prometheus+influxdb2.0+grafana打造监控系统

prometheus本身使用本地的tsdb数据库来保存数据,如果prometheus服务部署在容器中,没有做持久化,将会导致数据丢失。我们可以使用prometheus的remote write功能将数据写入远程的时序性数据库,这儿我们使用influxdb2.0

在influxdb1.x中,prometheus可以直接将数据写入influxdb中,influxdb2.0的改动比较大,使用telegraf 插件来收集指标数据到influxdb中。所以这儿我们使用telegraf 来将prometheus的数据同步到influxdb中。在influxdb1.x中,使用Influxql来查询数据,在influxdb2.0中,引入了自己的脚本语言(flux)来做查询,如果在influxdb2.0中想使用influx1.0版本通过influxql来查询数据的话,需要在influxdb中配置database和retention policy,详情见官网。
influxdb2.0,使用tick(telegraf、influxdb2.0,chronnograf、kapacitor)组合来处理指标数据的收集、存储、展示和报警。

influxdb 相关概念

  1. org:组织,多租户
  2. bucket:类似influxdb1.x中databse的概念
  3. measurement:table,类似于表
  4. field:field key、field value,具体的值数据
  5. field set:值数据字段的集合
  6. tag:指标,标签字段,类似索引
  7. tag set:指标字段的集合(多个指标字段)
  8. telegraf:数据收集(类似prometheus exporter)
  9. user:用户

一、安装influxdb2.0

docker pull influxdb
docker run -d -p 8086:8086 influxdb
打开http://localhost:8086就可以本地访问influxdb的页面了,大概长这样。首次登录需要设置org、user、password来登录

image.png

可以在页面创建bucket及token,用于telegraf发送数据到influxdb

二、安装telegraf

docker pull telegraf
// 为了便于telegraf和influxdb之间的通信,telegraf容器直接共享influxdb的网络,influxdb-container-name 为influxdb的容器名称或ID
INFLUX_TOKEN 为页面配置的api-token值

image.png

image.png

docker run --net=container:influxdb-container-name -e INFLUX_TOKEN=iCVATJ7rNxZxnNsWf9-QRL-uiLCTKnZLb0mOuP5eAXeVKrBXuZMB7mKFVkTd7HM0oRessWZ3Q== -v /$HOME/k8s/telegraf/telegraf.conf:/etc/telegraf/telegraf.conf telegraf
telegraf配置文件,在这儿分为三部分
第一部分:output influxdb的配置,可以从influxdb页面把配置粘贴过来

image.png

image.png

第二部分:input prometheus的配置,从github上找telegraf prometheus插件的配置
https://github.com/influxdata/telegraf/tree/release-1.20/plugins/inputs/prometheus
第三部分:telegraf 开启端口,供prometheus remote write使用

 [[outputs.influxdb_v2]]
  ## The URLs of the InfluxDB cluster nodes.
  ##
  ## Multiple URLs can be specified for a single cluster, only ONE of the
  ## urls will be written to each interval.
  ##   ex: urls = ["https://us-west-2-1.aws.cloud2.influxdata.com"]
  urls = ["http://localhost:8086"]

  ## API token for authentication.
  token = "$INFLUX_TOKEN"

  ## Organization is the name of the organization you wish to write to; must exist.
  organization = "rekca"

  ## Destination bucket to write into.
  bucket = "prometheus"

  ## The value of this tag will be used to determine the bucket.  If this
  ## tag is not set the 'bucket' option is used as the default.
  # bucket_tag = ""

  ## If true, the bucket tag will not be added to the metric.
  # exclude_bucket_tag = false

  ## Timeout for HTTP messages.
  # timeout = "5s"

  ## Additional HTTP headers
  # http_headers = {"X-Special-Header" = "Special-Value"}

  ## HTTP Proxy override, if unset values the standard proxy environment
  ## variables are consulted to determine which proxy, if any, should be used.
  # http_proxy = "http://corporate.proxy:3128"

  ## HTTP User-Agent
  # user_agent = "telegraf"

  ## Content-Encoding for write request body, can be set to "gzip" to
  ## compress body or "identity" to apply no encoding.
  # content_encoding = "gzip"

  ## Enable or disable uint support for writing uints influxdb 2.0.
  # influx_uint_support = false

  ## Optional TLS Config for use on HTTP connections.
  # tls_ca = "/etc/telegraf/ca.pem"
  # tls_cert = "/etc/telegraf/cert.pem"
  # tls_key = "/etc/telegraf/key.pem"
  ## Use TLS but skip chain & host verification
  # insecure_skip_verify = false

# Read metrics from one or many prometheus clients
[[inputs.prometheus]]
  ## An array of urls to scrape metrics from.
  urls = ["http://172.17.0.3:9090/metrics"]

  ## Metric version controls the mapping from Prometheus metrics into
  ## Telegraf metrics.  When using the prometheus_client output, use the same
  ## value in both plugins to ensure metrics are round-tripped without
  ## modification.
  ##
  ##   example: metric_version = 1;
  ##            metric_version = 2; recommended version
  # metric_version = 1

  ## Url tag name (tag containing scrapped url. optional, default is "url")
  # url_tag = "url"

  ## An array of Kubernetes services to scrape metrics from.
  # kubernetes_services = ["http://my-service-dns.my-namespace:9100/metrics"]

  ## Kubernetes config file to create client from.
  # kube_config = "/path/to/kubernetes.config"

  ## Scrape Kubernetes pods for the following prometheus annotations:
  ## - prometheus.io/scrape: Enable scraping for this pod
  ## - prometheus.io/scheme: If the metrics endpoint is secured then you will need to
  ##     set this to 'https' & most likely set the tls config.
  ## - prometheus.io/path: If the metrics path is not /metrics, define it with this annotation.
  ## - prometheus.io/port: If port is not 9102 use this annotation
  # monitor_kubernetes_pods = true

  ## Get the list of pods to scrape with either the scope of
  ## - cluster: the kubernetes watch api (default, no need to specify)
  ## - node: the local cadvisor api; for scalability. Note that the config node_ip or the environment variable NODE_IP must be set to the host IP.
  # pod_scrape_scope = "cluster"

  ## Only for node scrape scope: node IP of the node that telegraf is running on.
  ## Either this config or the environment variable NODE_IP must be set.
  # node_ip = "10.180.1.1"

  ## Only for node scrape scope: interval in seconds for how often to get updated pod list for scraping.
  ## Default is 60 seconds.
  # pod_scrape_interval = 60

  ## Restricts Kubernetes monitoring to a single namespace
  ##   ex: monitor_kubernetes_pods_namespace = "default"
  # monitor_kubernetes_pods_namespace = ""
  # label selector to target pods which have the label
  # kubernetes_label_selector = "env=dev,app=nginx"
  # field selector to target pods
  # eg. To scrape pods on a specific node
  # kubernetes_field_selector = "spec.nodeName=$HOSTNAME"

  ## Scrape Services available in Consul Catalog
  # [inputs.prometheus.consul]
  #   enabled = true
  #   agent = "http://localhost:8500"
  #   query_interval = "5m"

  #   [[inputs.prometheus.consul.query]]
  #     name = "a service name"
  #     tag = "a service tag"
  #     url = 'http://{{if ne .ServiceAddress ""}}{{.ServiceAddress}}{{else}}{{.Address}}{{end}}:{{.ServicePort}}/{{with .ServiceMeta.metrics_path}}{{.}}{{else}}metrics{{end}}'
  #     [inputs.prometheus.consul.query.tags]
  #       host = "{{.Node}}"

  ## Use bearer token for authorization. ('bearer_token' takes priority)
  # bearer_token = "/path/to/bearer/token"
  ## OR
  # bearer_token_string = "abc_123"

  ## HTTP Basic Authentication username and password. ('bearer_token' and
  ## 'bearer_token_string' take priority)
  # username = ""
  # password = ""

  ## Specify timeout duration for slower prometheus clients (default is 3s)
  # response_timeout = "3s"

  ## Optional TLS Config
  # tls_ca = /path/to/cafile
  # tls_cert = /path/to/certfile
  # tls_key = /path/to/keyfile

  ## Use TLS but skip chain & host verification
  # insecure_skip_verify = false

[[inputs.http_listener_v2]]
 ## Address and port to host HTTP listener on
 service_address = ":1234"
 ## Path to listen to.
 path = "/receive"
 ## Data format to consume.
 data_format = "prometheusremotewrite"

三、安装prometheus

docker pull prom/prometheus
prometheus.yml 配置,remote_write 为前面telegraf第三部分开放的端口

# my global config
global:
  scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
  # scrape_timeout is set to the global default (10s).

# Alertmanager configuration
alerting:
  alertmanagers:
    - static_configs:
        - targets:
          # - alertmanager:9093

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
  # - "first_rules.yml"
  # - "second_rules.yml"

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: "prometheus"

    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.

    static_configs:
      - targets: ["localhost:9090"]
  - job_name: "mac"

    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.
remote_write:
 - url: "http://172.17.0.2:1234/receive"

四、安装grafana

docker pull grafana/grafana
docker run -d -p 3000:3000 grafana/grafana
之后就可以在grafana上配置prometheus相关的dashboard了。

如果想直接在grafana上导入influxdb。需要注意query language


image.png

在import influxdb的dashboard id时,需要区分influxdb1.x和2.x的区别(influxql和flux查询语言的区别)
最后我单独启动了一个telegraf容器,来收集了system数据


image.png

image.png

在grafana上选择id= 14126 的dashboard导入,效果如下


image.png

总结

在influxdb2.x上,influxdb发展得越来越像prometheus生态了。influxdb有telegraf类似(exporter)的各种丰富的插件。开发了自己的flux脚本语言来操作数据,与promql功能类似。个人认为使用prometheus + influxdb 远程存储数据,grafana 做数据展示,使用prometheus 生态的alertManager来做监控告警,也是一种比较好的方案。

最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
  • 序言:七十年代末,一起剥皮案震惊了整个滨河市,随后出现的几起案子,更是在滨河造成了极大的恐慌,老刑警刘岩,带你破解...
    沈念sama阅读 159,569评论 4 363
  • 序言:滨河连续发生了三起死亡事件,死亡现场离奇诡异,居然都是意外死亡,警方通过查阅死者的电脑和手机,发现死者居然都...
    沈念sama阅读 67,499评论 1 294
  • 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
    开封第一讲书人阅读 109,271评论 0 244
  • 文/不坏的土叔 我叫张陵,是天一观的道长。 经常有香客问我,道长,这世上最难降的妖魔是什么? 我笑而不...
    开封第一讲书人阅读 44,087评论 0 209
  • 正文 为了忘掉前任,我火速办了婚礼,结果婚礼上,老公的妹妹穿的比我还像新娘。我一直安慰自己,他们只是感情好,可当我...
    茶点故事阅读 52,474评论 3 287
  • 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
    开封第一讲书人阅读 40,670评论 1 222
  • 那天,我揣着相机与录音,去河边找鬼。 笑死,一个胖子当着我的面吹牛,可吹牛的内容都是我干的。 我是一名探鬼主播,决...
    沈念sama阅读 31,911评论 2 313
  • 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
    开封第一讲书人阅读 30,636评论 0 202
  • 序言:老挝万荣一对情侣失踪,失踪者是张志新(化名)和其女友刘颖,没想到半个月后,有当地人在树林里发现了一具尸体,经...
    沈念sama阅读 34,397评论 1 246
  • 正文 独居荒郊野岭守林人离奇死亡,尸身上长有42处带血的脓包…… 初始之章·张勋 以下内容为张勋视角 年9月15日...
    茶点故事阅读 30,607评论 2 246
  • 正文 我和宋清朗相恋三年,在试婚纱的时候发现自己被绿了。 大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
    茶点故事阅读 32,093评论 1 261
  • 序言:一个原本活蹦乱跳的男人离奇死亡,死状恐怖,灵堂内的尸体忽然破棺而出,到底是诈尸还是另有隐情,我是刑警宁泽,带...
    沈念sama阅读 28,418评论 2 254
  • 正文 年R本政府宣布,位于F岛的核电站,受9级特大地震影响,放射性物质发生泄漏。R本人自食恶果不足惜,却给世界环境...
    茶点故事阅读 33,074评论 3 237
  • 文/蒙蒙 一、第九天 我趴在偏房一处隐蔽的房顶上张望。 院中可真热闹,春花似锦、人声如沸。这庄子的主人今日做“春日...
    开封第一讲书人阅读 26,092评论 0 8
  • 文/苍兰香墨 我抬头看了看天上的太阳。三九已至,却和暖如春,着一层夹袄步出监牢的瞬间,已是汗流浃背。 一阵脚步声响...
    开封第一讲书人阅读 26,865评论 0 196
  • 我被黑心中介骗来泰国打工, 没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留,地道东北人。 一个月前我还...
    沈念sama阅读 35,726评论 2 276
  • 正文 我出身青楼,却偏偏与公主长得像,于是被迫代替她去往敌国和亲。 传闻我的和亲对象是个残疾皇子,可洞房花烛夜当晚...
    茶点故事阅读 35,627评论 2 270

推荐阅读更多精彩内容