ELK/kibana整合Sentinl插件-日志监控告警(邮件/webhook)

前段时间,公司mongodb数据库总是自己悄咪咪挂掉了,导致链接数据库查询时总是不停地重复报错,但因为不是最核心功能,一直没有被用户提出,但是导致日志内容超级多,于是在想,如果在报错的一开始知道了,马上解决掉,那该多好,虽然初衷只是不想看到那么多不停重复的报错日志,但其实对于一些2C的应用来说这点特别重要,如果经常出现应用报错没有解决的情况,用户可能也就悄咪咪地卸载了,也不会通知你一声,这会失去很多用户的。

之所以选择Sentinl是因为它作为kibana的一个插件,有友好的可视化界面,整合非常方便。

此教程是基于6.4.2写的,不同版本界面和功能大致一样,小细节可能有点区别。

1. 下载安装Sentinl

下载地址:https://github.com/sirensolutions/sentinl/releases
注意:Sentinl选择的版本要跟你正在使用的kibana版本一致

命令为:kibana安装目录/bin/kibana-plugin install 安装包放的位置/安装包名
java07@java07-Vostro-3268:~/software/kibana-6.3.1-linux-x86_64/bin$ ./kibana-plugin install ../plugins/sentinl-v6.3.1.zip
(在下的linux操作不是很溜,所以有时候看博文用命令的时候挺郁闷的,这安装操作是写详细给同样的小白看,大神可自行花式安装)

2. 安装成功后,重启一下kibana

(我是在elk正在运行着的情况下安装插件,是没有问题的,安装完重启kibana就可以了)
你会在页面上发现多了这个,这就表示插件装好了可以用了


image.png

3. 新建一个watcher

3.1. 点击 new


image.png

3.2. 点击 Watcher Wizard


image.png

3.3 填写想要监视的索引(可多个),以及多久搜寻一次(Schedule)
image.png

3.4. 填写完index之后,页面下方会显示其他信息


image.png

“Match condition”是指满足什么条件的情况下才会产生警报,但是它的可视化页面不是很友好,我都是直接先让它默认,然后等下再自己修改,有一些版本的可视化界面是比较友好的(为什么不是我要用的这个版本╥﹏╥...)

3.5. 添加"Actions"行动
"Actions"这里是指如果满足条件时,应该触发什么行动,可以有多个选择:Email, HTML email, Report, Console, Webhook, Slack, Elastic。默认会添加了一个HTML email 的行动,不需要的话可以删掉。
我的需求是给我们公司内部的通讯工具发警报消息,而不是发邮件,所以我就把默认的email删掉,添加了一个Webhook Action,如下:


image.png

到这里,通过可视化界面设置的部分就结束了(实话,这个版本的可视化界面真不算友好啊),接下来就通过高级设置来继续完善了。点击这个地方:


image.png

image.png

不可逆转,嗯,YES
然后再重新点击进去后,页面就不一样了:


image.png

噢,提一句,话说如果一开始new的时候就选择了Watcher Advanced,会出现空白页,一脸懵逼的时候,请默默new一个Wathcer Wizard吧,这个插件bug我也无解。
配置花了我很长时间,难点在于我本身并不是很熟悉Elasticsearch的语法,主要难点在于:

1. 配置消息体,需要提取出我想要的内容
2. input部分
3. condition部分
下面来仔细讲讲这些部分我的理解,先贴个完整的配置:

  "actions": {
    "Webhook_5010c3fa-32d6-4cf1-8af0-087faef6a4e7": {
      "name": "Webhook-h",
      "throttle_period": "20s",
      "webhook": {
        "priority": "low",
        "stateless": false,
        "method": "POST",
        "host": "192.168.1.49",
        "port": "3000",
        "path": "/hooks/JfRf3TJhpfdBmKpv3/YcrNpCbv5XXXXXXXXXX",
        "body": "{\n  \"text\": \"watcher -- {{watcher.title}}\\nerrorCount -- {{payload.hits.total}}\\n{{#payload.hits.hits}} {{_source.springAppName}} -- {{_source.message}}\\n{{/payload.hits.hits}}\"\n}",
        "headers": {
          "Content-Type": "application/json;charset=UTF-8"
        }
      }
    }
  },
  "input": {
    "search": {
      "request": {
        "index": [
          "dev@work_order_service"
        ],
        "body": {
          "query": {
            "bool": {
              "must": [
                {
                  "match_all": {}
                },
                {
                  "match_phrase": {
                    "level": {
                      "query": "ERROR"
                    }
                  }
                },
                {
                  "exists": {
                    "field": "stack_trace"
                  }
                },
                {
                  "range": {
                    "@timestamp": {
                      "format": "epoch_millis",
                      "gte": "now-60m/m",
                      "lte": "now/m"
                    }
                  }
                }
              ],
              "filter": [],
              "should": [],
              "must_not": [
                {
                  "bool": {
                    "should": [
                      {
                        "match_phrase": {
                          "stack_trace": "com.shls.service.ServiceException"
                        }
                      },
                      {
                        "match_phrase": {
                          "stack_trace": "com.shls.db.service.ServiceException"
                        }
                      }
                    ],
                    "minimum_should_match": 1
                  }
                }
              ]
            }
          }
        }
      }
    }
  },
  "condition": {
    "script": {
      "script": "payload.hits.total > 0"
    }
  },
  "trigger": {
    "schedule": {
      "later": "every 20 second"
    }
  },
  "disable": true,
  "report": false,
  "title": "test_0408",
  "save_payload": false,
  "spy": false,
  "impersonate": false
}
  1. 消息体body和触发条件condition
    看我的body:
    "body": "{\n \"text\": \"watcher -- {{watcher.title}}\\nerrorCount -- {{payload.hits.total}}\\n{{#payload.hits.hits}} {{_source.springAppName}} -- {{_source.message}}\\n{{/payload.hits.hits}}\"\n}",
    忽略掉\n是我要换行显示,{{}}是Mustache语法,类似于el表达式,具体语法可以看Mustache syntax。 用法其实比较简单,比较需要用到的是循环的语法。
    首先要理解payload是什么,可以简单理解成一个集合,里面的元素就是符合我们input条件的记录,至于记录长什么样子,就看你之前配置的logstash发过来的内容是怎样的,在哪里看呢,在这里看:
    image.png

    看JSON更清晰,不要看Table,我的长这样:
{
  "_index": "dev@work_order_service",
  "_type": "doc",
  "_id": "DWO1C2oB4NvoQ2BimFUw",
  "_version": 1,
  "_score": null,
  "_source": {
    "level_value": 40000,
    "level": "ERROR",
    "logger_name": "com.shls.web.controller.GlobalExceptionHandler",
    "springAppName": "dev@work_order_service",
    "@timestamp": "2019-04-11T09:23:31.400Z",
    "port": 47450,
    "X-B3-SpanId": "f8d1aa1480dc2b35",
    "thread_name": "http-nio-8023-exec-7",
    "X-Span-Export": "true",
    "@version": "1",
    "stack_trace": "java.lang.ArithmeticException: / by zero\n\tat com.shls.web.controller.TestController.forLogger(TestController.java:167)\n\tat sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)\n\tat sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)\n\tat sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)\n\tat java.lang.reflect.Method.invoke(Method.java:498)\n\tat org.springframework.web.method.support.InvocableHandlerMethod.doInvoke(InvocableHandlerMethod.java:205)\n\tat org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:133)\n\tat org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:97)\n\tat org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.invokeHandlerMethod(RequestMappingHandlerAdapter.java:827)\n\tat org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.handleInternal(RequestMappingHandlerAdapter.java:738)\n\tat org.springframework.web.servlet.mvc.method.AbstractHandlerMethodAdapter.handle(AbstractHandlerMethodAdapter.java:85)\n\tat org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:967)\n\tat org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:901)\n\tat org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:970)\n\tat org.springframework.web.servlet.FrameworkServlet.doGet(FrameworkServlet.java:861)\n\tat javax.servlet.http.HttpServlet.service(HttpServlet.java:635)\n\tat org.springframework.web.servlet.FrameworkServlet.service(FrameworkServlet.java:846)\n\tat javax.servlet.http.HttpServlet.service(HttpServlet.java:742)\n\tat org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:231)\n\tat org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)\n\tat org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:52)\n\tat org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)\n\tat org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)\n\tat org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:317)\n\tat org.springframework.security.web.access.intercept.FilterSecurityInterceptor.invoke(FilterSecurityInterceptor.java:127)\n\tat org.springframework.security.web.access.intercept.FilterSecurityInterceptor.doFilter(FilterSecurityInterceptor.java:91)\n\tat org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:331)\n\tat org.springframework.security.web.access.ExceptionTranslationFilter.doFilter(ExceptionTranslationFilter.java:114)\n\tat org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:331)\n\tat org.springframework.security.web.session.SessionManagementFilter.doFilter(SessionManagementFilter.java:137)\n\tat org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:331)\n\tat org.springframework.security.web.authentication.AnonymousAuthenticationFilter.doFilter(AnonymousAuthenticationFilter.java:111)\n\tat org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:331)\n\tat org.springframework.security.web.servletapi.SecurityContextHolderAwareRequestFilter.doFilter(SecurityContextHolderAwareRequestFilter.java:170)\n\tat org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:331)\n\tat org.springframework.security.web.savedrequest.RequestCacheAwareFilter.doFilter(RequestCacheAwareFilter.java:63)\n\tat org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:331)\n\tat com.shls.config.security.JWTAuthenticationFilter.doFilterInternal(JWTAuthenticationFilter.java:38)\n\tat org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107)\n\tat org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:331)\n\tat org.springframework.security.web.authentication.logout.LogoutFilter.doFilter(LogoutFilter.java:116)\n\tat org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:331)\n\tat org.springframework.web.filter.CorsFilter.doFilterInternal(CorsFilter.java:96)\n\tat org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107)\n\tat org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:331)\n\tat org.springframework.security.web.header.HeaderWriterFilter.doFilterInternal(HeaderWriterFilter.java:64)\n\tat org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107)\n\tat org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:331)\n\tat org.springframework.security.web.context.SecurityContextPersistenceFilter.doFilter(SecurityContextPersistenceFilter.java:105)\n\tat org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:331)\n\tat org.springframework.security.web.context.request.async.WebAsyncManagerIntegrationFilter.doFilterInternal(WebAsyncManagerIntegrationFilter.java:56)\n\tat org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107)\n\tat org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:331)\n\tat org.springframework.security.web.FilterChainProxy.doFilterInternal(FilterChainProxy.java:214)\n\tat org.springframework.security.web.FilterChainProxy.doFilter(FilterChainProxy.java:177)\n\tat org.springframework.web.filter.DelegatingFilterProxy.invokeDelegate(DelegatingFilterProxy.java:347)\n\tat org.springframework.web.filter.DelegatingFilterProxy.doFilter(DelegatingFilterProxy.java:263)\n\tat org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)\n\tat org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)\n\tat org.springframework.web.filter.RequestContextFilter.doFilterInternal(RequestContextFilter.java:99)\n\tat org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107)\n\tat org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)\n\tat org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)\n\tat org.springframework.web.filter.HttpPutFormContentFilter.doFilterInternal(HttpPutFormContentFilter.java:108)\n\tat org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107)\n\tat org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)\n\tat org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)\n\tat org.springframework.web.filter.HiddenHttpMethodFilter.doFilterInternal(HiddenHttpMethodFilter.java:81)\n\tat org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107)\n\tat org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)\n\tat org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)\n\tat org.springframework.cloud.sleuth.instrument.web.TraceFilter.doFilter(TraceFilter.java:166)\n\tat org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)\n\tat org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)\n\tat org.springframework.web.filter.CharacterEncodingFilter.doFilterInternal(CharacterEncodingFilter.java:197)\n\tat org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107)\n\tat org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)\n\tat org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)\n\tat org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:199)\n\tat org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:96)\n\tat org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:478)\n\tat org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:140)\n\tat org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:81)\n\tat org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:87)\n\tat org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:342)\n\tat org.apache.coyote.http11.Http11Processor.service(Http11Processor.java:803)\n\tat org.apache.coyote.AbstractProcessorLight.process(AbstractProcessorLight.java:66)\n\tat org.apache.coyote.AbstractProtocol$ConnectionHandler.process(AbstractProtocol.java:868)\n\tat org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1459)\n\tat org.apache.tomcat.util.net.SocketProcessorBase.run(SocketProcessorBase.java:49)\n\tat java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)\n\tat java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)\n\tat org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)\n\tat java.lang.Thread.run(Thread.java:748)\n",
    "message": "/ by zero",
    "host": "192.168.13.243",
    "X-B3-TraceId": "f8d1aa1480dc2b35"
  },
  "fields": {
    "@timestamp": [
      "2019-04-11T09:23:31.400Z"
    ]
  },
  "sort": [
    1554974611400
  ]
}

{{watcher.title}} watcher有什么属性我没了解过,因为不太需要,有兴趣的同学自行研究;
{{payload.hits.total}} payload就要好好看看了,官网上的看得我也是晕乎乎的,因为我一开始没理解到payload是什么鬼,理解了就好办多了。payload.hits.total是命中记录的条数,payload.hits.hits是命中记录的集合。所以我们用{{#payload.hits.hits}} XXX {{/payload.hits.hits}}去遍历的时候,就可以拿到每个JSON了,把XXX替换为你自己需要的内容,比如我需要的{{_source.springAppName}}就可以获取到"dev@work_order_service",{{_source.message}}就可以获取到"/ by zero",{{fields.@timestamp}}就可以获取到 "2019-04-11T09:23:31.400Z",如此类推。

  1. input的部分
    因为我真的不熟悉它语法,所以真心痛苦,看官方文档,看博文,看起来都很类似,但有时候能成功有时候又不能,找了很多方法,最后我发现了一个很好用的办法!!超好用!!看kibana的请求!!
    image.png

    用kibana的可视化页面来筛选你想看的日志信息,我相信你一定都会,所以,你可以先把自己的筛选条件放上去,然后看请求参数!看第二个,随便用个工具格式化一下这个JSON串
    image.png
"query": {
        "bool": {
            "must": [{
                "match_all": {}
            }, {
                "range": {
                    "@timestamp": {
                        "gte": 1554912000000,
                        "lte": 1554998399999,
                        "format": "epoch_millis"
                    }
                }
            }],
            "filter": [],
            "should": [],
            "must_not": [{
                "bool": {
                    "should": [{
                        "match_phrase": {
                            "message": "/ by zero"
                        }
                    }],
                    "minimum_should_match": 1
                }
            }]
        }
  }

然后,把这个query整个copy上去自己那里,搞定!如果说一个条件不够,要多个条件怎么办,你把每个条件请求一遍,把新的那个条件的部分copy上去,就是这样,成功率竟然是100%,跟之前我自己琢磨语法老是失败对比简直要哭了。当然,自己琢磨学习下它的语法也是受益很多,但如果你来不及,这样操作简单快捷!

邮件的配置我还没弄过,大致上也是配置一下Action部分就好了,其他部分是一样的,没实践过我就不把别人的蹭过来了,有需要的可以去搜搜其他博文看看。最后我还有点存疑的地方是这里面涉及到的多个时间间隔的区别作用,包括watcher的trigger.schedule,query.range,action.throttle_period单个都好理解,但是怎样才能配置到每次警报的内容不重复,同时不遗漏掉需要统计的记录,这个问题我还没仔细实践,有晓得的小伙伴欢迎留言交流,谢谢!

推荐阅读更多精彩内容