实践elk

Elastic, The Open Source Elastic Stack, Reliably and securely take data from any source, in any format, and search, analyze, and visualize it in real time.

ELK工作原理

ElasticSearch

ElasticSearch

1
$ elasticsearch -d

ElasticSearch

Logstash

Logstash

Logstash Docs

Logstash工作原理

help

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
$ bin/logstash --help
ERROR StatusLogger No log4j2 configuration file found. Using default configuration: logging only errors to the console.
Usage:
    bin/logstash [OPTIONS]

Options:
    -n, --node.name NAME          Specify the name of this logstash instance, if no value is given
                                  it will default to the current hostname.
                                   (default: "angi-deepin")
    -f, --path.config CONFIG_PATH Load the logstash config from a specific file
                                  or directory.  If a directory is given, all
                                  files in that directory will be concatenated
                                  in lexicographical order and then parsed as a
                                  single config file. You can also specify
                                  wildcards (globs) and any matched files will
                                  be loaded in the order described above.
    -e, --config.string CONFIG_STRING Use the given string as the configuration
                                  data. Same syntax as the config file. If no
                                  input is specified, then the following is
                                  used as the default input:
                                  "input { stdin { type => stdin } }"
                                  and if no output is specified, then the
                                  following is used as the default output:
                                  "output { stdout { codec => rubydebug } }"
                                  If you wish to use both defaults, please use
                                  the empty string for the '-e' flag.
                                   (default: nil)
    --modules MODULES             Load Logstash modules.
                                  Modules can be defined using multiple instances
                                  '--modules module1 --modules module2',
                                     or comma-separated syntax
                                  '--modules=module1,module2'
                                  Cannot be used in conjunction with '-e' or '-f'
                                  Use of '--modules' will override modules declared
                                  in the 'logstash.yml' file.
    -M, --modules.variable MODULES_VARIABLE Load variables for module template.
                                  Multiple instances of '-M' or
                                  '--modules.variable' are supported.
                                  Ignored if '--modules' flag is not used.
                                  Should be in the format of
                                  '-M "MODULE_NAME.var.PLUGIN_TYPE.PLUGIN_NAME.VARIABLE_NAME=VALUE"'
                                  as in
                                  '-M "example.var.filter.mutate.fieldname=fieldvalue"'
    -w, --pipeline.workers COUNT  Sets the number of pipeline workers to run.
                                   (default: 8)
    -b, --pipeline.batch.size SIZE Size of batches the pipeline is to work in.
                                   (default: 125)
    -u, --pipeline.batch.delay DELAY_IN_MS When creating pipeline batches, how long to wait while polling
                                  for the next event.
                                   (default: 5)
    --pipeline.unsafe_shutdown    Force logstash to exit during shutdown even
                                  if there are still inflight events in memory.
                                  By default, logstash will refuse to quit until all
                                  received events have been pushed to the outputs.
                                   (default: false)
    --path.data PATH              This should point to a writable directory. Logstash
                                  will use this directory whenever it needs to store
                                  data. Plugins will also have access to this path.
                                   (default: "/home/angi/software/logstash/data")
    -p, --path.plugins PATH       A path of where to find plugins. This flag
                                  can be given multiple times to include
                                  multiple paths. Plugins are expected to be
                                  in a specific directory hierarchy:
                                  'PATH/logstash/TYPE/NAME.rb' where TYPE is
                                  'inputs' 'filters', 'outputs' or 'codecs'
                                  and NAME is the name of the plugin.
                                   (default: [])
    -l, --path.logs PATH          Write logstash internal logs to the given
                                  file. Without this flag, logstash will emit
                                  logs to standard output.
                                   (default: "/home/angi/software/logstash/logs")
    --log.level LEVEL             Set the log level for logstash. Possible values are:
                                    - fatal
                                    - error
                                    - warn
                                    - info
                                    - debug
                                    - trace
                                   (default: "info")
    --config.debug                Print the compiled config ruby code out as a debug log (you must also have --log.level=debug enabled).
                                  WARNING: This will include any 'password' options passed to plugin configs as plaintext, and may result
                                  in plaintext passwords appearing in your logs!
                                   (default: false)
    -i, --interactive SHELL       Drop to shell instead of running as normal.
                                  Valid shells are "irb" and "pry"
    -V, --version                 Emit the version of logstash and its friends,
                                  then exit.
    -t, --config.test_and_exit    Check configuration for valid syntax and then exit.
                                   (default: false)
    -r, --config.reload.automatic Monitor configuration changes and reload
                                  whenever it is changed.
                                  NOTE: use SIGHUP to manually reload the config
                                   (default: false)
    --config.reload.interval RELOAD_INTERVAL How frequently to poll the configuration location
                                  for changes, in seconds.
                                   (default: 3)
    --http.host HTTP_HOST         Web API binding host (default: "127.0.0.1")
    --http.port HTTP_PORT         Web API http port (default: 9600..9700)
    --log.format FORMAT           Specify if Logstash should write its own logs in JSON form (one
                                  event per line) or in plain text (using Ruby's Object#inspect)
                                   (default: "plain")
    --path.settings SETTINGS_DIR  Directory containing logstash.yml file. This can also be
                                  set through the LS_SETTINGS_DIR environment variable.
                                   (default: "/home/angi/software/logstash/config")
    --verbose                     Set the log level to info.
                                  DEPRECATED: use --log.level=info instead.
    --debug                       Set the log level to debug.
                                  DEPRECATED: use --log.level=debug instead.
    --quiet                       Set the log level to info.
                                  DEPRECATED: use --log.level=quiet instead.
    -h, --help                    print help
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
# 控制台输入-》控制台编码输出
$ bin/logstash -e ""
ERROR StatusLogger No log4j2 configuration file found. Using default configuration: logging only errors to the console.
Sending Logstash's logs to /home/angi/software/logstash/logs which is now configured via log4j2.properties
[2017-11-07T13:31:07,061][INFO ][logstash.pipeline        ] Starting pipeline {"id"=>"main", "pipeline.workers"=>8, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>5, "pipeline.max_inflight"=>1000}
[2017-11-07T13:31:07,075][INFO ][logstash.pipeline        ] Pipeline main started
The stdin plugin is now waiting for input:
[2017-11-07T13:31:07,126][INFO ][logstash.agent           ] Successfully started Logstash API endpoint {:port=>9600}
helo
{
    "@timestamp" => 2017-11-07T05:34:13.765Z,
      "@version" => "1",
          "host" => "angi-deepin",
       "message" => "helo",
          "type" => "stdin"
}
1
2
# 指定配置文件
$ bin/logstash -f config/logstash.conf

plugins

1
2
3
4
$ bin/logstash-plugin install <plugin-name>
$ bin/logstash-plugin list
$ bin/logstash-plugin update
$ bin/logstash-plugin remove <plugin-name>

input

log4j input plugin

log4j input plugin,基于tcp协议传输日志(org.apache.log4j.spi.LoggingEvent),已不被推荐,使用filebeat替代。

logstash.conf(log4j plugin):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
input {
  # For detail config for log4j as input,
  # See: https://www.elastic.co/guide/en/logstash/current/plugins-inputs-log4j.html
  log4j {
    mode => "server"
    host => "localhost"
    port => 4567
  }
}

# The filter part of this file is commented out to indicate that it is
# optional.
# filter {
#
# }

output {
  elasticsearch {
    hosts => "localhost:9200"
    action => "index"
    index => "applog"
  }
}

log4j.properties

1
2
3
4
5
6
7
8
9
10
11
12
13
log4j.rootLogger=INFO,socket,console

# appender socket
log4j.appender.socket=org.apache.log4j.net.SocketAppender
log4j.appender.socket.Port=4567
log4j.appender.socket.RemoteHost=localhost
log4j.appender.socket.ReconnectionDelay=10000

# appender console
log4j.appender.console=org.apache.log4j.ConsoleAppender
log4j.appender.console.target=System.out
log4j.appender.console.layout=org.apache.log4j.PatternLayout
log4j.appender.console.layout.ConversionPattern=%d %m\n

filter

grok

grok parse arbitrary text and structure it. Grok works by combining text patterns into something that matches your logs.

The syntax for a grok pattern is %{SYNTAX:SEMANTIC:DATA_TYPE}, SYNTAX is the name of the pattern (shipped pattern),SEMANTIC is the identifier or field_name, DATA_TYPE is the object data type that conversion from string to.

custom pattern
  • (?<field_name>pattern)

    配置:

1
2
3
  grok {
     match => { "message" => "(?<duration>[0-9A-F]{1,3}) (?<client>[0-9A-F]{1,3})" }
  }
结果:
1
2
3
4
5
6
7
8
9
10
123 456
{
      "duration" => "123",
    "@timestamp" => 2017-11-08T03:46:34.217Z,
      "@version" => "1",
          "host" => "angi-deepin",
        "client" => "456",
       "message" => "123 456",
          "type" => "stdin"
}
  • patterns_dir

    logstash根目录新建pattern目录,pattern目录下新建文件,例如:custom-pattern,内容格式:<pattern_name> <pattern>,例如:

    1
    2
    STR [0-9A-F]{1,3}
    STR6 [0-9A-F]{1,6}
    

    配置:

    1
    2
    3
    4
      grok {
         patterns_dir => ["./patterns"]
         match => { "message" => "%{STR:duration} %{STR:client}" }
      }
    

    结果:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    123 456
    {
          "duration" => "123",
        "@timestamp" => 2017-11-09T01:46:06.025Z,
          "@version" => "1",
              "host" => "angi-deepin",
            "client" => "456",
           "message" => "123 456",
              "type" => "stdin"
    }
    
    

    这种形式自定义的pattern可以被复用。

  • pattern_definitions

    pattern_definitions值类型为hash,key为pattern_name,value为pattern。

    配置:

    1
    2
    3
    4
    grok {
         pattern_definitions => { "STR9" => "[0-9A-F]{1,9}"}
         match => { "message" => "%{STR9:duration} %{STR9:client}" }
      }
    

    结果:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    123456789 987654321
    {
          "duration" => "123456789",
        "@timestamp" => 2017-11-09T01:56:41.810Z,
          "@version" => "1",
              "host" => "angi-deepin",
            "client" => "987654321",
           "message" => "123456789 987654321",
              "type" => "stdin"
    }
    

    这种形式自定义的pattern只有当前grok可用

GEO IP

fingerprint

mutate

date

match

output

Beats

Beats

Beats Docs

Filebeat

轻量型日志采集器,当您要面对成百上千、甚至成千上万的服务器、虚拟机和容器生成的日志时,请告别 SSH 吧。Filebeat 将为您提供一种轻量型方法,用于转发和汇总日志与文件,让简单的事情不再繁杂。

Filebeat Docs

filebeat.yml

1
2
3
4
5
6
filebeat.prospectors:
- input_type: log
  paths:
    - /app/log/*.log
output.logstash:
  hosts: ["127.0.0.1:5044"]
1
2
3
4
5
6
7
8
9
10
11
12
$ ./import_dashboards -only-index
Created temporary directory /tmp/tmp566769272
Downloading https://artifacts.elastic.co/downloads/beats/beats-dashboards/beats-dashboards-5.5.2.zip
Unzip archive /tmp/tmp566769272
Importing Kibana from /tmp/tmp566769272/beats-dashboards-5.5.2/filebeat
Import directory /tmp/tmp566769272/beats-dashboards-5.5.2/filebeat/index-pattern
Import index to /.kibana/index-pattern/filebeat-* from /tmp/tmp566769272/beats-dashboards-5.5.2/filebeat/index-pattern/filebeat.json

Importing Kibana from /tmp/tmp566769272/beats-dashboards-5.5.2/heartbeat
Importing Kibana from /tmp/tmp566769272/beats-dashboards-5.5.2/metricbeat
Importing Kibana from /tmp/tmp566769272/beats-dashboards-5.5.2/packetbeat
Importing Kibana from /tmp/tmp566769272/beats-dashboards-5.5.2/winlogbeat 
1
$ filebeat start

logstash.conf(filebeat):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
input {
  # for Filebeat
  beats {
    port => 5044
  }
}

# The filter part of this file is commented out to indicate that it is
# optional.
# filter {
#
# }

output {
  elasticsearch {
    hosts => "localhost:9200"
    manage_template => false
    index => "%{[@metadata][beat]}-%{+YYYY.MM.dd}" 
    document_type => "%{[@metadata][type]}" 
  }
}

Metricbeat

Metricbeat,轻量型指标采集器,用于从系统和服务收集指标。从 CPU 到内存,从 Redis 到 Nginx,Metricbeat 能够以一种轻量型的方式,输送各种系统和服务统计数据。

Metricbeat Docs

Packetbeat

Packetbeat,轻量型网络数据采集器,用于深挖网线上传输的数据,了解应用程序动态。Packetbeat 是一款轻量型网络数据包分析器,能够将数据发送至 Logstash 或 Elasticsearch。

Packetbeat Docs

Winlogbeat

轻量型 Windows 事件日志采集器,用于密切监控基于 Windows 的基础架构上发生的事件。Winlogbeat 能够以一种轻量型的方式,将 Windows 事件日志实时地流式传输至 Elasticsearch 和 Logstash。

Heartbeat

Heartbeat,轻量型运行时间监控采集器,通过主动探测来监控服务可用性。Heartbeat 面对一系列 URL 时会问这个简单的问题:你还活着吗?Heartbeat 会将这些信息和响应时间输送至 Elastic Stack 的其他部分,以做进一步分析。

Heartbeat Docs

Kibana

Kibana

config/kibana.yml

1
2
3
4
# server.port: 5601
# server.host: "localhost"
# elasticsearch.url: http://localhost:9200
# kibana.index: ".kibana"
1
$ kibana

Kibana

X-Pack

参考资料

Elastic Stack Getting Started