Docker - build log monitoring system

Centralized log collection tools are commonly used in projects

  • Logstash

Logstash Is an open source data collection engine , With real-time pipeline function .Logstash It can dynamically unify data from different data sources , And standardize the data to your chosen destination .

  • advantage

    Logstash The main thing is its flexibility , Mainly because it has a lot of plug-ins , Detailed documentation and straightforward configuration format make it applicable in a variety of scenarios . We can basically find a lot of resources on the Internet , It can handle almost any problem .

  • shortcoming

    Logstash The fatal problem is its performance and resource consumption ( The default heap size is 1GB). Although its performance has greatly improved in recent years , It's much slower than its substitutes . Here you are Logstash And rsyslog Performance comparison and Logstash And filebeat Performance comparison of . It can be a problem in the case of large amounts of data .

  • Filebeat

As Beats A member of the family ,Filebeat Is a lightweight log transfer tool , Its existence is making up for Logstash The shortcomings of :Filebeat As a lightweight log transmission tool, it can push logs to the center Logstash.

  • advantage

    Filebeat It's just a binary without any dependencies . It takes up very little resources , Even though it's very young , Formal because it's simple , So there's little that can go wrong , So its reliability is very high . It also provides us with a lot of adjustable points , for example : How it searches for new files , And when the file hasn't changed for a while , When to choose to close the file handle .

  • shortcoming

    Filebeat The scope of application is very limited , So in some scenarios we have problems . for example , If you use Logstash As a downstream pipeline , We also have performance problems . Because of that ,Filebeat It's expanding . At the beginning of the , It can only send logs to Logstash and Elasticsearch, Now it can send logs to Kafka and Redis, stay 5.x In the version , It also has the ability to filter .

  • Fluentd (Docker Log driven support )

Fluentd The purpose of creation is to use it as much as possible JSON Output as log , Therefore, the transmission tool and its downstream transmission line do not need to guess the types of fields in the substring . such , It provides libraries for almost all languages , It also means that , We can plug it into our custom program .

  • advantage

    And most of them Logstash The plug-in is the same ,Fluentd The plug-in uses Ruby Language development is very easy to write and maintain . So it's a lot , Almost all source and target stores have plug-ins ( The maturity of each plug-in is also different ). It also means that we can use Fluentd To connect everything .

  • shortcoming

    Because in most scenarios , We will pass Fluentd Get structured data , It's not very flexible . But we can still use regular expressions , To parse unstructured data . Even though , Performance is good in most scenarios , But it's not *** Of , and syslog-ng equally , Its buffer only exists with the output , Single threaded core and Ruby GIL The plug-in implemented means that its performance is limited under large nodes , however , Its resource consumption is acceptable in most scenarios . For small or embedded devices , You may need to see Fluent Bit, It and Fluentd The relationship with Filebeat and Logstash The relationship between them is similar to .

Use Docker-Compose build EFK Collection center

  1. establish docker-compose.yml

Create a new one efk Catalog , Then go to the directory :

    image: httpd
      - "80:80"
      - fluentd
        tag: httpd.access
    build: ./fluentd
      - ./fluentd/conf:/fluentd/etc
      - "elasticsearch"
      - "24224:24224"
      - "24224:24224/udp"
      - "discovery.type=single-node"
      - "9200"
      - "9200:9200"
      - "elasticsearch"
      - "5601:5601"
  1. establish fluentd Image and configuration config And plug-ins

newly build fluentd/Dockerfile

FROM fluent/fluentd:v1.12.0-debian-1.0
USER root
RUN ["gem""install""fluent-plugin-elasticsearch""--no-document""--version""4.3.3"]
USER fluent

newly build fluentd/conf/fluent.conf

  @type forward
  port 24224
<match *.**>
  @type copy
    @type elasticsearch
    host elasticsearch
    port 9200
    logstash_format true
    logstash_prefix fluentd
    logstash_dateformat %Y%m%d
    include_tag_key true
    type_name access_log
    tag_key @log_name
    flush_interval 1s
    @type stdout
  1. Start the service

docker-compose up
  1. The request for many times httpd Service generation log

$ curl localhost:80
  1. Verify log collection

Open browser access http://localhost:5601

Initialize creation fluentd-* Indexes

 Create index

Create index

Now you can see that Httpd The generated logs have been collected



Use fluentd Collect key points

  1. How to specify fluentd drive

  • modify daemon.json( overall situation )

  • Single container

    #  Start adding  
    --fluentd-address=localhost:24224  --log-driver=fluentd
    # Be careful : Be careful , If at this time fluentd Service to hang   The service doesn't start up , When the service starts   add 


