Fluentd - RubyKansai 65

download Fluentd - RubyKansai 65

If you can't read please download the document

  • date post

    16-Jul-2015
  • Category

    Technology

  • view

    1.075
  • download

    0

Embed Size (px)

Transcript of Fluentd - RubyKansai 65

  • Masahiro NakagawaFeb 21, 2015

    RubyKansai #65

    FluentdUnified logging layer

  • Who are you?

    > Masahiro Nakagawa> github/twitter: @repeatedly

    > Treasure Data, Inc.> Senior Software Engineer> Fluentd / td-agent developer

    > Living at OSS :)> D language - Phobos committer> Fluentd - Main maintainer> MessagePack / RPC - D and Python (only RPC)> The organizer of several meetups (Presto, DTM, etc)> etc

  • Structured logging

    !

    Reliable forwarding

    !

    Pluggable architecture

    http://fluentd.org/

  • Whats Fluentd?> Data collector for unified logging layer

    > Streaming data transfer based on JSON> Written in Ruby

    > Gem based various plugins> http://www.fluentd.org/plugins

    > Working in production> http://www.fluentd.org/testimonials

  • Background

  • Data Analytics Flow

    Collect Store Process Visualize

    Data source

    Reporting Monitoring

  • Data Analytics Flow

    Store Process

    Cloudera

    Horton Works

    Treasure Data

    Collect Visualize

    Tableau

    Excel

    R

    easier & shorter time

    ???

  • TD Service Architecture

    Time to Value

    Send query result Result Push

    Acquire AnalyzeStore

    Plazma DBFlexible, Scalable, Columnar Storage

    Web Log

    App Log

    Censor

    CRM

    ERP

    RDBMS

    Treasure Agent(Server)SDK(JS, Android, iOS, Unity)

    Streaming Collector

    Batch / Reliability

    Ad-hoc /Low latency

    KPI$

    KPI Dashboard

    BI Tools

    Other Products

    RDBMS, Google Docs,AWS S3, FTP Server, etc.

    Metric Insights

    Tableau, Motion Boardetc.

    POS

    REST APIODBC / JDBCSQL, Pig

    Bulk Uploader

    Embulk,TD Toolbelt

    SQL-based query

    @AWS or @IDCF

    Connectivity

    Economy & Flexibility Simple & Supported

  • Dive into

  • Divide & Conquer & Retry

    error retry

    error retry retry

    retryBatch

    Stream

    Other stream

  • Application

    Server2

    Application

    Server3

    Application

    Server1

    FluentLog ServerHigh Latency!must wait for a day...

    Before

  • Application

    Server2

    Application

    Server3

    Application

    Server1

    Fluentd Fluentd Fluentd

    Fluentd Fluentd

    In streaming!

    After

  • Core Plugins

    > Divide & Conquer

    > Buffering & Retrying

    > Error handling

    > Message routing

    > Parallelism

    > read / receive data> from API, database,

    command, etc> write / send data

    > to API, database, alert, graph, etc

  • Apache to Mongo

    tail

    insert

    event buffering

    127.0.0.1 - - [11/Dec/2012:07:26:27] "GET / ... 127.0.0.1 - - [11/Dec/2012:07:26:30] "GET / ... 127.0.0.1 - - [11/Dec/2012:07:26:32] "GET / ... 127.0.0.1 - - [11/Dec/2012:07:26:40] "GET / ... 127.0.0.1 - - [11/Dec/2012:07:27:01] "GET / ...

    ...

    Fluentd

    Web Server

    2012-02-04 01:33:51

    apache.log

    {

    "host": "127.0.0.1",

    "method": "GET",

    ...

    }

  • > default second unit

    > from data source

    Event structure(log message)

    Time

    > for message routing

    > where is from?

    Tag

    > JSON format

    > MessagePackinternally

    > schema-free

    Record

  • Architecture (v0.12 or later)

    EngineInput

    Filter Output

    Buffer

    > grep > record_transfomer

    >

    > Forward

    > File tail

    > ...

    > Forward

    > File

    > ...

    Output

    > File

    > Memory

    not pluggable

    FormatterParser

  • Configuration and operation

    > No central / master node > include helps configuration sharing

    > Operation depends on your environment > Use your deamon management > Use Chef in Treasure Data

    > Apache like syntax and Ruby DSL

  • # receive events via HTTP type http port 8888 !# read logs from a file type tail path /var/log/httpd.log format apache tag apache.access !# save access logs to MongoDB type mongo database apache collection log

    # save alerts to a file

    type file

    path /var/log/fluent/alerts

    !# forward other logs to servers

    type forward

    host 192.168.0.11

    weight 20

    host 192.168.0.12

    weight 60

    !include http://example.com/conf

  • Plugins - use rubygems

    $ fluent-gem search -rd fluent-plugin!

    !

    $ fluent-gem search -rd fluent-mixin!

    !

    $ fluent-gem install fluent-plugin-mongo

  • in_tail

    read a log file! custom regexp! custom parser in Ruby

    FluentdApache

    access.log

    > json > csv > tsv > ltsv

    Supported format:> apache > apache_error > apache2 > nginx

    > syslog > none

  • out_webhdf

    Fluentd

    buffer

    retry automatically! exponential retry wait! persistent on a file

    slice files based on time2013-01-01/01/access.log.gz!2013-01-01/02/access.log.gz!2013-01-01/03/access.log.gz!...

    HDFS

    custom text formatter

    Apache

    access.log

  • out_copy

    routing based on tags! copy to multiple storages

    Amazon S3

    Fluentd

    buffer

    Apache

    access.log

  • out_forward

    apache

    automatic fail-over! load balancing

    FluentdApache

    bufferaccess.log

    retry automatically! exponential retry wait! persistent on a file

    Fluentd

    Fluentd

    Fluentd

  • Before

  • After

    or Embulk

  • Nagios

    MongoDB

    Hadoop

    Alerting

    Amazon S3

    Analysis

    Archiving

    MySQL

    Apache

    Frontend

    Access logs

    syslogd

    App logs

    System logs

    Backend

    Databasesbuffering / processing / routing

    M x N M + N

  • Use-cases

  • Treasure Data

    FrontendJob Queue

    WorkerHadoop

    Presto

    Fluentd

    Applications push metrics to Fluentd (via local Fluentd)

    Librato Metricsfor realtime analysis

    Treasure Data

    for historical analysis

    Fluentd sums up data minutes(partial aggregation)

  • hundreds of app servers

    sends event logs

    sends event logs

    sends event logs

    Rails app td-agent

    td-agent

    td-agent

    GoogleSpreadsheet

    Treasure Data

    MySQL

    Logs are availableafter several mins.

    Daily/HourlyBatch

    KPIvisualizationFeedback rankings

    Rails app

    Rails app

    Unlimited scalability Flexible schema Realtime Less performance impact

    Cookpad

    Over 100 RoR servers (2012/2/4)

  • Slideshare

    http://engineering.slideshare.net/2014/04/skynet-project-monitor-scale-and-auto-heal-a-system-in-the-cloud/

  • Log Analysis System And its designs in LINE Corp. 2014 early

  • Roadmap

  • v0.10 (old stable)

    > Mainly for log forwarding> with good performance> working in production

    > almost users use td-agent> Various plugins

    > http://www.fluentd.org/plugins

  • v0.12 (current stable)> Event handling improvement

    > Filter> Label> Error Stream

    > At-least-once semantics in forwarding> require_ack_response parameter> http://ogibayashi.github.io/blog/2014/12/16/try-

    fluentd-v0-dot-12-at-least-once/

  • > Apply filtering routine to event stream> No more tag tricks!

    Filter

    type record_reformer

    tag reformed.${tag}

    !

    type growthforecast

    type record_transformer

    v0.10: v0.12:

    type growthforecast

  • > Internal event routing> Redirect events to another group

    > much easier to group and share plugins

    Label

    type forward

    !

    type record_reformer

    !

    type forward

    @label @APP1

    type s3

    v0.10: v0.12:

  • Error stream with Label> Can handle an error at each record level

    > It is still prototype ERROR!

    {"event":1, ...}

    {"event":2, ...}

    {"event":3, ...}

    chunk1

    {"event":4, ...}

    {"event":5, ...}

    {"event":6, ...}

    chunk2

    Input

    OK

    ERROR!

    OK

    OK

    OK

    Output

    type file

    ...

    Error stream

    Built-in @ERROR is used

    when error occurred in emit

  • v0.14 (next stable)> New plugin APIs

    > Actor> New base classes (#309)

    > ServerEngine based core engine> Robust supervisor

    > Sub-second time support (#461)> Zero downtime restart

  • Actor> Easy to write popular routines> Hide implementation details

    class TimerWatcher Socket manager shared resources with workers

    40

    SupervisorTCP

    1. Listen to TCP socket

    Zero downtime restart

  • 41

    Worker

    Supervisor

    heartbeat

    TCP

    TCP

    1. Listen to TCP socket

    2. Pass its socket to worker

    Zero downtime restart> Socket manager shared resources with

    workers

  • 42

    Worker

    Supervisor

    Worker

    TCP

    TCP

    1. Listen to TCP socket

    2. Pass its socket to worker

    3. Do same actionat worker restartingwith keeping TCP socket

    heartbeat

    Zero downtime restart> Socket manager shared resources with

    workers

    TODO: How