Fluentd at HKOScon

45
Fluentd: Unified logging layer June 25, 2016 / HKOScon Masahiro Nakagawa

Transcript of Fluentd at HKOScon

Page 1: Fluentd at HKOScon

Fluentd: Unified logging layerJune 25, 2016 / HKOScon Masahiro Nakagawa

Page 2: Fluentd at HKOScon

Who am I• Masahiro Nakagawa

• github: @repeatedly

• Treasure Data, Inc. • Senior Software Engineer • Fluentd / td-agent developer

• Living at OSS :) • D language - Phobos, a.k.a standard library, committer • Fluentd - Main maintainer • MessagePack / RPC • etc…

Page 3: Fluentd at HKOScon

Background

Page 4: Fluentd at HKOScon

Before

✓ duplicated code for error handling...✓ messy code for retrying mechanism...

Page 5: Fluentd at HKOScon

After

Page 6: Fluentd at HKOScon

Structured logging

Reliable forwarding

Pluggable architecture

http://fluentd.org/

Page 7: Fluentd at HKOScon

Concept /

Design

Page 8: Fluentd at HKOScon

What’s Fluentd?• Data collector for unified logging layer

• Streaming data transfer based on JSON / MessagePack

• Robust core and plugins written in Ruby

• Rubygems based various plugins by the community

• www.fluentd.org/plugins • Apache License, Version 2.0

• https://github.com/fluent/fluentd

Page 9: Fluentd at HKOScon

Reliable data transfer

error retry

error retry retry

retryBatch

Stream

Other stream

(micro-batch)

Page 10: Fluentd at HKOScon

Core

Common concerns Use case specific

PLUGINSCORE• Read Data• Parse Data• Buffer Data• Write Data• Format Data

• Divide & Conquer• Buffering & Retries• Error Handling• Message Routing• Parallelism

Page 11: Fluentd at HKOScon

> logged time

Event structure(log message)✓ Time

> for message routing

✓ Tag> Actual content

> JSON / MessagePack

✓ Record

127.0.0.1 - - [11/Dec/2012:07:26:27] "GET / ... 127.0.0.1 - - [11/Dec/2012:07:26:30] "GET / ... 127.0.0.1 - - [11/Dec/2012:07:26:32] "GET / ... 127.0.0.1 - - [11/Dec/2012:07:26:40] "GET / ...

...

2012-12-11 07:26:27apache.log

{ "host": "127.0.0.1", "method": "GET", ...}

Apache log Fluentd event

convert

timetag

record

Page 12: Fluentd at HKOScon

Architecture

Page 13: Fluentd at HKOScon

Data processing pipeline

ParserInput Buffer Output FormatterFilter

“output-ish”“input-ish”

Page 14: Fluentd at HKOScon

Input plugins

File tail (in_tail)Syslog (in_syslog)HTTP (in_http)RDBMS (in_sql)...

✓ Receive logs

✓ Or pull logs from data sources

✓ non-blocking

InputInput

Page 15: Fluentd at HKOScon

Parser plugins

JSONRegexpApache/NginxSyslogCSV/TSVetc.

✓ Parse into JSON

✓ Common formats out of the box

✓ Some inputs plugin depends on

Parser plugin

ParseParser

Page 16: Fluentd at HKOScon

Filter plugins

greprecord_transformersuppress…

✓ Filter / Mutate record

✓ Record level and Stream level

✓ v0.12 and above

Filter

Page 17: Fluentd at HKOScon

Buffer plugins

✓ Improve performance

✓ Provide reliability

✓ Provide thread-safetyMemory (buf_memory)File (buf_file)

Buffer

Page 18: Fluentd at HKOScon

Buffer internal

✓ Chunk = adjustable unit of data

✓ Buffer = Queue of chunks

chunk

chunk

chunk output

Input

Page 19: Fluentd at HKOScon

Formatter plugins

✓ Format output

✓ Convert json into other format

✓ Some plugins depends on

Formatter plugins

JSONCSV/TSV“single value”msgpack

Formatter

Page 20: Fluentd at HKOScon

Output plugins

✓ Write to external systems

✓ Buffered & Non-buffered

✓ 300+ pluginsFile (out_file)Amazon S3 (out_s3)Kafka (out_kafka)...

Output

Page 21: Fluentd at HKOScon

Use cases

Page 22: Fluentd at HKOScon

Simple Forwarding

Page 23: Fluentd at HKOScon

# logs from a file<source> @type tail path /var/log/httpd.log pos_file /tmp/pos_file format apache2 tag backend.apache</source>

# logs from client libraries<source> @type forward port 24224</source>

# store logs to MongoDB<match backend.*> @type mongo database fluent collection test</match>

Page 24: Fluentd at HKOScon

Less Simple Forwarding

- At-most-once / At-least-once - HA (failover)- Load-balancing

Page 25: Fluentd at HKOScon

Container Logging

Page 26: Fluentd at HKOScon

Treasure DataFrontend

Job Queue

WorkerHadoop

Presto

Fluentd

Applications push metrics to Fluentd (via local Fluentd)

Datadogfor realtime monitoring / alerting

Treasure Datafor historical analysis

Fluentd sums up data minutes(partial aggregation)

Page 27: Fluentd at HKOScon

Slideshare

http://engineering.slideshare.net/2014/04/skynet-project-monitor-scale-and-auto-heal-a-system-in-the-cloud/

Page 28: Fluentd at HKOScon

Other companies…

https://www.fluentd.org/testimonials

Page 29: Fluentd at HKOScon

Roadmap

Page 30: Fluentd at HKOScon

History / Roadmap• v0.10.0 (In Oct 2011)

• v0.12.0 (In Dec 2014) -> Current stable

• v0.14.0 (In May 2016)

• v0.14.x (some versions in 2016)

• v1 (4Q in 2016 / 1Q in 2017)

Page 31: Fluentd at HKOScon

v0.10 (old stable)

> First stable release> Mainly for log forwarding

> Only Input and Output> No complex event handling support

> Only At-most-once semantics in forwarding> Treasure Data provides td-agent 1 for v0.10

Page 32: Fluentd at HKOScon

v0.12 (current stable)> Event handling improvement

> Filter, Label, Error Stream> New configuration format> Add at-least-once semantics in forwarding

> Use require_ack_response parameter> HTTP RPC based process management> Treasure Data provides td-agent 2 for v0.12

Page 33: Fluentd at HKOScon

• New Plugin APIs, Plugin Helpers & Plugin Storage

• Supervisor using ServerEngine

• Time with Nanosecond precision

• Windows support

v0.14 (Next stable)

Page 34: Fluentd at HKOScon

Time with nanosecond• Fluent::EventTime

• behaves as Integer (used as time in v0.12) • has methods to get sub-second precision • be serialized into msgpack using Ext type

• Fluentd core can handle both of Integer and EventTime as time • compatible with older versions and software in eco-

system (e.g., fluent-logger, Docker logging driver)

Page 35: Fluentd at HKOScon

Windows support• Fluentd and core plugin work on Windows

• several companies have already used v0.14.0.pre version on production

• We will send a patch to popular plugins ifit doesn’t work on Windows

• Use HTTP RPC instead of signals

• Treasure Data will provide td-agent msi package

Page 36: Fluentd at HKOScon

v0.14.x - v1

• v0.14.x (some versions in 2016) • Symmetric multi processing model • Counter API • TLS/authentication/authorization support

(merging secure-forward into core)

• v1 (4Q in 2016 / 1Q in 2017) • Stable version for new APIs/features • fully compatible with v0.12.x and v0.14.x

Page 37: Fluentd at HKOScon

Symmetric multi processing model

• 2 or more workers share a configuration file • and share listening sockets via PluginHelper • under a supervisor process (ServerEngine)

• Multi core scalability for huge traffic • one input plugin for a tcp port, some filters and

one (or some) output plugin • No more fluent-plugin-multiprocess

Page 38: Fluentd at HKOScon

Worker

Supervisor

Worker Worker

Worker

Supervisor

Worker Worker

Supervisor Supervisor

Using fluent-plugin-multiprocess

v0.14

Page 39: Fluentd at HKOScon

TLS/Authn/Authz support for forward plugin

• secure-forward will be merged into built-in forward • TLS w/ at-least-one semantics • Simple authentication/authorization w/ non-SSL

forwarding

• Authentication and Authorization providers • Who can connect to input plugins?

What tags are permitted for clients? • New plugin types (3rd party authors can write it) • Mainly for in/out forward, but available from others

Page 40: Fluentd at HKOScon

Ecosystem

Page 41: Fluentd at HKOScon

Treasure Agent (td-agent)> Treasure Data distribution of Fluentd

> include Ruby runtime, fluentd and popular plugins> deb, rpm, dmg are supported

> Treasure Agent 2 is current stable> fluentd v0.12 and Ruby 2.1

> Treasure Agent 3 will be released at fall> fluentd v0.14 and Ruby 2.3> msi will be supported for Windows

Page 42: Fluentd at HKOScon

fluentd-ui> Manage Fluentd instance via Web UI

> https://github.com/fluent/fluentd-ui

Page 43: Fluentd at HKOScon

fluent-bit

> http://fluentbit.io/> Log collector mainly for embedded systems

> Written in C> Can communicate with Fluentd

> There are several plugins> Treasure Data, Elasticsearch, MQTT, etc…

Page 44: Fluentd at HKOScon

> Bulk Loader version of Fluentd> Pluggable architecture

> JRuby, JVM languages> High throughput by parallel processing

> Invented by Treasure Data> Share your script as a plugin> https://github.com/embulkSlide: Fighting Against Chaotically Separated Values with Embulk

Page 45: Fluentd at HKOScon

treasuredata.com

Enjoy logging!