Post on 08-May-2015
Log everyting in JSON.
Sadayuki FuruhashiTreasuare Data, Inc.
Self-introduction
> Sadayuki Furuhashitwitter: @frsyuki
> Original author of Fluentd
> Treasure Data, Inc.Software Architect; Founder
> open-sourceMessagePack - efficient serialization format
0. Why logging?
1. Why Fluentd? - Design of Fluentd
> Extensibility
> Uni!ed log format
> Simplicity
2. Who uses Fluentd?
3. Future of Fluentd
0. Why logging?
1. Why Fluentd? - Design of Fluentd
> Extensibility
> Uni!ed log format
> Simplicity
2. Who uses Fluentd?
3. Future of Fluentd
0. Why logging?
> Error notifications> Performance monitoring> User segment analysis> Funnel analysis> Heatmap analysis> Market prediction
etc...
0. Why logging? - Error noti!cations
Error!
0. Why logging? - Performance monitor
0. Why logging? - User segment analysis
0. Why logging? - Funnel analysis
-27%!-28%!
0. Why logging? - Heatmap analysis
0. Why logging? - Market prediction
0. Why logging?
1. Why Fluentd? - Design of Fluentd
> Extensibility
> Uni!ed log format
> Simplicity
2. Who uses Fluentd?
3. Future of Fluentd
0. Why logging?
1. Why Fluentd? - Design of Fluentd
> Extensibility
> Uni!ed log format
> Simplicity
2. Who uses Fluentd?
3. Future of Fluentd
Nagios
MongoDB
Hadoop
Alerting
Amazon S3
Analysis
Archiving
MySQL
log utilization
Nagios
MongoDB
Hadoop
Alerting
Amazon S3
Analysis
Archiving
MySQL
Apache
Frontend
Access logs
syslogd
App logs
System logs
Backend
Databases
log sources
log utilization
Nagios
MongoDB
Hadoop
Alerting
Amazon S3
Analysis
Archiving
MySQL
Apache
Frontend
Access logs
syslogd
App logs
System logs
Backend
Databases
perl scripts
rsync servers
bash scripts
Nagios
MongoDB
Hadoop
Alerting
Amazon S3
Analysis
Archiving
MySQL
Apache
Frontend
Access logs
syslogd
App logs
System logs
Backend
Databases
Problems...
No unified method to collect logs> Too many bash/perl scripts
Fragile for changesLess reliable
> Mixed log formatsOld-fashioned “Human-readable” text logsNot ready to analyze
> High latencymust wait a day for log rotation
Nagios
MongoDB
Hadoop
Alerting
Amazon S3
Analysis
Archiving
MySQL
Apache
Frontend
Access logs
syslogd
App logs
System logs
Backend
Databases
Nagios
MongoDB
Hadoop
Alerting
Amazon S3
Analysis
Archiving
MySQL
Apache
Frontend
Access logs
syslogd
App logs
System logs
Backend
Databasesfilter / buffer / routing
Input Plugins Output Plugins
Buffer PluginsFilter Plugins
Input Plugins Output Plugins
2012-02-04 01:33:51myapp.buylog { “user”: ”me”, “path”: “/buyItem”, “price”: 150, “referer”: “/landing”}
JSON format
Input Plugins Output Plugins
2012-02-04 01:33:51myapp.buylog { “user”: ”me”, “path”: “/buyItem”, “price”: 150, “referer”: “/landing”}
timetag
record
JSON format
Why Fluentd?> Extensibility - Plugin architecture
collect logs from various systemsforward logs to various systems
> Unified log format - JSON formatmodern “Machine-readable” log formatimmediately ready to analyze
> Reliable - HA configuration> Easy to install - RPM/deb packages
deploy instantly to everywhere
Comparision with other log collectors:> Scribe
Less extensibleNo unified log formatNo longer developped?
> FlumeLess simpleNo unified log formatLittle information about Flume-NG
0. Why logging?
1. Why Fluentd? - Design of Fluentd
> Extensibility
> Uni!ed log format
> Simplicity
2. Who uses Fluentd?
3. Future of Fluentd
0. Why logging?
1. Why Fluentd? - Design of Fluentd
> Extensibility
> Uni!ed log format
> Simplicity
2. Who uses Fluentd?
3. Future of Fluentd
NHN Japan COOKPAD NAVER
Crocos
http://www.quora.com/Who-uses-Fluentd-in-production
0. Why logging?
1. Why Fluentd? - Design of Fluentd
> Extensibility
> Uni!ed log format
> Simplicity
2. Who uses Fluentd?
3. Future of Fluentd
0. Why logging?
1. Why Fluentd? - Design of Fluentd
> Extensibility
> Uni!ed log format
> Simplicity
2. Who uses Fluentd?
3. Future of Fluentd
Future of Fluentd> <filter>> <match> in <source>> <label>> MessagePack for Ruby v5> td-agent-lite> Pub/Sub & Monitoring API> New process model & Live restart> Backward compatibility
<source> type tail path /var/log/httpd.log format apache tag not_filtered.apache</source>
<match not_filetered.**> type rewrite remove_prefix not_filtered <rule> key status pattern ^500$ ignore true </rule></match>
<match **> type forward host log.server</match>
Before
Mysterious tag
tag operations
<source> type tail path /var/log/httpd.log format apache tag apache</source>
<filter **> type rewrite <rule> key status pattern ^500$ ignore true </rule></match>
<match **> type forward host log.server</match>
After (v11)
Filter plugins!
<source> type tail path /var/log/httpd.log format apache tag apache
<filter **> type rewrite <rule> key status pattern ^500$ ignore true </rule> </match></source>
<match **> type forward host log.server</match>
After (v11)
<filter>/<match> in <source>
<source> type tail path /var/log/httpd.log tag apache</source>
<match **> type forward host log.server</match>
Before
I want to add flowcounter here...
<source> type tail path /var/log/httpd.log tag apache</source>
<match flow.traffic> type forward host traffic.server</match>
<match **> type copy <store> type flowcounter tag flow.traffic </store>
<store> type forward host log.server </store></match>
Before
Nested!
<source> type tail path /var/log/httpd.log tag apache</source>
<filter **> type copy <match> type flowcounter tag flow.traffic <match> type forward host traffic.server </match> </match></match>
<match **> type forward host log.server</match>
After (v11)
Filtering pipeline
<source> type forward</source>
<filter **> type copy <match> type file path /mnt/local_archive </match></filter>
<label alert> <match **> ... </match></label>
<label analysis> ...</label>
# copy & label & forward<filter **> type copy <match> type forward label alert host alerting.server </match></filter>
# copy & label & forward<filter **> type copy <match> type forward label analysis host analysis.server </match></filter>
After (v11)
MessagePack for Ruby v5
0
10000
20000
30000
40000
Serialize Deserialize
msgpack v5 msgpack v4 yajl json
(tweets/sec)
td-agent-lite
> in_tail + out_forward in “single” binarystatically linked ruby binary + scripts tied with the binary
New process model & Live restart
Supervisor
Old multiprocess model
Enginefork()
detached process
detached processall data pass through
the central process
New process model & Live restart
New multiprocess model
Supervisor Engine
detached process
detached process
direct communication
ProcessManager
New process model & Live restart
New multiprocess model
Supervisor Engine
detached process
detached process
ProcessManager
Live restart
Engine ProcessManager
Backward compatibility
Fluentd v11 includes 2 namespaces:> Fluentd:: new code base> Fluent:: old code base + wrapper classes
Checkout the repository for details:> http://github.com/frsyuki/fluentd-v11
Conculution
Fluentd makes logging better> Plugin architecture> JSON format> HA configuration> RPM/deb package
Fluentd is under active development
Fluentd is suppored by many committers
ログ収集/解析に使っているツール
ログの保存先
Fluentdを導入するにあたっての障壁