Fluentd meetup

21
Sadayuki Furuhashi Fluentd @frsyuki e Event Collector Service Treasure Data, Inc. Structured logging Pluggable architecture Reliable forwarding

Transcript of Fluentd meetup

Page 1: Fluentd meetup

Sadayuki Furuhashi

Fluentd

@frsyuki

!e Event Collector Service

Treasure Data, Inc.

Structured logging

Pluggable architecture

Reliable forwarding

Page 2: Fluentd meetup

Fluentd in brief

It's like syslogd, but uses JSON for log messages

Page 3: Fluentd meetup

Fluentd :: format of logs

Application

Fluentd

Storage

2012-02-04 01:33:51myapp.buylog { “user”: ”me”, “path”: “/buyItem”, “price”: 150, “referer”: “/landing”}

Page 4: Fluentd meetup

Fluentd :: format of logs

Application

Fluentd

Storage

2012-02-04 01:33:51myapp.buylog { “user”: ”me”, “path”: “/buyItem”, “price”: 150, “referer”: “/landing”}

timetag

record

Page 5: Fluentd meetup

Fluentd :: plugins

Application

Fluentd

FluentdStorageSaaS

!lter / bu"er / routing

Plug-in Plug-in Plug-in

Page 6: Fluentd meetup

Fluentd :: plugins

Application

Fluentd

FluentdStorageSaaS

!lter / bu"er / routing

File

tail

Scribesyslogd

Plug-in Plug-in

Plug-in

Plug-in Plug-in Plug-in

Page 7: Fluentd meetup

Fluentd :: client libraries• Client libraries

> Ruby> Perl> PHP> Python> Java> ...

Fluent.open(“myapp”)

Fluent.event(“login”, {“user”=>38})

#=> 2012-02-04 04:56:01 myapp.login {“user”:38}

Application

Fluentd

Page 8: Fluentd meetup

Typical architecture before Fluentd

Application

File File File ...

App server

Application

File File File ...

App server

File

Application

File File File ...

App server

Log server

Burst of tra!c

High latencymust wait for a day

Hard to analyzecomplex text parsers

Page 9: Fluentd meetup

Architecture after Fluentd

Application

App server

Fluentd

Application

App server

Fluentd

Application

App server

Fluentd

Fluentd Fluentd

Realtime!

Page 10: Fluentd meetup

Architecture after Fluentd

Fluentd Fluentd Fluentd

Fluentd Fluentd

Hadoop/ Hive MongoDB Amazon

S3 / EMRReady toAnalyze!

Realtime!

Page 11: Fluentd meetup

Fluentd Fluentd

Fluentd Fluentd Fluentd

Case studyRuby on Rails Ruby on Rails Ruby on Rails

Hadoop/ Hive MongoDBPV logs

User behaviorlogs

routing✓ 127 RoR servers✓ 70,000 msgs/sec✓ 120Mbps at peak✓ 650GB/day

Page 12: Fluentd meetup

# read logs from a file<source> type tail path /var/log/httpd.log format apache tag apache.access</source>

# save access logs to MongoDB<match apache.access> type mongo host 127.0.0.1</match>

# forward other logs to servers# (load-balancing + fail-over)<match **> type forward <server> host 192.168.0.11 weight 20 </server> <server> host 192.168.0.12 weight 60 </server></match>

Page 13: Fluentd meetup

Scribe

Frontend servers

Aggregator nodesscribe

scribescribe

scribe

scribescribe

HadoopHDFS

Scribe: log collector by Facebook

Page 14: Fluentd meetup

Scribe’s Pros & Cons• Pros.

> Fast (C++)

• Cons.> VERY hard to install> Deals with unstructured logs you must parse logs before analyzing them

> Hard to extend you must re-compile C++ programs

> No longer maintained?

Page 15: Fluentd meetup

Fluentd vs Scribe

• Easy to install> “gem install fluentd”> stable RPM and DEB packages http://packages.treasure-data.com/

• Easy to write plugins> you can use Ruby

• Easy to distribute plugins> “gem search -rd fluent-plugin”

Page 16: Fluentd meetup

FlumeFlume: distributed log collector by Cloudera

Flume

HadoopHDFS

Flume Flume

Flume MasterPhisicalTopology

LogicalTopology

Page 17: Fluentd meetup

Flume’s Pros & Cons• Pros.

> Central master server manages all nodes

• Cons.> Difficult to understand logical topologies, phisical servers and a configuration of

the logical/phisical mapping> Dificult to configure replicated master servers, log servers and agents

> Big footprint 50,000 lines of Java codes

Page 18: Fluentd meetup

Fluentd vs Flume

• Easy to understand> “syslogd that understands JSON”

• Easy to setup> “sudo fluentd --setup && fluentd”

• Very small footprint> small engine (3,000 lines) + plugins

• Easy to configure

Page 19: Fluentd meetup

Fluentd vs Scribe/FlumeFluentd Scribe Flume

Installation

Footprint

Plugin

Plugin distribution

Master Server

License

gem/rpm/deb make rpm/deb

3000 lines ofRuby

8000 lines ofC++

50,000 lines ofJava

Ruby N/A Java

RubyGems.org N/A N/A

No No Yes

Apache License Apache License Apache License

Page 20: Fluentd meetup

Fluentd

• Documents> http://fluentd.org

• Source code> http://github.com/fluent> 14 committers across

many organizations

• Mailing list> Google groups

Page 21: Fluentd meetup

• Sadayuki Furuhashi> twitter: @frsyuki

• Treasure Data, Inc.> Software Engineer; founder

• Author of MessagePack

• Author of Fluentd