Fluentd and Docker - running fluentd within a docker container

36
FLUENTD: UNIFIED LOGGING LAYER John Hammink October 14, 2015 Cask Big Data Application Meetup

Transcript of Fluentd and Docker - running fluentd within a docker container

FLUENTD: UNIFIED LOGGING LAYER

John HamminkOctober 14, 2015

Cask Big Data Application Meetup

About Me

• A recovering software & QA engineer turned digital artist once interested in fractals;

• now into data visualization based on large datasets rendered directly to GPU (RGL, various Python GL libraries, etc.)

• github: jammink2; twitter: rijksband

Tweet NOW! “At @caskdata learning how to collect more event data using #Fluentd”

WHAT’S FLUENTD?

An extensible & reliable data collection tool

simple core + plugins

buffering, HA (failover), load balancing, etc.

like syslogd

What’s Fluentd?

> Data collector for unified logging layer > Streaming data transfer based on JSON > Written in Ruby

> Gem based various plugins > http://www.fluentd.org/plugins

> Working in production > http://www.fluentd.org/testimonials

data collection tool

✓ duplicated code for error handling... ✓ messy code for retrying mechnism...

Blueflood

MongoDB

Hadoop

Metrics

Amazon S3

Analysis

Archiving

MySQL

Apache

Frontend

Access logs

syslogd

App logs

System logs

Backend

Your system

bash scripts ruby scripts

rsync

log file

bash

python scripts

customloggger

cron

other customscripts...

(this is painful!!!)

Blueflood

MongoDB

Hadoop

Metrics

Amazon S3

Analysis

Archiving

MySQL

Apache

Frontend

Access logs

syslogd

App logs

System logs

Backend

Your systemfilter / buffer / route

extensible

CORE PLUGINS• Divide & Conquer

• Buffering & Retries

• Error Handling

• Message Routing

• Parallelism

• Read Data

• Parse Data

• Buffer Data

• Write Data

• Format Data

CommonConcerns

Use CaseSpecific

architecture

INTERNAL ARCHITECTURE

“input-ish” “output-ish”

Input Parser Buffer Output FormatterFilter

Internal Architecture (Simplified)

Input Buffer Output

Plugin Plugin Plugin

2012-02-04 01:33:51 myapp.buylog { “user”: ”me”, “path”: “/buyItem”, “price”: 150, “referer”: “/landing” }

timetag

record

Architecture :: Input plugins

Input

HTTP+JSON (in_http) File tail (in_tail) Syslog (in_syslog) ...

Plugin

✓ Receive logs

✓ Or pull logs from data sources

✓ in non-blocking manner

Architecture :: Output plugins

Plugin

✓ Write or send event logs

Output

File (out_file) Amazon S3 (out_s3) MongoDB (out_mongo) ...

Architecture :: Buffer plugins

Plugin

✓ Improve performance

✓ Provide reliability

✓ Provide thread-safety

Buffer

Memory (buf_memory) File (buf_file)

Architecture :: Buffer plugins

Plugin

✓ Improve performance

✓ Provide reliability

✓ Provide thread-safety

chunk

chunk

chunk output

Input

reliable data transfer

DIVIDE & CONQUER & RETRY

error retry

error retry retry

retry

reliable process

THIS?

OR THIS?

M X N → M + N

Nagios

MongoDB

Hadoop

Alerting

Amazon S3

Analysis

Archiving

MySQL

Apache

Frontend

Access logs

syslogd

App logs

System logs

Backend

Databasesbuffer/filter/route

use cases

SIMPLE FORWARDING

# logs from a file<source> type tail path /var/log/httpd.log format apache2 tag backend.apache</source>

# logs from client libraries<source> type forward port 24224</source>

# store logs to MongoDB<match backend.*> type mongo database fluent collection test</match>

LESS SIMPLE FORWARDING

LAMBDA ARCHITECTURE

# logs from a file<source> type tail path /var/log/httpd.log format apache2 tag web.access</source>

# logs from client libraries<source> type forward port 24224</source>

# store logs to ES and HDFS<match *.*> type copy

<store> type elasticsearch logstash_format true </store>

<store> type webhdfs host namenode port 50070 path /path/on/hdfs/ </store></match>

FLUENTD ON KUBERNETES (NOV 2015)

FLUENTD LOGGING DRIVER (APR 2015)

Tweet Again! “Happy v1 #k8s and congrats #Fluentd for becoming a #docker logging driver”

DEMO: FLUENTD + DOCKER

THANK YOU!

AND TREASURE DATA IS HIRING! WWW.TREASUREDATA.COMC/CAREERS