Fluentd Overview, Now and Then

69
Fluentd Overview, Now and Then Satoshi Tagomori (@tagomoris) Fluentd meetup in Matsue #uentdmeetup

Transcript of Fluentd Overview, Now and Then

Page 1: Fluentd Overview, Now and Then

Fluentd Overview, Now and Then

Satoshi Tagomori (@tagomoris)

Fluentd meetup in Matsue #fluentdmeetup

Page 2: Fluentd Overview, Now and Then

Satoshi "Moris" Tagomori (@tagomoris)

Fluentd, MessagePack-Ruby, Norikra, ...

Treasure Data, Inc.

Page 3: Fluentd Overview, Now and Then

Fluentd overview

Page 4: Fluentd Overview, Now and Then

What’s Fluentd?

Simple core + Variety of plugins

Buffering, HA (failover), Secondary output, etc.

Like syslogd in streaming manner

AN EXTENSIBLE & RELIABLE DATA COLLECTION TOOL

Page 5: Fluentd Overview, Now and Then

Log collection with traditional logrotate + rsync

Log Server

Application

Server A

File FileFile

Hard to analyze!!Complex text parsers

Application

Server C

File FileFile

Application

Server B

File FileFile

High latency!!Must wait for a day

Page 6: Fluentd Overview, Now and Then

Streaming way with Fluentd

Log Server

Application

Server A

File FileFile

Application

Server C

File FileFile

Application

Server B

File FileFile

Low latency!Seconds or minutes

Easy to analyze!!Parsed and formatted

Page 7: Fluentd Overview, Now and Then

M x N problem for data integration

LOG

script to parse data

cron job forloading

filteringscript

syslogscript

Tweet-fetching

script

aggregationscript

aggregationscript

script to parse data

rsyncserver

Page 8: Fluentd Overview, Now and Then

LOG

A solution: centralized log collection service

M + N

Page 9: Fluentd Overview, Now and Then

Fluentd Architecture

Page 10: Fluentd Overview, Now and Then

Internal Architecture (simplified)

Plugin

Input Filter Buffer Output

Plugin Plugin Plugin

2012-02-04 01:33:51myapp.buylog{

“user”:”me”,“path”: “/buyItem”,“price”: 150,“referer”: “/landing”}

TimeTag

Record

Page 11: Fluentd Overview, Now and Then

Architecture: Input Plugins

HTTP+JSON (in_http)File tail (in_tail)Syslog (in_syslog)…

Receive logs

Or pull logs from data sources

In non-blocking manner

Plugin

Input

Page 12: Fluentd Overview, Now and Then

Filter

Architecture: Filter Plugins

Transform logs

Filter out unnecessary logs

Enrich logs

Plugin

Encrypt personal dataConvert IP to countriesParse User-Agent…

Page 13: Fluentd Overview, Now and Then

Buffer

Architecture: Buffer Plugins

Plugin

Improve performance

Provide reliability

Provide thread-safety

Memory (buf_memory)File (buf_file)

Page 14: Fluentd Overview, Now and Then

Buffer

Architecture: Buffer Plugins

Chunk

Plugin

Improve performance

Provide reliability

Provide thread-safety

Input

Output

Chunk

Chunk

Page 15: Fluentd Overview, Now and Then

Architecture: Output Plugins

Output

Write or send event logs

Plugin

File (out_file)Amazon S3 (out_s3)kafka (out_kafka_buffered)…

Page 16: Fluentd Overview, Now and Then

Retry

Error

Retry

Batch

Stream Error

Retry

Retry

Divide & Conquer for retry

Page 17: Fluentd Overview, Now and Then

Divide & Conquer for recoveryBuffer (on-disk or in-memory)

Error

Overloaded!!

recovery

recovery + flow control

queued chunks

Page 18: Fluentd Overview, Now and Then

Example Use Cases

Page 19: Fluentd Overview, Now and Then

Streaming from Apache/Nginx to Elasticsearch

in_tail /var/log/access.log

/var/log/fluentd/buffer

but_file

Page 20: Fluentd Overview, Now and Then

Error Handling and Recovery

in_tail /var/log/access.log

/var/log/fluentd/buffer

but_file

Buffering for any outputs Retrying automatically With exponential wait and persistence on a disk and secondary output

Page 21: Fluentd Overview, Now and Then

Tailing & parsing files

Supported built-in formats:

Read a log file Custom regexp Custom parser in Ruby

• apache • apache_error • apache2 • nginx

• json • csv • tsv • ltsv

• syslog • multiline • none

pos fileevents.log

?(your app)

Page 22: Fluentd Overview, Now and Then

Out to Multiple Locations

Routing based on tags Copy to multiple storages

bufferaccess.log

in_tail

Page 23: Fluentd Overview, Now and Then

Example configuration for real time batch combo

Page 24: Fluentd Overview, Now and Then

Data partitioning by time on HDFS / S3

access.logbuffer

Custom file formatter

Slice files based on time

2016-01-01/01/access.log.gz 2016-01-01/02/access.log.gz 2016-01-01/03/access.log.gz …

in_tail

Page 25: Fluentd Overview, Now and Then

3rd party input plugins

dstat

df AMQL

munin

jvmwatcher

SQL

Page 26: Fluentd Overview, Now and Then

3rd party output plugins

Graphite

Page 27: Fluentd Overview, Now and Then

Real World Use Cases

Page 28: Fluentd Overview, Now and Then

Microsoft

Operations Management Suite uses Fluentd: "The core of the agent uses an existing open source data aggregator called Fluentd. Fluentd has hundreds of existing plugins, which will make it really easy for you to add new data sources."

Syslog

Linux Computer

Operating SystemApache

MySQLContainers

omsconfig (DSC)PS DSC

Prov

ider

s

OMI Server(CIM Server)

omsagent

Fire

wal

l / p

roxy

OM

S Se

rvic

e

Upload Data(HTTPS)

Pullconfiguration

(HTTPS)

Page 29: Fluentd Overview, Now and Then

Atlassian

"At Atlassian, we've been impressed by Fluentd and have chosen to use it in Atlassian Cloud's logging and analytics pipeline."

Kinesis

Elasticsearchcluster

Ingestionservice

Page 30: Fluentd Overview, Now and Then

Amazon web services

The architecture of Fluentd (Sponsored by Treasure Data) is very similar to Apache Flume or Facebook’s Scribe. Fluentd is easier to install and maintain and has better documentation and support than Flume and Scribe.

Types of DataStoreCollectTransactional • Database reads & write (OLTP)• Cache

Search • Logs• Streams

File • Log files (/val/log)• Log collectors & frameworks

Stream • Log records• Sensors & IoT data

Web Apps

IoT

Appl

icat

ions

Logg

ing

Mobile AppsDatabase

Search

File Storage

Stream Storage

Page 31: Fluentd Overview, Now and Then

Container and Logging

Page 32: Fluentd Overview, Now and Then

The Container EraServer Era Container Era

Service Architecture Monolithic Microservices

System Image Mutable Immutable

Managed By Ops Team DevOps Team

Local Data Persistent Ephemeral

Log Collection syslogd / rsync ?

Metrics Collection Nagios / Zabbix ?

Page 33: Fluentd Overview, Now and Then

Server Era Container Era

Service Architecture Monolithic Microservices

System Image Mutable Immutable

Managed By Ops Team DevOps Team

Local Data Persistent Ephemeral

Log Collection syslogd / rsync ?

Metrics Collection Nagios / Zabbix ?

The Container Era

How should log & metrics collection be done in The Container Era?

Page 34: Fluentd Overview, Now and Then

Problems

Page 35: Fluentd Overview, Now and Then

The traditional logrotate + rsync on containers

Log Server

Application

Container A

File FileFile

Hard to analyze!!Complex text parsers

Application

Container C

File FileFile

Application

Container B

File FileFile

High latency!!Must wait for a day

Ephemeral!!Could be lost at any time

Page 36: Fluentd Overview, Now and Then

Server 1

Container AApplication

Container BApplication

Server 2

Container CApplication

Container DApplication

Kafka

elasticsearch

HDFS

Container

Container

Container

Container

Small & many containers make storages overloadedToo many connections from micro containers!

Page 37: Fluentd Overview, Now and Then

Server 1

Container AApplication

Container BApplication

Server 2

Container CApplication

Container DApplication

Kafka

elasticsearch

HDFS

Container

Container

Container

Container

System images are immutableToo many connections from micro containers!

Embedding destination IPsin ALL Docker images makes management hard

Page 38: Fluentd Overview, Now and Then

How to collect logs from Docker containers

Page 39: Fluentd Overview, Now and Then

Text logging with --log-driver=fluentdServer

Container

App

FluentdSTDOUT / STDERR

docker run \ --log-driver=fluentd \ --log-opt \ fluentd-address=localhost:24224

{ “container_id”: “ad6d5d32576a”, “container_name”: “myapp”, “source”: stdout}

Page 40: Fluentd Overview, Now and Then

Metrics collection with fluent-loggerServer

Container

App

Fluentd

from fluent import senderfrom fluent import event

sender.setup('app.events', host='localhost')event.Event('purchase', { 'user_id': 21, 'item_id': 321, 'value': '1'})

tag = app.events.purchase{ “user_id”: 21, “item_id”: 321 “value”: 1,}fluent-logger library

Page 41: Fluentd Overview, Now and Then

Shared data volume and tailingServer

Container

App

Fluentd

<source> @type tail path /mnt/nginx/logs/access.log pos_file /var/log/fluentd/access.log.pos format nginx tag nginx.access</source>

/mnt/nginx/logs

Page 42: Fluentd Overview, Now and Then

Logging methods for each purpose• Collecting log messages

> --log-driver=fluentd

• Application metrics

> fluent-logger

• Access logs, logs from middleware

> Shared data volume

• System metrics (CPU usage, Disk capacity, etc.)

> Fluentd’s input plugins(Fluentd pulls those data periodically)

Page 43: Fluentd Overview, Now and Then

Deployment Patterns

Page 44: Fluentd Overview, Now and Then

Server 1

Container AApplication

Container BApplication

Server 2

Container CApplication

Container DApplication

Kafka

elasticsearch

HDFS

Container

Container

Container

Container

Primitive deployment…Too many connections from many containers!

Embedding destination IPsin ALL Docker images makes management hard

Page 45: Fluentd Overview, Now and Then

Server 1

Container AApplication

Container BApplication

Fluentd

Server 2

Container CApplication

Container DApplication

Fluentd Kafka

elasticsearch

HDFS

Container

Container

Container

Container

destination is always localhost from app’s point of view

Source aggregation decouples config from apps

Page 46: Fluentd Overview, Now and Then

Server 1

Container AApplication

Container BApplication

Fluentd

Server 2

Container CApplication

Container DApplication

Fluentd

active / standby /load balancing

Destination aggregation makes storages scalable for high traffic

Aggregation server(s)

Page 47: Fluentd Overview, Now and Then

Aggregation servers• Logging directly from microservices makes log

storages overloaded. > Too many RX connections > Too frequent import API calls

• Aggregation servers make the logging infrastracture more reliable and scalable. > Connection aggregation > Buffering for less frequent import API calls > Data persistency during downtime > Automatic retry at recovery from downtime

Page 48: Fluentd Overview, Now and Then

Fluentd ♡ Container• Fluentd model fits container based systems

> This is why Treasure Data joined CNCF > TD wants to improve cloud native ecosystem

• Fluentd, Prometheus, Docker and Kubernetes collabolation is good for modern systems • Easy to scale and easy to maintain • Fluentd logging driver in Docker • fluent-plugin-prometheus to send application metrics

to prometheus • EFK for log visualization in Kubernetes

Page 49: Fluentd Overview, Now and Then

Fluentd v0.14 and Later

Page 50: Fluentd Overview, Now and Then

• v0.14.0: Released at May 31, 2016

• v0.14.1: Released at Jun 30, 2016

• New Features • New Plugin APIs, Plugin Helpers & Plugin Storage • Time with Nanosecond resolution • ServerEngine based Supervisor • Windows support

v0.14

Page 51: Fluentd Overview, Now and Then

New Plugin APIs• Input/Output plugin APIs w/ well-controlled lifecycle

• stop, shutdown, close, terminate

• New Buffer API for delayed commit of chunks • parallel/async "commit" operation for chunks

• 100% Compatible w/ v0.12 plugins • compatibility layer for traditional APIs • it will be supported between v1.x versions

Page 52: Fluentd Overview, Now and Then

Router

buffer_chunk_limit

enqueue: exceed flush_intervalor buffer_chunk_limit

Key pattern:

- BufferedOutputempty string or specified key-ObjectBufferedOutput tag-TimeSlicedOutput time slice

emit emit

Buffer

Queue

buffer_queue_limit

Output

OutputInput / Filter

Tag Time

Record Chunk

Chunk

Chunk Chunk

Chunk

key:foo

key:bar

key:baz

v0.12 buffer design

Page 53: Fluentd Overview, Now and Then

v0.14 buffer design

Page 54: Fluentd Overview, Now and Then

Plugin Storage & Helpers• Plugin Storage: new plugin type for plugins

• provides key-value storage for plugins • to persistent intermediate status of plugins • built-in plugins (in plan): in-memory, local file • pluggable: 3rd party plugin to store data to Redis?

• Plugin Helpers: • collections of utility methods for plugins • making threads, sockets, network servers, ... • fully integrated with test drivers to run test codes after

setup phase of helpers (e.g., after created threads started)

Page 55: Fluentd Overview, Now and Then

v0.12 plugins

ParserInput Buffer Output FormatteFilter

“output-ish”“input-ish”

Page 56: Fluentd Overview, Now and Then

v0.14 plugins

ParserInput Buffer Output FormatteFilter

“output-ish”“input-ish”

Storag

Helper

Page 57: Fluentd Overview, Now and Then

Time with nanosecond• For sub-second systems: Elasticsearch, InfluxData and etc

• Fluent::EventTime • behaves as Integer (used as time in v0.12) • has methods to get sub-second resolution • be serialized into msgpack using Ext type

• Fluentd core can handle both of Integer and EventTime as time • compatible with older versions and software in eco-

system (e.g., fluent-logger, Docker logging driver)

Page 58: Fluentd Overview, Now and Then

ServerEngine based Supervisor

• Replacing supervisor process with ServerEngine • it has SocketManager to share listening sockets

between 2 or more worker processes

• Replacing Fluentd's processing model from fork to spawn • to support Windows environment

Page 59: Fluentd Overview, Now and Then

Windows support

• Fluentd and core plugin work on Windows • several companies have already used

v0.14.0.pre version on production • We will send a patch to popular plugins if

it doesn’t work on Windows

• Use HTTP RPC instead of signals

Page 60: Fluentd Overview, Now and Then

v0.14.x - v1• v0.14.x (some versions in 2016)

• Symmetric multi-core processing • Counter API • TLS/authentication/authorization support

(merging secure forward) • https://github.com/fluent/fluentd/issues/1000

• v1 (4Q in 2016 or 1Q in 2017) • Stable version for new APIs / features • Fully compatible with v0.12

• exclude v0 config syntax and detach_process

Page 61: Fluentd Overview, Now and Then

Symmetric multi core processing

• 2 or more workers share a configuration file • and share listening sockets via PluginHelper • under a supervisor process (ServerEngine)

• Multi core scalability for huge traffic • one input plugin for a tcp port, some filters and

one (or some) output plugin • buffer paths are managed automatically by

Fluentd core

Page 62: Fluentd Overview, Now and Then

Worker

Supervisor

Worker Worker

Worker

Supervisor

Worker Worker

Supervisor Supervisor

Using fluent-plugin-multiprocess

v0.14

Page 63: Fluentd Overview, Now and Then

Counter API

• APIs to increment/decrement values • shared by some processes • persisted on disk backed by Storage API

• Useful for collecting metrics or stats filters

Page 64: Fluentd Overview, Now and Then

TLS/Authn/Authz support for forward plugin

• secure-forward will be merged into built-in forward • TLS w/ at-least-one semantics • Simple authentication/authorization w/ non-SSL

forwarding

• Authentication and Authorization providers • Who can connect to input plugins?

What tags are permitted for clients? • New plugin types (3rd party authors can write it) • Mainly for in/out forward, but available from others

Page 65: Fluentd Overview, Now and Then

Benchmark (1 CPU usage)

100,000msgs/sec v0.14 v0.12

in_tail (none) + out_forward 70% 66%

in_forward + flowcounter_simple 11% 11%

in_forward + tdlog 43% 38%

※ Use EC2 c3.8xlarge ※ Not fully optimized yet

Page 66: Fluentd Overview, Now and Then

Treasure Agent 3.0 (td-agent 3)

• fluentd v0.14

• Ruby 2.3 and latest core components

• Environments • Add msi Windows package • Remove CentOS 5, Ubuntu 10.04 support

• Release date is not fixed…

Page 67: Fluentd Overview, Now and Then

Enjoy logging!

Page 68: Fluentd Overview, Now and Then
Page 69: Fluentd Overview, Now and Then

H.A. configuration (high availability)

Retry automatically Exponential retry wait Persistent on a disk

bufferAutomatic fail-over Load balancing

access.log

in_tail