Fluentd v0.14 Plugin API Details

56
Fluentd v0.14 Plugin API Details Fluentd meetup 2016 Summer Jun 1, 2016 Satoshi "Moris" Tagomori (@tagomoris)

Transcript of Fluentd v0.14 Plugin API Details

Page 1: Fluentd v0.14 Plugin API Details

Fluentd v0.14 Plugin API DetailsFluentd meetup 2016 Summer Jun 1, 2016 Satoshi "Moris" Tagomori (@tagomoris)

Page 2: Fluentd v0.14 Plugin API Details

Satoshi "Moris" Tagomori (@tagomoris)

Fluentd, MessagePack-Ruby, Norikra, ...

Treasure Data, Inc.

Page 3: Fluentd v0.14 Plugin API Details

Topics• Why Fluentd v0.14 has a new API set for plugins

• Compatibility of v0.12 plugins/configurations

• Plugin APIs: Input, Filter, Output & Buffer

• Storage Plugin, Plugin Helpers

• New Test Drivers for plugins

• Plans for v0.14.x & v1

Page 4: Fluentd v0.14 Plugin API Details

Why Fluentd v0.14 has a New API set for plugins?

Page 5: Fluentd v0.14 Plugin API Details

Fluentd v0.12 Plugins• No supports to write plugins by Fluentd core

• plugins creates threads, sockets, timers and event loops • writing tests is very hard and messy with sleeps

• Fragmented implementations • Output, BufferedOutput, ObjectBufferedOutput and TimeSlicedOutput

• Mixture of configuration parameters from output&buffer

• Uncontrolled plugin instance lifecycle (no "super" in start/shutdown)

• Imperfect buffering control and useless configurations • the reason why fluent-plugin-forest exists and be used widely

Page 6: Fluentd v0.14 Plugin API Details

Fluentd v0.12 Plugins• Insufficient buffer chunking control

• only by size, without number of events in chunks

• Forcedly synchronized buffer flushing • no way to flush-and-commit chunks asynchronously

• Ultimate freedom for using mix-ins • everything overrides Plugin#emit ... (the only one entry point for

events to plugins) • no valid hook points to get metrics or something else

• Bad Ruby coding rules and practices • too many classes at "Fluent::*" in fluent/plugin, no "require", ...

Page 7: Fluentd v0.14 Plugin API Details

And many others!

Page 8: Fluentd v0.14 Plugin API Details

Compatibility of v0.12 plugins/configurations

Page 9: Fluentd v0.14 Plugin API Details

Compatibility of plugins• v0.12 plugins are subclass of Fluent::*

• Fluent::Input, Fluent::Filter, Fluent::Output, ...

• Compatibility layers for v0.12 plugins in v0.14 • Fluent::Compat::Klass -> Fluent::Klass (e.g., Input, Output, ...) • it provides transformation of:

• namespaces, configuration parameters • internal APIs, argument objects

• IT SHOULD WORK, except for :P • 3rd party buffer plugin, part of test code • "Engine.emit"

Page 10: Fluentd v0.14 Plugin API Details

Compatibility of configurations• v0.14 plugins have another set of parameters

• many old-fashioned parameters are removed • "buffer_type", "num_threads", "timezone", "time_slice_format",

"buffer_chunk_limit", "buffer_queue_limit", ...

• Plugin helper "compat_parameters" • transform parameters between v0.12 style

configuration and v0.14 plugin

v0.12 v0.14

convert internally

Page 11: Fluentd v0.14 Plugin API Details

FAQ: Can we create plugins like this? * it uses v0.14 API * it runs on Fluentd v0.12

Impossible :P

Page 12: Fluentd v0.14 Plugin API Details

Overview of v0.14 Plugin classes

Page 13: Fluentd v0.14 Plugin API Details

v0.14 plugin classes

• All files MUST be in `fluent/plugin/*.rb` (in gems) • or just a "*.rb" file in directory specified by "-r"

• All classes MUST be under Fluent::Plugin

• All plugins MUST be subclasses of Fluent::Plugin::Base

• All plugins MUST call `super` in methods overriding default implementation (e.g., #configure, #start, #shutdown, ...)

Page 14: Fluentd v0.14 Plugin API Details

Classes hierarchy (v0.12)

Fluent::Input F::Filter

F::Output

BufferedOutput

ObjectBuffered

TimeSliced Multi

Output F::BufferF::Parser

F::Formatter

3rd party plugins

Page 15: Fluentd v0.14 Plugin API Details

Classes hierarchy (v0.14)

F::P::Input F::P::Filter F::P::Output

Fluent::Plugin::Base

F::P::BufferF::P::Parser

F::P::FormatterF::P::Storage

both ofbuffered/non-buffered

F::P::BareOutput(not for 3rd party

plugins)

F::P::MultiOutput

copyroundrobin

Page 16: Fluentd v0.14 Plugin API Details

Tour of New Plugin APIs:

Fluent::Plugin::Input

Page 17: Fluentd v0.14 Plugin API Details

Fluent::Plugin::Input• Nothing changed :)

• except for overall rules

• But it's much easier to write plugins than v0.12 :) • fetch HTTP resource per

specified interval • parse response body

with format specified in config

• emit parse result

Page 18: Fluentd v0.14 Plugin API Details

Fluent::Plugin::Input

Page 19: Fluentd v0.14 Plugin API Details

Tour of New Plugin APIs:

Fluent::Plugin::Filter

Page 20: Fluentd v0.14 Plugin API Details

Fluent::Plugin::Filter

• Almost nothing changed :)

• Required: #filter(tag, time, record) #=> record | nil

• Optional: #filter_stream(tag, es) #=> event_stream

Page 21: Fluentd v0.14 Plugin API Details

Tour of New Plugin APIs:

Fluent::Plugin::Output

Page 22: Fluentd v0.14 Plugin API Details

Fluent::Plugin::Output• Many things changed!

• Merged Output, BufferedOutput, ObjectBufferedOutput, TimeSlicedOutput

• Output plugins can be • with buffering • without buffering • both (do/doesn't buffering by configuration)

• Buffers chunks events by: • byte size, interval, tag • number of records (new!) • time (by any unit(new!): 30s, 5m, 15m, 3h, ...) • any specified field in records (new!) • any combination of above (new!)

Page 23: Fluentd v0.14 Plugin API Details

Variations of bufferingNO MORE forest plugin!

Page 24: Fluentd v0.14 Plugin API Details

Output Plugin: Methods to be implemented• Non-buffered: #process(tag, es)

• Buffered synchronous: #write(chunk)

• Buffered Asynchronous: #try_write(chunk) • New feature for destinations with huge latency to write

chunks • Plugins must call #commit_write(chunk_id) (otherwise,

#try_write will be retried)

• Buffered w/ custom format: #format(tag, time, record) • Without this method, output uses standard format

Page 25: Fluentd v0.14 Plugin API Details

implement?#process

implement?#process or #write or #try_write

NO error

YES

#prefer_buffered_processingcalled (default true)

NO

non-buffered

YES

exists?<buffer> section

YES implement?#write or #try_write error

NO YES

implement?#write or

#try_write

NO

NO

YES

false

implement?#write and #try_write

YES

#prefer_delayed_commitcalled (default true)

implement? #try_write

syncbuffered

asyncbuffered

Page 26: Fluentd v0.14 Plugin API Details

In other words :P• If users configure "<buffer>" section

• plugin try to do buffering • Else if plugin implements both (buffering/non-buf)

• plugin call #prefer_buffer_processing to decide • Else plugin does as implemented

• When plugin does bufferingIf plugin implements both (sync/async write) • plugin call #prefer_delayed_commit to decide

• Else plugin does as implemented

Page 27: Fluentd v0.14 Plugin API Details

Delayed commit (1)• high latency #write operations locks a flush thread for long time

(e.g., ACK in forward)

destination w/ high latency

#write

Output Plugin

send data send ACK

return #write

a flush thread locked

Page 28: Fluentd v0.14 Plugin API Details

Delayed commit (2)• #try_write & delayed #commit_write

destination w/ high latency

#try_write

Output Plugin

send datasend ACK

return#try_write

async check thread

#commit_write

Page 29: Fluentd v0.14 Plugin API Details

Use cases: delayed commit

• Forward protocol w/ ACK

• Distributed file systems or databases • put data -> confirm to read data -> commit

• Submit tasks to job queues • submit a job -> detect executed -> commit

Page 30: Fluentd v0.14 Plugin API Details

Standard chunk format• Buffering w/o #format method

• Almost same with ObjectBufferedOutput

• No need to implement #format always • Implement it for performance/low-latency

• Tool to dump & read buffer chunks on disk w/ standard format • To be implemented in v0.14.x :)

Page 31: Fluentd v0.14 Plugin API Details

<buffer CHUNK_KEYS>• comma-separated tag, time or ANY_KEYS

• Nothing specified: all events are in same chunk • flushed when chunk is full • (optional) "flush_interval" after first event in chunk

• tag: events w/ same tag are in same chunks

• time: buffer chunks will be split by timekey • timekey: unit of time to be chunked (1m, 15m, 3h, ...) • flushed after expiration of timekey unit + timekey_wait

• ANY_KEYS: any key names in records

Page 32: Fluentd v0.14 Plugin API Details

• comma-separated tag, time or ANY_KEYS

• Nothing specified: all events are in same chunk • flushed when chunk is full • (optional) "flush_interval" after first event in chunk

• tag: events w/ same tag are in same chunks

• time: buffer chunks will be split by timekey • timekey: unit of time to be chunked (1m, 15m, 3h, ...) • flushed after expiration of timekey unit + timekey_wait

• ANY_KEYS: any key names in records

<buffer CHUNK_KEYS>

BufferedOutput

TimeSlicedOutput

ObjectBufferedOutput

in v0.12

in v0.12

in v0.12

Page 33: Fluentd v0.14 Plugin API Details

configurations: flushing buffers

• flush_mode: lazy, interval, immediate • default: lazy if "time" specified, otherwise interval

• flush_interval, flush_thread_count • flush_thread_count: number of threads for flushing

• delayed_commit_timeout • output plugin will retry #try_write when expires

Page 34: Fluentd v0.14 Plugin API Details

Retries, Secondary• Explicit timeout for retries:

• retry_timeout: timeout not to retry anymore • retry_max_times: how many times to retry

• retry_type: "periodic" w/ fixed retry_wait

• retry_secondary_threshold (percentage) • output will use secondary if specified percentage

of retry_timeout elapsed after first error

Page 35: Fluentd v0.14 Plugin API Details

Buffer parameters• chunk_limit_size

• maximum bytesize per chunks

• chunk_records_limit (default: not specified) • maximum number of records per chunks

• total_limit_size • maximum bytesize which a buffer plugin can use

• (optional) queue_length_limit: no need to specify

Page 36: Fluentd v0.14 Plugin API Details

Chunk metadata

• Stores various information of buffer chunks • key-values of chunking unit • number of records • created_at, modified_at

• `chunk.metadata` • extract_placeholders(@path, chunk.metadata)

Page 37: Fluentd v0.14 Plugin API Details

Tour of New Plugin APIs:

Other plugin types

Page 38: Fluentd v0.14 Plugin API Details

Classes hierarchy (v0.14)

F::P::Input F::P::Filter F::P::Output

Fluent::Plugin::Base

F::P::BufferF::P::Parser

F::P::FormatterF::P::Storage

both ofbuffered/non-buffered

F::P::BareOutput(not for 3rd party

plugins)

F::P::MultiOutput

copyroundrobin

Page 39: Fluentd v0.14 Plugin API Details

Classes hierarchy (v0.14)

F::P::Input F::P::Filter F::P::Output

Fluent::Plugin::Base

F::P::BufferF::P::Parser

F::P::FormatterF::P::Storage

both ofbuffered/non-buffered

F::P::BareOutput(not for 3rd party

plugins)

F::P::MultiOutput

copyroundrobin"Owned" plugins

Page 40: Fluentd v0.14 Plugin API Details

"Owned" plugins

• Primary plugins: Input, Output, Filter • Instantiated by Fluentd core

• "Owned" plugins are owned by primary plugins • Buffer, Parser, Formatter, Storage, ... • It can refer owner's plugin id, logger, ... • Fluent::Plugin.new_xxx("kind", parent:@input)

• "Owned" plugins can be configured by owner plugins

Page 41: Fluentd v0.14 Plugin API Details

Owner plugins can control defaults of owned plugins Fluentd provides standard way to configure owned plugins

Page 42: Fluentd v0.14 Plugin API Details

Tour of New Plugin APIs:

Fluent::Plugin::Storage

Page 43: Fluentd v0.14 Plugin API Details

Storage plugins• Pluggable Key-Value store for plugins

• configurable: autosave, persistent, save_at_shutdown • get, fetch, put, delete, update (transactional)

• Various possible implementations • built-in: local (json) on-disk / on-memory • possible: Redis, Consul,

or whatever supports serialize/deserialize of json-like object

• To store states of plugins: • counter values of data-counter plugin • pos data of file plugin

• To load configuration dynamically for plugins: • load configurations from any file systems

Page 44: Fluentd v0.14 Plugin API Details

Tour of New Plugin APIs:

Plugin Helpers

Page 45: Fluentd v0.14 Plugin API Details

Plugin Helpers• No more mixin!

• declare to use helpers by "helpers :name"

• Utility functions to support difficult things • creating threads, timers, child processes... • created timers will be stopped automatically in

plugin's shutdown sequence

• Integrated w/ New Test Drivers • tests runs after helpers started everything requested

Page 46: Fluentd v0.14 Plugin API Details

Plugin Helpers Example• Thread: thread_create, thread_current_running?

• Timer: timer_execute

• ChildProcess: child_process_execute • command, arguments, subprocess_name, interval, immediate,

parallel, mode, stderr, env, unsetenv, chdir, ...

• EventEmitter: router (Output doesn't have router in v0.14 default)

• Storage: storage_create

• (TBD) Socket/Server for TCP/UDP/TLS, Parser, Formatter

Page 47: Fluentd v0.14 Plugin API Details
Page 48: Fluentd v0.14 Plugin API Details

Tour of New Plugin APIs:

New Test Drivers

Page 49: Fluentd v0.14 Plugin API Details

New Test Drivers• Instead of old drivers Fluent::Test::*TestDriver

• Fluent::Test::Driver::Input, Output or Filter • fully emulates actual plugin behavior • w/ override SystemConfig • capturing emitted events & error event streams • inserting TestLogger to capture/test logs of plugins • capturing "format" result of output plugins • controlling "flush" timing of output plugins

• Running tests under control • Plugin Helper integration • conditions to keep/break running tests • timeouts, number of emits/events to stop tests • automatic start/shutdown call for plugins

Page 50: Fluentd v0.14 Plugin API Details

Plans for v0.14.x

Page 51: Fluentd v0.14 Plugin API Details

New Features• Symmetric multi processing

• to use 2 or more CPU cores! • by sharing a configuration between all processes • "detach_process" will be deprecated

• forward: TLS + authentication/authorization support • secure-forward integration

• Buffer supports compression & forward it

• Plugin generator & template

Page 52: Fluentd v0.14 Plugin API Details

New APIs• Controlling global configuration from SystemConfig

• configured via <system> tag • root buffer path + plugin id: remove paths from

each buffers • process total buffer size control

• Counter APIs • counting everything over processes via RPC • creating metrics for a whole fluentd cluster

Page 53: Fluentd v0.14 Plugin API Details

For v1

Page 54: Fluentd v0.14 Plugin API Details

v1: stable version of v0.14

• v0.12 plugins will be still supported at v1.0.0 • deprecated, and will be obsoleted at v1.x

• Will be obsoleted: • v0 (traditional) configuration syntax • "detach_process" feature

• Q4 2016?

Page 55: Fluentd v0.14 Plugin API Details

To Be Written by me :-)• As soooooooooon as possible...

• Plugin developers' guide for • Updating v0.12 plugins with v0.14 APIs • Writing plugins with v0.14 APIs • Writing tests of plugins with v0.14 APIs

• Users' guide for • How to use buffering in general (w/ <buffer>) • Updated plugin documents

Page 56: Fluentd v0.14 Plugin API Details

Enjoy logging!