Fluentd v0.14 Plugin API DetailsFluentd meetup 2016 Summer Jun 1, 2016 Satoshi "Moris" Tagomori (@tagomoris)
Satoshi "Moris" Tagomori (@tagomoris)
Fluentd, MessagePack-Ruby, Norikra, ...
Treasure Data, Inc.
Topics• Why Fluentd v0.14 has a new API set for plugins
• Compatibility of v0.12 plugins/configurations
• Plugin APIs: Input, Filter, Output & Buffer
• Storage Plugin, Plugin Helpers
• New Test Drivers for plugins
• Plans for v0.14.x & v1
Why Fluentd v0.14 has a New API set for plugins?
Fluentd v0.12 Plugins• No supports to write plugins by Fluentd core
• plugins creates threads, sockets, timers and event loops • writing tests is very hard and messy with sleeps
• Fragmented implementations • Output, BufferedOutput, ObjectBufferedOutput and TimeSlicedOutput
• Mixture of configuration parameters from output&buffer
• Uncontrolled plugin instance lifecycle (no "super" in start/shutdown)
• Imperfect buffering control and useless configurations • the reason why fluent-plugin-forest exists and be used widely
Fluentd v0.12 Plugins• Insufficient buffer chunking control
• only by size, without number of events in chunks
• Forcedly synchronized buffer flushing • no way to flush-and-commit chunks asynchronously
• Ultimate freedom for using mix-ins • everything overrides Plugin#emit ... (the only one entry point for
events to plugins) • no valid hook points to get metrics or something else
• Bad Ruby coding rules and practices • too many classes at "Fluent::*" in fluent/plugin, no "require", ...
And many others!
Compatibility of v0.12 plugins/configurations
Compatibility of plugins• v0.12 plugins are subclass of Fluent::*
• Fluent::Input, Fluent::Filter, Fluent::Output, ...
• Compatibility layers for v0.12 plugins in v0.14 • Fluent::Compat::Klass -> Fluent::Klass (e.g., Input, Output, ...) • it provides transformation of:
• namespaces, configuration parameters • internal APIs, argument objects
• IT SHOULD WORK, except for :P • 3rd party buffer plugin, part of test code • "Engine.emit"
Compatibility of configurations• v0.14 plugins have another set of parameters
• many old-fashioned parameters are removed • "buffer_type", "num_threads", "timezone", "time_slice_format",
"buffer_chunk_limit", "buffer_queue_limit", ...
• Plugin helper "compat_parameters" • transform parameters between v0.12 style
configuration and v0.14 plugin
v0.12 v0.14
convert internally
FAQ: Can we create plugins like this? * it uses v0.14 API * it runs on Fluentd v0.12
Impossible :P
Overview of v0.14 Plugin classes
v0.14 plugin classes
• All files MUST be in `fluent/plugin/*.rb` (in gems) • or just a "*.rb" file in directory specified by "-r"
• All classes MUST be under Fluent::Plugin
• All plugins MUST be subclasses of Fluent::Plugin::Base
• All plugins MUST call `super` in methods overriding default implementation (e.g., #configure, #start, #shutdown, ...)
Classes hierarchy (v0.12)
Fluent::Input F::Filter
F::Output
BufferedOutput
ObjectBuffered
TimeSliced Multi
Output F::BufferF::Parser
F::Formatter
3rd party plugins
Classes hierarchy (v0.14)
F::P::Input F::P::Filter F::P::Output
Fluent::Plugin::Base
F::P::BufferF::P::Parser
F::P::FormatterF::P::Storage
both ofbuffered/non-buffered
F::P::BareOutput(not for 3rd party
plugins)
F::P::MultiOutput
copyroundrobin
Tour of New Plugin APIs:
Fluent::Plugin::Input
Fluent::Plugin::Input• Nothing changed :)
• except for overall rules
• But it's much easier to write plugins than v0.12 :) • fetch HTTP resource per
specified interval • parse response body
with format specified in config
• emit parse result
Fluent::Plugin::Input
Tour of New Plugin APIs:
Fluent::Plugin::Filter
Fluent::Plugin::Filter
• Almost nothing changed :)
• Required: #filter(tag, time, record) #=> record | nil
• Optional: #filter_stream(tag, es) #=> event_stream
Tour of New Plugin APIs:
Fluent::Plugin::Output
Fluent::Plugin::Output• Many things changed!
• Merged Output, BufferedOutput, ObjectBufferedOutput, TimeSlicedOutput
• Output plugins can be • with buffering • without buffering • both (do/doesn't buffering by configuration)
• Buffers chunks events by: • byte size, interval, tag • number of records (new!) • time (by any unit(new!): 30s, 5m, 15m, 3h, ...) • any specified field in records (new!) • any combination of above (new!)
Variations of bufferingNO MORE forest plugin!
Output Plugin: Methods to be implemented• Non-buffered: #process(tag, es)
• Buffered synchronous: #write(chunk)
• Buffered Asynchronous: #try_write(chunk) • New feature for destinations with huge latency to write
chunks • Plugins must call #commit_write(chunk_id) (otherwise,
#try_write will be retried)
• Buffered w/ custom format: #format(tag, time, record) • Without this method, output uses standard format
implement?#process
implement?#process or #write or #try_write
NO error
YES
#prefer_buffered_processingcalled (default true)
NO
non-buffered
YES
exists?<buffer> section
YES implement?#write or #try_write error
NO YES
implement?#write or
#try_write
NO
NO
YES
false
implement?#write and #try_write
YES
#prefer_delayed_commitcalled (default true)
implement? #try_write
syncbuffered
asyncbuffered
In other words :P• If users configure "<buffer>" section
• plugin try to do buffering • Else if plugin implements both (buffering/non-buf)
• plugin call #prefer_buffer_processing to decide • Else plugin does as implemented
• When plugin does bufferingIf plugin implements both (sync/async write) • plugin call #prefer_delayed_commit to decide
• Else plugin does as implemented
Delayed commit (1)• high latency #write operations locks a flush thread for long time
(e.g., ACK in forward)
destination w/ high latency
#write
Output Plugin
send data send ACK
return #write
a flush thread locked
Delayed commit (2)• #try_write & delayed #commit_write
destination w/ high latency
#try_write
Output Plugin
send datasend ACK
return#try_write
async check thread
#commit_write
Use cases: delayed commit
• Forward protocol w/ ACK
• Distributed file systems or databases • put data -> confirm to read data -> commit
• Submit tasks to job queues • submit a job -> detect executed -> commit
Standard chunk format• Buffering w/o #format method
• Almost same with ObjectBufferedOutput
• No need to implement #format always • Implement it for performance/low-latency
• Tool to dump & read buffer chunks on disk w/ standard format • To be implemented in v0.14.x :)
<buffer CHUNK_KEYS>• comma-separated tag, time or ANY_KEYS
• Nothing specified: all events are in same chunk • flushed when chunk is full • (optional) "flush_interval" after first event in chunk
• tag: events w/ same tag are in same chunks
• time: buffer chunks will be split by timekey • timekey: unit of time to be chunked (1m, 15m, 3h, ...) • flushed after expiration of timekey unit + timekey_wait
• ANY_KEYS: any key names in records
• comma-separated tag, time or ANY_KEYS
• Nothing specified: all events are in same chunk • flushed when chunk is full • (optional) "flush_interval" after first event in chunk
• tag: events w/ same tag are in same chunks
• time: buffer chunks will be split by timekey • timekey: unit of time to be chunked (1m, 15m, 3h, ...) • flushed after expiration of timekey unit + timekey_wait
• ANY_KEYS: any key names in records
<buffer CHUNK_KEYS>
BufferedOutput
TimeSlicedOutput
ObjectBufferedOutput
in v0.12
in v0.12
in v0.12
configurations: flushing buffers
• flush_mode: lazy, interval, immediate • default: lazy if "time" specified, otherwise interval
• flush_interval, flush_thread_count • flush_thread_count: number of threads for flushing
• delayed_commit_timeout • output plugin will retry #try_write when expires
Retries, Secondary• Explicit timeout for retries:
• retry_timeout: timeout not to retry anymore • retry_max_times: how many times to retry
• retry_type: "periodic" w/ fixed retry_wait
• retry_secondary_threshold (percentage) • output will use secondary if specified percentage
of retry_timeout elapsed after first error
Buffer parameters• chunk_limit_size
• maximum bytesize per chunks
• chunk_records_limit (default: not specified) • maximum number of records per chunks
• total_limit_size • maximum bytesize which a buffer plugin can use
• (optional) queue_length_limit: no need to specify
Chunk metadata
• Stores various information of buffer chunks • key-values of chunking unit • number of records • created_at, modified_at
• `chunk.metadata` • extract_placeholders(@path, chunk.metadata)
Tour of New Plugin APIs:
Other plugin types
Classes hierarchy (v0.14)
F::P::Input F::P::Filter F::P::Output
Fluent::Plugin::Base
F::P::BufferF::P::Parser
F::P::FormatterF::P::Storage
both ofbuffered/non-buffered
F::P::BareOutput(not for 3rd party
plugins)
F::P::MultiOutput
copyroundrobin
Classes hierarchy (v0.14)
F::P::Input F::P::Filter F::P::Output
Fluent::Plugin::Base
F::P::BufferF::P::Parser
F::P::FormatterF::P::Storage
both ofbuffered/non-buffered
F::P::BareOutput(not for 3rd party
plugins)
F::P::MultiOutput
copyroundrobin"Owned" plugins
"Owned" plugins
• Primary plugins: Input, Output, Filter • Instantiated by Fluentd core
• "Owned" plugins are owned by primary plugins • Buffer, Parser, Formatter, Storage, ... • It can refer owner's plugin id, logger, ... • Fluent::Plugin.new_xxx("kind", parent:@input)
• "Owned" plugins can be configured by owner plugins
Owner plugins can control defaults of owned plugins Fluentd provides standard way to configure owned plugins
Tour of New Plugin APIs:
Fluent::Plugin::Storage
Storage plugins• Pluggable Key-Value store for plugins
• configurable: autosave, persistent, save_at_shutdown • get, fetch, put, delete, update (transactional)
• Various possible implementations • built-in: local (json) on-disk / on-memory • possible: Redis, Consul,
or whatever supports serialize/deserialize of json-like object
• To store states of plugins: • counter values of data-counter plugin • pos data of file plugin
• To load configuration dynamically for plugins: • load configurations from any file systems
Tour of New Plugin APIs:
Plugin Helpers
Plugin Helpers• No more mixin!
• declare to use helpers by "helpers :name"
• Utility functions to support difficult things • creating threads, timers, child processes... • created timers will be stopped automatically in
plugin's shutdown sequence
• Integrated w/ New Test Drivers • tests runs after helpers started everything requested
Plugin Helpers Example• Thread: thread_create, thread_current_running?
• Timer: timer_execute
• ChildProcess: child_process_execute • command, arguments, subprocess_name, interval, immediate,
parallel, mode, stderr, env, unsetenv, chdir, ...
• EventEmitter: router (Output doesn't have router in v0.14 default)
• Storage: storage_create
• (TBD) Socket/Server for TCP/UDP/TLS, Parser, Formatter
Tour of New Plugin APIs:
New Test Drivers
New Test Drivers• Instead of old drivers Fluent::Test::*TestDriver
• Fluent::Test::Driver::Input, Output or Filter • fully emulates actual plugin behavior • w/ override SystemConfig • capturing emitted events & error event streams • inserting TestLogger to capture/test logs of plugins • capturing "format" result of output plugins • controlling "flush" timing of output plugins
• Running tests under control • Plugin Helper integration • conditions to keep/break running tests • timeouts, number of emits/events to stop tests • automatic start/shutdown call for plugins
Plans for v0.14.x
New Features• Symmetric multi processing
• to use 2 or more CPU cores! • by sharing a configuration between all processes • "detach_process" will be deprecated
• forward: TLS + authentication/authorization support • secure-forward integration
• Buffer supports compression & forward it
• Plugin generator & template
New APIs• Controlling global configuration from SystemConfig
• configured via <system> tag • root buffer path + plugin id: remove paths from
each buffers • process total buffer size control
• Counter APIs • counting everything over processes via RPC • creating metrics for a whole fluentd cluster
For v1
v1: stable version of v0.14
• v0.12 plugins will be still supported at v1.0.0 • deprecated, and will be obsoleted at v1.x
• Will be obsoleted: • v0 (traditional) configuration syntax • "detach_process" feature
• Q4 2016?
To Be Written by me :-)• As soooooooooon as possible...
• Plugin developers' guide for • Updating v0.12 plugins with v0.14 APIs • Writing plugins with v0.14 APIs • Writing tests of plugins with v0.14 APIs
• Users' guide for • How to use buffering in general (w/ <buffer>) • Updated plugin documents
Enjoy logging!
Top Related