Heka - Rob Miller

Post on 26-Jan-2015

118 views 0 download

description

 

Transcript of Heka - Rob Miller

HekaUnified Data Processing

So. Much. Data.

So. Much. Data.

•Server level ops data

•Process level data

•Ops data / metrics

•Business data

•Logging output

•Error reports / tracebacks

So. Many. Tools.

•collectd / tcollector

•statsd / graphite / etc.

•[r]syslog[-ng]

•Logstash

•Riemann / Esper / other CEP

•Nagios / Zenoss

One Basic Pattern

•Acquire data

•Transform and/or Transport data

•Output data

One Multi-Tool?

What would it be like to build a tool to tackle this in the general case?

Wins:

•Fewer processes to manage

•Increased client / configuration consistency

•Processing shared across domains

One Multi-Tool?

Requirements:

•Lightweight

•Flexible and configurable

•Easily extended

I know, I know...

BUT!

Replacing even two services on each box is a net ops win.

SCIENCE!

How Heka Is Put Together

Inputs

•Listen or fetch

•Just about the low level transport

Splitters

•Slice Inputs' raw data streams into discrete events

•Text or binary protocols

•Decouple protocols from their transports

Decoders

•Parse event data to populate a metadata envelope for all event types

•Extract structure from unstructured data...

•... or just wrap a blob

•Sandbox-able (Lua)

Router

Simple, efficient grammar for matching messages:

Type == "counter" && Payload == "1"

Type == "applog" && Logger == "marketplace"

Type == "alert" && (Severity==7 || Payload=="emergency")

Type == "myapp.metric" && Fields[name] =~ /.*\.stat/

Filters

•Watch flowing data

•Generate output messages

•Sandbox-able (Lua)

Outputs

•Deliver to external service...

•… and/or to upstream Heka...

•… and/or directly to Heka Dashboard UI

•Configurable reconnect

Sandboxes Are Fun!

• Dynamically added to running Heka w/ no config changes, no restart

● CPU cycles and RAM usage monitored

● Misbehaving plugins are shut off

Sandboxes Are Fun!

• LPeg (parsing expression grammar) & JSON libraries for data parsing

• Circular buffer library for time series data

Sandboxes Are Fun!

Circular buffers auto-generate dashboard graphs

Try It Out

https://github.com/mozilla-services/heka

http://hekad.readthedocs.org

https://mail.mozilla.org/listinfo/heka

irc.mozilla.org, #heka

rmiller@mozilla.com