Streamy, Pipy, Analyticy
-
Upload
darach -
Category
Technology
-
view
2.035 -
download
0
description
Transcript of Streamy, Pipy, Analyticy
Copyright Push Technology 2012
LNUG London
January 2013
Copyright Push Technology 2012 [email protected]
About me?
• Distributed Systems / HPC guy.
• Chief Scien*st :-‐ at Push Technology
• Responds to: Guinness, Whisky
• TwiOer: @darachennis
Copyright Push Technology 2012
Streamy Pipy
Analy*cy
Copyright Push Technology 2012
EEP + ‘Streams & Pipes’= CEP
• An experiment in Embedded Event Processing • Sliding, Tumbling, Monotonic and Periodic windows • Separate ‘window’ definiYon from operaYon • Aggregate funcYons. Window of data produces scalar result
• But? No filtering, branching or combinators, no flows …
• That’s a job for Streams & Pipes. Let’s add that.
eep.js: Func*onal Opera*ons on Streaming Data Windows
S Cw ww w Q
Copyright Push Technology 2012
Windows
Copyright Push Technology 2012
Windows + Aggregate FuncYons
• A window of data is a slice of data over Yme, number of events or some other dimension
• An aggregate funcYon is something you do in the context of a window.
What is this? • Average – Aggregate Func*on • CPU – Data (events) • On a second by second basis -‐ Periodic Yme window
Example
Copyright Push Technology 2012
Tumbling Windows
• Every N events, give me an average of the last N events • Does not overlap windows • ‘Closing’ a window, ‘Emits’ a result (the average) • Closing a window, Opens a new window
What is a tumbling window?
1 2 3 4
2 3 4 5
2 3 4 5
t0 t1 t2 t3 t4 t5 t6 t7 t8 t9 t10 t11 ...
init()
init()
init()
emit()
emit()
emit()
x() x() x() x()
x() x() x() x()
x() x() x() x()
Copyright Push Technology 2012
Sliding Windows
• Like tumbling, except can overlap. • But typically O(N2), Keep N small. Except EEP.js. O(N) perf.
• Every event opens a new window. • Ader N events, every subsequent event emits a result. • Like all windows, cost of calculaYon amorYzed over events
What is a sliding window?
1 2 3 4
1 2 3 4
1 2 3 ..
1 2 .. ..
5
..
..
..
..
..
init()
x()
x()
x()
..
.. ..
..
..
..
..
..
t0 t1 t2 t3 t4 t5 t6 t7 t8 t9 t10 t11 ...
Copyright Push Technology 2012
Periodic Windows
• Driven by ‘wall clock Yme’ in milliseconds • Not monotonic, natch. Beware of NTP
What is a periodic window?
1 2 3 4
2 3 4 5
2 3 4 5
t0 t1 t2 t3 ...
init()
init()
init()
emit()
emit()
emit()
x() x() x() x()
x() x() x() x()
x() x() x() x()
Copyright Push Technology 2012
Monotonic Windows
• Driven mad by ‘wall clock Yme’? Need a logical clock? • No worries. Provide your own clock! Eg. Vector clock
What is a monotonic window?
1 2 3 4
2 3 4 5
2 3 4 5
t0 t1 t2 t3 ...
init()
init()
init()
emit()
emit()
emit()
x() x() x() x()
x() x() x() x()
x() x() x() x()
my my my
Copyright Push Technology 2012
Slide beOer with CompensaYng Aggregates
1
1 2 3 4
1 2 3 4
1 2 3 ..
1 2 .. ..
5
..
..
..
..
..
init()
x()
x()
x()
..
.. ..
..
..
..
..
..
t0 t1 t2 t3 t4 t5 t6 t7 t8 t9 t10 t11 ...
do { … } while (…)
compensate()
Copyright Push Technology 2012
Bad Sliding -‐ O(N2)
Copyright Push Technology 2012
Good Sliding
• Takes us from O(N2) to O(N) for Sliding windows
Copyright Push Technology 2012
EEP.js is fast
Copyright Push Technology 2012
Using Sliding, Tumbling Windows
Copyright Push Technology 2012
Using Periodic, Monotonic Windows
Copyright Push Technology 2012
Custom clocks (noYon of Yme)
Copyright Push Technology 2012
EEP.js v0.1, v0.2 were ugly babies.
Sorry! Swear, the next version will be just as funcYonal but preOy…
Copyright Push Technology 2012
Streams & Pipes
Copyright Push Technology 2012
What about Streams & Pipes?
S C Q
w ww weep
????
+
Copyright Push Technology 2012
Streams & Pipes: Origins
• Do one thing. Do it well • Compose sophisYcated behaviors from simple parts
• Maximize reuse • Unix, ‘Chain of Responsibility’ (GoF), Interceptor (POSA2), XPipe, Builder, …
• The ‘Assembly Line Principle’ is nothing new
Copyright Push Technology 2012
Streams & Pipes: Node.JS
• var events = require(‘events’) • Publish/Subscribe to event (streams)
• var stream = require(‘stream’) • Readable – Consume a (finite) set of events • Writable – Produce a (finite) set of events • readable.pipe(writeable) • writeable.pipe(readable)
Copyright Push Technology 2012
Streams & Pipes: streams2
• Transform – Compress, Encrypt, Encode, … • Duplex – Readable and Writable • Passthrough – The canonical ‘noop’ transform
• Node.js Streams history (so far) hOp://bit.ly/XupqkO -‐ by @izs
Copyright Push Technology 2012
Streams & Pipes: but …
• Oriented for IO, not compute/analy*cs • Array-‐like buffers not individual datums • @dominictarr event-‐streams? Array based • ASCII, UTF-‐8, Binary -‐ not JS types • Oden require copying, parsing, … (slow)
• So, streams & pipes for JS types? Yes! • Do one thing. Do it well • Compose sophisYcated simple parts • Maximize reuse
Copyright Push Technology 2012
Introducing Beam.js
Copyright Push Technology 2012
Beams, Pipes
• Streams & Pipes for analyYcs • Not designed for IO. Use Streams for that
• Not concerned with CEP. • … Use EEP for that? J
• Not concerned with arrays of things • … Use Dominic Tarr’s event-‐stream for that
• Beam • Crunch events • Pipeline, Branch & Combine
Copyright Push Technology 2012
Beams & Pipes.
• Streams & Pipes, reconsidered for JS types
• var Beam = require(‘beam’);
• Beam.Source -‐-‐ Push data in • Beam.Sink -‐-‐ Suck analysis out • Beam.Operator -‐-‐ OODA / PDCA
• Really Simple: ~150 LOC
Copyright Push Technology 2012
Beams & Pipes: Operators
• Three types of operator • Transform • 1 in, 1 out. Output data/type may differ
• Filter • 1 in, 1 or none out. Output data/type same as input
• Custom • May transform, filter
Copyright Push Technology 2012
Example: Defini*ons
Copyright Push Technology 2012
Example: Usage
Copyright Push Technology 2012
Example: Easy to debug …
Copyright Push Technology 2012
Example: Streams & Beams
Copyright Push Technology 2012
Branch
• You can define 1 or many • They can overlap or not as you see fit • It’s just an applicaYon of predicate (boolean) filters • Simple
Copyright Push Technology 2012
Combine?
• You can combine many sources or branches into one • Works like a union. First in, first out. • You can write your own. It’s just an Operator • You can branch from, combine to … any beam
Copyright Push Technology 2012
Streams & Pipes, ++
• In Node.js the definiYon and usage of streams in a pipe are entangled. • Typically, with Streams & Pipes for IO, you only ever want one. • In algorithms you may want to reuse. • Think about it …
• Event EmiOer. 1 square … 2 branches?
Copyright Push Technology 2012
Pipes ++
• Beam Pipes are different (& really really really simple) • You can define a filter once • You can store it in a module • Store like opera*ons together • Make libraries
• Use ‘em. Share ‘em.
Copyright Push Technology 2012
EEP based on Beam soon!
Copyright Push Technology 2012
Un*l then?
• npm install beam
• Filter data events • Transform data events • Analyze, crunch all the things • Branch all the things • Combine all the things
Copyright Push Technology 2012
Beam futures?
• Taps – Convert events into beams • Drain – Convert beams into events • Beams • Write Beam operators in ‘beam’ • Beams ‘inside’ beams • Source.pipe(op).compile(); // Maybe?
Copyright Push Technology 2012
Ques*ons
Copyright Push Technology 2012 [email protected]
QuesYons?
• Thank you for listening to, having me • Le twiOer: @darachennis
• hOps://github.com/darach/beam-‐js
hOps://github.com/darach/eep-‐js
• npm install eep npm install beam
• EEP built on beam? EEP in other langs? Soon
• Fork it, Port it, Enjoy it!