Post on 08-Jan-2017
Streaming, Database & Distributed Systems:
Bridging the Divide Ben Stopford (@benstopford)
Codemesh 2016
Event Driven Systems
Most stateful systems have to pull from these three worlds
Today we have 2 goals
1. Understand Stateful Stream Processing (now & near future)
2. Case for SSP as a general framework for building data-centric systems.
Data systems come in different forms
• Database (OLTP)
• Analytics Database (OLAP/Hadoop)
• Messaging
• Distributed log
• Stream Processing
• Stateful Stream Processing
Database (OLTP)
Focuses on providing a consistent view that supports updates and queries on individual tuples.
Analytics Database (OLAP/Hadoop) 1. Focuses on aggregations via table scans.
2. Executes as distributed system
Messaging Focuses on asynchronous information transfer with limited state
Distributed Log
1. Similar to messaging, but data can be retained
2. Executes as distributed system (scale + fault tolerance)
Stream Processing
Manipulate concurrent streams of events
Comes from CEP background (ephemeral)
Stateful Stream Processing Moves stream processing to be a more general framework for building data-centric systems.
What is stream processing?
Data Index
Query Engine
Query Engine
vs
Database Finite source
Stream Processor Infinite source
Infinite streams need windows
How many items will we bring into the machine at one time?
Windows bound a computation
How many items will we bring into the machine at one time?
Buffering allows us to handle late events
How many items will we bring into the machine at one time?
Some query Over some time window Emitting at some frequency
Continually executing query
Stream(s)
Stream Processing Engine
Derived Stream
Avg(p.time – o.time) From orders, payment Group by payment.region over 1 day window emitting every second
Stream Processing
orders !
payments!
Completion time, by region!
Avg(o.time – p.time) From orders, payment Group by payment.region over 1 day window emitting every second
Materialised View (DB )
Query
orders !
payments!
Completion time, by region!
Avg(o.time – p.time) From orders, payment, user Group by user.region over 1 day window emitting every second
Stateful Stream Processing
Streams
Stream Processing Engine
Derived Stream
Query
Derived “Table” Table
“View” is output as table or stream
Table == Stream + Window0n
== 0 N
Table is a stream with an infinite window (i.e. buffer from 0 -> now)
window !
SSP is about creating materialised views.
Materialised as a table, or materialised as a stream
Features: similar to database query engine
Join Filter Aggr- egate
View Windowed Streams
Can distribute over many machines in two dimensions
Join Filter Aggr- egate
View
Join Filter Aggr- egate
View
Join Filter Aggr- egate
View
Scale Out Scale Forward
Stateful Stream Processing engines typically use Kafka (a distributed commit log)
Join Filter Aggr- egate
View
Kafka (a distributed log)
A log is very simple idea
Messages are added at the end of the log
Just think of the log as a file
Old New
Readers have a position & scan
Sally is here
George is here
Fred is here
Old New
Scan Scan
Scan
Can “Rewind & Replay” the log
Rewind & Replay
Compacted Log (Tabular View)
Version 3
Version 2
Version 1
Version 2
Version 1
Version 5
Version 4
Version 3
Version 2
Version 1
Version 2
Version 3
Version 5
STEAM (All versions)
COMPACTED STREAM (Latest Key only)
The log is a Distributed System
For scalability and fault tolerance
Shard on the way in
Producers
Kafka
Consumers
Each shard is a queue
Producers
Kafka
Consumers
Producers
Kafka
Many consumers share partitions
in one topic
Consumers share consumption of a single topic
The Log reassigns data on failure
Producers
Kafka
Many consumers share partitions in
one topic
Kafka supplies two levels of leader election
Replicas in Kafka have an elected leader
Consumers in Kafka have an elected leader
The log is important for SSP
Maintains History: Acts like a “push based” distributed file system
The log is important: Two Primitives
Stream
Compacted Stream (‘table’)
The Log is, to a streaming engine, what HDFS is to Hadoop
But it’s a bit more than a HDFS replacement: Processors inherit the idea of “membership” from the log
So stateful Stream Processors use the Log
Join Filter Aggr- egate
View
Kafka (Distributed Log)
They also use local storage
Join Filter Aggr- egate
View
(1) a Kafka
(2) Local KV Store
Local KV store has a few uses
(1) It caches streams on disk (2) It caches “tables” on disk
Join Filter Aggr- egate
View
This makes join operations fast as they’re entirely local
Streams just cache recent messages to help with joins
Tables are fully “realised” locally
Stateful Stream Processing
stream
Compacted stream
Join
Stream data
Stream-Tabular Data
Infinite Stream
Locally Cached Table
(disk resident)
Kafka Kafka Streams
e.g. Useful for Enrichment
stream
Compacted stream
Join
Orders
Customers
Kafka Kafka Streams
Local DB
Aggregates need intermediary state
stream
Compacted stream
Join
Orders
Customers
Kafka Sum(orders) group by region
Persist current value, in case we fail
State store inherits durability from the log
State store flushes back to the log
Join Filter Aggr- egate
View
Separate Data, Processing & View
View
Orders Payments View
View
Storage Layer (a Kafka)
Processing & View
Query
You can query the views from anywhere
View
Orders Payments View
View
Storage Layer (a Kafka)
Processing & View
Query
So what happens on failure?
View
Orders Payments View
View
Storage Layer (a Kafka)
Processing & View
Clustering Reroutes Data to surviving node
View
Orders Payments View
View
Storage Layer (Kafka)
Ownership of partitions is re-routed from dead node
Processing & View
But what about state?
View
Orders Payments View
View
Storage Layer (Kafka)
“Cold” replica of state takes over
Processing & View
Primitives for sharding & replication
Stock
Orders Payments Stock
Stock
Redundant copies are cached on other nodes
Sharding spread data over processors
So processors inherit much from the log
Clustering comes from the log
You just write the functional bit
General framework for distributed, realtime data computation
Protection from broker failure
Protection from engine failure
Join tables & streams (in process)
Event Driven
Create views which can be queried
Query
But stream processing has a
problem
Correctness Guarantees in multi layer topologies
Join Filter Aggr- egate
View
Join Filter Aggr- egate
View
Join Filter Aggr- egate
View
Join Filter Aggr- egate
View
Join Filter Aggr- egate
View
Join Filter Aggr- egate
View
Join Filter Aggr- egate
View
Join Filter Aggr- egate
View
Join Filter Aggr- egate
View
Duplicates are a side effect of all at-least-once delivery mechanisms
Data is rerouted, on failure, which can cause duplicates
Idempotance isn’t enough
Join Filter Aggr- egate
View
Join Filter Aggr- egate
View
Filter
Join Filter Aggr- egate
View
Join Filter Aggr- egate
View
Distributed Snapshots* (transactions)
Join Filter Aggr- egate
View
Join Filter Aggr- egate
View
Join Filter Aggr- egate
View
Transaction markers: [Begin], [Prepare], [Commit], [Abort]
Buffer
Chandy, Lamport - Distributed Snapshots: Determining Global States of Distributed Systems
*In development in Kafka
So why use these tools?
(1) Streaming is a superset of batch
Databases look backwards
Batch == Streaming from offset 0
Query
Query
Query
Distributed File System (HDFS)
Query
Query
Query
Distributed Log (Kafka)
MPP Batch System MPP Streaming System
Streaming is the superset of batch
Streaming
Batch
Database
Global, Linearisible consistency model
(2) Separates store & view
“Engine” part is lightweight but stateful
Storage Just a java process which uses a library
Log handles fault tolerance of both layers
Separates Concerns of Model & View – Think MVC
Storage View & Controller
Model
Physically Separates Read & Write – Think CQRS
Storage View & Controller
Model
Database vs SSP
Data Index
Query Engine
Query Engine
vs
Database Stateful Stream Processor
Query
Query
View
Index Data
(3) Decentralised approaches are more general
Rather than pushing processing into an “appliance”
(code -> data)
Centralised Processing
App
Data Decentric Architecture
Distributed Log
Decentralised Processing over many user-specific views
This more general than than just
analytics use cases
It’s more than taking a database and adding push
notifications
Whether you’re building a hulking, multistage, analytic platform
Query
Final View
Intermediary View (2)
Intermediary View (1)
Or a simple microservice that needs to run hot-hot & scale
Business Logic Manage local
state
Join various streams
Hot secondary instance
Composable Primatives
Declarative Function
Traditional DB
Work Distribution
Replication
Sharding
Query Engine
Distributed DB Distributed Systems
Membership
Global Consistency
General framework for distributed, event-driven data computation
Protection from broker failure
Protection from engine failure
Join tables & streams (in process)
Event Driven
Create views which can be queried
Query
Stateful Stream Processing
Framework for building a streaming data systems, just for you “~)
Find out more:
• http://www.confluent.io/blog/introducing-kafka-streams-stream-processing-made-simple/
• https://martin.kleppmann.com/2015/02/11/database-inside-out-at-salesforce.html
• http://cidrdb.org/cidr2015/Papers/CIDR15_Paper16.pdf
• https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/cidr07p42.pdf
• http://highscalability.com/blog/2015/5/4/elements-of-scale-composing-and-scaling-data-platforms.html
• https://speakerdeck.com/bobbycalderwood/commander-decoupled-immutable-rest-apis-with-kafka-streams
• https://timothyrenner.github.io/engineering/2016/08/11/kafka-streams-not-looking-at-facebook.html
• https://www.madewithtea.com/processing-tweets-with-kafka-streams.html
• http://www.infolace.com/blog/2016/07/14/simple-spatial-windowing-with-kafka-streams/
• http://www.slideshare.net/zacharycox/updating-materialized-views-and-caches-using-kafka
The end
@benstopford http://benstopford.com