Druid at SF Big Analytics 2015-12-01
-
Upload
gianmerlino -
Category
Technology
-
view
1.071 -
download
5
Transcript of Druid at SF Big Analytics 2015-12-01
DRUID INTERACTIVE EXPLORATORY ANALYTICS AT SCALE
GIAN MERLINO · DRUID COMMITTER · COFOUNDER @ IMPLY
OVERVIEW MOTIVATION WHY DRUID? DEMO AN EXAMPLE APPLICATION ARCHITECTURE HIGH LEVEL OVERVIEW COMMUNITY CONTRIBUTE TO DRUID
2013
HISTORY & MOTIVATION
‣ Druid was started in 2011 ‣ Power interactive data applications ‣ Multi-tenancy: lots of concurrent users ‣ Scalability: trillions events/day, sub-second queries ‣ Real-time analysis
HISTORY & MOTIVATION
‣ Questions lead to more questions ‣ Dig into the dataset using filters, aggregates, and comparisons ‣ All interesting queries cannot be determined upfront
DEMO
IN CASE THE INTERNET DIDN’T WORK PRETEND YOU SAW SOMETHING COOL
2015
A GENERAL SOLUTION?
‣ Load all your data into Hadoop. Query it. Done! ‣ Good job guys, let’s go home
2015
FINDING A SOLUTION
Hadoop
Even
t St
ream
s
Insi
ght
2015
FINDING A SOLUTION
Hadoop (pre-processing and storage) Query Layer
Hadoop
Even
t St
ream
s
Insi
ght
POSSIBLE SOLUTIONS
2015
MAKE QUERIES FASTER‣ Optimizing business intelligence (OLAP) queries
• Aggregate measures over time, broken down by dimensions • Revenue over time broken down by product type • Top selling products by volume in San Francisco • Number of unique visitors broken down by age • Not dumping the entire dataset • Not examining individual events
2015
FINDING A SOLUTION
Hadoop (pre-processing and storage) Sharded RDBMS?
Hadoop
Even
t St
ream
s
Insi
ght
2015
‣ The idea • Row store • Star schema • Aggregate tables • Query cache
‣ But! • Scanning raw data is slow and expensive
GENERAL PURPOSE RDBMS
2015
FINDING A SOLUTION
Hadoop (pre-processing and storage) NoSQL K/V Stores?
Hadoop
Even
t St
ream
s
Insi
ght
2015
‣ Pre-computation • Pre-compute every possible query • Pre-compute a subset of queries • Exponential scaling costs
‣ Range scans • Primary key: dimensions/attributes • Value: measures/metrics (things to aggregate) • Still too slow!
KEY/VALUE STORES
2015
FINDING A SOLUTION
Hadoop (pre-processing and storage) Column Stores
Hadoop
Even
t St
ream
s
Insi
ght
2015
‣ Load/scan exactly what you need for a query ‣ Different compression algorithms for different columns ‣ Encoding for string columns ‣ Compression for measure columns
‣ Different indexes for different columns
COLUMN STORES
DRUID
2013
KEY FEATURES LOW LATENCY INGESTION
FAST AGGREGATIONS ARBITRARY SLICE-N-DICE CAPABILITIES
HIGHLY AVAILABLE APPROXIMATE & EXACT CALCULATIONS
DRUID
DATA STORAGE
2015
DATA!timestamp page language city country ... added deleted2011-01-01T00:01:35Z Justin Bieber en SF USA 10 652011-01-01T00:01:63Z Justin Bieber en SF USA 15 622011-01-01T01:02:51Z Justin Bieber en SF USA 32 452011-01-01T01:01:11Z Ke$ha en Calgary CA 17 872011-01-01T01:02:24Z Ke$ha en Calgary CA 43 992011-01-01T02:03:12Z Ke$ha en Calgary CA 12 53...
2015
PRE-AGGREGATION/ROLL-UP
timestamp page language city country ... added deleted2011-01-01T00:00:00Z Justin Bieber en SF USA 25 1272011-01-01T01:00:00Z Justin Bieber en SF USA 32 452011-01-01T01:00:00Z Ke$ha en Calgary CA 60 1862011-01-01T02:00:00Z Ke$ha en Calgary CA 12 53...
timestamp page language city country ... added deleted2011-01-01T00:01:35Z Justin Bieber en SF USA 10 652011-01-01T00:01:63Z Justin Bieber en SF USA 15 622011-01-01T01:02:51Z Justin Bieber en SF USA 32 452011-01-01T01:01:11Z Ke$ha en Calgary CA 17 872011-01-01T01:02:24Z Ke$ha en Calgary CA 43 992011-01-01T02:03:12Z Ke$ha en Calgary CA 12 53...
2015
PARTITION DATAtimestamp page language city country ... added deleted
2011-01-01T00:00:00Z Justin Bieber en SF USA 25 127
2011-01-01T01:00:00Z Justin Bieber en SF USA 32 452011-01-01T01:00:00Z Ke$ha en Calgary CA 60 186
2011-01-01T02:00:00Z Ke$ha en Calgary CA 12 53
‣ Shard data by time ‣ Immutable blocks of data called “segments”
Segment 2011-01-01T02/2011-01-01T03
Segment 2011-01-01T01/2011-01-01T02
Segment 2011-01-01T00/2011-01-01T01
2015
IMMUTABLE SEGMENTS‣ Fundamental storage unit in Druid ‣ No contention between reads and writes ‣ One thread scans one segment ‣ Multiple threads can access same underlying data
2015
COLUMNAR STORAGE
‣ Scan/load only what you need ‣ Compression! ‣ Indexes!
timestamp page language city country ... added deleted2011-01-01T00:01:35Z Justin Bieber en SF USA 10 652011-01-01T00:03:63Z Justin Bieber en SF USA 15 622011-01-01T00:04:51Z Justin Bieber en SF USA 32 452011-01-01T01:00:00Z Ke$ha en Calgary CA 17 872011-01-01T02:00:00Z Ke$ha en Calgary CA 43 992011-01-01T02:00:00Z Ke$ha en Calgary CA 12 53...
2013
COLUMN COMPRESSION · DICTIONARIES
‣ Create ids • Justin Bieber -> 0, Ke$ha -> 1
‣ Store • page -> [0 0 0 1 1 1] • language -> [0 0 0 0 0 0]
timestamp page language city country ... added deleted2011-01-01T00:01:35Z Justin Bieber en SF USA 10 652011-01-01T00:03:63Z Justin Bieber en SF USA 15 622011-01-01T00:04:51Z Justin Bieber en SF USA 32 452011-01-01T01:00:00Z Ke$ha en Calgary CA 17 872011-01-01T02:00:00Z Ke$ha en Calgary CA 43 992011-01-01T02:00:00Z Ke$ha en Calgary CA 12 53...
2013
BITMAP INDICES
‣ Justin Bieber -> [0, 1, 2] -> [111000] ‣ Ke$ha -> [3, 4, 5] -> [000111]
timestamp page language city country ... added deleted2011-01-01T00:01:35Z Justin Bieber en SF USA 10 652011-01-01T00:03:63Z Justin Bieber en SF USA 15 622011-01-01T00:04:51Z Justin Bieber en SF USA 32 452011-01-01T01:00:00Z Ke$ha en Calgary CA 17 872011-01-01T02:00:00Z Ke$ha en Calgary CA 43 992011-01-01T02:00:00Z Ke$ha en Calgary CA 12 53...
2013
FAST AND FLEXIBLE QUERIES
JUSTIN BIEBER [1, 1, 0, 0]
KE$HA [0, 0, 1, 1]
JUSTIN BIEBER OR
KE$HA [1, 1, 1, 1]
row page0 Justin(Bieber1 Justin(Bieber2 Ke$ha3 Ke$ha
ARCHITECTURE
2015
ARCHITECTURE (BATCH ONLY)
Historical Node
Historical Node
Historical Node
HadoopData
Segments
2015
‣ Main workhorses of a Druid cluster ‣ Respond to queries on segments ‣ Shared-nothing architecture
HISTORICAL NODES
2015
ARCHITECTURE (BATCH ONLY)
Broker Node
Historical Node
Historical Node
Historical Node
Broker Node
QueriesHadoopData
Segments
2015
‣ Knows which nodes hold what data ‣ Query scatter/gather (send requests to nodes and merge results) ‣ Caching
BROKER NODES
2015
EVOLVING A SOLUTION
Hadoop (pre-processing and storage) Druid
Hadoop
Even
t St
ream
s
Insi
ght
2015
MORE PROBLEMS‣ We’ve solved the query problem
• Druid gave us arbitrary data exploration & fast queries
‣ But what about data freshness? • Batch loading is slow! • We want “real-time” • Alerts, operational monitoring, etc.
2015
FAST LOADING WITH DRUID‣ We have an indexing system ‣ We have a serving system that runs queries on data ‣ We can serve queries while building indexes! ‣ Real-time indexing workers do this
2015
‣ Write-optimized data structure: hash map in heap
‣ Convert write optimized -> read optimized
‣ Read-optimized data structure: Druid segments
‣ Query data immediately
REAL-TIME NODES
Memory
Segment
Events
QueriesConvert
2015
ARCHITECTURE (STREAMING-ONLY)
Broker Node
Historical Node
Historical Node
Historical Node
Broker Node
QueriesReal-time Nodes
Streaming Data
Segments
2015
ARCHITECTURE (LAMBDA)
Broker Node
Historical Node
Historical Node
Historical Node
Broker Node
Queries
HadoopBatch Data
Segments
Real-time Nodes
Streaming Data
Segments
2015
APPROXIMATE ANSWERS‣ Drastically reduce storage space and compute time
• Cardinality estimation • Histograms • Quantiles • Add your own proprietary modules
2015
QUERY INTERFACE‣ Query libraries:
• JSON over HTTP • SQL • R • Python • Ruby • Perl
‣ UIs • Pivot • Grafana • Panoramix
DRUID TODAY
2015
THE COMMUNITY‣ Growing Community
• 130+ contributors from many different companies • In production at many different companies, we’re hoping for more!
• Ad-tech, network traffic, operations, activity streams, etc. • We love contributions!
2015
PRODUCTION READY‣ High availability through replication ‣ Rolling restarts ‣ 4 years of no down time for software updates and restarts ‣ Battle tested ‣ Used by hundreds of companies in production
2014
REALTIME INGESTION >3M EVENTS / SECOND SUSTAINED (200B+ EVENTS/DAY)
10 – 100K EVENTS / SECOND / CORE
DRUID IN PRODUCTION
2014
CLUSTER SIZE>500TB OF SEGMENTS (>50 TRILLION RAW EVENTS)
>5000 CORES (>400 NODES, >100TB RAM)
IT’S CHEAPMOST COST EFFECTIVE AT THIS SCALE
DRUID IN PRODUCTION
2014
0.0
0.5
1.0
1.5
0
1
2
3
4
0
5
10
15
20
90%ile
95%ile
99%ile
Feb 03 Feb 10 Feb 17 Feb 24time
quer
y tim
e (s
econ
ds)
datasource
a
b
c
d
e
f
g
h
Query latency percentiles
QUERY LATENCY (500MS AVERAGE) 90% < 1S 95% < 2S 99% < 10S
DRUID IN PRODUCTION
2014
QUERY VOLUME SEVERAL HUNDRED QUERIES / SECOND
VARIETY OF GROUP BY & TOP-K QUERIES
DRUID IN PRODUCTION
TAKE AWAYS
2015
TAKE-AWAYS‣ When Druid?
• You want to power user-facing data applications • You want to do your analysis on data as it’s happening (realtime) • Arbitrary data exploration with sub-second ad-hoc queries • OLAP, BI, Pivot (anything involved aggregates) • You need availability, extensibility and flexibility
MY INFORMATION
[email protected] twitter @gianmerlino LinkedIn gianmerlino
THANK YOU