DRUID - events.csdn.net · 2013 THE PROBLEM ‣ Arbitrary and interactive exploration of time...
Transcript of DRUID - events.csdn.net · 2013 THE PROBLEM ‣ Arbitrary and interactive exploration of time...
DRUID INTERACTIVE EXPLORATORY ANALYTICS AT SCALE
FANGJIN YANG · DRUID COMMITTER
OVERVIEW DEMO SEE SOME NEAT THINGS MOTIVATION WHY DRUID? ARCHITECTURE PICTURES WITH ARROWS COMMUNITY CONTRIBUTE TO DRUID
2013
THE PROBLEM
‣ Arbitrary and interactive exploration of time series data • Ad-tech, system/app metrics, network/website traffic analysis ‣ Multi-tenancy: lots of concurrent users ‣ Scalability: 10+ TB/day, ad-hoc queries on trillions of events ‣ Recency matters! Real-time analysis
DEMO
IN CASE THE INTERNET DIDN’T WORK PRETEND YOU SAW SOMETHING COOL
2015
REQUIREMENTS‣ Scalable & highly available ‣ Real-time data ingestion ‣ Arbitrary data exploration with ad-hoc queries ‣ Sub-second queries ‣ Many concurrent reads
2015
FINDING A SOLUTION
‣ Load all your data into Hadoop. Query it. Done! ‣ Good job guys, let’s go home
2015
FINDING A SOLUTION
Hadoop
Even
t St
ream
s
Insi
ght
2015
PROBLEMS WITH THE NAIVE SOLUTION‣ MapReduce can handle almost every distributed computing
problem ‣ MapReduce over your raw data is flexible but slow ‣ Hadoop is not optimized for query latency ‣ To optimize queries, we need a query layer
2015
FINDING A SOLUTION
Hadoop (pre-processing and storage) Query Layer
Hadoop
Even
t St
ream
s
Insi
ght
2015
MAKE QUERIES FASTER
‣ What types of queries to optimize for? • Business intelligence/OLAP/pivot tables
queries • Aggregations, filters, groupBys
WHAT WE TRIED
2015
FINDING A SOLUTION
Hadoop (pre-processing and storage) RDBMS?
Hadoop
Even
t St
ream
s
Insi
ght
2015
‣ Common solution in data warehousing: • Star Schema • Aggregate Tables • Query Caching
I. RDBMS - THE SETUP
2015
‣ Queries that were cached • fast
‣ Queries against aggregate tables • fast to acceptable
‣ Queries against base fact table • generally unacceptable
I. RDBMS - THE RESULTS
2015
I. RDBMS - PERFORMANCE
Naive benchmark scan rate ~5.5M rows / second / core
1 day of summarized aggregates 60M+ rows
1 query over 1 week, 16 cores ~5 seconds
Page load with 20 queries over a week of data
long time
2015
FINDING A SOLUTION
Hadoop (pre-processing and storage) NoSQL K/V Stores?
Hadoop
Even
t St
ream
s
Insi
ght
2015
‣ Pre-aggregate all dimensional combinations ‣ Store results in a NoSQL store
II. NOSQL - THE SETUP
ts gender age revenue1 M 18 $0.15
1 F 25 $1.03
1 F 18 $0.01
Key Value1 revenue=$1.19
1,M revenue=$0.15
1,F revenue=$1.04
1,18 revenue=$0.16
1,25 revenue=$1.03
1,M,18 revenue=$0.15
1,F,18 revenue=$0.01
1,F,25 revenue=$1.03
2015
‣ Queries were fast • range scan on primary key ‣ Inflexible
• not aggregated, not available ‣ Does not work well with streams
II. NOSQL - THE RESULTS
2015
‣ Processing scales exponentially! ‣ Example: ~500k records
• Precompute 11 dimensions • 4.5 hours on a 15-node Hadoop cluster
• Precompute 14 dimensions • 9 hours on a 25-node Hadoop cluster
II. NOSQL - PERFORMANCE
2015
FINDING A SOLUTION
Hadoop (pre-processing and storage) Commercial Databases
Hadoop
Even
t St
ream
s
Insi
ght
DRUID AS A QUERY LAYER
2013
KEY FEATURES LOW LATENCY INGESTION
FAST AGGREGATIONS ARBITRARY SLICE-N-DICE CAPABILITIES
HIGHLY AVAILABLE APPROXIMATE & EXACT CALCULATIONS
DRUID
DATA STORAGE
2015
DATA!timestamp page language city country ... added deleted2011-01-01T00:01:35Z Justin Bieber en SF USA 10 652011-01-01T00:03:63Z Justin Bieber en SF USA 15 622011-01-01T00:04:51Z Justin Bieber en SF USA 32 452011-01-01T01:00:00Z Ke$ha en Calgary CA 17 872011-01-01T02:00:00Z Ke$ha en Calgary CA 43 992011-01-01T02:00:00Z Ke$ha en Calgary CA 12 53...
2015
PARTITION DATAtimestamp page language city country ... added deleted
2011-01-01T00:01:35Z Justin Bieber en SF USA 10 652011-01-01T00:03:63Z Justin Bieber en SF USA 15 622011-01-01T00:04:51Z Justin Bieber en SF USA 32 45
2011-01-01T01:00:00Z Ke$ha en Calgary CA 17 87
2011-01-01T02:00:00Z Ke$ha en Calgary CA 43 992011-01-01T02:00:00Z Ke$ha en Calgary CA 12 53
‣ Shard data by time ‣ Immutable chunks of data called “segments”
Segment 2011-01-01T02/2011-01-01T03
Segment 2011-01-01T01/2011-01-01T02
Segment 2011-01-01T00/2011-01-01T01
2015
IMMUTABLE SEGMENTS‣ Fundamental storage unit in Druid ‣ No contention between reads and writes ‣ One thread scans one segment ‣ Multiple threads can access same underlying data
2015
COLUMNAR STORAGE
‣ Scan/load only what you need ‣ Compression! ‣ Indexes!
timestamp page language city country ... added deleted2011-01-01T00:01:35Z Justin Bieber en SF USA 10 652011-01-01T00:03:63Z Justin Bieber en SF USA 15 622011-01-01T00:04:51Z Justin Bieber en SF USA 32 452011-01-01T01:00:00Z Ke$ha en Calgary CA 17 872011-01-01T02:00:00Z Ke$ha en Calgary CA 43 992011-01-01T02:00:00Z Ke$ha en Calgary CA 12 53...
2015
COLUMN COMPRESSION · DICTIONARIES
‣ Create ids • Justin Bieber -> 0, Ke$ha -> 1 ‣ Store
• page -> [0 0 0 1 1 1] • language -> [0 0 0 0 0 0]
timestamp page language city country ... added deleted2011-01-01T00:01:35Z Justin Bieber en SF USA 10 652011-01-01T00:03:63Z Justin Bieber en SF USA 15 622011-01-01T00:04:51Z Justin Bieber en SF USA 32 452011-01-01T01:00:00Z Ke$ha en Calgary CA 17 872011-01-01T02:00:00Z Ke$ha en Calgary CA 43 992011-01-01T02:00:00Z Ke$ha en Calgary CA 12 53...
2015
BITMAP INDICES
‣ Justin Bieber -> [0, 1, 2] -> [111000] ‣ Ke$ha -> [3, 4, 5] -> [000111]
timestamp page language city country ... added deleted2011-01-01T00:01:35Z Justin Bieber en SF USA 10 652011-01-01T00:03:63Z Justin Bieber en SF USA 15 622011-01-01T00:04:51Z Justin Bieber en SF USA 32 452011-01-01T01:00:00Z Ke$ha en Calgary CA 17 872011-01-01T02:00:00Z Ke$ha en Calgary CA 43 992011-01-01T02:00:00Z Ke$ha en Calgary CA 12 53...
2015
FAST AND FLEXIBLE QUERIES
JUSTIN BIEBER [1, 1, 0, 0]
KE$HA [0, 0, 1, 1]
JUSTIN BIEBER OR
KE$HA [1, 1, 1, 1]
row page0 Justin(Bieber1 Justin(Bieber2 Ke$ha3 Ke$ha
ARCHITECTURE
2015
ARCHITECTURE (BATCH ONLY)
Historical Node
Historical Node
Historical Node
HadoopData
Segments
2015
‣ Main workhorses of a Druid cluster ‣ Scan segments ‣ Shared-nothing architecture
HISTORICAL NODES
2015
ARCHITECTURE (BATCH ONLY)
Broker Node
Historical Node
Historical Node
Historical Node
Broker Node
QueriesHadoopData
Segments
2015
‣ Knows which nodes hold what data ‣ Query scatter/gather (send requests to nodes and merge results) ‣ Caching
BROKER NODES
2015
EVOLVING A SOLUTION
Hadoop (pre-processing and storage) Druid
Hadoop
Even
t St
ream
s
Insi
ght
2015
MORE PROBLEMS‣ We’ve solved the query problem
• Druid gave us arbitrary data exploration & fast queries ‣ But what about data freshness?
• Batch loading is slow! • We want “real-time” • Alerts, operational monitoring, etc.
2015
FAST LOADING WITH DRUID‣ We have an indexing system ‣ We have a serving system that runs queries on data ‣ We can serve queries while building indexes! ‣ Real-time indexing workers do this
2015
‣ Write-optimized data structure: hash map in heap
‣ Read-optimized data structure: Druid segments
‣ Convert write optimized -> read optimized
‣ Query data as soon as it is ingested ‣ Log-structured merge-tree
REAL-TIME NODES
2015
ARCHITECTURE (STREAMING-ONLY)
Broker Node
Historical Node
Historical Node
Historical Node
Broker Node
QueriesReal-time Nodes
Streaming Data
Segments
2015
ARCHITECTURE (LAMBDA)
Broker Node
Historical Node
Historical Node
Historical Node
Broker Node
Queries
HadoopBatch Data
Segments
Real-time Nodes
Streaming Data
Segments
2015
APPROXIMATE ANSWERS‣ Drastically reduce storage space and compute time
• Cardinality estimation • Histograms and Quantiles • Funnel analysis • Add your own proprietary modules
2015
PRODUCTION READY‣ High availability through replication ‣ Rolling restarts ‣ 4 years of no down time for software updates and restarts ‣ Battle tested
DRUID TODAY
2015
THE COMMUNITY‣ Growing Community
• 120+ contributors from many different companies • In production at many different companies, we’re hoping for more!
• Ad-tech, network traffic, operations, activity streams, etc. • We love contributions!
2014
REALTIME INGESTION >3M EVENTS / SECOND SUSTAINED (200B+ EVENTS/DAY)
10 – 100K EVENTS / SECOND / CORE
DRUID IN PRODUCTION
2014
CLUSTER SIZE >500TB OF SEGMENTS (>30 TRILLION RAW EVENTS)
>5000 CORES (>350 NODES, >100TB RAM)
IT’S CHEAPMOST COST EFFECTIVE AT THIS SCALE
DRUID IN PRODUCTION
2014
0.0
0.5
1.0
1.5
0
1
2
3
4
0
5
10
15
20
90%ile
95%ile
99%ile
Feb 03 Feb 10 Feb 17 Feb 24time
quer
y tim
e (s
econ
ds)
datasource
a
b
c
d
e
f
g
h
Query latency percentiles
QUERY LATENCY (500MS AVERAGE) 90% < 1S 95% < 2S 99% < 10S
DRUID IN PRODUCTION
2014
QUERY VOLUME SEVERAL HUNDRED QUERIES / SECOND
VARIETY OF GROUP BY & TOP-K QUERIES
DRUID IN PRODUCTION
DRUID AND THE DATA INFRASTRUCTURE SPACE
2015
STREAMING SETUP
Hadoop (pre-processing and storage) Druid
Hadoop
Even
t St
ream
s
Insi
ght
Kafka Samza Druid
2015
STREAMING ONLY INGESTION‣ Stream processing isn’t perfect ‣ Difficult to handle corrections of existing data ‣ Windows may be too small for fully accurate operations ‣ Hadoop was actually good at these things
2015
OPEN SOURCE LAMBDA ARCHITECTUREEv
ent
Stre
ams
Insi
ght
Kafka
Hadoop
Druid
Samza‣ Real-time ‣ Only on-time data
‣ Some hours later ‣ All data
2015
TAKE-AWAYS‣ When Druid?
• You want to power user-facing data applications • You want to do your analysis on data as it’s happening (realtime) • Arbitrary data exploration with sub-second ad-hoc queries • OLAP, BI, Pivot (anything involved aggregates) • You need availability, extensibility and flexibility
MY INFORMATION
[email protected] twitter @fangjin LinkedIn fangjin
THANK YOU