Druid at SF Big Analytics 2015-12-01

DRUID INTERACTIVE EXPLORATORY ANALYTICS AT SCALE

GIAN MERLINO · DRUID COMMITTER · COFOUNDER @ IMPLY

OVERVIEW MOTIVATION WHY DRUID? DEMO AN EXAMPLE APPLICATION ARCHITECTURE HIGH LEVEL OVERVIEW COMMUNITY CONTRIBUTE TO DRUID

2013

HISTORY & MOTIVATION

‣ Druid was started in 2011 ‣ Power interactive data applications ‣ Multi-tenancy: lots of concurrent users ‣ Scalability: trillions events/day, sub-second queries ‣ Real-time analysis

HISTORY & MOTIVATION

‣ Questions lead to more questions ‣ Dig into the dataset using filters, aggregates, and comparisons ‣ All interesting queries cannot be determined upfront

DEMO

IN CASE THE INTERNET DIDN’T WORK PRETEND YOU SAW SOMETHING COOL

2015

A GENERAL SOLUTION?

‣ Load all your data into Hadoop. Query it. Done! ‣ Good job guys, let’s go home

2015

FINDING A SOLUTION

Hadoop

Even

t St

ream

s

Insi

ght

2015

FINDING A SOLUTION

Hadoop (pre-processing and storage) Query Layer

Hadoop

Even

t St

ream

s

Insi

ght

POSSIBLE SOLUTIONS

2015

MAKE QUERIES FASTER‣ Optimizing business intelligence (OLAP) queries

• Aggregate measures over time, broken down by dimensions • Revenue over time broken down by product type • Top selling products by volume in San Francisco • Number of unique visitors broken down by age • Not dumping the entire dataset • Not examining individual events

2015

FINDING A SOLUTION

Hadoop (pre-processing and storage) Sharded RDBMS?

Hadoop

Even

t St

ream

s

Insi

ght

2015

‣ The idea • Row store • Star schema • Aggregate tables • Query cache

‣ But! • Scanning raw data is slow and expensive

GENERAL PURPOSE RDBMS

2015

FINDING A SOLUTION

Hadoop (pre-processing and storage) NoSQL K/V Stores?

Hadoop

Even

t St

ream

s

Insi

ght

2015

‣ Pre-computation • Pre-compute every possible query • Pre-compute a subset of queries • Exponential scaling costs

‣ Range scans • Primary key: dimensions/attributes • Value: measures/metrics (things to aggregate) • Still too slow!

KEY/VALUE STORES

2015

FINDING A SOLUTION

Hadoop (pre-processing and storage) Column Stores

Hadoop

Even

t St

ream

s

Insi

ght

2015

‣ Load/scan exactly what you need for a query ‣ Different compression algorithms for different columns ‣ Encoding for string columns ‣ Compression for measure columns

‣ Different indexes for different columns

COLUMN STORES

2013

KEY FEATURES LOW LATENCY INGESTION

FAST AGGREGATIONS ARBITRARY SLICE-N-DICE CAPABILITIES

HIGHLY AVAILABLE APPROXIMATE & EXACT CALCULATIONS

DRUID

DATA STORAGE

2015

DATA!timestamp page language city country ... added deleted2011-01-01T00:01:35Z Justin Bieber en SF USA 10 652011-01-01T00:01:63Z Justin Bieber en SF USA 15 622011-01-01T01:02:51Z Justin Bieber en SF USA 32 452011-01-01T01:01:11Z Ke$ha en Calgary CA 17 872011-01-01T01:02:24Z Ke$ha en Calgary CA 43 992011-01-01T02:03:12Z Ke$ha en Calgary CA 12 53...

2015

PRE-AGGREGATION/ROLL-UP

timestamp page language city country ... added deleted2011-01-01T00:00:00Z Justin Bieber en SF USA 25 1272011-01-01T01:00:00Z Justin Bieber en SF USA 32 452011-01-01T01:00:00Z Ke$ha en Calgary CA 60 1862011-01-01T02:00:00Z Ke$ha en Calgary CA 12 53...

timestamp page language city country ... added deleted2011-01-01T00:01:35Z Justin Bieber en SF USA 10 652011-01-01T00:01:63Z Justin Bieber en SF USA 15 622011-01-01T01:02:51Z Justin Bieber en SF USA 32 452011-01-01T01:01:11Z Ke$ha en Calgary CA 17 872011-01-01T01:02:24Z Ke$ha en Calgary CA 43 992011-01-01T02:03:12Z Ke$ha en Calgary CA 12 53...

2015

PARTITION DATAtimestamp page language city country ... added deleted

2011-01-01T00:00:00Z Justin Bieber en SF USA 25 127

2011-01-01T01:00:00Z Justin Bieber en SF USA 32 452011-01-01T01:00:00Z Ke$ha en Calgary CA 60 186

2011-01-01T02:00:00Z Ke$ha en Calgary CA 12 53

‣ Shard data by time ‣ Immutable blocks of data called “segments”

Segment 2011-01-01T02/2011-01-01T03

Segment 2011-01-01T01/2011-01-01T02

Segment 2011-01-01T00/2011-01-01T01

2015

IMMUTABLE SEGMENTS‣ Fundamental storage unit in Druid ‣ No contention between reads and writes ‣ One thread scans one segment ‣ Multiple threads can access same underlying data

2015

COLUMNAR STORAGE

‣ Scan/load only what you need ‣ Compression! ‣ Indexes!


2013

COLUMN COMPRESSION · DICTIONARIES

‣ Create ids • Justin Bieber -> 0, Ke$ha -> 1

‣ Store • page -> [0 0 0 1 1 1] • language -> [0 0 0 0 0 0]


2013

BITMAP INDICES

‣ Justin Bieber -> [0, 1, 2] -> [111000] ‣ Ke$ha -> [3, 4, 5] -> [000111]


2013

FAST AND FLEXIBLE QUERIES

JUSTIN BIEBER [1, 1, 0, 0]

KE$HA [0, 0, 1, 1]

JUSTIN BIEBER OR

KE$HA [1, 1, 1, 1]

row page0 Justin(Bieber1 Justin(Bieber2 Ke$ha3 Ke$ha

ARCHITECTURE

2015

ARCHITECTURE (BATCH ONLY)

Historical Node

Historical Node

Historical Node

HadoopData

Segments

2015

‣ Main workhorses of a Druid cluster ‣ Respond to queries on segments ‣ Shared-nothing architecture

HISTORICAL NODES

2015

ARCHITECTURE (BATCH ONLY)

Broker Node

Historical Node

Historical Node

Historical Node

Broker Node

QueriesHadoopData

Segments

2015

‣ Knows which nodes hold what data ‣ Query scatter/gather (send requests to nodes and merge results) ‣ Caching

BROKER NODES

2015

EVOLVING A SOLUTION

Hadoop (pre-processing and storage) Druid

Hadoop

Even

t St

ream

s

Insi

ght

2015

MORE PROBLEMS‣ We’ve solved the query problem

• Druid gave us arbitrary data exploration & fast queries

‣ But what about data freshness? • Batch loading is slow! • We want “real-time” • Alerts, operational monitoring, etc.

2015

FAST LOADING WITH DRUID‣ We have an indexing system ‣ We have a serving system that runs queries on data ‣ We can serve queries while building indexes! ‣ Real-time indexing workers do this

2015

‣ Write-optimized data structure: hash map in heap

‣ Convert write optimized -> read optimized

‣ Read-optimized data structure: Druid segments

‣ Query data immediately

REAL-TIME NODES

Memory

Segment

Events

QueriesConvert

2015

ARCHITECTURE (STREAMING-ONLY)

Broker Node

Historical Node

Historical Node

Historical Node

Broker Node

QueriesReal-time Nodes

Streaming Data

Segments

2015

ARCHITECTURE (LAMBDA)

Broker Node

Historical Node

Historical Node

Historical Node

Broker Node

Queries

HadoopBatch Data

Segments

Real-time Nodes

Streaming Data

Segments

2015

APPROXIMATE ANSWERS‣ Drastically reduce storage space and compute time

• Cardinality estimation • Histograms • Quantiles • Add your own proprietary modules

2015

QUERY INTERFACE‣ Query libraries:

• JSON over HTTP • SQL • R • Python • Ruby • Perl

‣ UIs • Pivot • Grafana • Panoramix

DRUID TODAY

2015

THE COMMUNITY‣ Growing Community

• 130+ contributors from many different companies • In production at many different companies, we’re hoping for more!

• Ad-tech, network traffic, operations, activity streams, etc. • We love contributions!

2015

PRODUCTION READY‣ High availability through replication ‣ Rolling restarts ‣ 4 years of no down time for software updates and restarts ‣ Battle tested ‣ Used by hundreds of companies in production

2014

REALTIME INGESTION >3M EVENTS / SECOND SUSTAINED (200B+ EVENTS/DAY)

10 – 100K EVENTS / SECOND / CORE

DRUID IN PRODUCTION

2014

CLUSTER SIZE>500TB OF SEGMENTS (>50 TRILLION RAW EVENTS)

>5000 CORES (>400 NODES, >100TB RAM)

IT’S CHEAPMOST COST EFFECTIVE AT THIS SCALE

DRUID IN PRODUCTION

2014

0.0

0.5

1.0

1.5

0

1

2

3

4

0

5

10

15

20

90%ile

95%ile

99%ile

Feb 03 Feb 10 Feb 17 Feb 24time

quer

y tim

e (s

econ

ds)

datasource

a

b

c

d

e

f

g

h

Query latency percentiles

QUERY LATENCY (500MS AVERAGE) 90% < 1S 95% < 2S 99% < 10S

DRUID IN PRODUCTION

2014

QUERY VOLUME SEVERAL HUNDRED QUERIES / SECOND

VARIETY OF GROUP BY & TOP-K QUERIES

DRUID IN PRODUCTION

TAKE AWAYS

2015

TAKE-AWAYS‣ When Druid?

• You want to power user-facing data applications • You want to do your analysis on data as it’s happening (realtime) • Arbitrary data exploration with sub-second ad-hoc queries • OLAP, BI, Pivot (anything involved aggregates) • You need availability, extensibility and flexibility

DRUID IS OPEN SOURCE

WWW.DRUID.IO twitter @druidio irc.freenode.net #druid-dev

http://druid.IO

MY INFORMATION

[email protected] twitter @gianmerlino LinkedIn gianmerlino

THANK YOU

Druid at SF Big Analytics 2015-12-01

Technology

Transcript of Druid at SF Big Analytics 2015-12-01