C* Summit 2013: Dude, Where's My Tweet? Taming the Twitter Firehose by Andrew Noonan

20
#Cassandra2013 Dude, Where’s My Tweet? Taming the Twitter Firehose Andrew Noonan Software Engineer at Gnip @noonanisms

description

Gnip ingests and must serve out hundreds of millions of social activities every day and social platforms are only growing. This makes the scalability of applications essential for Gnip. Enter Cassandra. Problem solved, right? Not exactly, Gnip's relationship with Cassandra was not all rainbows and unicorns. In this session we will walk you through why we began looking at Cassandra as a data store in the first place and the valuable lessons we with Cassandra that has made it an invaluable part of our infrastructure.

Transcript of C* Summit 2013: Dude, Where's My Tweet? Taming the Twitter Firehose by Andrew Noonan

Page 1: C* Summit 2013: Dude, Where's My Tweet? Taming the Twitter Firehose by Andrew Noonan

#Cassandra2013

Dude, Where’s My Tweet? Taming the Twitter Firehose

Andrew Noonan Software Engineer at Gnip

@noonanisms

Page 2: C* Summit 2013: Dude, Where's My Tweet? Taming the Twitter Firehose by Andrew Noonan

#Cassandra2013

Gnip

Cassandra

Rainbows

Unicorns

???  

Page 3: C* Summit 2013: Dude, Where's My Tweet? Taming the Twitter Firehose by Andrew Noonan

#Cassandra2013

Page 4: C* Summit 2013: Dude, Where's My Tweet? Taming the Twitter Firehose by Andrew Noonan

#Cassandra2013

Social Data

Page 5: C* Summit 2013: Dude, Where's My Tweet? Taming the Twitter Firehose by Andrew Noonan

#Cassandra2013

90% of Fortune 500

120 Billion Activities Delivered Per Month

Page 6: C* Summit 2013: Dude, Where's My Tweet? Taming the Twitter Firehose by Andrew Noonan

#Cassandra2013

Lots-O-Data

Redundancy & Reliability

Availability

Page 7: C* Summit 2013: Dude, Where's My Tweet? Taming the Twitter Firehose by Andrew Noonan

#Cassandra2013

Page 8: C* Summit 2013: Dude, Where's My Tweet? Taming the Twitter Firehose by Andrew Noonan

#Cassandra2013

High Write Throughput ✔

Scalable ✔

Highly Available ✔

Persistent ✔

Page 9: C* Summit 2013: Dude, Where's My Tweet? Taming the Twitter Firehose by Andrew Noonan

#Cassandra2013

Right?

Page 10: C* Summit 2013: Dude, Where's My Tweet? Taming the Twitter Firehose by Andrew Noonan

#Cassandra2013

Not Exactly…

Page 11: C* Summit 2013: Dude, Where's My Tweet? Taming the Twitter Firehose by Andrew Noonan

#Cassandra2013

No Maintenance? Bad Idea

Begin Maintenance -> 2X Data Growth

Scalable, Right?

Bootstrap Failures Due To Cluster Load

Page 12: C* Summit 2013: Dude, Where's My Tweet? Taming the Twitter Firehose by Andrew Noonan

#Cassandra2013

Reconsider (Life) Choices?

Page 13: C* Summit 2013: Dude, Where's My Tweet? Taming the Twitter Firehose by Andrew Noonan

#Cassandra2013

Size Tiered Compaction vs Leveled Compaction

How Much Data To Store Per Node

Your Write Pattern Matters Too

Page 14: C* Summit 2013: Dude, Where's My Tweet? Taming the Twitter Firehose by Andrew Noonan

#Cassandra2013

compaction_throughput_mb_per_sec

16-32X write rate?

Lots-o-options – explore them

Page 15: C* Summit 2013: Dude, Where's My Tweet? Taming the Twitter Firehose by Andrew Noonan

#Cassandra2013

Lookup by Tweet ID

Read Rate < Write Rate

Dynamic ColumnFamilies

Page 16: C* Summit 2013: Dude, Where's My Tweet? Taming the Twitter Firehose by Andrew Noonan

#Cassandra2013

For realz this time!?

Page 17: C* Summit 2013: Dude, Where's My Tweet? Taming the Twitter Firehose by Andrew Noonan

#Cassandra2013

Page 18: C* Summit 2013: Dude, Where's My Tweet? Taming the Twitter Firehose by Andrew Noonan

#Cassandra2013

Bloom Filter False Positive Rate

Index Intervals

Only Change Schema On One Node! (For Now)

Page 19: C* Summit 2013: Dude, Where's My Tweet? Taming the Twitter Firehose by Andrew Noonan

#Cassandra2013

You Won’t Always Fit The Mold and That’s Okay

Explore Your Options No Matter What

Understand The Consequences Of Your Choices

Staging Environment Identical To Production

Page 20: C* Summit 2013: Dude, Where's My Tweet? Taming the Twitter Firehose by Andrew Noonan

#Cassandra2013

www.gnip.com

@noonanisms