Cabs, Cassandra, and Hailo (at Cassandra EU)

68
#CASSANDRAEU Cabs, Cassandra, and Hailo David Gardner, Architect at Hailo CASSANDRASUMMITEU

description

My talk from #CassandraEU covering Hailo's use of Cassandra including insight from developers, operations and management, plus lessons learned.

Transcript of Cabs, Cassandra, and Hailo (at Cassandra EU)

Page 1: Cabs, Cassandra, and Hailo (at Cassandra EU)

#CASSANDRAEU

Cabs, Cassandra, and Hailo

David Gardner, Architect at Hailo

CASSANDRASUMMITEU

Page 2: Cabs, Cassandra, and Hailo (at Cassandra EU)

#CASSANDRAEU CASSANDRASUMMITEU

Page 3: Cabs, Cassandra, and Hailo (at Cassandra EU)

#CASSANDRAEU CASSANDRASUMMITEU

Page 4: Cabs, Cassandra, and Hailo (at Cassandra EU)

#CASSANDRAEU CASSANDRASUMMITEU

• 1,352 changed files with 235,413 additions and 47,487 deletions

• 7,429 commits

• 1,653 tickets completed

https://github.com/apache/cassandra/compare/cassandra-0.6.0...cassandra-1.2

https://github.com/apache/cassandra/blob/trunk/CHANGES.txt

0.6 to 1.2

Page 5: Cabs, Cassandra, and Hailo (at Cassandra EU)

#CASSANDRAEU CASSANDRASUMMITEU

Cassandra adoption at Hailo from three perspectives:

1. Development

2. Operational

3. Management

What this talk is about

Page 6: Cabs, Cassandra, and Hailo (at Cassandra EU)

#CASSANDRAEU CASSANDRASUMMITEU

What is Hailo?

Hailo is The Taxi Magnet. Use Hailo to get a cab wherever you are, whenever you want.

Page 7: Cabs, Cassandra, and Hailo (at Cassandra EU)

#CASSANDRAEU CASSANDRASUMMITEU

Page 8: Cabs, Cassandra, and Hailo (at Cassandra EU)

#CASSANDRAEU CASSANDRASUMMITEU

Page 9: Cabs, Cassandra, and Hailo (at Cassandra EU)

#CASSANDRAEU CASSANDRASUMMITEU

Page 10: Cabs, Cassandra, and Hailo (at Cassandra EU)

#CASSANDRAEU CASSANDRASUMMITEU

• The world’s highest-rated taxi app – over 11,000 five-star reviews

• Over 500,000 registered passengers

• A Hailo hail is accepted around the world every 4 seconds

• Hailo operates in 15 cities on 3 continents from Tokyo to Toronto in nearly 2 years of operation

What is Hailo?

Page 11: Cabs, Cassandra, and Hailo (at Cassandra EU)

#CASSANDRAEU CASSANDRASUMMITEU

• Hailo is a marketplace that facilitates over $100M in run-rate transactions and is making the world a better place for passengers and drivers

• Hailo has raised over $50M in financing from the world's best investors including Union Square Ventures, Accel, the founder of Skype (via Atomico), Wellington Partners (Spotify), Sir Richard Branson, and our CEO's mother, Janice

Hailo is growing

Page 12: Cabs, Cassandra, and Hailo (at Cassandra EU)

#CASSANDRAEU CASSANDRASUMMITEU

The history

The story behind Cassandra adoption at Hailo

Page 13: Cabs, Cassandra, and Hailo (at Cassandra EU)

#CASSANDRAEU CASSANDRASUMMITEU

Hailo launched in London in November 2011

• Launched on AWS

• Two PHP/MySQL web apps plus a Java backend

• Mostly built by a team of 3 or 4 backend engineers

• MySQL multi-master for single AZ resilience

Page 14: Cabs, Cassandra, and Hailo (at Cassandra EU)

#CASSANDRAEU CASSANDRASUMMITEU

Why Cassandra?

• A desire for greater resilience – “become a utility”Cassandra is designed for high availability

• Plans for international expansion around a single consumer appCassandra is good at global replication

• Expected growthCassandra scales linearly for both reads and writes

• Prior experienceI had experience with Cassandra and could recommend it

Page 15: Cabs, Cassandra, and Hailo (at Cassandra EU)

#CASSANDRAEU CASSANDRASUMMITEU

The path to adoption

• Largely unilateral decision by developers – a result of a startup culture

• Replacement of key consumer app functionality, splitting up the PHP/MySQL web app into a mixture of global PHP/Java services backed by a Cassandra data store

• Launched into production in September 2012 – originally just powering North American expansion, before gradually switching over Dublin and London

Page 16: Cabs, Cassandra, and Hailo (at Cassandra EU)

#CASSANDRAEU CASSANDRASUMMITEU

One year on...

• Further breakdown of functionality into Go/Java SOA

• Migrating all online databases to Cassandra

Page 17: Cabs, Cassandra, and Hailo (at Cassandra EU)

#CASSANDRAEU CASSANDRASUMMITEU

Development perspective

Page 18: Cabs, Cassandra, and Hailo (at Cassandra EU)

#CASSANDRAEU CASSANDRASUMMITEU

“Cassandra just works”

Dom W, Senior Engineer

Page 19: Cabs, Cassandra, and Hailo (at Cassandra EU)

#CASSANDRAEU CASSANDRASUMMITEU

Use cases

1. Entity storage

2. Time series data

Page 20: Cabs, Cassandra, and Hailo (at Cassandra EU)

#CASSANDRAEU CASSANDRASUMMITEU

CF = customers

126007613634425612:createdTimestamp: 1370465412email: [email protected]: DavefamilyName: Gardnerlocale: en_GBphone:

+447911111111

Page 21: Cabs, Cassandra, and Hailo (at Cassandra EU)

#CASSANDRAEU CASSANDRASUMMITEU

Considerations for entity storage

• Do not read the entire entity, update one property and then write back a mutation containing every column

• Only mutate columns that have been set

• This avoids read-before-write race conditions

Page 22: Cabs, Cassandra, and Hailo (at Cassandra EU)

#CASSANDRAEU CASSANDRASUMMITEU

Page 23: Cabs, Cassandra, and Hailo (at Cassandra EU)

#CASSANDRAEU CASSANDRASUMMITEU

CF = stats_db

2013-06-01:55374fa0-ce2b-11e2-8b8b-0800200c9a66:

{“action”:”…a48bd800-ce2b-11e2-8b8b-0800200c9a66:

{“action”:”…b0e15850-ce2b-11e2-8b8b-0800200c9a66:

{“action”:”…bfac6c80-ce2b-11e2-8b8b-0800200c9a66:

{“action”:”…

Page 24: Cabs, Cassandra, and Hailo (at Cassandra EU)

#CASSANDRAEU CASSANDRASUMMITEU

CF = stats_db

LON123456:13b247f0-ce2c-11e2-8b8b-0800200c9a66:

{“action”:”…20f70a40-ce2c-11e2-8b8b-0800200c9a66:

{“action”:”…2b44d3b0-ce2c-11e2-8b8b-0800200c9a66:

{“action”:”…338a22f0-ce2c-11e2-8b8b-0800200c9a66:

{“action”:”…

Page 25: Cabs, Cassandra, and Hailo (at Cassandra EU)

#CASSANDRAEU CASSANDRASUMMITEU

Page 26: Cabs, Cassandra, and Hailo (at Cassandra EU)

#CASSANDRAEU CASSANDRASUMMITEU

Considerations for time series storage

• Choose row key carefully, since this partitions the records

• Think about how many records you want in a single row

• Denormalise on write into many indexes

Page 27: Cabs, Cassandra, and Hailo (at Cassandra EU)

#CASSANDRAEU CASSANDRASUMMITEU

Client libraries

• Gossie (Go)

• Astyanax (Java)

• phpcassa (PHP)

Page 28: Cabs, Cassandra, and Hailo (at Cassandra EU)

#CASSANDRAEU CASSANDRASUMMITEU

Analytics

• With Cassandra we lost the ability to carry out analyticseg: COUNT, SUM, AVG, GROUP BY

• We use Acunu Analytics to give us this abilty in real time, for pre-planned query templates

• It is backed by Cassandra and therefore highly available, resilient and globally distributed

• Integration is straightforward

Page 29: Cabs, Cassandra, and Hailo (at Cassandra EU)

NSQ Acunu C*events

#CASSANDRAEU CASSANDRASUMMITEU

Page 30: Cabs, Cassandra, and Hailo (at Cassandra EU)

#CASSANDRAEU CASSANDRASUMMITEU

AQL

SELECT SUM(accepted), SUM(ignored), SUM(declined), SUM(withdrawn)FROM AllocationsWHERE timestamp BETWEEN '1 week ago' AND 'now’ AND driver='LON123456789’GROUP BY timestamp(day)

Page 31: Cabs, Cassandra, and Hailo (at Cassandra EU)

#CASSANDRAEU CASSANDRASUMMITEU

Page 32: Cabs, Cassandra, and Hailo (at Cassandra EU)

#CASSANDRAEU CASSANDRASUMMITEU

Operational perspective

Page 33: Cabs, Cassandra, and Hailo (at Cassandra EU)

#CASSANDRAEU CASSANDRASUMMITEU

“Allows a team of 2 to achieve things they wouldn’t have considered before Cassandra existed”Chris H, Operations Engineer

Page 34: Cabs, Cassandra, and Hailo (at Cassandra EU)

#CASSANDRAEU CASSANDRASUMMITEU

Page 35: Cabs, Cassandra, and Hailo (at Cassandra EU)

#CASSANDRAEU CASSANDRASUMMITEU

3 clusters

6 machines per region

3 regions

(stats cluster is a long story) O

pera

tion

al C

luste

rS

tats

Clu

ste

r

ap-southeast-1

us-east-1 eu-west-1

us-east-1 eu-west-1

Page 36: Cabs, Cassandra, and Hailo (at Cassandra EU)

AZ1

eu-west-1

AZ1

AZ2 AZ2

AZ3 AZ3

AZ1

us-east-1

AZ1

AZ2 AZ2

AZ3 AZ3

AZ1

ap-southeast-1

AZ1

AZ2 AZ2

AZ3 AZ3

#CASSANDRAEU CASSANDRASUMMITEU

Page 37: Cabs, Cassandra, and Hailo (at Cassandra EU)

#CASSANDRAEU CASSANDRASUMMITEU

AWS VPCs with Open VPN links

3 AZs per region

m1.large machines

Provisoned IOPS EBS

Op

era

tion

al C

luste

rS

tats

Clu

ste

r

~ 1TB/node

~ 200GB/node

Page 38: Cabs, Cassandra, and Hailo (at Cassandra EU)

#CASSANDRAEU CASSANDRASUMMITEU

Backups

• SSTable snapshot

• Used to upload to S3, but this was taking >6 hours and consuming all our network bandwidth

• Now take EBS snapshot of the data volumes

Page 39: Cabs, Cassandra, and Hailo (at Cassandra EU)

#CASSANDRAEU CASSANDRASUMMITEU

Encryption

• Requirement for NYC launch

• We use dmcrypt to encrypt the entire EBS volume

• Chose dmcrypt because it is uncomplicated

• Our tests show a 1% performance hit in disk performance, which concurs with what Amazon suggest

Page 40: Cabs, Cassandra, and Hailo (at Cassandra EU)

#CASSANDRAEU CASSANDRASUMMITEU

Datastax Ops Centre is a quick win

Page 41: Cabs, Cassandra, and Hailo (at Cassandra EU)

#CASSANDRAEU CASSANDRASUMMITEU

Multi DC

• Something that Cassandra makes trivial

• Would have been very difficult to accomplish active-active inter-DC replication with a team of 2 without Cassandra

• Rolling repair needed to make it safe (we use LOCAL_QUORUM)

• We schedule “narrow repairs” on different nodes in our cluster each night

Page 42: Cabs, Cassandra, and Hailo (at Cassandra EU)

#CASSANDRAEU CASSANDRASUMMITEU

Compression

• Our stats cluster was running at ~1.5TB per node

• We didn’t want to add more nodes

• With compression, we are now back to ~600GB

• Easy to accomplish

• `nodetool upgradesstables` on a rolling schedule

Page 43: Cabs, Cassandra, and Hailo (at Cassandra EU)

#CASSANDRAEU CASSANDRASUMMITEU

Management perspective

Page 44: Cabs, Cassandra, and Hailo (at Cassandra EU)

#CASSANDRAEU CASSANDRASUMMITEU

“The days of the quick and dirty are over”Simon V, EVP Operations

Page 45: Cabs, Cassandra, and Hailo (at Cassandra EU)

#CASSANDRAEU CASSANDRASUMMITEU

Technically, everything is fine…

• Our COO feels that C* is “technically good and beautiful”, a “perfectly good option”

• Our EVPO says that C* reminds him of a time series database in use at Goldman Sachs that had “very good performance”

…but there are concerns

Page 46: Cabs, Cassandra, and Hailo (at Cassandra EU)

#CASSANDRAEU CASSANDRASUMMITEU

People who canattempt to queryMySQL

People who canattempt to

query Cassandra

Page 47: Cabs, Cassandra, and Hailo (at Cassandra EU)

#CASSANDRAEU CASSANDRASUMMITEU

Page 48: Cabs, Cassandra, and Hailo (at Cassandra EU)

#CASSANDRAEU CASSANDRASUMMITEU

Lessons learned

Page 49: Cabs, Cassandra, and Hailo (at Cassandra EU)

#CASSANDRAEU CASSANDRASUMMITEU

There might be a gulf in experience

Page 50: Cabs, Cassandra, and Hailo (at Cassandra EU)

#CASSANDRAEU CASSANDRASUMMITEU

10 Average years experience per team

member

MySQL Cassandra

Page 51: Cabs, Cassandra, and Hailo (at Cassandra EU)

#CASSANDRAEU CASSANDRASUMMITEU

Lesson learned

• Have an advocate - get someone who will sell the vision internally

• Learn the theory - teach each team member the fundamentals

• Make an effort to get everyone on board

Page 52: Cabs, Cassandra, and Hailo (at Cassandra EU)

#CASSANDRAEU CASSANDRASUMMITEU

Things can drift into failure

Page 53: Cabs, Cassandra, and Hailo (at Cassandra EU)

#CASSANDRAEU CASSANDRASUMMITEU

Page 54: Cabs, Cassandra, and Hailo (at Cassandra EU)

#CASSANDRAEU CASSANDRASUMMITEU

Page 55: Cabs, Cassandra, and Hailo (at Cassandra EU)

#CASSANDRAEU CASSANDRASUMMITEU

Page 56: Cabs, Cassandra, and Hailo (at Cassandra EU)

#CASSANDRAEU CASSANDRASUMMITEU

Page 57: Cabs, Cassandra, and Hailo (at Cassandra EU)

#CASSANDRAEU CASSANDRASUMMITEU

Page 58: Cabs, Cassandra, and Hailo (at Cassandra EU)

#CASSANDRAEU CASSANDRASUMMITEU

Lesson learned

• Be pro-active with Cassandra, even if it seems to be running smoothly

• Peer-review data models, take time to think about them

• Big rows are bad - use cfstats to look for them

• Mixed workloads can cause problems - use cfhistograms and look out for signs of data modeling problems

• Think about the compaction strategy for each CF

Page 59: Cabs, Cassandra, and Hailo (at Cassandra EU)

#CASSANDRAEU CASSANDRASUMMITEU

EBS is terrible

Page 60: Cabs, Cassandra, and Hailo (at Cassandra EU)

#CASSANDRAEU CASSANDRASUMMITEU

Lessons learned

• EBS is nearly always the cause of Amazon outages

• EBS is a single point of failure (it will fail everywhere in your cluster)

• EBS is slow

• EBS is expensive

• EBS is unnecessary!

Page 61: Cabs, Cassandra, and Hailo (at Cassandra EU)

#CASSANDRAEU CASSANDRASUMMITEU

Management need to know the trade offs

Page 62: Cabs, Cassandra, and Hailo (at Cassandra EU)

#CASSANDRAEU CASSANDRASUMMITEU

Lessons learned

• Keep the business informed – explain the tradeoffs in simple terms

• Sing from the same hymn sheet

• Make sure there solutions in place for every use case from the beginning

Page 63: Cabs, Cassandra, and Hailo (at Cassandra EU)

#CASSANDRAEU CASSANDRASUMMITEU

People who canattempt to queryMySQL

People who canattempt to

query Cassandra

Page 64: Cabs, Cassandra, and Hailo (at Cassandra EU)

#CASSANDRAEU CASSANDRASUMMITEU

Conclusions

Page 65: Cabs, Cassandra, and Hailo (at Cassandra EU)

#CASSANDRAEU CASSANDRASUMMITEU

We like Cassandra

• Solid design

• HA characteristics

• Easy multi-DC setup

• Simplicity of operation

Page 66: Cabs, Cassandra, and Hailo (at Cassandra EU)

#CASSANDRAEU CASSANDRASUMMITEU

Lessons for successful adoption

• Have an advocate, sell the dream

• Learn the fundamentals, get the best out of Cassandra

• Invest in tools to make life easier

• Keep management in the loop, explain the trade offs

Page 67: Cabs, Cassandra, and Hailo (at Cassandra EU)

#CASSANDRAEU CASSANDRASUMMITEU

The future

• We will continue to invest in Cassandra as we expand globally

• We will hire people with experience running Cassandra

• We will focus on expanding our reporting facilities

• We aspire to extend our network (1M consumer installs, wallet) beyond cabs

• We will continue to hire the best engineers in London, NYC and Asia

Page 68: Cabs, Cassandra, and Hailo (at Cassandra EU)

#CASSANDRAEU CASSANDRASUMMITEU

Questions?