Building a Distributed Reservation System with Cassandra (Andrew Baker & Jeffrey Carpenter, Choice...

28
Jeff Carpenter and Andrew Baker Choice Hotels International Building a Distributed Reservation System Using Cassandra

Transcript of Building a Distributed Reservation System with Cassandra (Andrew Baker & Jeffrey Carpenter, Choice...

Page 1: Building a Distributed Reservation System with Cassandra (Andrew Baker & Jeffrey Carpenter, Choice Hotels) | C* Summit 2016

Jeff Carpenter and Andrew BakerChoice Hotels International

Building a Distributed Reservation System Using Cassandra

Page 2: Building a Distributed Reservation System with Cassandra (Andrew Baker & Jeffrey Carpenter, Choice Hotels) | C* Summit 2016

1 The Problem – Replacing a Reservation System

2 Managing Consistency

3 Microservices and Data Integrity

4 Schema Evolution

5 Data Retention and TTL

6 Performance and Cost Tradeoffs2© DataStax, All Rights Reserved.

Page 3: Building a Distributed Reservation System with Cassandra (Andrew Baker & Jeffrey Carpenter, Choice Hotels) | C* Summit 2016

© DataStax, All Rights Reserved. 3

Central Reservation System Interfaces

CRSProperty Systems

Web and Mobile

External Channels Reportin

g & Billing

Customer &

Loyalty

Page 4: Building a Distributed Reservation System with Cassandra (Andrew Baker & Jeffrey Carpenter, Choice Hotels) | C* Summit 2016

© DataStax, All Rights Reserved. 4

Current Reservation System – By The Numbers

25 years6,000 hotels

50transactions / second4,000distribution channels

1 instance

Page 5: Building a Distributed Reservation System with Cassandra (Andrew Baker & Jeffrey Carpenter, Choice Hotels) | C* Summit 2016

© DataStax, All Rights Reserved. 5

Architecture Tenets

MicroservicesCloud-nativeRules-based

Open Source InfrastructureStable, Scalable, Secure

Page 6: Building a Distributed Reservation System with Cassandra (Andrew Baker & Jeffrey Carpenter, Choice Hotels) | C* Summit 2016

© DataStax, All Rights Reserved. 6

Non-relational

Rules-based

Reporting & Analytics

Cloud deployment

RESTful services

High availability

Page 7: Building a Distributed Reservation System with Cassandra (Andrew Baker & Jeffrey Carpenter, Choice Hotels) | C* Summit 2016

© DataStax, All Rights Reserved. 7

Our configuration

• 18 nodes• Cassandra 2.2.X • I2.2XL (Smaller in Dev/Test)• 1 TB and growing

• 3 regions• AWS VPC• Direct Connect• Legacy systems in on-

prem data center

C*

Page 8: Building a Distributed Reservation System with Cassandra (Andrew Baker & Jeffrey Carpenter, Choice Hotels) | C* Summit 2016

© DataStax, All Rights Reserved. 8

Project Timeline

Inception• Proof of

concept

Beta• Initial

Capability• Beta Release• <1%

production traffic

Release 1• Full

Capability• ~10%

production traffic

Completion• 100%

production traffic

• Legacy System Retirement

Look, Ma, 100K writes/sec!

Why are my repairs failing?

We got this!

Page 9: Building a Distributed Reservation System with Cassandra (Andrew Baker & Jeffrey Carpenter, Choice Hotels) | C* Summit 2016

© DataStax, All Rights Reserved. 9

Key Data Types

rates inventoryhotels reservations

Page 10: Building a Distributed Reservation System with Cassandra (Andrew Baker & Jeffrey Carpenter, Choice Hotels) | C* Summit 2016

© DataStax, All Rights Reserved. 10

Key Data Types → Microservices

Hotel Service

Booking Service

Rates Service

Shopping Service

Data Maintenance Apps

Inventory Service

Reservation Service

Inventory keyspace

Rates keyspace

Hotels keyspace

Reservations keyspace

Page 11: Building a Distributed Reservation System with Cassandra (Andrew Baker & Jeffrey Carpenter, Choice Hotels) | C* Summit 2016

© DataStax, All Rights Reserved. 11

Varying Consistency Needs

Eventual consistency Immediate consistency

ratesreservationshotels inventory

Page 12: Building a Distributed Reservation System with Cassandra (Andrew Baker & Jeffrey Carpenter, Choice Hotels) | C* Summit 2016

© DataStax, All Rights Reserved. 12

Distributed Transactions, Anyone?

Commit thecontract

Reservethe inventory

Booking Service

Data Maintenance Apps

Inventory Service

Reservation Service

inventory

reservations

Data synchronization

Page 13: Building a Distributed Reservation System with Cassandra (Andrew Baker & Jeffrey Carpenter, Choice Hotels) | C* Summit 2016

Alternatives to Distributed Transactions

Approach Example Scope

Lightweight Transaction Updating inventory counts Data Tier

Logged Batch Writing to multiple denormalized hotel tables Data Tier

Retrying failed calls Data synchronization, reservation processing Service

Compensating processes Verifying reservation processing System

© DataStax, All Rights Reserved. 13

Eventual consistency

Strong consistency

Page 14: Building a Distributed Reservation System with Cassandra (Andrew Baker & Jeffrey Carpenter, Choice Hotels) | C* Summit 2016

© DataStax, All Rights Reserved. 14

Where is my data?

Create Block

C*

node

node

node

nodeUpdate

Inventory

Check Block & Adjust Inventory

Count

LOCAL_ONE

LOCAL_QUORUM

LOCAL_QUORUM

Page 15: Building a Distributed Reservation System with Cassandra (Andrew Baker & Jeffrey Carpenter, Choice Hotels) | C* Summit 2016

© DataStax, All Rights Reserved. 15

Configurable Consistency Levels

C*

node

node

node

nodeA LOCAL_ONE

C*

node

node

node

nodeB LOCAL_QUORUM

Test

Page 16: Building a Distributed Reservation System with Cassandra (Andrew Baker & Jeffrey Carpenter, Choice Hotels) | C* Summit 2016

© DataStax, All Rights Reserved. 16© DataStax, All Rights Reserved. 16

Cross Region Issues

Page 17: Building a Distributed Reservation System with Cassandra (Andrew Baker & Jeffrey Carpenter, Choice Hotels) | C* Summit 2016

© DataStax, All Rights Reserved. 17

ServiceReliability

ClusterBuilder. addContactPoint()

ClusterBuilder. addContactPoints()

VS

Java Driver

C*

node

node

node

node X

Page 18: Building a Distributed Reservation System with Cassandra (Andrew Baker & Jeffrey Carpenter, Choice Hotels) | C* Summit 2016

© DataStax, All Rights Reserved. 18

Complex Time Queries

Shoppingrequest

Rate data

I can’t do a range query on departure date

If I do a range query on arrival date…

Time

Page 19: Building a Distributed Reservation System with Cassandra (Andrew Baker & Jeffrey Carpenter, Choice Hotels) | C* Summit 2016

© DataStax, All Rights Reserved. 19

Keyspace edge_hotel

Denormalization Gone Wild (aka “Hotel Access Patterns”)

Locate hotel by identifier

Find hotels within X miles

of point Y

Find hotels by city, state, country

Find hotels by postal

code

Hotels by amenity

Find hotels by brand

hotels_by_id

hotels_by_brand

hotels_by_postal_code

Hotels by this

Hotels by that

Hotels by something

else

Page 20: Building a Distributed Reservation System with Cassandra (Andrew Baker & Jeffrey Carpenter, Choice Hotels) | C* Summit 2016

20

Schema Evolution

© DataStax, All Rights Reserved.

CREATE TABLE rates_by_hotel (id text, hotel_id text, code text, name text, product_ids set<text>, categories set<text>, PRIMARY KEY ((hotel_id), code));

ALTER TABLE rates_by_hotelDROP code;

ALTER TABLE rates_by_hotelADD code int;

schema.cql

001_code_to_integer.cql

001_rollback.cql

ALTER TABLE rates_by_hotelDROP code;

ALTER TABLE rates_by_hoteladd code text;

Page 21: Building a Distributed Reservation System with Cassandra (Andrew Baker & Jeffrey Carpenter, Choice Hotels) | C* Summit 2016

Size and Cost Estimation

© DataStax, All Rights Reserved. 21

47%

37%

7%

9%

Data SizeInventory RatesHotels Reservations

Schemas

Sizes

Chebotko Formulas

Estimates

• String length• Collection size• Partition and

row counts

Page 22: Building a Distributed Reservation System with Cassandra (Andrew Baker & Jeffrey Carpenter, Choice Hotels) | C* Summit 2016

© DataStax, All Rights Reserved. 22

TTL for Data Cleanup

Now

Time

Yesterday’s data is ancient history

Rate + Inventory Data

Page 23: Building a Distributed Reservation System with Cassandra (Andrew Baker & Jeffrey Carpenter, Choice Hotels) | C* Summit 2016

© DataStax, All Rights Reserved. 23

Service Level Agreement Decomposition

The shopping service must

complete in 80 ms

The inventory service must

complete in 20 ms

C* inventory query must

complete in 8 ms

The rates service must complete in

30 ms

C* rates query must complete in

10 ms

A shopping request for a 3-night stay at rack rates for a property that has 15 room types must have a 95 percentile completion time of 80 ms

Page 24: Building a Distributed Reservation System with Cassandra (Andrew Baker & Jeffrey Carpenter, Choice Hotels) | C* Summit 2016

© DataStax, All Rights Reserved. 24

Roadmap

Cassandra 3.XMaterialized Views & SASI

Row CacheSpark & Search

Page 25: Building a Distributed Reservation System with Cassandra (Andrew Baker & Jeffrey Carpenter, Choice Hotels) | C* Summit 2016

© DataStax, All Rights Reserved. 25

Final Thoughts

Let Cassandra be CassandraBe Flexible in Consistency

Manage the JoinsGet to Scale

Automate Everything… with care

Page 26: Building a Distributed Reservation System with Cassandra (Andrew Baker & Jeffrey Carpenter, Choice Hotels) | C* Summit 2016

We’re Hiring!

© DataStax, All Rights Reserved. 26

http://careers.choicehotels.com

• Sr. Cassandra Database Administrator• DevOps Architect• Multiple Java development positions

Page 27: Building a Distributed Reservation System with Cassandra (Andrew Baker & Jeffrey Carpenter, Choice Hotels) | C* Summit 2016

Now Available!

© DataStax, All Rights Reserved. 27

Cassandra: The Definitive Guide, 2nd EditionCompletely reworked for Cassandra 3.X:

• Data modeling in CQL

• SASI indexes

• materialized views

• lightweight transactions

• DataStax drivers

• New chapters on security, deployment, and integration

Page 28: Building a Distributed Reservation System with Cassandra (Andrew Baker & Jeffrey Carpenter, Choice Hotels) | C* Summit 2016

© DataStax, All Rights Reserved. 28

Contact us

@JavaBakerag @choicehotels

Choice Hotels International

@jscarp