Data Modeling with Cassandra
-
Upload
patricia-gorla -
Category
Software
-
view
712 -
download
3
description
Transcript of Data Modeling with Cassandra
![Page 1: Data Modeling with Cassandra](https://reader033.fdocuments.us/reader033/viewer/2022051400/553a3d8e5503464e418b4af5/html5/thumbnails/1.jpg)
Licensed under a Creative Commons Attribution-NonCommercial 3.0 New Zealand License
Data Modeling with Cassandra
Patricia Gorla @patriciagorla
Cassandra Consultant
![Page 2: Data Modeling with Cassandra](https://reader033.fdocuments.us/reader033/viewer/2022051400/553a3d8e5503464e418b4af5/html5/thumbnails/2.jpg)
About The Last Pickle. !
Work with clients to deliver and improve Apache Cassandra based solutions. Apache Cassandra Committer, DataStax MVP, Hector Maintainer, Apache Usergrid Committer. Based in New Zealand & USA.
![Page 3: Data Modeling with Cassandra](https://reader033.fdocuments.us/reader033/viewer/2022051400/553a3d8e5503464e418b4af5/html5/thumbnails/3.jpg)
A Few Notes about Cassandra
![Page 4: Data Modeling with Cassandra](https://reader033.fdocuments.us/reader033/viewer/2022051400/553a3d8e5503464e418b4af5/html5/thumbnails/4.jpg)
A Few Notes about CassandraOpen sourced in 2008 by Facebook
![Page 5: Data Modeling with Cassandra](https://reader033.fdocuments.us/reader033/viewer/2022051400/553a3d8e5503464e418b4af5/html5/thumbnails/5.jpg)
A Few Notes about CassandraOpen sourced in 2008 by FacebookA lot has changed since then…
See issues.apache.org/jira/browse/CASSANDRA
![Page 6: Data Modeling with Cassandra](https://reader033.fdocuments.us/reader033/viewer/2022051400/553a3d8e5503464e418b4af5/html5/thumbnails/6.jpg)
Cassandra is…• Distributed
'foo'
'bar''foo'
'foo'
'bar'
'bar'
Data distributed by hash
![Page 7: Data Modeling with Cassandra](https://reader033.fdocuments.us/reader033/viewer/2022051400/553a3d8e5503464e418b4af5/html5/thumbnails/7.jpg)
Cassandra is…• Distributed
Availability through Redundancy
'foo'
'bar''foo'
'foo'
'bar'
'bar'
![Page 8: Data Modeling with Cassandra](https://reader033.fdocuments.us/reader033/viewer/2022051400/553a3d8e5503464e418b4af5/html5/thumbnails/8.jpg)
SouthAfrica
Central Africa
Egypt
North Africa
Mad
agas
car
East Africa
India
Afghanistan
Middle East
Ural
Siberia
Yakutsk Kamchatka
Irkutsk
Japa
n
Russia
Scandinavia
SoutheastAsia
NorthernEurope
SouthernEurope
WesternEurope
Iceland
Great Britain
New Guinea
Indonesia
Western Australia
Eastern Australia
Northwest TerritoryAlaska
Alberta
Ontario Eastern Canada
WesternUnited States
EasternUnited States
Greenland
Central America
Venezuela
Brazil
Peru
Argentina
Cassandra is…• Distributed
Geolocated datacenters
![Page 9: Data Modeling with Cassandra](https://reader033.fdocuments.us/reader033/viewer/2022051400/553a3d8e5503464e418b4af5/html5/thumbnails/9.jpg)
Cassandra is…• Distributed • Eventually Consistent
?
?
?
Read Repair Maintenance Repair
![Page 10: Data Modeling with Cassandra](https://reader033.fdocuments.us/reader033/viewer/2022051400/553a3d8e5503464e418b4af5/html5/thumbnails/10.jpg)
Cassandra is…• Distributed • Eventually Consistent
?
?
?
Consistency Level
QUORUM, ONE, ALL, ANY
![Page 11: Data Modeling with Cassandra](https://reader033.fdocuments.us/reader033/viewer/2022051400/553a3d8e5503464e418b4af5/html5/thumbnails/11.jpg)
Cassandra is…• Distributed • Eventually Consistent • Fast
See http://www.datastax.com/dev/blog/cassandra-2-1-now-over-50-faster
2.1 - 190,000 wps
2.0 - 105,000 wps
![Page 12: Data Modeling with Cassandra](https://reader033.fdocuments.us/reader033/viewer/2022051400/553a3d8e5503464e418b4af5/html5/thumbnails/12.jpg)
Cassandra is…• Distributed • Eventually Consistent • Fast
See http://www.datastax.com/dev/blog/cassandra-2-1-now-over-50-faster
2.1 - 190,000 wps
2.0 - 105,000 wps
Note: Reads can be tuned through data model and JVM
![Page 13: Data Modeling with Cassandra](https://reader033.fdocuments.us/reader033/viewer/2022051400/553a3d8e5503464e418b4af5/html5/thumbnails/13.jpg)
Cassandra is…• Distributed • Eventually Consistent • Fast • Familiar
CREATE TABLE IF NOT EXISTS foo ( bar text, baz text, PRIMARY KEY (bar));
CQL - Cassandra Query Language
![Page 14: Data Modeling with Cassandra](https://reader033.fdocuments.us/reader033/viewer/2022051400/553a3d8e5503464e418b4af5/html5/thumbnails/14.jpg)
Cassandra is…• Distributed • Eventually Consistent • Fast • Familiar
CREATE TABLE IF NOT EXISTS foo ( bar text, baz text, PRIMARY KEY (bar));!
INSERT INTO foo (bar, baz) VALUES ('one', 'two');!
SELECT * FROM foo;
cqlsh - CLI tool
![Page 15: Data Modeling with Cassandra](https://reader033.fdocuments.us/reader033/viewer/2022051400/553a3d8e5503464e418b4af5/html5/thumbnails/15.jpg)
Cassandra is…• Distributed • Eventually Consistent • Fast • Familiar • Popular
DriversDatastax C#, Java, C++, Python,
Node.js*, Ruby*.NET/C# Cassandra Sharp, Aquiles, … Cassandra, Apache Spark Datastax Spark Connector
C++ libQTCassandraClojure CLJ-Hector, Cassaforte, AliaErlang CQerl
Go Gossie, GoCQL, CQLcHaskell Cassy
Java Astyanax,Hector, Achilles,Node.js Helenus, Node-Cassandra-
CQL,ODBC Simba ODBCPerl Cassandra::Simple, PerlcassaPHP CQL PHP, CQLSI, php-
cassandraPython Datastax Python, Pycassa,R R Cassandra
Ruby Fauna, CQL Ruby, CQLEngineRust Rust-CQL
Scala CascalStorm Storm-Cassandra
For full list, see http://planetcassandra.org/client-drivers-tools/
![Page 16: Data Modeling with Cassandra](https://reader033.fdocuments.us/reader033/viewer/2022051400/553a3d8e5503464e418b4af5/html5/thumbnails/16.jpg)
The Hard Part
![Page 17: Data Modeling with Cassandra](https://reader033.fdocuments.us/reader033/viewer/2022051400/553a3d8e5503464e418b4af5/html5/thumbnails/17.jpg)
The Hard Part(Data Modeling)
![Page 18: Data Modeling with Cassandra](https://reader033.fdocuments.us/reader033/viewer/2022051400/553a3d8e5503464e418b4af5/html5/thumbnails/18.jpg)
The Hard Part(Data Modeling)
No JOINs, Denormalize
![Page 19: Data Modeling with Cassandra](https://reader033.fdocuments.us/reader033/viewer/2022051400/553a3d8e5503464e418b4af5/html5/thumbnails/19.jpg)
The Hard Part(Data Modeling)
No JOINs, Denormalize
Duplicate the Data
![Page 20: Data Modeling with Cassandra](https://reader033.fdocuments.us/reader033/viewer/2022051400/553a3d8e5503464e418b4af5/html5/thumbnails/20.jpg)
The Hard Part(Data Modeling)
No JOINs, Denormalize
Duplicate the Data
Identify Usage
![Page 21: Data Modeling with Cassandra](https://reader033.fdocuments.us/reader033/viewer/2022051400/553a3d8e5503464e418b4af5/html5/thumbnails/21.jpg)
Bikes Customers Stations Trips
c Noah Berger, Flickr
Case Study: City BikeShare
![Page 22: Data Modeling with Cassandra](https://reader033.fdocuments.us/reader033/viewer/2022051400/553a3d8e5503464e418b4af5/html5/thumbnails/22.jpg)
!
CREATE KEYSPACE bikeshare WITH replication = { 'class': 'NetworkTopologyStrategy' , 'datacenter1': 3 };!
USE bikeshare; RF can be altered ex post facto
![Page 23: Data Modeling with Cassandra](https://reader033.fdocuments.us/reader033/viewer/2022051400/553a3d8e5503464e418b4af5/html5/thumbnails/23.jpg)
Bikes Customers Stations Trips
c Noah Berger, Flickr
- List the properties of the bike.
![Page 24: Data Modeling with Cassandra](https://reader033.fdocuments.us/reader033/viewer/2022051400/553a3d8e5503464e418b4af5/html5/thumbnails/24.jpg)
!
CREATE TABLE IF NOT EXISTS bike ( bike_id text, properties map<text, text>, is_damaged boolean, is_checked_out boolean, latitude double, longitude double, PRIMARY KEY (bike_id));
![Page 25: Data Modeling with Cassandra](https://reader033.fdocuments.us/reader033/viewer/2022051400/553a3d8e5503464e418b4af5/html5/thumbnails/25.jpg)
!
CREATE TABLE IF NOT EXISTS bike ( bike_id text, properties map<text, text>, is_damaged boolean, is_checked_out boolean, latitude double, longitude double, PRIMARY KEY (bike_id));
![Page 26: Data Modeling with Cassandra](https://reader033.fdocuments.us/reader033/viewer/2022051400/553a3d8e5503464e418b4af5/html5/thumbnails/26.jpg)
!
CREATE TABLE IF NOT EXISTS bike ( bike_id text, properties map<text, text>, is_damaged boolean, is_checked_out boolean, latitude double, longitude double, PRIMARY KEY (bike_id));
See www.datastax.com/documentation/cql/3.0/cql/cql_reference/cql_data_types_c.html for all data types
![Page 27: Data Modeling with Cassandra](https://reader033.fdocuments.us/reader033/viewer/2022051400/553a3d8e5503464e418b4af5/html5/thumbnails/27.jpg)
INSERT INTO bike ( bike_id, properties, is_damaged, is_checked_out, latitude, longitude ) VALUES ( 'bike1', {'serial_number' : 'GS-00143', 'type' : 'road bike'}, False, False, 37.7648, 122.4200);
![Page 28: Data Modeling with Cassandra](https://reader033.fdocuments.us/reader033/viewer/2022051400/553a3d8e5503464e418b4af5/html5/thumbnails/28.jpg)
!
SELECT * FROM bike;
![Page 29: Data Modeling with Cassandra](https://reader033.fdocuments.us/reader033/viewer/2022051400/553a3d8e5503464e418b4af5/html5/thumbnails/29.jpg)
!
SELECT * FROM bike;! bike_id | is_checked_out | is_damaged | latitude | longitude | properties---------+----------------+------------+----------+-----------+----------------------------------------------------- bike3 | False | True | 37.793 | 122.4 | {'serial_number': 'GS-70159', 'type': 'fixed gear'} bike2 | True | False | 37.786 | 122.4 | {'serial_number': 'GS-79366', 'type': 'road bike'} bike1 | False | False | 37.765 | 122.42 | {'serial_number': 'GS-00143', 'type': 'road bike'}!(3 rows)
![Page 30: Data Modeling with Cassandra](https://reader033.fdocuments.us/reader033/viewer/2022051400/553a3d8e5503464e418b4af5/html5/thumbnails/30.jpg)
!
CREATE TABLE IF NOT EXISTS bike ( bike_id text, properties map<text, text>, is_damaged boolean, is_checked_out boolean, latitude double, longitude double, PRIMARY KEY (bike_id));
![Page 31: Data Modeling with Cassandra](https://reader033.fdocuments.us/reader033/viewer/2022051400/553a3d8e5503464e418b4af5/html5/thumbnails/31.jpg)
!
CREATE TABLE IF NOT EXISTS bike ( bike_id text, properties map<text, text>, is_damaged boolean, is_checked_out boolean, latitude double, longitude double, PRIMARY KEY (bike_id));
!UPDATE bike SET properties['color'] = 'royal blue' WHERE bike_id = 'bike1';
![Page 32: Data Modeling with Cassandra](https://reader033.fdocuments.us/reader033/viewer/2022051400/553a3d8e5503464e418b4af5/html5/thumbnails/32.jpg)
!
CREATE TABLE IF NOT EXISTS bike ( bike_id text, properties map<text, text>, is_damaged boolean, is_checked_out boolean, latitude double, longitude double, PRIMARY KEY (bike_id));
!UPDATE bike SET properties['color'] = 'royal blue' WHERE bike_id = 'bike1';
!SELECT properties FROM bike WHERE bike_id = bike1';!properties--------------------------------------------------------------------------- {'color': 'royal blue','serial_number': 'GS-00143', 'type': 'road bike'}!(1 rows)
![Page 33: Data Modeling with Cassandra](https://reader033.fdocuments.us/reader033/viewer/2022051400/553a3d8e5503464e418b4af5/html5/thumbnails/33.jpg)
!
CREATE TABLE IF NOT EXISTS bike ( bike_id text, properties map<text, text>, is_damaged boolean, is_checked_out boolean, latitude double, longitude double, PRIMARY KEY (bike_id));
!DELETE properties['color'] FROM bike WHERE bike_id = 'bike1';
![Page 34: Data Modeling with Cassandra](https://reader033.fdocuments.us/reader033/viewer/2022051400/553a3d8e5503464e418b4af5/html5/thumbnails/34.jpg)
!
CREATE TABLE IF NOT EXISTS bike ( bike_id text, properties map<text, text>, is_damaged boolean, is_checked_out boolean, latitude double, longitude double, PRIMARY KEY (bike_id));
!DELETE properties['color'] FROM bike WHERE bike_id = 'bike1';
!SELECT properties FROM bike WHERE bike_id = bike1';!properties--------------------------------------------------- {'serial_number': 'GS-00143', 'type': 'road bike'}!(1 rows)
![Page 35: Data Modeling with Cassandra](https://reader033.fdocuments.us/reader033/viewer/2022051400/553a3d8e5503464e418b4af5/html5/thumbnails/35.jpg)
Bikes Customers Stations Trips
c Noah Berger, Flickr
- List the properties of the bike. - Verify whether the bike can be
checked out.
![Page 36: Data Modeling with Cassandra](https://reader033.fdocuments.us/reader033/viewer/2022051400/553a3d8e5503464e418b4af5/html5/thumbnails/36.jpg)
!
CREATE TABLE IF NOT EXISTS bike ( bike_id text, properties map<text, text>, is_damaged boolean, is_checked_out boolean, latitude double, longitude double, PRIMARY KEY (bike_id));
![Page 37: Data Modeling with Cassandra](https://reader033.fdocuments.us/reader033/viewer/2022051400/553a3d8e5503464e418b4af5/html5/thumbnails/37.jpg)
!
UPDATE bike SET is_checked_out = True WHERE bike_id = 'bike1' IF is_checked_out = False; !
Set conditional statement
![Page 38: Data Modeling with Cassandra](https://reader033.fdocuments.us/reader033/viewer/2022051400/553a3d8e5503464e418b4af5/html5/thumbnails/38.jpg)
!
UPDATE bike SET is_checked_out = True WHERE bike_id = 'bike1' IF is_checked_out = False; !! [applied] ----------- True
![Page 39: Data Modeling with Cassandra](https://reader033.fdocuments.us/reader033/viewer/2022051400/553a3d8e5503464e418b4af5/html5/thumbnails/39.jpg)
!
UPDATE bike SET is_checked_out = True WHERE bike_id = 'bike1' IF is_checked_out = False; !! [applied] | is_checked_out -----------+---------------- False | True
See www.datastax.com/dev/blog/lightweight-transactions-in-cassandra-2-0
![Page 40: Data Modeling with Cassandra](https://reader033.fdocuments.us/reader033/viewer/2022051400/553a3d8e5503464e418b4af5/html5/thumbnails/40.jpg)
Bikes Customers Stations Trips
c Noah Berger, Flickr
- Get the customer details.
![Page 41: Data Modeling with Cassandra](https://reader033.fdocuments.us/reader033/viewer/2022051400/553a3d8e5503464e418b4af5/html5/thumbnails/41.jpg)
CREATE TYPE IF NOT EXISTS address ( street_name text, zip text);!
CREATE TABLE IF NOT EXISTS customer ( customer_id text, email text, name text, password text, mailing_address address, PRIMARY KEY (customer_id));
Note: This example uses text fields for simplicity. Passwords should not be stored in plain text.
![Page 42: Data Modeling with Cassandra](https://reader033.fdocuments.us/reader033/viewer/2022051400/553a3d8e5503464e418b4af5/html5/thumbnails/42.jpg)
CREATE TYPE IF NOT EXISTS address ( street_name text, zip text);!
CREATE TABLE IF NOT EXISTS customer ( customer_id text, email text, name text, password text, mailing_address frozen<address>, PRIMARY KEY (customer_id));
Limitations
Data is serialisedCASSANDRA-7857
CASSANDRA-7423 - Freezing UDT
- Query individual subfields
![Page 43: Data Modeling with Cassandra](https://reader033.fdocuments.us/reader033/viewer/2022051400/553a3d8e5503464e418b4af5/html5/thumbnails/43.jpg)
INSERT INTO customer ( customer_id, email, name, password, mailing_address) VALUES ( 'customer1', '[email protected]', 'Paul Van Haver', 'p@ssw0rd1', {street_name: 'Capp Street', zip: '94110'});
![Page 44: Data Modeling with Cassandra](https://reader033.fdocuments.us/reader033/viewer/2022051400/553a3d8e5503464e418b4af5/html5/thumbnails/44.jpg)
INSERT INTO customer ( customer_id, email, name, password, mailing_address) VALUES ( 'customer1', '[email protected]', 'Paul Van Haver', 'p@ssw0rd1', {street_name: 'Capp Street', zip: '94110'});
![Page 45: Data Modeling with Cassandra](https://reader033.fdocuments.us/reader033/viewer/2022051400/553a3d8e5503464e418b4af5/html5/thumbnails/45.jpg)
!
SELECT mailing_address.street_name FROM customer WHERE customer_id = ‘customer2';!
!
mailing_address.street_name----------------------------- Bryant Street!
(1 rows)
![Page 46: Data Modeling with Cassandra](https://reader033.fdocuments.us/reader033/viewer/2022051400/553a3d8e5503464e418b4af5/html5/thumbnails/46.jpg)
Bikes Customers Stations Trips
c Noah Berger, Flickr
- List the available bikes at a station.
![Page 47: Data Modeling with Cassandra](https://reader033.fdocuments.us/reader033/viewer/2022051400/553a3d8e5503464e418b4af5/html5/thumbnails/47.jpg)
!
CREATE TABLE IF NOT EXISTS station ( station_name text, latitude double, longitude double, PRIMARY KEY (station_name));
![Page 48: Data Modeling with Cassandra](https://reader033.fdocuments.us/reader033/viewer/2022051400/553a3d8e5503464e418b4af5/html5/thumbnails/48.jpg)
!
CREATE TABLE IF NOT EXISTS bike_at_stations_count ( station_name text, bikes_available counter, PRIMARY KEY (station_name));
![Page 49: Data Modeling with Cassandra](https://reader033.fdocuments.us/reader033/viewer/2022051400/553a3d8e5503464e418b4af5/html5/thumbnails/49.jpg)
!
CREATE TABLE IF NOT EXISTS bike_at_stations_count ( station_name text, bikes_available counter, PRIMARY KEY (station_name)); All counters start at 0
Only increment, decrement
![Page 50: Data Modeling with Cassandra](https://reader033.fdocuments.us/reader033/viewer/2022051400/553a3d8e5503464e418b4af5/html5/thumbnails/50.jpg)
!
UPDATE bikes_at_stations_count SET bikes_available = bikes_available + 1 WHERE station_name = '16th & Mission';
2.1 - Creates a local lock
See www.datastax.com/dev/blog/whats-new-in-cassandra-2-1-a-better-implementation-of-counters
![Page 51: Data Modeling with Cassandra](https://reader033.fdocuments.us/reader033/viewer/2022051400/553a3d8e5503464e418b4af5/html5/thumbnails/51.jpg)
!
UPDATE bikes_at_stations_count SET bikes_available = bikes_available + 1 WHERE station_name = '16th & Mission';!
SELECT * FROM bikes_at_stations_count WHERE station_name = '16th & Mission’;!
station_name | bikes_available----------------+----------------- 16th & Mission | 2!
(1 rows)
![Page 52: Data Modeling with Cassandra](https://reader033.fdocuments.us/reader033/viewer/2022051400/553a3d8e5503464e418b4af5/html5/thumbnails/52.jpg)
Bikes Customers Stations Trips
c Noah Berger, Flickr
- List all trips a bike has been on.
![Page 53: Data Modeling with Cassandra](https://reader033.fdocuments.us/reader033/viewer/2022051400/553a3d8e5503464e418b4af5/html5/thumbnails/53.jpg)
CREATE TABLE IF NOT EXISTS BikeTrips ( bike_id text, trip_id text, PRIMARY KEY (bike_id, trip_id));
![Page 54: Data Modeling with Cassandra](https://reader033.fdocuments.us/reader033/viewer/2022051400/553a3d8e5503464e418b4af5/html5/thumbnails/54.jpg)
CREATE TABLE IF NOT EXISTS BikeTrips ( bike_id text, trip_id text, PRIMARY KEY (bike_id, trip_id));
Flaw: All trips for a bike will be stored in the same row
(row will grow unbounded)
![Page 55: Data Modeling with Cassandra](https://reader033.fdocuments.us/reader033/viewer/2022051400/553a3d8e5503464e418b4af5/html5/thumbnails/55.jpg)
Two components of a primary key
PRIMARY KEY ((a, b, …)…, c)
![Page 56: Data Modeling with Cassandra](https://reader033.fdocuments.us/reader033/viewer/2022051400/553a3d8e5503464e418b4af5/html5/thumbnails/56.jpg)
Partition KeyWhere the row will be physically located
Two components of a primary key
PRIMARY KEY ((a, b, …)…, c)
![Page 57: Data Modeling with Cassandra](https://reader033.fdocuments.us/reader033/viewer/2022051400/553a3d8e5503464e418b4af5/html5/thumbnails/57.jpg)
PRIMARY KEY ((a, b, …)…, c)
Partition KeyWhere the row will be physically located
Clustering KeyHow the columns will be ordered on disk
Two components of a primary key
![Page 58: Data Modeling with Cassandra](https://reader033.fdocuments.us/reader033/viewer/2022051400/553a3d8e5503464e418b4af5/html5/thumbnails/58.jpg)
CREATE TABLE IF NOT EXISTS user ( first_name text, last_login timestamp, PRIMARY KEY (first_name));
Single PKEach row is on a separate partition Can be uniquely identified
![Page 59: Data Modeling with Cassandra](https://reader033.fdocuments.us/reader033/viewer/2022051400/553a3d8e5503464e418b4af5/html5/thumbnails/59.jpg)
Single PK
CREATE TABLE IF NOT EXISTS user ( first_name text, last_login timestamp, PRIMARY KEY (first_name, last_login)) WITH CLUSTERING ORDER BY (last_login DESC);
Compound PKColumns are ordered by logins Most recent users will be at the top
Each row is on a separate partition Can be uniquely identified
CREATE TABLE IF NOT EXISTS user ( first_name text, last_login timestamp, PRIMARY KEY (first_name));
![Page 60: Data Modeling with Cassandra](https://reader033.fdocuments.us/reader033/viewer/2022051400/553a3d8e5503464e418b4af5/html5/thumbnails/60.jpg)
Single PK
CREATE TABLE IF NOT EXISTS user ( first_name text, last_login timestamp, PRIMARY KEY (first_name, last_login)) WITH CLUSTERING ORDER BY (last_login DESC);
Compound PKColumns are ordered by logins Most recent users will be at the top
Each row is on a separate partition Can be uniquely identified
CREATE TABLE IF NOT EXISTS user ( first_name text, last_login timestamp, PRIMARY KEY (first_name));
CREATE TABLE IF NOT EXISTS user ( first_name text, last_name text, last_login timestamp, PRIMARY KEY ((first_name, last_name), last_login)) WITH CLUSTERING ORDER BY (last_login DESC);
Composite PKData is bucketed by compositeRow width will be limited
![Page 61: Data Modeling with Cassandra](https://reader033.fdocuments.us/reader033/viewer/2022051400/553a3d8e5503464e418b4af5/html5/thumbnails/61.jpg)
CREATE TABLE IF NOT EXISTS BikeTrips ( bike_id text, trip_id text, PRIMARY KEY (bike_id, trip_id));
Flaw: All trips for a bike will be stored in the same partition
(row will grow unbounded)
![Page 62: Data Modeling with Cassandra](https://reader033.fdocuments.us/reader033/viewer/2022051400/553a3d8e5503464e418b4af5/html5/thumbnails/62.jpg)
CREATE TABLE IF NOT EXISTS BikeTrips ( bike_id text, trip_id text, PRIMARY KEY (bike_id, trip_id));
Solution: Create artificial bucketCREATE TABLE IF NOT EXISTS BikeTrips ( bike_id text, bucket int, trip_id text, PRIMARY KEY ((bike_id, bucket), trip_id));
Flaw: All trips for a bike will be stored in the same partition
(row will grow unbounded)
![Page 63: Data Modeling with Cassandra](https://reader033.fdocuments.us/reader033/viewer/2022051400/553a3d8e5503464e418b4af5/html5/thumbnails/63.jpg)
CREATE TABLE IF NOT EXISTS BikeTrips ( bike_id text, bucket int, trip_id text, PRIMARY KEY ((bike_id, bucket), trip_id));
Must specify all parts on SELECT
SELECT * FROM BikeTrips WHERE bike_id = 1 AND bucket = 0;
![Page 64: Data Modeling with Cassandra](https://reader033.fdocuments.us/reader033/viewer/2022051400/553a3d8e5503464e418b4af5/html5/thumbnails/64.jpg)
Bikes Customers Stations Trips
c Noah Berger, Flickr
- List all trips a bike has been on. - List all trips a customer has
taken.
![Page 65: Data Modeling with Cassandra](https://reader033.fdocuments.us/reader033/viewer/2022051400/553a3d8e5503464e418b4af5/html5/thumbnails/65.jpg)
CREATE TABLE IF NOT EXISTS CustomerTrips ( customer_id text, trip_id text, PRIMARY KEY (customer_id, trip_id));
Rows will not be as wide as BikeTrips
![Page 66: Data Modeling with Cassandra](https://reader033.fdocuments.us/reader033/viewer/2022051400/553a3d8e5503464e418b4af5/html5/thumbnails/66.jpg)
Bikes Customers Stations Trips
c Noah Berger, Flickr
- List all trips a bike has been on. - List all trips a customer has
taken. - Show details of a particular trip
(duration, distance traveled).
![Page 67: Data Modeling with Cassandra](https://reader033.fdocuments.us/reader033/viewer/2022051400/553a3d8e5503464e418b4af5/html5/thumbnails/67.jpg)
CREATE TABLE IF NOT EXISTS trip ( trip_id text, customer_id text static, bike_id text static, started_at timestamp static, stopped_at timestamp static, sequence timestamp, latitude decimal, longitude decimal, delta_distance double, PRIMARY KEY (trip_id, sequence)) WITH CLUSTERING ORDER BY (sequence DESC);
![Page 68: Data Modeling with Cassandra](https://reader033.fdocuments.us/reader033/viewer/2022051400/553a3d8e5503464e418b4af5/html5/thumbnails/68.jpg)
!
SELECT * FROM trip WHERE trip_id = 'trip1';! trip_id | sequence | bike_id | customer_id | started_at | stopped_at | delta_distance | latitude | longitude---------+--------------------------+---------+-------------+--------------------------+--------------------------+----------------+-------------+----------- trip1 | 2014-08-10 06:10:05+0100 | bike15 | customer3 | 2014-08-10 06:07:55+0100 | 2014-08-10 06:07:55+0100 | 8.7951 | -122.405319 | 37.796936 trip1 | 2014-08-10 06:10:00+0100 | bike15 | customer3 | 2014-08-10 06:07:55+0100 | 2014-08-10 06:07:55+0100 | 15.381 | -122.403347 | 37.795535 trip1 | 2014-08-10 06:09:55+0100 | bike15 | customer3 | 2014-08-10 06:07:55+0100 | 2014-08-10 06:07:55+0100 | 0 | -122.403347 | 37.795535 trip1 | 2014-08-10 06:09:50+0100 | bike15 | customer3 | 2014-08-10 06:07:55+0100 | 2014-08-10 06:07:55+0100 | 10.557 | -122.401702 | 37.795731 trip1 | 2014-08-10 06:09:45+0100 | bike15 | customer3 | 2014-08-10 06:07:55+0100 | 2014-08-10 06:07:55+0100 | 0 | -122.401702 | 37.795731 trip1 | 2014-08-10 06:09:40+0100 | bike15 | customer3 | 2014-08-10 06:07:55+0100 | 2014-08-10 06:07:55+0100 | 35.282 | -122.400589 | 37.790268 ... trip1 | 2014-08-10 06:08:45+0100 | bike15 | customer3 | 2014-08-10 06:07:55+0100 | 2014-08-10 06:07:55+0100 | 6.1672 | -122.414782 | 37.771255 trip1 | 2014-08-10 06:08:40+0100 | bike15 | customer3 | 2014-08-10 06:07:55+0100 | 2014-08-10 06:07:55+0100 | 2.6682 | -122.415047 | 37.770929 trip1 | 2014-08-10 06:08:35+0100 | bike15 | customer3 | 2014-08-10 06:07:55+0100 | 2014-08-10 06:07:55+0100 | 2.9604 | -122.415287 | 37.770529 trip1 | 2014-08-10 06:08:30+0100 | bike15 | customer3 | 2014-08-10 06:07:55+0100 | 2014-08-10 06:07:55+0100 | 2.775 | -122.41544 | 37.770119 trip1 | 2014-08-10 06:08:25+0100 | bike15 | customer3 | 2014-08-10 06:07:55+0100 | 2014-08-10 06:07:55+0100 | 5.7684 | -122.41566 | 37.769236 trip1 | 2014-08-10 06:08:20+0100 | bike15 | customer3 | 2014-08-10 06:07:55+0100 | 2014-08-10 06:07:55+0100 | 3.1183 | -122.415669 | 37.768744 trip1 | 2014-08-10 06:08:15+0100 | bike15 | customer3 | 2014-08-10 06:07:55+0100 | 2014-08-10 06:07:55+0100 | 93.217 | -122.414251 | 37.754102 trip1 | 2014-08-10 06:08:10+0100 | bike15 | customer3 | 2014-08-10 06:07:55+0100 | 2014-08-10 06:07:55+0100 | 0 | -122.414251 | 37.754102 trip1 | 2014-08-10 06:08:05+0100 | bike15 | customer3 | 2014-08-10 06:07:55+0100 | 2014-08-10 06:07:55+0100 | 31.664 | -122.409291 | 37.754393 trip1 | 2014-08-10 06:08:00+0100 | bike15 | customer3 | 2014-08-10 06:07:55+0100 | 2014-08-10 06:07:55+0100 | 0 | -122.409291 | 37.754393 trip1 | 2014-08-10 06:07:55+0100 | bike15 | customer3 | 2014-08-10 06:07:55+0100 | 2014-08-10 06:07:55+0100 | 0.54761 | -122.409282 | 37.754307!(27 rows)
![Page 69: Data Modeling with Cassandra](https://reader033.fdocuments.us/reader033/viewer/2022051400/553a3d8e5503464e418b4af5/html5/thumbnails/69.jpg)
CREATE TABLE IF NOT EXISTS trip ( trip_id text, customer_id text static, bike_id text static, started_at timestamp static, stopped_at timestamp static, sequence timestamp, latitude decimal, longitude decimal, delta_distance double, PRIMARY KEY (trip_id, sequence)) WITH CLUSTERING ORDER BY (sequence DESC);
![Page 70: Data Modeling with Cassandra](https://reader033.fdocuments.us/reader033/viewer/2022051400/553a3d8e5503464e418b4af5/html5/thumbnails/70.jpg)
!
SELECT sequence, latitude, longitude FROM trip WHERE trip_id = 'trip1' AND sequence > '2014-08-10 06:09:00+0100';! sequence | latitude | longitude--------------------------+-------------+----------- 2014-08-10 06:10:05+0100 | -122.405319 | 37.796936 2014-08-10 06:10:00+0100 | -122.403347 | 37.795535 2014-08-10 06:09:55+0100 | -122.403347 | 37.795535 2014-08-10 06:09:50+0100 | -122.401702 | 37.795731 2014-08-10 06:09:45+0100 | -122.401702 | 37.795731 2014-08-10 06:09:40+0100 | -122.400589 | 37.790268 2014-08-10 06:09:35+0100 | -122.400589 | 37.790268 2014-08-10 06:09:30+0100 | -122.400404 | 37.790241 2014-08-10 06:09:25+0100 | -122.400359 | 37.790128 2014-08-10 06:09:20+0100 | -122.400359 | 37.790128 2014-08-10 06:09:15+0100 | -122.408092 | 37.784008 2014-08-10 06:09:10+0100 | -122.408092 | 37.784008 2014-08-10 06:09:05+0100 | -122.403416 | 37.780284
Use comparator for data type
![Page 71: Data Modeling with Cassandra](https://reader033.fdocuments.us/reader033/viewer/2022051400/553a3d8e5503464e418b4af5/html5/thumbnails/71.jpg)
CREATE TABLE IF NOT EXISTS trip ( trip_id text, customer_id text static, bike_id text static, started_at timestamp static, stopped_at timestamp static, sequence timestamp, latitude decimal, longitude decimal, delta_distance double, PRIMARY KEY (trip_id, sequence)) WITH CLUSTERING ORDER BY (sequence DESC);
![Page 72: Data Modeling with Cassandra](https://reader033.fdocuments.us/reader033/viewer/2022051400/553a3d8e5503464e418b4af5/html5/thumbnails/72.jpg)
CREATE TABLE IF NOT EXISTS trip ( trip_id text, customer_id text static, bike_id text static, started_at timestamp static, stopped_at timestamp static, sequence timestamp, latitude decimal, longitude decimal, delta_distance double, PRIMARY KEY (trip_id, sequence)) WITH CLUSTERING ORDER BY (sequence DESC);
![Page 73: Data Modeling with Cassandra](https://reader033.fdocuments.us/reader033/viewer/2022051400/553a3d8e5503464e418b4af5/html5/thumbnails/73.jpg)
Recap
![Page 74: Data Modeling with Cassandra](https://reader033.fdocuments.us/reader033/viewer/2022051400/553a3d8e5503464e418b4af5/html5/thumbnails/74.jpg)
Recap• There is hope
![Page 75: Data Modeling with Cassandra](https://reader033.fdocuments.us/reader033/viewer/2022051400/553a3d8e5503464e418b4af5/html5/thumbnails/75.jpg)
Recap• There is hope • Identify usage
![Page 76: Data Modeling with Cassandra](https://reader033.fdocuments.us/reader033/viewer/2022051400/553a3d8e5503464e418b4af5/html5/thumbnails/76.jpg)
Recap• There is hope • Identify usage • Be mindful of storage engine
![Page 77: Data Modeling with Cassandra](https://reader033.fdocuments.us/reader033/viewer/2022051400/553a3d8e5503464e418b4af5/html5/thumbnails/77.jpg)
Licensed under a Creative Commons Attribution-NonCommercial 3.0 New Zealand License
Patricia Gorla @patriciagorla !
www.thelastpickle.com
Q&A