CQL Under the Hood

82
CQL Under the Hood Robbie Strickland

description

As a reformed CQL critic, I'd like to help dispel the myths around CQL and extol its awesomeness. Most criticism comes from people like me who were early Cassandra adopters and are concerned about the SQL-like syntax, the apparent lack of control, and the reliance on a defined schema. I'll pop open the hood, showing just how the various CQL constructs translate to the underlying storage layer--and in the process I hope to give novices and old-timers alike a reason to love CQL.

Transcript of CQL Under the Hood

Page 1: CQL Under the Hood

CQL Under the Hood Robbie Strickland

Page 2: CQL Under the Hood

whoami?

Robbie Strickland Software Development Manager @rs_atl

Page 3: CQL Under the Hood

Target Audience •  Veterans who don’t use or don’t understand CQL

•  Newcomers who only know CQL, but don’t know what’s happening underneath

Page 4: CQL Under the Hood

Definitions •  Thrift: legacy RPC protocol + code generation tool

•  batch_mutate, get_range_slices, multiget_slice, etc.

•  Deprecated in favor of native protocol

•  Native protocol: replacement for Thrift

•  Only works with CQL

Page 5: CQL Under the Hood

Definitions •  Storage rows: keys + columns as stored on disk

•  CQL rows: abstraction layer on top of storage rows

•  Not usually a direct mapping to storage rows

•  Make use of predefined schema

•  Data still sparse – no space used for null columns

Page 6: CQL Under the Hood

In the Old Days … •  Lots of clients with lots of APIs

•  No common language for describing schemas or queries

•  Steep learning curve

Page 7: CQL Under the Hood

In the Old Days … •  No cursors, so entire result set must fit in memory on client and server

•  Hard to add new features

•  Lag time between release of new features and client library adoption

Page 8: CQL Under the Hood

Solution: CQL? SELECT * FROM mytable WHERE mykey = ‘foo’;

WAT???!!!

Page 9: CQL Under the Hood

Solution: CQL? Veterans

“What happened to my NoSQL??!!”

Page 10: CQL Under the Hood

Solution: CQL? Newbies

“Sweet, SQL! I don’t have to learn

anything new!”

Page 11: CQL Under the Hood

Reality •  Don’t panic

•  Thrift problems solved

•  You didn’t lose anything

•  Underlying storage is unchanged

•  Don’t get lazy

•  CQL is not SQL

•  You need to know what you’re doing

•  No, you can’t just index everything

Page 12: CQL Under the Hood

Simple CQL CREATE TABLE Books (! title varchar,! author varchar,! year int,! PRIMARY KEY (title)!);!!INSERT INTO Books (title, author, year) !VALUES ('Patriot Games', 'Tom Clancy', 1987);!INSERT INTO Books (title, author, year) !VALUES ('Without Remorse', 'Tom Clancy', 1993);!

SELECT * FROM Books;!!

title | author | year!-----------------+------------+------! Without Remorse | Tom Clancy | 1993! Patriot Games | Tom Clancy | 1987!

Page 13: CQL Under the Hood

Storage Rows [default@unknown] create keyspace Library;![default@unknown] use Library;![default@Library] create column family Books!...! with key_validation_class=UTF8Type!...! and comparator=UTF8Type!...! and default_validation_class=UTF8Type;![default@Library] set Books['Patriot Games']['author'] = 'Tom Clancy';![default@Library] set Books['Patriot Games']['year'] = '1987';![default@Library] list Books;!!RowKey: Patriot Games!=> (name=author, value=Tom Clancy, timestamp=1393102991499000)!=> (name=year, value=1987, timestamp=1393103015955000)!

Page 14: CQL Under the Hood

Storage Rows [default@unknown] create keyspace Library;![default@unknown] use Library;![default@Library] create column family Books!...! with key_validation_class=UTF8Type!...! and comparator=UTF8Type!...! and default_validation_class=UTF8Type;![default@Library] set Books['Patriot Games']['author'] = 'Tom Clancy';![default@Library] set Books['Patriot Games']['year'] = '1987';![default@Library] list Books;!!RowKey: Patriot Games!=> (name=author, value=Tom Clancy, timestamp=1393102991499000)!=> (name=year, value=1987, timestamp=1393103015955000)!

Random hash (no ordering)

Page 15: CQL Under the Hood

Storage Rows [default@unknown] create keyspace Library;![default@unknown] use Library;![default@Library] create column family Books!...! with key_validation_class=UTF8Type!...! and comparator=UTF8Type!...! and default_validation_class=UTF8Type;![default@Library] set Books['Patriot Games']['author'] = 'Tom Clancy';![default@Library] set Books['Patriot Games']['year'] = '1987';![default@Library] list Books;!!RowKey: Patriot Games!=> (name=author, value=Tom Clancy, timestamp=1393102991499000)!=> (name=year, value=1987, timestamp=1393103015955000)!

Ordered by name

Page 16: CQL Under the Hood

Compound Key CREATE TABLE authors (!

!name text,!!year int,!!title text,!!isbn text,!!publisher text,!!PRIMARY KEY (name, year, title)!

);!

name | year | title | isbn | publisher!------------+------+-----------------+---------------+-----------! Tom Clancy | 1987 | Patriot Games | 0-399-13241-4 | Putnam! Tom Clancy | 1993 | Without Remorse | 0-399-13825-0 | Putnam!

Page 17: CQL Under the Hood

Compound Key CREATE TABLE authors (!

!name text,!!year int,!!title text,!!isbn text,!!publisher text,!!PRIMARY KEY (name, year, title)!

);!

name | year | title | isbn | publisher!------------+------+-----------------+---------------+-----------! Tom Clancy | 1987 | Patriot Games | 0-399-13241-4 | Putnam! Tom Clancy | 1993 | Without Remorse | 0-399-13825-0 | Putnam!

Partition key (row key)

Page 18: CQL Under the Hood

Compound Key CREATE TABLE authors (!

!name text,!!year int,!!title text,!!isbn text,!!publisher text,!!PRIMARY KEY (name, year, title)!

);!

name | year | title | isbn | publisher!------------+------+-----------------+---------------+-----------! Tom Clancy | 1987 | Patriot Games | 0-399-13241-4 | Putnam! Tom Clancy | 1993 | Without Remorse | 0-399-13825-0 | Putnam!

Clustering columns

Page 19: CQL Under the Hood

== Composite Columns [default@Library] list authors;!!RowKey: Tom Clancy!=> (name=1987:Patriot Games:ISBN, value=0-399-13241-4, timestamp=1393104011458000)!=> (name=1987:Patriot Games:publisher, value=Putnam, timestamp=1393103948577000)!=> (name=1993:Without Remorse:ISBN, value=0-399-13825-0, timestamp=1393104109214000)!=> (name=1993:Without Remorse:publisher, value=Putnam, timestamp=1393104083773000)!

Partition key (row key)

Page 20: CQL Under the Hood

== Composite Columns [default@Library] list authors;!!RowKey: Tom Clancy!=> (name=1987:Patriot Games:ISBN, value=0-399-13241-4, timestamp=1393104011458000)!=> (name=1987:Patriot Games:publisher, value=Putnam, timestamp=1393103948577000)!=> (name=1993:Without Remorse:ISBN, value=0-399-13825-0, timestamp=1393104109214000)!=> (name=1993:Without Remorse:publisher, value=Putnam, timestamp=1393104083773000)!

Clustering columns

Page 21: CQL Under the Hood

== Composite Columns [default@Library] list authors;!!RowKey: Tom Clancy!=> (name=1987:Patriot Games:ISBN, value=0-399-13241-4, timestamp=1393104011458000)!=> (name=1987:Patriot Games:publisher, value=Putnam, timestamp=1393103948577000)!=> (name=1993:Without Remorse:ISBN, value=0-399-13825-0, timestamp=1393104109214000)!=> (name=1993:Without Remorse:publisher, value=Putnam, timestamp=1393104083773000)!

Ordered by name

Page 22: CQL Under the Hood

Reversing Sort Order CREATE TABLE authors (!

!name text,!!year int,!!title text,!!isbn text,!!publisher text,!!PRIMARY KEY (name, year, title)!

) WITH CLUSTERING ORDER BY (year DESC);!

name | year | title | isbn | publisher!------------+------+-----------------+---------------+-----------! Tom Clancy | 1993 | Without Remorse | 0-399-13825-0 | Putnam! Tom Clancy | 1987 | Patriot Games | 0-399-13241-4 | Putnam!

Page 23: CQL Under the Hood

Reversing Sort Order [default@Library] list authors;!!RowKey: Tom Clancy!=> (name=1993:Without Remorse:publisher, value=Putnam, timestamp=1393104083773000)!=> (name=1993:Without Remorse:ISBN, value=0-399-13825-0, timestamp=1393104109214000)!=> (name=1987:Patriot Games:publisher, value=Putnam, timestamp=1393103948577000)!=> (name=1987:Patriot Games:ISBN, value=0-399-13241-4, timestamp=1393104011458000)!

Page 24: CQL Under the Hood

Composite Partition Key CREATE TABLE authors (!

!name text,!!year int,!!title text,!!isbn text,!!publisher text,!!PRIMARY KEY ((name, year), title)!

);!

name | year | title | isbn | publisher!------------+------+-----------------+---------------+-----------! Tom Clancy | 1987 | Patriot Games | 0-399-13241-4 | Putnam! Tom Clancy | 1993 | Without Remorse | 0-399-13825-0 | Putnam!

Composite partition key

Page 25: CQL Under the Hood

Composite Partition Key CREATE TABLE authors (!

!name text,!!year int,!!title text,!!isbn text,!!publisher text,!!PRIMARY KEY ((name, year), title)!

);!

name | year | title | isbn | publisher!------------+------+-----------------+---------------+-----------! Tom Clancy | 1987 | Patriot Games | 0-399-13241-4 | Putnam! Tom Clancy | 1993 | Without Remorse | 0-399-13825-0 | Putnam!

Clustering column

Page 26: CQL Under the Hood

== Composite Keys [default@Library] list authors;!!RowKey: Tom Clancy:1993!=> (name=Without Remorse:isbn, value=0-399-13241-4, timestamp=1409344246457000)!=> (name=Without Remorse:publisher, value=5075746e616d, timestamp=1409344246457000)!-------------------!RowKey: Tom Clancy:1987!=> (name=Patriot Games:isbn, value=0-399-13825-0, timestamp=1409344245715000)!=> (name=Patriot Games:publisher, value=5075746e616d, timestamp=1409344245715000)!

Partition keys (row key)

Page 27: CQL Under the Hood

== Composite Keys [default@Library] list authors;!!RowKey: Tom Clancy:1993!=> (name=Without Remorse:isbn, value=0-399-13241-4, timestamp=1409344246457000)!=> (name=Without Remorse:publisher, value=5075746e616d, timestamp=1409344246457000)!-------------------!RowKey: Tom Clancy:1987!=> (name=Patriot Games:isbn, value=0-399-13825-0, timestamp=1409344245715000)!=> (name=Patriot Games:publisher, value=5075746e616d, timestamp=1409344245715000)!

Clustering column

Page 28: CQL Under the Hood

Why This Matters •  Queries must respect underlying storage, else they will either be slow or

impossible

•  You have to know your partition key at query time

•  If you want fast multi-record queries, select a range, in storage order

•  With clustering columns, order matters

Page 29: CQL Under the Hood

Example CREATE TABLE authors (!!name text,!!year int,!!title text,!!isbn text,!!publisher text,!!PRIMARY KEY (name, year, title)!

) WITH CLUSTERING ORDER BY (year DESC);!

Page 30: CQL Under the Hood

Example name | year | title | isbn | publisher!------------+------+-----------------------------+---------------+-----------! Tom Clancy | 1996 | Executive Orders | 0-399-13825-0 | Putnam! Tom Clancy | 1994 | Debt of Honor | 0-399-13826-1 | Putnam! Tom Clancy | 1993 | Without Remorse | 0-399-13927-0 | Putnam! Tom Clancy | 1991 | The Sum of All Fears | 0-399-12341-6 | Putnam! Tom Clancy | 1989 | Clear and Present Danger | 0-399-13341-1 | Putnam! Tom Clancy | 1988 | The Cardinal of the Kremlin | 0-399-13241-4 | Putnam! Tom Clancy | 1987 | Patriot Games | 0-399-13231-4 | Putnam! Tom Clancy | 1986 | Red Storm Rising | 0-399-13230-2 | Putnam! Tom Clancy | 1984 | The Hunt for Red October | 0-399-13251-1 | Putnam!!!

Page 31: CQL Under the Hood

Query by Key SELECT * FROM authors WHERE name = ‘Tom Clancy’ (CL = QUORUM) RF = 3

Tom Clancy

Tom Clancy

Tom Clancy

Page 32: CQL Under the Hood

Query by Key RowKey: Tom Clancy!=> (name=1996:Executive Orders:publisher, value=Putnam, timestamp=1393104083773000)!=> (name=1996:Executive Orders:ISBN, value=0-399-13825-0, timestamp=1393104109214000)!=> (name=1994:Debt of Honor:publisher, value=Putnam, timestamp=1393104083773000)!=> (name=1994:Debt of Honor:ISBN, value=0-399-13826-1, timestamp=1393104109214000)!=> (name=1993:Without Remorse:publisher, value=Putnam, timestamp=1393104083773000)!=> (name=1993:Without Remorse:ISBN, value=0-399-13825-0, timestamp=1393104109214000)!=> (name=1991:The Sum of All Fears:publisher, value=Putnam, timestamp=1393103948577000)!=> (name=1991:The Sum of All Fears:ISBN, value=0-399-13241-6, timestamp=1393104011458000)!...!=> (name=1987:Patriot Games:publisher, value=Putnam, timestamp=1393103948577000)!=> (name=1987:Patriot Games:ISBN, value=0-399-13241-4, timestamp=1393104011458000)!

Find partition key

Page 33: CQL Under the Hood

Query by Key RowKey: Tom Clancy!=> (name=1996:Executive Orders:publisher, value=Putnam, timestamp=1393104083773000)!=> (name=1996:Executive Orders:ISBN, value=0-399-13825-0, timestamp=1393104109214000)!=> (name=1994:Debt of Honor:publisher, value=Putnam, timestamp=1393104083773000)!=> (name=1994:Debt of Honor:ISBN, value=0-399-13826-1, timestamp=1393104109214000)!=> (name=1993:Without Remorse:publisher, value=Putnam, timestamp=1393104083773000)!=> (name=1993:Without Remorse:ISBN, value=0-399-13825-0, timestamp=1393104109214000)!=> (name=1991:The Sum of All Fears:publisher, value=Putnam, timestamp=1393103948577000)!=> (name=1991:The Sum of All Fears:ISBN, value=0-399-13241-6, timestamp=1393104011458000)!...!=> (name=1987:Patriot Games:publisher, value=Putnam, timestamp=1393103948577000)!=> (name=1987:Patriot Games:ISBN, value=0-399-13241-4, timestamp=1393104011458000)!

Scan all columns in order

Page 34: CQL Under the Hood

Range Query SELECT * FROM authors WHERE name = ‘Tom Clancy’ AND year >= 1990 (CL = QUORUM)

Tom Clancy

Tom Clancy

Tom Clancy

Page 35: CQL Under the Hood

Range Query RowKey: Tom Clancy!=> (name=1996:Executive Orders:publisher, value=Putnam, timestamp=1393104083773000)!=> (name=1996:Executive Orders:ISBN, value=0-399-13825-0, timestamp=1393104109214000)!=> (name=1994:Debt of Honor:publisher, value=Putnam, timestamp=1393104083773000)!=> (name=1994:Debt of Honor:ISBN, value=0-399-13826-1, timestamp=1393104109214000)!=> (name=1993:Without Remorse:publisher, value=Putnam, timestamp=1393104083773000)!=> (name=1993:Without Remorse:ISBN, value=0-399-13825-0, timestamp=1393104109214000)!=> (name=1991:The Sum of All Fears:publisher, value=Putnam, timestamp=1393103948577000)!=> (name=1991:The Sum of All Fears:ISBN, value=0-399-13241-6, timestamp=1393104011458000)!...!=> (name=1987:Patriot Games:publisher, value=Putnam, timestamp=1393103948577000)!=> (name=1987:Patriot Games:ISBN, value=0-399-13241-4, timestamp=1393104011458000)!

Find partition key

Page 36: CQL Under the Hood

Range Query RowKey: Tom Clancy!=> (name=1996:Executive Orders:publisher, value=Putnam, timestamp=1393104083773000)!=> (name=1996:Executive Orders:ISBN, value=0-399-13825-0, timestamp=1393104109214000)!=> (name=1994:Debt of Honor:publisher, value=Putnam, timestamp=1393104083773000)!=> (name=1994:Debt of Honor:ISBN, value=0-399-13826-1, timestamp=1393104109214000)!=> (name=1993:Without Remorse:publisher, value=Putnam, timestamp=1393104083773000)!=> (name=1993:Without Remorse:ISBN, value=0-399-13825-0, timestamp=1393104109214000)!=> (name=1991:The Sum of All Fears:publisher, value=Putnam, timestamp=1393103948577000)!=> (name=1991:The Sum of All Fears:ISBN, value=0-399-13241-6, timestamp=1393104011458000)!...!=> (name=1987:Patriot Games:publisher, value=Putnam, timestamp=1393103948577000)!=> (name=1987:Patriot Games:ISBN, value=0-399-13241-4, timestamp=1393104011458000)!

Scan until < 1990

Page 37: CQL Under the Hood

Querying Tail of Range SELECT * FROM authors WHERE name = ‘Tom Clancy’ AND year <= 1990 (CL = QUORUM)

Tom Clancy

Tom Clancy

Tom Clancy

Page 38: CQL Under the Hood

Querying Tail of Range RowKey: Tom Clancy!=> (name=1996:Executive Orders:publisher, value=Putnam, timestamp=1393104083773000)!=> (name=1996:Executive Orders:ISBN, value=0-399-13825-0, timestamp=1393104109214000)!=> (name=1994:Debt of Honor:publisher, value=Putnam, timestamp=1393104083773000)!=> (name=1994:Debt of Honor:ISBN, value=0-399-13826-1, timestamp=1393104109214000)!=> (name=1993:Without Remorse:publisher, value=Putnam, timestamp=1393104083773000)!=> (name=1993:Without Remorse:ISBN, value=0-399-13825-0, timestamp=1393104109214000)!=> (name=1991:The Sum of All Fears:publisher, value=Putnam, timestamp=1393103948577000)!=> (name=1991:The Sum of All Fears:ISBN, value=0-399-13241-6, timestamp=1393104011458000)!...!=> (name=1987:Patriot Games:publisher, value=Putnam, timestamp=1393103948577000)!=> (name=1987:Patriot Games:ISBN, value=0-399-13241-4, timestamp=1393104011458000)!

Find partition key

Page 39: CQL Under the Hood

Querying Tail of Range RowKey: Tom Clancy!=> (name=1996:Executive Orders:publisher, value=Putnam, timestamp=1393104083773000)!=> (name=1996:Executive Orders:ISBN, value=0-399-13825-0, timestamp=1393104109214000)!=> (name=1994:Debt of Honor:publisher, value=Putnam, timestamp=1393104083773000)!=> (name=1994:Debt of Honor:ISBN, value=0-399-13826-1, timestamp=1393104109214000)!=> (name=1993:Without Remorse:publisher, value=Putnam, timestamp=1393104083773000)!=> (name=1993:Without Remorse:ISBN, value=0-399-13825-0, timestamp=1393104109214000)!=> (name=1991:The Sum of All Fears:publisher, value=Putnam, timestamp=1393103948577000)!=> (name=1991:The Sum of All Fears:ISBN, value=0-399-13241-6, timestamp=1393104011458000)!...!=> (name=1987:Patriot Games:publisher, value=Putnam, timestamp=1393103948577000)!=> (name=1987:Patriot Games:ISBN, value=0-399-13241-4, timestamp=1393104011458000)!

Scan all, then filter <= 1990

Page 40: CQL Under the Hood

DK

DKMG

MGTCMG

TC

TCDK

Multiple Keys SELECT * FROM authors WHERE name IN (‘Tom Clancy’, ‘Dean Koontz’, ‘Malcolm Gladwell’) (CL = QUORUM)

Page 41: CQL Under the Hood

Lessons Learned •  Sequential == fast

•  Query by key / clustering column (range) == Sequential

•  Multi-key query often == lots of nodes

•  Write in the intended read order

Page 42: CQL Under the Hood

Collections •  Sets: unordered, unique

•  Lists: ordered, allow duplicates

•  Maps: key/value pairs – can be a good substitute for dynamic columns

•  Max 64k items, 64k per item

•  Always returns entire collection

Page 43: CQL Under the Hood

Sets CREATE TABLE authors (!!name text,!!books set<text>,!

PRIMARY KEY (name)!);!!INSERT INTO authors (name, books) !VALUES ('Tom Clancy', {'Without Remorse', 'Patriot Games'});!

name | books!------------+--------------------------------------! Tom Clancy | {'Patriot Games', 'Without Remorse'}!

Page 44: CQL Under the Hood

Sets RowKey: Tom Clancy!=> (name=books:50617472696f742047616d6573, value=, ...)!=> (name=books:576974686f75742052656d6f727365, value=, ...)!

Set name Set item Empty value

Page 45: CQL Under the Hood

Lists CREATE TABLE authors (!!name text,!!books list<text>,!

PRIMARY KEY (name)!);!!INSERT INTO authors (name, books) !VALUES ('Tom Clancy', ['Without Remorse', 'Patriot Games']);!

name | books!------------+--------------------------------------! Tom Clancy | ['Without Remorse’, 'Patriot Games']!

Page 46: CQL Under the Hood

Lists RowKey: Tom Clancy!=> (name=books:d36de8b0305011e4a0dddbbeade718be, value=576974686f75742052656d6f727365, ...)!=> (name=books:d36de8b1305011e4a0dddbbeade718be, value=50617472696f742047616d6573, ...)!

List name Ordering ID List item

Page 47: CQL Under the Hood

Maps CREATE TABLE authors (!!name text,!!books map<text, int>,!

PRIMARY KEY (name)!);!!INSERT INTO authors (name, books) !VALUES ('Tom Clancy', !{'Without Remorse' : 1993, 'Patriot Games' : 1987});!

name | books!------------+-------------------------------------------------! Tom Clancy | {'Patriot Games': 1987, 'Without Remorse': 1993}!

Page 48: CQL Under the Hood

Maps RowKey: Tom Clancy!=> (name=books:50617472696f742047616d6573, value=000007c3, ...)!=> (name=books:576974686f75742052656d6f727365, value=000007c9, ...)!

Map name Map key Map value

Page 49: CQL Under the Hood

Indices CREATE INDEX author_publisher!ON author (publisher);!

Page 50: CQL Under the Hood

Indices •  Allow query by value in certain cases

•  Partitioned based on row key of indexed table

•  Are updated atomically along with the data being inserted

•  Must be low cardinality, or it won’t scale well (to large cluster sizes)

•  But not too low, or it’s sort of pointless

Page 51: CQL Under the Hood

Index Distribution

Node 1

“Tom Clancy” : “Putnam”!“Mark Twain” : “Putnam”!!

“Putnam” : “Tom Clancy”!“Putnam” : “Mark Twain”!!

Authors

Index

Node 2

“Mark Twain” : “Putnam”!“Dan Brown” : “Putnam”!!

“Putnam” : “Mark Twain”!“Putnam” : “Dan Brown”!!

Authors

Index

Node 3

“Dan Brown” : “Putnam” “Tom Clancy” : “Putnam”!!

“Putnam” : “Dan Brown”!“Putnam” : “Tom Clancy”!!

Authors

Index

Index key == indexed column value

Page 52: CQL Under the Hood

Index Distribution

Node 1

“Tom Clancy” : “Putnam”!“Mark Twain” : “Putnam”!!

“Putnam” : “Tom Clancy”!“Putnam” : “Mark Twain”!!

Authors

Index

Node 2

“Mark Twain” : “Putnam”!“Dan Brown” : “Putnam”!!

“Putnam” : “Mark Twain”!“Putnam” : “Dan Brown”!!

Authors

Index

Node 3

“Dan Brown” : “Putnam” “Tom Clancy” : “Putnam”!!

“Putnam” : “Dan Brown”!“Putnam” : “Tom Clancy”!!

Authors

Index

… but node distribution based on original table key

Page 53: CQL Under the Hood

pub idx

pub idx

pub idxpub idx

pub idx

pub idx

Querying by Value SELECT * FROM authors WHERE publisher = ‘Putnam’ (CL = QUORUM)

Page 54: CQL Under the Hood

Deletes DELETE FROM authors WHERE name = 'Tom Clancy';!!INSERT INTO authors (title, name) VALUES ('Patriot Games', 'Tom Clancy') USING TTL 86400;!!INSERT INTO authors (title, name, year) VALUES ('Patriot Games', 'Tom Clancy', null);!!UPDATE authors SET publisher = null WHERE name = 'Tom Clancy';!!

Page 55: CQL Under the Hood

Deletes •  Log-structured storage, so writes are immutable

•  Deletes create tombstones, one for each deleted column

•  Cassandra must read the tombstones to make sure it doesn’t revive deleted data

•  Lots of deletes is an anti-pattern

Page 56: CQL Under the Hood

Missing Columns INSERT INTO authors (title, name, year)!VALUES ('Without Remorse', 'Tom Clancy', 1993);!! name | year | title | isbn | publisher!------------+------+-----------------+------+-----------! Tom Clancy | 1993 | Without Remorse | null | null!!RowKey: Tom Clancy!=> (name=1993:Without Remorse:, value=, timestamp=1409936754170000)!

Page 57: CQL Under the Hood

Missing Columns activity | timestamp | source | source_elapsed!---------------------------------------------------------------------------+--------------+-----------+----------------! execute_cql3_query | 11:51:55,975 | 127.0.0.1 | 0! Parsing select * from authors where name = 'Tom Clancy' LIMIT 10000; | 11:51:55,975 | 127.0.0.1 | 47! Preparing statement | 11:51:55,975 | 127.0.0.1 | 105! Executing single-partition query on authors | 11:51:55,975 | 127.0.0.1 | 307! Acquiring sstable references | 11:51:55,975 | 127.0.0.1 | 315! Merging memtable tombstones | 11:51:55,975 | 127.0.0.1 | 328! Skipped 0/0 non-slice-intersecting sstables, included 0 due to tombstones | 11:51:55,975 | 127.0.0.1 | 374! Merging data from memtables and 0 sstables | 11:51:55,975 | 127.0.0.1 | 383!

Read 1 live and 0 tombstoned cells | 11:51:55,975 | 127.0.0.1 | 420! Request complete | 11:51:55,975 | 127.0.0.1 | 585!

Only 1 read required

Page 58: CQL Under the Hood

Null Columns INSERT INTO authors (title, name, year, isbn, publisher) VALUES ('Without Remorse', 'Tom Clancy', 1993, null, null);!! name | year | title | isbn | publisher!------------+------+-----------------+------+-----------! Tom Clancy | 1993 | Without Remorse | null | null!

Page 59: CQL Under the Hood

Null Columns activity | timestamp | source | source_elapsed!---------------------------------------------------------------------------+--------------+-----------+----------------! execute_cql3_query | 11:57:31,623 | 127.0.0.1 | 0! Parsing select * from authors where name = 'Tom Clancy' LIMIT 10000; | 11:57:31,623 | 127.0.0.1 | 41! Preparing statement | 11:57:31,623 | 127.0.0.1 | 101! Executing single-partition query on authors | 11:57:31,623 | 127.0.0.1 | 532! Acquiring sstable references | 11:57:31,623 | 127.0.0.1 | 544! Merging memtable tombstones | 11:57:31,623 | 127.0.0.1 | 571! Skipped 0/0 non-slice-intersecting sstables, included 0 due to tombstones | 11:57:31,623 | 127.0.0.1 | 596! Merging data from memtables and 0 sstables | 11:57:31,623 | 127.0.0.1 | 605!

Read 1 live and 2 tombstoned cells | 11:57:31,624 | 127.0.0.1 | 669! Request complete | 11:57:31,623 | 127.0.0.1 | 777!

3 reads required!

Page 60: CQL Under the Hood

Why Queries Fail SELECT name FROM authors WHERE title = 'Patriot Games';!

Bad Request: PRIMARY KEY part title cannot be restricted (preceding part year is either not restricted or by a non-EQ relation)!

Page 61: CQL Under the Hood

Why Queries Fail •  Failure to provide the full partition key

•  Querying by value without an index

•  Misunderstanding clustering columns

Page 62: CQL Under the Hood

Missing Key Parts CREATE TABLE authors (!!name text,!!year int,!!title text,!!isbn text,!!publisher text,!!PRIMARY KEY ((name, year), title)!

);!

!

!

SELECT name FROM authors WHERE title = 'Patriot Games';!

Bad Request: PRIMARY KEY part title cannot be restricted (preceding part year is either not restricted or by a non-EQ relation)!

Page 63: CQL Under the Hood

Missing Key Parts [default@Library] list authors;!!RowKey: Tom Clancy:1993!=> (name=Without Remorse:isbn, value=0-399-13241-4, timestamp=1409344246457000)!=> (name=Without Remorse:publisher, value=5075746e616d, timestamp=1409344246457000)!-------------------!RowKey: Tom Clancy:1987!=> (name=Patriot Games:isbn, value=0-399-13825-0, timestamp=1409344245715000)!=> (name=Patriot Games:publisher, value=5075746e616d, timestamp=1409344245715000)!

Partition keys (row key)

Page 64: CQL Under the Hood

Missing Key Parts SELECT * FROM authors WHERE name = 'Tom Clancy’;!

Bad Request: Partition key part year must be restricted since preceding part is!

Page 65: CQL Under the Hood

Missing Key Parts SELECT * FROM authors WHERE year = 1987;!

Bad Request: partition key part year cannot be restricted (preceding part name is either not restricted or by a non-EQ relation)!

Page 66: CQL Under the Hood

Missing Key Parts SELECT * FROM authors WHERE year >= 1987;!

Bad Request: partition key part year cannot be restricted (preceding part name is either not restricted or by a non-EQ relation)!

Page 67: CQL Under the Hood

Missing Key Parts SELECT * FROM authors !WHERE name = 'Tom Clancy' and year >= 1987;!!Bad Request: Only EQ and IN relation are supported on the partition key (unless you use the token() function)!

Page 68: CQL Under the Hood

Querying by Value SELECT * FROM authors WHERE isbn = '0-399-13241-4';!

Bad Request: No indexed columns present in by-columns clause with Equal operator!

Page 69: CQL Under the Hood

Querying Clustering Columns CREATE TABLE authors (!!name text,!!year int,!!title text,!!isbn text,!!publisher text,!!PRIMARY KEY (name, year, title)!

);!

!

!

SELECT name FROM authors WHERE title = 'Patriot Games';!

Bad Request: PRIMARY KEY part title cannot be restricted (preceding part year is either not restricted or by a non-EQ relation)!

Page 70: CQL Under the Hood

Querying Clustering Columns RowKey: Tom Clancy!=> (name=1987:Patriot Games:ISBN, value=0-399-13241-4, timestamp=1393104011458000)!=> (name=1987:Patriot Games:publisher, value=Putnam, timestamp=1393103948577000)!=> (name=1993:Without Remorse:ISBN, value=0-399-13825-0, timestamp=1393104109214000)!=> (name=1993:Without Remorse:publisher, value=Putnam, timestamp=1393104083773000)!=> (name=1996:Executive Orders:ISBN, value=0-399-13825-0, timestamp=1393104109214000)!=> (name=1996:Executive Orders:publisher, value=Putnam, timestamp=1393104083773000)!

Clustering columns

Page 71: CQL Under the Hood

Querying Clustering Columns SELECT * FROM authors !WHERE name = 'Tom Clancy’!AND year = 1993;!

name | year | title | isbn | publisher!------------+------+-----------------+---------------+-----------! Tom Clancy | 1993 | Without Remorse | 0-399-13825-0 | Putnam!

Page 72: CQL Under the Hood

Querying Clustering Columns RowKey: Tom Clancy!=> (name=1987:Patriot Games:ISBN, value=0-399-13241-4, timestamp=1393104011458000)!=> (name=1987:Patriot Games:publisher, value=Putnam, timestamp=1393103948577000)!=> (name=1993:Without Remorse:ISBN, value=0-399-13825-0, timestamp=1393104109214000)!=> (name=1993:Without Remorse:publisher, value=Putnam, timestamp=1393104083773000)!=> (name=1996:Executive Orders:ISBN, value=0-399-13825-0, timestamp=1393104109214000)!=> (name=1996:Executive Orders:publisher, value=Putnam, timestamp=1393104083773000)!

Page 73: CQL Under the Hood

Querying Clustering Columns SELECT * FROM authors !WHERE name = 'Tom Clancy’!AND year <= 1993;!

name | year | title | isbn | publisher!------------+------+-----------------+---------------+-----------! Tom Clancy | 1987 | Patriot Games | 0-399-13241-4 | Putnam! Tom Clancy | 1993 | Without Remorse | 0-399-13825-0 | Putnam!

Page 74: CQL Under the Hood

Querying Clustering Columns RowKey: Tom Clancy!=> (name=1987:Patriot Games:ISBN, value=0-399-13241-4, timestamp=1393104011458000)!=> (name=1987:Patriot Games:publisher, value=Putnam, timestamp=1393103948577000)!=> (name=1993:Without Remorse:ISBN, value=0-399-13825-0, timestamp=1393104109214000)!=> (name=1993:Without Remorse:publisher, value=Putnam, timestamp=1393104083773000)!=> (name=1996:Executive Orders:ISBN, value=0-399-13825-0, timestamp=1393104109214000)!=> (name=1996:Executive Orders:publisher, value=Putnam, timestamp=1393104083773000)!

Page 75: CQL Under the Hood

Querying Clustering Columns SELECT * FROM authors !WHERE name = 'Tom Clancy’!AND title = 'Patriot Games';!!Bad Request: PRIMARY KEY part title cannot be restricted (preceding part year is either not restricted or by a non-EQ relation)!!!

Page 76: CQL Under the Hood

Querying Clustering Columns RowKey: Tom Clancy!=> (name=1987:Patriot Games:ISBN, value=0-399-13241-4, timestamp=1393104011458000)!=> (name=1987:Patriot Games:publisher, value=Putnam, timestamp=1393103948577000)!=> (name=1993:Without Remorse:ISBN, value=0-399-13825-0, timestamp=1393104109214000)!=> (name=1993:Without Remorse:publisher, value=Putnam, timestamp=1393104083773000)!=> (name=1996:Executive Orders:ISBN, value=0-399-13825-0, timestamp=1393104109214000)!=> (name=1996:Executive Orders:publisher, value=Putnam, timestamp=1393104083773000)!

?

Page 77: CQL Under the Hood

Querying Clustering Columns SELECT * FROM authors !WHERE year = 1987;!!Bad Request: Cannot execute this query as it might involve data filtering and thus may have unpredictable performance. If you want to execute this query despite the performance unpredictability, use ALLOW FILTERING!!

Page 78: CQL Under the Hood

Querying Clustering Columns SELECT * FROM authors !WHERE year = 1987 ALLOW FILTERING;!

name | year | title | isbn | publisher!------------+------+-----------------+---------------+-----------! Tom Clancy | 1987 | Patriot Games | 0-399-13241-4 | Putnam!

Page 79: CQL Under the Hood

Querying Clustering Columns SELECT * FROM authors !WHERE year = 1987 LIMIT 1 ALLOW FILTERING;!

name | year | title | isbn | publisher!------------+------+-----------------+---------------+-----------! Tom Clancy | 1987 | Patriot Games | 0-399-13241-4 | Putnam!

Page 80: CQL Under the Hood

Summary •  Stop using Thrift; use CQL instead

•  … but know what your model/query is doing

Page 81: CQL Under the Hood

Thanks!

Robbie Strickland Software Development Manager @rs_atl

Page 82: CQL Under the Hood