DataStax: A deep look at the CQL WHERE clause
-
Upload
datastax-academy -
Category
Technology
-
view
2.467 -
download
3
Transcript of DataStax: A deep look at the CQL WHERE clause
A deep look at the CQL WHERE clause
CQL WHERE clause
2© 2015. All Rights Reserved.
Driver
The WHERE clause restrictions are dependent on:
• The type of statement: SELECT, UPDATE or DELETE
• The type of column: partition key, clustering or regular column
• If a secondary index is used or not
3© 2015. All Rights Reserved.
Driver
SELECT statements
Partition key restrictions
4© 2015. All Rights Reserved.
Driver
Cluster Date Time Count
‘cluster 1’ ‘2015-09-21’ ‘12:00’ 251
‘cluster 1’ ‘2015-09-22’ ‘12:00’ 342
‘cluster 2’ ‘2015-09-21’ ‘12:00’ 403
‘cluster 2’ ‘2015-09-22’ ‘12:00’ 451
CREATE TABLE numberOfRequests (
cluster text,
date text,
time text,
count int,
PRIMARY KEY ((cluster, date))
)
Partition Key
Partition key restrictions
5© 2015. All Rights Reserved.
Driver
Cluster Date Murmur3 hash
‘cluster 1’ ‘2015-09-21’ -4782752162231423249
‘cluster 1’ ‘2015-09-22’ 4936127188075462704
‘cluster 2’ ‘2015-09-21’ 5822105674898716412
‘cluster 2’ ‘2015-09-22’ 2698159220916609751
A
C
D B
4611686018427387904to
9223372036854775807
-9223372036854775808to
-4611686018427387903
-1to
4611686018427387903
-4611686018427387904 to -1
Partition key restrictions
6© 2015. All Rights Reserved.
Driver
Cluster Date Node
‘cluster 1’ ‘2015-09-21’ A
‘cluster 1’ ‘2015-09-22’ D
‘cluster 2’ ‘2015-09-21’ D
‘cluster 2’ ‘2015-09-22’ C
A
C
D B
Partition key restrictions
7© 2015. All Rights Reserved.
Driver
A
C
D B
SELECT * FROM numberOfRequests;
Driver
Partition key restrictions
8© 2015. All Rights Reserved.
Driver
SELECT * FROM numberOfRequests WHERE cluster= ‘cluster 1’;
InvalidRequest: code=2200 [Invalid query]
message="Partition key parts: date must be restricted as other parts are"
Partition key restrictions
9© 2015. All Rights Reserved.
Driver
A
C
D B
SELECT * FROM numberOfRequests WHERE cluster= ‘cluster 1’
AND date = ‘2015-09-21’;
Driver
Partition key restrictions
10© 2015. All Rights Reserved.
Driver
A
C
D B
SELECT * FROM numberOfRequests WHERE cluster= ‘cluster 1’
AND date = ‘2015-09-21’;
Driver
…with TokenAwarePolicy
Partition key restrictions
11© 2015. All Rights Reserved.
Driver
A
C
D B
SELECT * FROM numberOfRequests WHERE cluster= ‘cluster 2’
AND date IN (‘2015-09-21’, ‘2015-09-22’);
Driver
Partition key restrictions
12© 2015. All Rights Reserved.
Driver
A
C
D B
SELECT * FROM numberOfRequests WHERE cluster = ‘cluster 2’
AND date = ‘2015-09-21’;
Driver
…with TokenAwarePolicy
and asynchronous calls
SELECT * FROM numberOfRequests WHERE cluster = ‘cluster 2’
AND date = ‘2015-09-22’;
Partition key restrictions
13© 2015. All Rights Reserved.
Driver
SELECT * FROM numberOfRequests WHERE cluster= ‘cluster 1’
AND date >= ‘2015-09-21’;
InvalidRequest: code=2200 [Invalid query]
message="Only EQ and IN relation are supported on the partition key (unless
you use the token() function)"
Partition key restrictions
14© 2015. All Rights Reserved.
Driver
Cluster Date Node
‘cluster 1’ ‘2015-09-21’ A
‘cluster 1’ ‘2015-09-22’ D
‘cluster 2’ ‘2015-09-21’ D
‘cluster 2’ ‘2015-09-22’ C
A
C
D B
SELECT * FROM numberOfRequests WHERE cluster= ‘cluster 1’
AND date >= ‘2015-09-21’;
Partition key restrictions
15© 2015. All Rights Reserved.
Driver
• Murmur3Partitioner (default): uniformly distributes data across
the cluster based on MurmurHash hash values.
• RandomPartitioner: uniformly distributes data across the
cluster based on MD5 hash values.
• ByteOrderedPartitioner: keeps an ordered distribution of data
lexically by key bytes
Partition key restrictions
16© 2015. All Rights Reserved.
Driver
SELECT * FROM numberOfRequests
WHERE token(cluster, date) > token(‘cluster 1’, ‘2015-09-21’)
AND token(cluster, date) < token(‘cluster 1’, ‘2015-09-23’);
Partition key restrictions (SELECT)
17© 2015. All Rights Reserved.
• Without secondary index, either all partition key components must be
restricted or none of them
• = restrictions are allowed on any partition key component
• IN restrictions are allowed on any partition key component since 2.2
• Prior to 2.2, IN restrictions were only allowed on the last partition key
component
• =, >, >=, <= and < restrictions are allowed with the token function
Clustering column restrictions
18© 2015. All Rights Reserved.
CREATE TABLE numberOfRequests (
cluster text,
date text,
datacenter text,
server inet,
time text,
count int,
PRIMARY KEY((cluster, date), datacenter, server, time))
…
Clustering column restrictions
19© 2015. All Rights Reserved.
…
Datacenter Server Time Count
Iowa 196.8.7.134 00:00 130
Iowa 196.8.7.134 00:01 125
Iowa 196.8.7.134 00:02 97
Iowa 196.8.7.135 00:00 178
Iowa 196.8.7.135 00:01 201
[Iowa, 196.8.7.134, 00:02, count] :
97
In the Memtables:
[Iowa, 196.8.7.134, 00:00, count] :
130
[Iowa, 196.8.7.134, 00:01, count] :
125[Iowa, 196.8.7.135, 00:00, count] :
178
[Iowa, 196.8.7.135, 00:01, count] :
201
[Iowa, 196.8.7.134, 00:00, count] :
130
Cell nameCell
Column name
Clustering column restrictions
20© 2015. All Rights Reserved.
…
Datacenter Server Time Count
Iowa 196.8.7.134 00:00 130
Iowa 196.8.7.134 00:01 125
Iowa 196.8.7.134 00:02 97
Iowa 196.8.7.135 00:00 178
Iowa 196.8.7.135 00:01 201
[Iowa, 196.8.7.134, 00:02, count] :
97
In the SSTables:
[Iowa, 196.8.7.134, 00:00, count] :
130
[Iowa, 196.8.7.134, 00:01, count] :
125
[Iowa, 196.8.7.135, 00:00, count] :
178
[Iowa, 196.8.7.135, 00:01, count] :
201
[Iowa, 196.8.7.134, 00:00, count] :
130
Cell nameCell
Column name
Clustering column restrictions
21© 2015. All Rights Reserved.
…
[Iowa, 196.8.7.134, 00:02, count] :
97
In the Memtables:
[Iowa, 196.8.7.134, 00:00, count] :
130
[Iowa, 196.8.7.134, 00:01, count] :
125[Iowa, 196.8.7.135, 00:00, count] :
178
[Iowa, 196.8.7.135, 00:01, count] :
201
SELECT * FROM numberOfRequests
WHERE cluster = ‘cluster1’ AND date =‘2015-09-21’
AND datacenter = ‘Iowa’ AND server = ‘196.8.7.135’ AND time = ‘00:00’;
[Iowa,196.8.7.135,00:00]
Clustering column restrictions
22© 2015. All Rights Reserved.
…
SELECT * FROM numberOfRequests
WHERE cluster = ‘cluster1’ AND date =‘2015-09-21’
AND datacenter = ‘Iowa’ AND server = ‘196.8.7.135’ AND time = ‘00:00’;
[Iowa,196.8.7.135,00:00]
…[Iowa, 196.8.7.134, 00:02, count] :
97
In the SSTables:
[Iowa, 196.8.7.134, 00:00, count] :
130
[Iowa, 196.8.7.134, 00:01, count] :
125
[Iowa, 196.8.7.135, 00:00, count] :
178
[Iowa, 196.8.7.135, 00:01, count] :
201
Clustering column restrictions
23© 2015. All Rights Reserved.
[Iowa, 196.8.7.134, 00:02, count] :
97
In the Memtables:
[Iowa, 196.8.7.134, 00:00, count] :
130
[Iowa, 196.8.7.134, 00:01, count] :
125[Iowa, 196.8.7.135, 00:00, count] :
178
[Iowa, 196.8.7.135, 00:01, count] :
201
SELECT * FROM numberOfRequests
WHERE cluster = ‘cluster1’ AND date =‘2015-09-21’
AND datacenter = ‘Iowa’ AND server = ‘196.8.7.135’;
[Iowa,196.8.7.135]
Clustering column restrictions
24© 2015. All Rights Reserved.
…
SELECT * FROM numberOfRequests
WHERE cluster = ‘cluster1’ AND date =‘2015-09-21’
AND datacenter = ‘Iowa’ AND server = ‘196.8.7.135’;
[Iowa,196.8.7.135]
…[Iowa, 196.8.7.134, 00:02, count] :
97
In the SSTables:
[Iowa, 196.8.7.134, 00:00, count] :
130
[Iowa, 196.8.7.134, 00:01, count] :
125
[Iowa, 196.8.7.135, 00:00, count] :
178
[Iowa, 196.8.7.135, 00:01, count] :
201
Clustering column restrictions
SELECT * FROM numberOfRequests
WHERE cluster = ‘cluster1’ AND date =‘2015-09-21’
AND time = ‘00:00’;
[?,?,00:00]
InvalidRequest: code=2200 [Invalid query]
message="PRIMARY KEY column "time" cannot be restricted as preceding
column "datacenter" is not restricted"
Clustering column restrictions
26© 2015. All Rights Reserved.
…
AND datacenter = ‘Iowa’
AND server IN (‘196.8.7.134’, ‘196.8.7.135’)
AND time = ‘00:00’;
In 2.2:
[Iowa,196.8.7.134,00:00]
[Iowa,196.8.7.135,00:00]
…[Iowa, 196.8.7.134, 00:02, count] :
97
In the SSTables:
[Iowa, 196.8.7.134, 00:00, count] :
130
[Iowa, 196.8.7.134, 00:01, count] :
125
[Iowa, 196.8.7.135, 00:00, count] :
178
[Iowa, 196.8.7.135, 00:01, count] :
201
Clustering column restrictions
27© 2015. All Rights Reserved.
…
AND datacenter = ‘Iowa’
AND server IN (‘196.8.7.134’, ‘196.8.7.135’)
AND time = ‘00:00’;
In 2.1:
InvalidRequest: code=2200 [Invalid query]
message="Clustering column "server" cannot be restricted by an IN relation"
Clustering column restrictions
28© 2015. All Rights Reserved.
= multi-column restriction:
(clustering1, clustering2, clustering3) = (?, ?, ?)
IN multi-column restriction:
(clustering1, clustering2, clustering3) IN ((?, ?, ?), (?, ?, ?))
Slice multi-column restriction:
(clustering1, clustering2, clustering3) > (?, ?, ?)
(clustering1, clustering2, clustering3) >= (?, ?, ?)
(clustering1, clustering2, clustering3) <= (?, ?, ?)
(clustering1, clustering2, clustering3) < (?, ?, ?)
Clustering column restrictions
29© 2015. All Rights Reserved.
…
AND datacenter = ‘Iowa’
AND (server, time) IN ((‘196.8.7.134’, ‘00:00’),
(‘196.8.7.135’, ‘00:00’));
In 2.1:
[Iowa,196.8.7.134,00:00]
[Iowa,196.8.7.135,00:00]
…[Iowa, 196.8.7.134, 00:02, count] :
97
In the SSTables:
[Iowa, 196.8.7.134, 00:00, count] :
130
[Iowa, 196.8.7.134, 00:01, count] :
125
[Iowa, 196.8.7.135, 00:00, count] :
178
[Iowa, 196.8.7.135, 00:01, count] :
201
Clustering column restrictions
30© 2015. All Rights Reserved.
…
AND datacenter = ‘Iowa’
AND server = ‘196.8.7.134’
AND time > ’00:00’;
from after [Iowa,196.8.7.134,00:00]
to end of [Iowa,196.8.7.134]
…[Iowa, 196.8.7.134, 00:02, count] :
97
In the SSTables:
[Iowa, 196.8.7.134, 00:00, count] :
130
[Iowa, 196.8.7.134, 00:01, count] :
125
[Iowa, 196.8.7.135, 00:00, count] :
178[Iowa, 196.8.7.135, 00:01, count] :
201
Clustering column restrictions (SELECT)
31© 2015. All Rights Reserved.
• Without secondary index, a clustering column cannot be restricted if
one of the previous ones was not
• = restrictions (single and multi) are allowed on any clustering column
• IN restrictions (single and multi) are allowed on any clustering column
since 2.2
• Prior to 2.2, IN restrictions (single and multi) were only allowed on the
last clustering column or set of clustering columns
• >, >=, <=, < restrictions (single and multi) are only allowed on the last
restricted clustering column or set of clustering columns
• CONTAINS and CONTAINS KEY restrictions are only allowed on
indexed collections
Secondary index queries
32© 2015. All Rights Reserved.
CREATE TABLE numberOfRequests (
cluster text,
date text,
datacenter text,
server inet,
time text,
count int,
PRIMARY KEY((cluster, date), datacenter, server, time));
CREATE INDEX ON numberOfRequests (time);…
Secondary index queries
33© 2015. All Rights Reserved.
CREATE INDEX ON numberOfRequests (time);
CREATE LOCAL TABLE numberOfRequests_time_idx (
time text,
cluster text,
date text,
datacenter text,
server inet,
PRIMARY KEY(time, cluster, date, datacenter, server);…
Table Partition Key
Table remaining clustering columns
IDX-BIDX-D
IDX-C
IDX-A
Secondary index queries
34© 2015. All Rights Reserved.
Driver
A
C
D B
SELECT * FROM numberOfRequests WHERE time = ‘12:00’;
Driver
Secondary index queries
35© 2015. All Rights Reserved.
Driver
SELECT * FROM numberOfRequests WHERE time = ‘12:00’;
idx
SELECT * FROM numberOfRequests_time_idx
WHERE time = ‘12:00’;
Results (Primary Keys)
tableSELECT with full PK;
[For each]
Add to rows
Secondary index queries
36© 2015. All Rights Reserved.
Driver
SELECT * FROM numberOfRequests WHERE time >= ‘12:00’;
InvalidRequest: code=2200 [Invalid query]
message="PRIMARY KEY column "time" cannot be restricted as preceding
column "datacenter" is not restricted"
Direct queries on secondary index support only =, CONTAINS or CONTAINS
KEY restrictions.
Secondary index queries
37© 2015. All Rights Reserved.
Driver
SELECT * FROM numberOfRequests WHERE time = ‘12:00’
AND count >= 500 ALLOW FILTERING;
idx
SELECT * FROM numberOfRequests_time_idx
WHERE time = ‘12:00’;
Results (Primary Keys)
tableSELECT with full PK;
[For each]
Add to rows
[if count >= 500]
IDX-BIDX-D
IDX-C
IDX-A
Secondary index queries
38© 2015. All Rights Reserved.
Driver
A
C
D B
SELECT * FROM numberOfRequests
WHERE cluster = ‘cluster 1’ AND date = ‘2015-09-21’AND time = ‘12:00’;
Driver
Secondary index queries
39© 2015. All Rights Reserved.
Driver
SELECT * FROM numberOfRequests
WHERE cluster = ‘cluster 1’ AND date = ‘2015-09-21’ AND time = ‘12:00’;
idx
SELECT * FROM numberOfRequests_time_idx
WHERE time = ‘12:00’ AND cluster = ‘1’ AND
date = ‘2015-09-21’;
Results (Primary Keys)
tableSELECT with full PK
[For each]
Add to rows
40© 2015. All Rights Reserved.
Driver
UPDATE/DELETE statements
UPDATE statements
41© 2015. All Rights Reserved.
Driver
In the UPDATE statements all the primary key columns must be restricted and
the only allowed restrictions are:
• Prior to 3.0:
• Single column = restriction on any partition key or clustering column
• Single column IN restriction on the last partition key column
• In 3.0:
• = and IN single column restrictions on any partition key column
• = and IN single or multi column restrictions on any clustering column
DELETE statements
42© 2015. All Rights Reserved.
Driver
Before 3.0, in the DELETE statements all the primary key columns must be
restricted and the only allowed restrictions were:
• Single column = restriction on any partition key or clustering column
• Single column IN restriction on the last partition key column
DELETE statements
43© 2015. All Rights Reserved.
Driver
Since 3.0:
• The partition key columns must be restricted by = or IN restrictions
• A clustering column might not be restricted if none of the following is
• Clustering columns can be restricted by:
• Single or multi column = restriction
• Single or multi column IN restriction
• Single or multi column >, >=, <=, < restriction
© 2015. All Rights Reserved. 44
Design your tables for the queries
you want to perform.
Thank you