Query Processing CS 347 Decomposition Parallel and...
Transcript of Query Processing CS 347 Decomposition Parallel and...
CS 347Parallel and Distributed
Data ProcessingSpring 2016
Notes 3: Query Processing
Query ProcessingDecompositionLocalizationOptimization
CS 347 Notes 3 2
DecompositionSame as in centralized system
1. Normalization2. Eliminating redundancy3. Algebraic rewriting
CS 347 Notes 3 3
NormalizationConvert from general language to a standard form E.g., relational algebra
CS 347 Notes 3 4
NormalizationExample:
select R.A, R.Cfrom R, Swhere R.A = S.A and
((R.B = 1 and S.D = 2) or (R.C > 3 and S.D = 2))
CS 347 Notes 3 5
∏R.A,R.C
⋈
R S
σ(R.B=1∨R.C>3)∧S.D=2
conjunctivenormal formA
NormalizationAlso detect invalid expressions
select * from R ✘ R does not have D attributewhere R.D = 3
CS 347 Notes 3 6
Eliminating RedundancyE.g., in conditions
(S.A = 1) ∧ (S.A > 5) ⟹ false
(S.A < 10) ∧ (S.A < 5) ⟹ S.A < 5
CS 347 Notes 3 7
Eliminating RedundancyE.g., sharing common sub-expressions
CS 347 Notes 3 8
∪
⋈ ⋈
S σcond σcond T
R R
∪
⋈ ⋈
S σcond T
R
Algebraic RewritingE.g., pushing conditions down
CS 347 Notes 3 9
σcond
⋈
R S
σcond3
⋈
σcond1 σcond2
R S
Query ProcessingDecomposition ✔→ One or more algebraic query trees on relations
LocalizationReplacing relations by corresponding fragments
CS 347 Notes 3 10
LocalizationSteps
(1) Start with query
(2) Replace relations by fragments
(3) Push ∪ up and ∏, σ down (apply CS 245 rules)
(4) Simplify ~ eliminate unnecessary operations
CS 347 Notes 3 11
LocalizationNotation for fragment
CS 347 Notes 3 12
[R : cond]
fragment
conditions its tuples satisfy
Example A(1)
CS 347 Notes 3 13
σE=3
R
Example A(2)
CS 347 Notes 3 14
σE=3
∪
[R1: E<10] [R2: E≥10]
Example A(3)
CS 347 Notes 3 15
σE=3
∪
[R1: E<10] [R2: E≥10]
σE=3
Example A(3)
CS 347 Notes 3 16
σE=3
∪
[R1: E<10] [R2: E≥10]
σE=3
⟹ Ø
Example A(4)
CS 347 Notes 3 17
σE=3
[R1: E<10]
LocalizationRule 1
σc1 [R: c2] ⟹ σc1 [R: c1∧c2]
[R: false] ⟹ Ø
CS 347 Notes 3 18
Example AσE=3 [R2: E ≥ 10] ⟹ σE=3 [R2: E=3∧E ≥ 10]
⟹ σE=3 [R2: false]⟹ Ø
CS 347 Notes 3 19
Example B(1)
CS 347 Notes 3 20
R
⋈
S
AA = common attribute
Example B(2)
CS 347 Notes 3 21
∪
⋈
∪
A
[R1: A<5] [R2: 5≤A≤10] [R3: A>10] [S1: A<5] [S2: A≥5]
Example B(3)
CS 347 Notes 3 22
∪
[R1: A<5]
[R2: 5≤A≤10]
[R3: A>10]
[S1: A<5] [S2: A≥5]
[S1: A<5] [R1: A<5] [S2: A≥5]
[R2: 5≤A≤10]
[R3: A>10] [S1: A<5] [S2: A≥5]
⋈A ⋈A ⋈A ⋈A
⋈A ⋈A
Example B(3)
CS 347 Notes 3 23
∪
[R1: A<5]
[R2: 5≤A≤10]
[R3: A>10]
[S1: A<5] [S2: A≥5]
[S1: A<5] [R1: A<5] [S2: A≥5]
[R2: 5≤A≤10]
[R3: A>10] [S1: A<5] [S2: A≥5]
⋈A ⋈A ⋈A ⋈A
⋈A ⋈A
Example B(4)
CS 347 Notes 3 24
[R1: A<5] [R3: A>10][S2: A≥5][S1: A<5] [R2: 5≤A≤10] [S2: A≥5]
⋈A ⋈A ⋈A
∪
LocalizationRule 2
[R: c1] [S: c2] ⟹ [R S: c1∧c2∧R.A=S.A]
[R: false] ⟹ Ø
CS 347 Notes 3 25
⋈A ⋈A
Example B[R1: A < 5] [S2: A ≥ 5]
⟹ [R1 S2: R1.A < 5 ∧ S2.A ≥ 5 ∧ R1.A=S2.A]
⟹ [R1 S2: false]
⟹ Ø
CS 347 Notes 3 26
⋈A
⋈A
⋈A
Example C(2)
CS 347 Notes 3 27
∪
⋈
∪
K
[R1: A<10] [R2: A≥10] [S1: K=R.K∧R.A<10] [S2: K=R.K∧R.A≥10]
derived fragmentation
Example C(3)
CS 347 Notes 3 28
[R1]
⋈K ⋈K⋈K ⋈K
∪
[S1] [R1] [S2] [S1][R2] [R2] [S2]
Example C(3)
CS 347 Notes 3 29
[R1]
⋈K ⋈K⋈K ⋈K
∪
[S1] [R1] [S2] [S1][R2] [R2] [S2]
Example C(4)
CS 347 Notes 3 30
[R1: A<10]
⋈K ⋈K
∪
[S1: K=R.K∧R.A<10] [R2: A≥10] [S2: K=R.K∧R.A≥10]
Example C[R1: A < 10] [S2: K = R.K ∧ R.A ≥ 10]
⟹ [R1 S2: R1.A < 10 ∧ S2.K = R.K ∧ R.A ≥ 10 ∧ R1.K = S2.K]
⟹ [R1 S2: false]
⟹ Ø
CS 347 Notes 3 31
⋈K
⋈K
⋈K
Example D(1)
CS 347 Notes 3 32
RR1 (K, A, B)
R2 (K, C, D)
vertical fragmentation
∏A
Example D(2)
CS 347 Notes 3 33
∏A
R1
⋈K
R2(K,A,B) (K,C,D)
Example D(2)
CS 347 Notes 3 34
∏A
R1
⋈K
R2(K,A,B) (K,C,D)
∏K,A ∏K,A
Example D(4)
CS 347 Notes 3 35
∏A
R1(K, A, B)
Rule 3Consider the vertical fragmentation
Ri = ∏ (R), Ai⊆A
Then for any B ⊆A
∏B(R) = ∏B( Ri | B ∩ Ai ≠ Ø )
CS 347 Notes 3 36
⋈i
Ai
Query ProcessingDecomposition ✔Localization ✔
OptimizationOverviewJoins and other operationsInter-operation parallelismOptimization strategies
CS 347 Notes 3 37
OverviewOptimization process1. Generate query plans
2. Estimate size ofintermediate results
3. Estimate cost of plan
CS 347 Notes 3 384. Pick minimum
P1 P2 P3 Pn
C1 C2 C3 Cn
§
§
§
§
§
§
§
§
§
§
§
§
OverviewDifferences from centralized optimizationNew strategies for some operations
Parallel/distributed sortParallel/distributed join
Semi-joinPrivacy preserving join
Duplicate eliminationAggregationSelection
Many ways to assign and schedule processorsCS 347 Notes 3 39
Parallel/Distributed SortInputa) Relation R on single site/diskb) Relation R fragmented/partitioned by sort attributec) Relation R fragmented/partitioned by another attribute
CS 347 Notes 3 40
Parallel/Distributed SortOutputa) Sorted R on single site/diskb) Sorted R fragments/partitions
CS 347 Notes 3 41
5 …610
12 …15
19 …202150
F1 F2 F3
Basic SortR(K, …) to be sorted on KHorizontally fragmented on K using vector (k0, k1, ... , kn)
CS 347 Notes 3 42
73
2722
111714
10 20
k0 k1
Basic SortAlgorithm1. Sort each fragment independently2. Ship results if necessary
CS 347 Notes 3 43
Basic SortSame idea on different architectures
Shared nothing
Shared memory
CS 347 Notes 3 44
P1
M F1…
P2
M F2
M
P1
F2
… P2
F1
Range-Partition SortR(K, …) to be sorted on KR located at one or more site/disk, not fragmented on K
CS 347 Notes 3 45
Range-Partition SortAlgorithm1. Range partition on K2. Basic sort
CS 347 Notes 3 46
R’1
R’2
R’3
k0
k1
local sort
local sort
local sort
Ra R1
R2
R3
Rbresult
Partition VectorsProblemSelect a good partition vector given fragments
CS 347 Notes 3 47
7 …521114
31 …815113217
10 …124
Ra Rb Rc
Partition VectorsExample approach
Each site sends to coordinatorMIN sort keyMAX sort keyNumber of tuples
CoordinatorComputes vector and distributes to sitesDecides the number of sites to perform local sorts
CS 347 Notes 3 48
Partition VectorsSample scenarioCoordinator receives
SA: MIN = 5 MAX = 9 # of tuples = 10SB: MIN = 7 MAX = 16 # of tuples = 10
CS 347 Notes 3 49
Sample scenarioCoordinator receives
SA: MIN = 5 MAX = 9 # of tuples = 10SB: MIN = 7 MAX = 16 # of tuples = 10
Partition Vectors
CS 347 Notes 3 50
Expected tuples
assuming we want to sort at 2 sites
5 10 15 20
k0?
1 = 10 / (16 – 7 + 1)
2 = 10 / (9 – 5 + 1)
Sample scenario
Partition Vectors
CS 347 Notes 3 51
Expected tuples with key < k0 = half of total tuples2(k0 – 5) + (k0 – 7) = 10k0 = (10 + 10 + 7) / 3 = 9
Expected tuples
5 10 15 20
k0?
1
2
Partition VectorsVariationsSend more info to coordinator:a) Partition vector for local site
b) Histogram
CS 347 Notes 3 52
5 6 8 107 9
3 3 3
5 6 8 10
# of tupleslocal vector
Partition VectorsVariationsMultiple rounds1. Sites send range and # of tuples to coordinator2. Coordinator distributes preliminary vector V0
3. Sites send coordinator # of tuples in each V0 range4. Coordinator computes and distributes final vector Vf
CS 347 Notes 3 53
Partition VectorsVariationsDistributed algorithm that does not require a coordinator?
CS 347 Notes 3 54
Parallel External Sort-MergeSame as range-partition sort except does sorting first
CS 347 Notes 3 55
R’1
R’2
R’3
k0
k1
local sort
local sort
Ra R’a
R’bRb result
in order
merge
Parallel/Distributed JoinInputRelations R, SMay or may not be fragmented
OutputR ⋈ SResult at one or more sites
CS 347 Notes 3 56
Partitioned Join (Equijoin)
CS 347 Notes 3 57
RA S1
S2
S3
RB
R1
R2
R3
SA
SB
SC
resultf(A)
f(A)
local join
Partitioned Join (Equijoin)Same partition function f is used for both R and S
f can be range or hash partitioning
Local join can be of any type (use any CS 245 optimization)
Various scheduling options, e.g., a) Partition R; partition S; joinb) Partition R; build local hash table for R; partition S and join
CS 347 Notes 3 58
Partitioned Join (Equijoin)We already know why it works
CS 347 Notes 3 59
∪
⋈
[R1] [R2] [R3]
∪
[S1] [S2] [S3]
∪
⋈ ⋈ ⋈
[R1] [S1] [R2] [S2] [R3] [S3]
Sometimes we partition just to make this join possible
Partitioned Join (Equijoin)Selecting a good partition function f is very important
Goal is to make all | Ri | + | Si | the same size
CS 347 Notes 3 60
Asymmetric Fragment + Replicate Join
CS 347 Notes 3 61
RA S
S
S
RB
R1
R2
R3
SA
SB
partition
unionresult
local join
Asymmetric Fragment + Replicate JoinCan use any partition function f for R (even round robin)
Can do any join, not just equijoin (e.g., R S)
CS 347 Notes 3 62
⋈R.A<S.B
General Fragment + Replicate Join
CS 347 Notes 3 63
RA R1
R2
R3
RB
R1
R2
R3
f
n copies of each fragment
partition→3 fragments
General Fragment + Replicate Join
CS 347 Notes 3 64
RA R1
R2
R3
RB
R1
R2
R3
f
n copies of each fragment
partition→3 fragments
S is partitioned in similar fashion
General Fragment + Replicate Join
CS 347 Notes 3 65
n × m pairings
of R, S fragments
R1 S1
R2 S1
R3 S1
R1 S2
R2 S2
R3 S2
result
General Fragment + Replicate Join
The asymmetric F+R join is special case of the general F+R
Asymmetric F+R may be good if S is small
Also works on non-equijoins
CS 347 Notes 3 66
Semi-JoinGoal: reduce communication traffic
R ⋈ S ⟹ ( R ⋉ S ) ⋈ S or
R ⋈ ( S ⋉ R ) or
( R ⋉ S ) ⋈ ( S ⋉ R )
CS 347 Notes 3 67
A A A
A A
AA A
Semi-JoinR ⋈ S
CS 347 Notes 3 68
2 a10 b25 c30 d
3 x10 y15 z25 w32 x
R S A AB C
Semi-JoinR ⋈ S
CS 347 Notes 3 69
2 a10 b25 c30 d
3 x10 y15 z25 w32 x
R S A AB C
∏AR = [2, 10, 25, 30]
Semi-JoinR ⋈ S
CS 347 Notes 3 70
2 a10 b25 c30 d
3 x10 y15 z25 w32 x
R S A AB C
∏AR = [2, 10, 25, 30]
S ⋉R = 10 y25 w
R ⋈ S
Semi-Join
CS 347 Notes 3 71
2 a10 b25 c30 d
3 x10 y15 z25 w32 x
R S A AB C
Transmitted dataWith semi-join R ⋈ ( S ⋉ R ): T = 4 | A | + 2 | A + C | + resultWith join R ⋈ S : T = 4 | A + B | + result
better if | B | is large
Semi-JoinAssume R is the smaller relation
Then, in general,
( R ⋉ S ) ⋈ S is better than ( R ⋈ S ) ifsize( ∏A S ) + size ( R ⋉ S ) < size ( R )
Can do similar comparison for other joins→ Only taking into account transmission cost
CS 347 Notes 3 72
A A A
A
Semi-JoinTrick: encode ∏A S (or ∏A R) as a bit vector
CS 347 Notes 3 73
0 0 1 1 0 1 0 0 0 0 1 0 1 0 0 keys in S
one bit/possible key
Semi-Join
CS 347 Notes 3 74
Useful for three way joins R ⋈ S ⋈ T
Semi-Join
CS 347 Notes 3 75
Useful for three way joins R ⋈ S ⋈ T
Option 1: R’ ⋈ S’ ⋈ T where R’ = R ⋉ S
S’ = S ⋉ T
Semi-Join
CS 347 Notes 3 76
Useful for three way joins R ⋈ S ⋈ T
Option 1: R’ ⋈ S’ ⋈ T where R’ = R ⋉ S
S’ = S ⋉ T
Option 2: R” ⋈ S’ ⋈ T where R” = R ⋉ S’
S’ = S ⋉ T
Semi-Join
CS 347 Notes 3 77
Useful for three way joins R ⋈ S ⋈ T
Option 1: R’ ⋈ S’ ⋈ T where R’ = R ⋉ S
S’ = S ⋉ T
Option 2: R” ⋈ S’ ⋈ T where R” = R ⋉ S’
S’ = S ⋉ T…→ Many options ~ exponential in # of relations
Privacy Preserving JoinSite 1 has R(A, B)Site 2 has S(A, C)
Want to compute R ⋈ S
Site 1 should not discover any S info not in the joinSite 2 should not discover any R info not in the join
CS 347 Notes 3 78
Privacy Preserving JoinSemi-join won’t workIf site 1 sends ∏A R to site 2, site 2 learns all keys of R
CS 347 Notes 3 79
a1 b1
a2 b2
a3 b3
a4 b4
a1 c1
a3 c2
a5 c3
a7 c4
R S A AB C
site 1 site 2
∏A R = (a1, a2, a3, a4)
Privacy Preserving JoinFix: send hash keys onlySite 1 hashes each value of A before sendingSite 2 hashes its own A values (same h) to see what tuples match
CS 347 Notes 3 80
a1 b1
a2 b2
a3 b3
a4 b4
a1 c1
a3 c2
a5 c3
a7 c4
R S A AB C
∏A R = (h(a1), h(a2), h(a3), h(a4))
site 2 sees it has h(a1), h(a3)
(a1, c1), (a3, c3)site 1 site 2
Privacy Preserving JoinWhat is the problem?
CS 347 Notes 3 81
a1 b1
a2 b2
a3 b3
a4 b4
a1 c1
a3 c2
a5 c3
a7 c4
R S A AB C
∏A R = (h(a1), h(a2), h(a3), h(a4))
site 2 sees it has h(a1), h(a3)
(a1, c1), (a3, c3)site 1 site 2
Privacy Preserving JoinWhat is the problem?
CS 347 Notes 3 82
a1 b1
a2 b2
a3 b3
a4 b4
a1 c1
a3 c2
a5 c3
a7 c4
R S A AB C
∏A R = (h(a1), h(a2), h(a3), h(a4))
site 2 sees it has h(a1), h(a3)
(a1, c1), (a3, c3)site 1 site 2
Dictionary attackSite 2 can take all keys a1, a2, a3, … and checks if h(a1), h(a2), h(a3), …matches what site 1 sent
Privacy Preserving JoinOur adversary model: honest but curious
Dictionary attack is possible (cheating is internal and can’t be caught)
Sending incorrect keys not possible (cheater could be caught)
CS 347 Notes 3 83
Privacy Preserving JoinOne solutionInformation Sharing Across Private Databases. Agrawal et al., SIGMOD 2003
→ Use commutative encryption functions
Ei(x) = x encrypted using a key private to site iE1(E2(x)) = E2(E1(x))
Shorthand: E1(x) is x, E2(x) is x, E1(E2(x)) is x
CS 347 Notes 3 84
One solution
Privacy Preserving Join
CS 347 Notes 3 85
a1 b1
a2 b2
a3 b3
a4 b4
a1 c1
a3 c2
a5 c3
a7 c4
R S A AB C
( a1, a2, a3, a4 )
( a1, a2, a3, a4 )
( a1, a3, a5, a7 )
Computes ( a1, a3, a5, a7 ), intersects with ( a1, a2, a3, a4 )
(a1, b1), (a3, b3)
site 1 site 2
Other Privacy Preserving OperationsInequality join
Similarity join
CS 347 Notes 3 86
R ⋈ SR.A>S.A
R ⋈ Ssim(R.A,S.A)<e
Other Parallel OperationsDuplicate eliminationSort first (in parallel) then eliminate duplicates in resultPartition tuples (range or hash) and eliminate locally
AggregatesPartition by grouping attributes; compute aggregate locally
CS 347 Notes 3 87
Parallel/Distributed Aggregates
CS 347 Notes 3 88
sum (sal) group by dept
id dept salary1 toy 102 toy 203 sales 15
Ra
id dept salary4 sales 55 toy 206 mgmt 157 sales 108 mgmt 30
Rb
Parallel/Distributed Aggregates
CS 347 Notes 3 89
id dept salary1 toy 102 toy 203 sales 15
Ra
id dept salary4 sales 55 toy 206 mgmt 157 sales 108 mgmt 30
Rb
id dept salary1 toy 102 toy 205 toy 206 mgmt 158 mgmt 30
id dept salary3 sales 154 sales 57 sales 10
sum (sal) group by dept
Parallel/Distributed Aggregates
CS 347 Notes 3 90
id dept salary1 toy 102 toy 203 sales 15
Ra
id dept salary4 sales 55 toy 206 mgmt 157 sales 108 mgmt 30
Rb
id dept salary1 toy 102 toy 205 toy 206 mgmt 158 mgmt 30
id dept salary3 sales 154 sales 57 sales 10
dept sumtoy 50
mgmt 45
dept sumsales 30
sum
sum
sum (sal) group by dept
Parallel/Distributed Aggregates
CS 347 Notes 3 91
id dept salary1 toy 102 toy 203 sales 15
Ra
id dept salary4 sales 55 toy 206 mgmt 157 sales 108 mgmt 30
Rb
sum (sal) group by dept with less data
Parallel/Distributed Aggregates
CS 347 Notes 3 92
id dept salary1 toy 102 toy 203 sales 15
Ra
id dept salary4 sales 55 toy 206 mgmt 157 sales 108 mgmt 30
Rb
dept salarytoy 30toy 20
mgmt 45
dept salarysales 15sales 15
sum
sum
sum (sal) group by dept with less data
Parallel/Distributed Aggregates
CS 347 Notes 3 93
id dept salary1 toy 102 toy 203 sales 15
Ra
id dept salary4 sales 55 toy 206 mgmt 157 sales 108 mgmt 30
Rb
dept salarytoy 30toy 20
mgmt 45
dept salarysales 15sales 15
sum
sum
dept sumtoy 50
mgmt 45
dept sumsales 30
sum
sum
sum (sal) group by dept with less data
Parallel/Distributed AggregatesEnhancementPerform aggregate during partition to reduce data transmitted→ Does not work for all aggregate functions—which ones?
CS 347 Notes 3 94
Preview: MapReduce
CS 347 Notes 3 95
data A1
data A2
data A3
data B1
data B2
data C1
data C1
map
reduce
Parallel/Distributed SelectionStraightforward if one can leverage range or hash partitioning
How do indexes work?
CS 347 Notes 3 96
Parallel/Distributed SelectionPartition vector can act as the root of a distributed index
CS 347 Notes 3 97
k0 k1
site 1 site 2 site 3
local indexes
Parallel/Distributed SelectionDistributed indexes on a non-partition attributes get complicated
CS 347 Notes 3 98
k0 k1
index sites
tuple sites
Parallel/Distributed SelectionIndexing schemesWhich one is better in a distributed environment?How to make updates and expansion efficient?Where to store the directory and set of participants?Is global knowledge necessary?
→ If the index is not too big, it may be better to keep it whole and replicate it
CS 347 Notes 3 99
Query ProcessingDecomposition ✔Localization ✔
OptimizationOverview ✔Joins and other operations ✔Inter-operation parallelismOptimization strategies
CS 347 Notes 3 100
Inter-Operation ParallelismPipelinedIndependent
CS 347 Notes 3 101
Pipelined Parallelism
CS 347 Notes 3 102
σc
⋈
R
S
site 2
site 1
site 1 join
R S
site 2
resultprobetuples matching σc
Pipelined ParallelismPipelining cannot be used in all cases
CS 347 Notes 3 103
⋈
stream ofR tuples
stream ofS tuples
Independent Parallelism
CS 347 Notes 3 104
⋈⋈ ⋈
R S T V
site 1 site 2
1. temp1 ⟵R ⋈ Stemp2 ⟵ T ⋈ V
2. result ⟵ temp1 ⋈ temp2