(Slides by Hector Garcia-Molina, hector/cs245/notes.htm)
description
Transcript of (Slides by Hector Garcia-Molina, hector/cs245/notes.htm)
![Page 1: (Slides by Hector Garcia-Molina, hector/cs245/notes.htm)](https://reader035.fdocuments.us/reader035/viewer/2022070404/56813a63550346895da25ad6/html5/thumbnails/1.jpg)
Chapters 15-16a 1
(Slides by Hector Garcia-Molina,http://www-db.stanford.edu/~hector/cs245/
notes.htm)
Chapters 15 and 16:Query Processing
![Page 2: (Slides by Hector Garcia-Molina, hector/cs245/notes.htm)](https://reader035.fdocuments.us/reader035/viewer/2022070404/56813a63550346895da25ad6/html5/thumbnails/2.jpg)
Chapters 15-16a 2
Query Processing
Q Query Plan
Focus: Relational System
![Page 3: (Slides by Hector Garcia-Molina, hector/cs245/notes.htm)](https://reader035.fdocuments.us/reader035/viewer/2022070404/56813a63550346895da25ad6/html5/thumbnails/3.jpg)
Chapters 15-16a 3
Example
Select B,DFrom R,SWhere R.A = “c”
AND S.E = 2 AND R.C=S.C
![Page 4: (Slides by Hector Garcia-Molina, hector/cs245/notes.htm)](https://reader035.fdocuments.us/reader035/viewer/2022070404/56813a63550346895da25ad6/html5/thumbnails/4.jpg)
Chapters 15-16a 4
R: A B C S: C D E
a 1 10 10 x 2
b 1 20 20 y 2
c 2 10 30 z 2
d 2 35 40 x 1
e 3 45 50 y 3
Answer B D2 x
![Page 5: (Slides by Hector Garcia-Molina, hector/cs245/notes.htm)](https://reader035.fdocuments.us/reader035/viewer/2022070404/56813a63550346895da25ad6/html5/thumbnails/5.jpg)
Chapters 15-16a 5
• How do we execute query?
- Do Cartesian product
- Select tuples- Do projection
One idea
![Page 6: (Slides by Hector Garcia-Molina, hector/cs245/notes.htm)](https://reader035.fdocuments.us/reader035/viewer/2022070404/56813a63550346895da25ad6/html5/thumbnails/6.jpg)
Chapters 15-16a 6
RXS R.A R.B R.C S.C S.D S.E
a 1 10 10 x 2
a 1 10 20 y 2
. .
C 2 10 10 x 2 . .
Bingo!
Got one...
![Page 7: (Slides by Hector Garcia-Molina, hector/cs245/notes.htm)](https://reader035.fdocuments.us/reader035/viewer/2022070404/56813a63550346895da25ad6/html5/thumbnails/7.jpg)
Chapters 15-16a 7
Relational Algebra - can be used to
describe plans...Ex: Plan I
B,D
R.A=“c” S.E=2 R.C=S.C
XR S
OR: B,D [ R.A=“c” S.E=2 R.C = S.C (RXS)]
![Page 8: (Slides by Hector Garcia-Molina, hector/cs245/notes.htm)](https://reader035.fdocuments.us/reader035/viewer/2022070404/56813a63550346895da25ad6/html5/thumbnails/8.jpg)
Chapters 15-16a 8
Another idea:
B,D
R.A = “c” S.E = 2
R S
Plan II
natural join
![Page 9: (Slides by Hector Garcia-Molina, hector/cs245/notes.htm)](https://reader035.fdocuments.us/reader035/viewer/2022070404/56813a63550346895da25ad6/html5/thumbnails/9.jpg)
Chapters 15-16a 9
R S
A B C (R) (S) C D E
a 1 10 A B C C D E 10 x 2
b 1 20 c 2 10 10 x 2 20 y 2
c 2 10 20 y 2 30 z 2
d 2 35 30 z 2 40 x 1
e 3 45 50 y 3
![Page 10: (Slides by Hector Garcia-Molina, hector/cs245/notes.htm)](https://reader035.fdocuments.us/reader035/viewer/2022070404/56813a63550346895da25ad6/html5/thumbnails/10.jpg)
Chapters 15-16a 10
Plan III Use R.A and S.C Indexes
(1) Use R.A index to select R tuples with R.A = “c”
(2) For each R.C value found, use S.C index to find matching tuples
(3) Eliminate S tuples S.E 2
(4) Join matching R,S tuples, project
B,D attributes and place in result
![Page 11: (Slides by Hector Garcia-Molina, hector/cs245/notes.htm)](https://reader035.fdocuments.us/reader035/viewer/2022070404/56813a63550346895da25ad6/html5/thumbnails/11.jpg)
Chapters 15-16a 11
R S
A B C C D E
a 1 10 10 x 2
b 1 20 20 y 2
c 2 10 30 z 2
d 2 35 40 x 1
e 3 45 50 y 3
A CI1 I2
=“c”
<c,2,10> <10,x,2>
check=2?
output: <2,x>
next tuple:<c,7,15>
![Page 12: (Slides by Hector Garcia-Molina, hector/cs245/notes.htm)](https://reader035.fdocuments.us/reader035/viewer/2022070404/56813a63550346895da25ad6/html5/thumbnails/12.jpg)
Chapters 15-16a 12
Overview of Query Optimization
![Page 13: (Slides by Hector Garcia-Molina, hector/cs245/notes.htm)](https://reader035.fdocuments.us/reader035/viewer/2022070404/56813a63550346895da25ad6/html5/thumbnails/13.jpg)
Chapters 15-16a 13
parse
convert
apply laws
estimate result sizes
consider physical plans estimate costs
pick best
execute
{P1,P2,…..}
{(P1,C1),(P2,C2)...}
Pi
answer
SQL query
parse tree
logical query plan
“improved” l.q.p
l.q.p. +sizes
statistics
![Page 14: (Slides by Hector Garcia-Molina, hector/cs245/notes.htm)](https://reader035.fdocuments.us/reader035/viewer/2022070404/56813a63550346895da25ad6/html5/thumbnails/14.jpg)
Chapters 15-16a 14
Example: SQL querySELECT titleFROM StarsInWHERE starName IN (
SELECT nameFROM MovieStarWHERE birthdate LIKE ‘%1960’
);
(Find the movies with stars born in 1960)
![Page 15: (Slides by Hector Garcia-Molina, hector/cs245/notes.htm)](https://reader035.fdocuments.us/reader035/viewer/2022070404/56813a63550346895da25ad6/html5/thumbnails/15.jpg)
Chapters 15-16a 15
Example: Parse Tree<Query>
<SFW>
SELECT <SelList> FROM <FromList> WHERE <Condition>
<Attribute> <RelName> <Tuple> IN <Query>
title StarsIn <Attribute> ( <Query> )
starName <SFW>
SELECT <SelList> FROM <FromList> WHERE <Condition>
<Attribute> <RelName> <Attribute> LIKE <Pattern>
name MovieStar birthDate ‘%1960’
![Page 16: (Slides by Hector Garcia-Molina, hector/cs245/notes.htm)](https://reader035.fdocuments.us/reader035/viewer/2022070404/56813a63550346895da25ad6/html5/thumbnails/16.jpg)
Chapters 15-16a 16
Example: Generating Relational Algebra
title
StarsIn <condition>
<tuple> IN name
<attribute> birthdate LIKE ‘%1960’
starName MovieStar
An expression using a two-argument , midway between a parse tree and relational algebra
![Page 17: (Slides by Hector Garcia-Molina, hector/cs245/notes.htm)](https://reader035.fdocuments.us/reader035/viewer/2022070404/56813a63550346895da25ad6/html5/thumbnails/17.jpg)
Chapters 15-16a 17
Example: Logical Query Plan
title
starName=name
StarsIn name
birthdate LIKE ‘%1960’
MovieStar
Applying the rule for IN conditions
![Page 18: (Slides by Hector Garcia-Molina, hector/cs245/notes.htm)](https://reader035.fdocuments.us/reader035/viewer/2022070404/56813a63550346895da25ad6/html5/thumbnails/18.jpg)
Chapters 15-16a 18
Example: Improved Logical Query Plan
title
starName=name
StarsIn name
birthdate LIKE ‘%1960’
MovieStar
Fig. 7.20: An improvement on fig. 7.18.
Question:Push project to
StarsIn?
![Page 19: (Slides by Hector Garcia-Molina, hector/cs245/notes.htm)](https://reader035.fdocuments.us/reader035/viewer/2022070404/56813a63550346895da25ad6/html5/thumbnails/19.jpg)
Chapters 15-16a 19
Example: Estimate Result Sizes
Need expected size
StarsIn
MovieStar
![Page 20: (Slides by Hector Garcia-Molina, hector/cs245/notes.htm)](https://reader035.fdocuments.us/reader035/viewer/2022070404/56813a63550346895da25ad6/html5/thumbnails/20.jpg)
Chapters 15-16a 20
Example: One Physical Plan
Parameters: join order,
memory size, ...
Hash join
SEQ scan index scan Parameters:Select Condition,...
StarsIn MovieStar
![Page 21: (Slides by Hector Garcia-Molina, hector/cs245/notes.htm)](https://reader035.fdocuments.us/reader035/viewer/2022070404/56813a63550346895da25ad6/html5/thumbnails/21.jpg)
Chapters 15-16a 21
Example: Estimate costs
L.Q.P
P1 P2 …. Pn
C1 C2 …. Cn Pick best!
![Page 22: (Slides by Hector Garcia-Molina, hector/cs245/notes.htm)](https://reader035.fdocuments.us/reader035/viewer/2022070404/56813a63550346895da25ad6/html5/thumbnails/22.jpg)
Chapters 15-16a 22
Textbook outline
Chapter 5: Algebra for queries[bags vs sets]- Select, project, join, ….[project lista,a+b->x,…]- Duplicate elimination, grouping, sorting
Chapter 15:- Physical operators
- Scan,sort, …
- Implementing operators and estimating their cost
![Page 23: (Slides by Hector Garcia-Molina, hector/cs245/notes.htm)](https://reader035.fdocuments.us/reader035/viewer/2022070404/56813a63550346895da25ad6/html5/thumbnails/23.jpg)
Chapters 15-16a 23
Chapter 16:- Parsing- Algebraic laws- Parse tree -> logical query plan- Estimating result sizes- Cost based optimization
![Page 24: (Slides by Hector Garcia-Molina, hector/cs245/notes.htm)](https://reader035.fdocuments.us/reader035/viewer/2022070404/56813a63550346895da25ad6/html5/thumbnails/24.jpg)
Chapters 15-16a 24
Query Optimization
• Relational algebra level• Detailed query plan level
– Estimate Costs• without indexes• with indexes
– Generate and compare plans
![Page 25: (Slides by Hector Garcia-Molina, hector/cs245/notes.htm)](https://reader035.fdocuments.us/reader035/viewer/2022070404/56813a63550346895da25ad6/html5/thumbnails/25.jpg)
Chapters 15-16a 25
Relational algebra optimization
• Transformation rules(preserve equivalence)
• What are good transformations?
![Page 26: (Slides by Hector Garcia-Molina, hector/cs245/notes.htm)](https://reader035.fdocuments.us/reader035/viewer/2022070404/56813a63550346895da25ad6/html5/thumbnails/26.jpg)
Chapters 15-16a 26
R x S = S x R(R x S) x T = R x (S x T)
R U S = S U RR U (S U T) = (R U S) U T
Rules: Natural joins & cross products & unionR S = S R(R S) T = R (S T)
![Page 27: (Slides by Hector Garcia-Molina, hector/cs245/notes.htm)](https://reader035.fdocuments.us/reader035/viewer/2022070404/56813a63550346895da25ad6/html5/thumbnails/27.jpg)
Chapters 15-16a 27
Note:
• Can also write as trees, e.g.:
T R
R S S T
![Page 28: (Slides by Hector Garcia-Molina, hector/cs245/notes.htm)](https://reader035.fdocuments.us/reader035/viewer/2022070404/56813a63550346895da25ad6/html5/thumbnails/28.jpg)
Chapters 15-16a 28
Rules: Selects
p1p2(R) =
p1vp2(R)
=
p1 [ p2 (R)]
[ p1 (R)] U [ p2 (R)]
![Page 29: (Slides by Hector Garcia-Molina, hector/cs245/notes.htm)](https://reader035.fdocuments.us/reader035/viewer/2022070404/56813a63550346895da25ad6/html5/thumbnails/29.jpg)
Chapters 15-16a 29
Let p = predicate with only R attribs q = predicate with only S attribs
p (R S) =
q (R S) =
Rules: combined
[p (R)] S
R [q (S)]
![Page 30: (Slides by Hector Garcia-Molina, hector/cs245/notes.htm)](https://reader035.fdocuments.us/reader035/viewer/2022070404/56813a63550346895da25ad6/html5/thumbnails/30.jpg)
Chapters 15-16a 30
p1p2 (R) p1 [p2 (R)]
p (R S) [p (R)] S
R S S R
Which are “good” transformations?
![Page 31: (Slides by Hector Garcia-Molina, hector/cs245/notes.htm)](https://reader035.fdocuments.us/reader035/viewer/2022070404/56813a63550346895da25ad6/html5/thumbnails/31.jpg)
Chapters 15-16a 31
Conventional wisdom: do projects early
Example: R(A,B,C,D,E) x={E} P: (A=3) (B=“cat”)
x {p (R)} vs. E
{p{ABE(R)}}
![Page 32: (Slides by Hector Garcia-Molina, hector/cs245/notes.htm)](https://reader035.fdocuments.us/reader035/viewer/2022070404/56813a63550346895da25ad6/html5/thumbnails/32.jpg)
Chapters 15-16a 32
What if we have A, B indexes?
B = “cat” A=3
Intersect pointers to get
pointers to matching tuples
But
![Page 33: (Slides by Hector Garcia-Molina, hector/cs245/notes.htm)](https://reader035.fdocuments.us/reader035/viewer/2022070404/56813a63550346895da25ad6/html5/thumbnails/33.jpg)
Chapters 15-16a 33
Bottom line:
• No transformation is always good• Usually good: early selections
![Page 34: (Slides by Hector Garcia-Molina, hector/cs245/notes.htm)](https://reader035.fdocuments.us/reader035/viewer/2022070404/56813a63550346895da25ad6/html5/thumbnails/34.jpg)
Chapters 15-16a 34
In textbook: more transformations
• Eliminate common sub-expressions• Other operations: duplicate
elimination
![Page 35: (Slides by Hector Garcia-Molina, hector/cs245/notes.htm)](https://reader035.fdocuments.us/reader035/viewer/2022070404/56813a63550346895da25ad6/html5/thumbnails/35.jpg)
Chapters 15-16a 35
Outline - Query Processing
• Relational algebra level– transformations– good transformations
• Detailed query plan level– estimate costs– generate and compare plans
![Page 36: (Slides by Hector Garcia-Molina, hector/cs245/notes.htm)](https://reader035.fdocuments.us/reader035/viewer/2022070404/56813a63550346895da25ad6/html5/thumbnails/36.jpg)
Chapters 15-16a 36
• Estimating cost of query plan
(1) Estimating size of results(2) Estimating # of IOs
![Page 37: (Slides by Hector Garcia-Molina, hector/cs245/notes.htm)](https://reader035.fdocuments.us/reader035/viewer/2022070404/56813a63550346895da25ad6/html5/thumbnails/37.jpg)
Chapters 15-16a 37
Estimating result size
• Keep statistics for relation R– T(R) : # tuples in R– S(R) : # of bytes in each R tuple– B(R): # of blocks to hold all R tuples– V(R, A) : # distinct values in R
for attribute A
![Page 38: (Slides by Hector Garcia-Molina, hector/cs245/notes.htm)](https://reader035.fdocuments.us/reader035/viewer/2022070404/56813a63550346895da25ad6/html5/thumbnails/38.jpg)
Chapters 15-16a 38
Example R A: 20 byte string
B: 4 byte integerC: 8 byte dateD: 5 byte string
A B C D
cat 1 10 a
cat 1 20 b
dog 1 30 a
dog 1 40 c
bat 1 50 d
T(R) = 5 S(R) = 37V(R,A) = 3 V(R,C) = 5V(R,B) = 1 V(R,D) = 4
![Page 39: (Slides by Hector Garcia-Molina, hector/cs245/notes.htm)](https://reader035.fdocuments.us/reader035/viewer/2022070404/56813a63550346895da25ad6/html5/thumbnails/39.jpg)
Chapters 15-16a 39
Size estimates for W = R1 x R2
T(W) =
S(W) =
T(R1) T(R2)
S(R1) + S(R2)
![Page 40: (Slides by Hector Garcia-Molina, hector/cs245/notes.htm)](https://reader035.fdocuments.us/reader035/viewer/2022070404/56813a63550346895da25ad6/html5/thumbnails/40.jpg)
Chapters 15-16a 40
Selection cardinality
SC(R,A) = average # records that satisfy
equality condition on R.A T(R)
V(R,A)SC(R,A) =
T(R) DOM(R,A)
![Page 41: (Slides by Hector Garcia-Molina, hector/cs245/notes.htm)](https://reader035.fdocuments.us/reader035/viewer/2022070404/56813a63550346895da25ad6/html5/thumbnails/41.jpg)
Chapters 15-16a 41
What about W = z val (R) ?
T(W) = ?
• Solution # 1:T(W) = T(R)/2
• Solution # 2:T(W) = T(R)/3
![Page 42: (Slides by Hector Garcia-Molina, hector/cs245/notes.htm)](https://reader035.fdocuments.us/reader035/viewer/2022070404/56813a63550346895da25ad6/html5/thumbnails/42.jpg)
Chapters 15-16a 42
Size estimate for W = R1 R2
Let x = attributes of R1 y = attributes of R2
X Y =
Same as R1 x R2
Case 1
![Page 43: (Slides by Hector Garcia-Molina, hector/cs245/notes.htm)](https://reader035.fdocuments.us/reader035/viewer/2022070404/56813a63550346895da25ad6/html5/thumbnails/43.jpg)
Chapters 15-16a 43
W = R1 R2 X Y = AR1 A B C R2 A D
Case 2
Assumption:
V(R1,A) V(R2,A) Every A value in R1 is in R2V(R2,A) V(R1,A) Every A value in R2 is in R1
“containment of value sets”
![Page 44: (Slides by Hector Garcia-Molina, hector/cs245/notes.htm)](https://reader035.fdocuments.us/reader035/viewer/2022070404/56813a63550346895da25ad6/html5/thumbnails/44.jpg)
Chapters 15-16a 44
• V(R1,A) V(R2,A) T(W) = T(R2) T(R1)
V(R2,A)
• V(R2,A) V(R1,A) T(W) = T(R2) T(R1)
V(R1,A)
[A is common attribute]
![Page 45: (Slides by Hector Garcia-Molina, hector/cs245/notes.htm)](https://reader035.fdocuments.us/reader035/viewer/2022070404/56813a63550346895da25ad6/html5/thumbnails/45.jpg)
Chapters 15-16a 45
T(W) = T(R2) T(R1)max{ V(R1,A), V(R2,A) }
In general W = R1 R2
![Page 46: (Slides by Hector Garcia-Molina, hector/cs245/notes.htm)](https://reader035.fdocuments.us/reader035/viewer/2022070404/56813a63550346895da25ad6/html5/thumbnails/46.jpg)
Chapters 15-16a 46
Using similar ideas,we can estimate sizes of:
AB (R) …..
A=aB=b (R) ….
R S with common attribs. A,B,C
Union, intersection, diff, ….
![Page 47: (Slides by Hector Garcia-Molina, hector/cs245/notes.htm)](https://reader035.fdocuments.us/reader035/viewer/2022070404/56813a63550346895da25ad6/html5/thumbnails/47.jpg)
Chapters 15-16a 47
Example:
Z = R1(A,B) R2(B,C) R3(C,D)
T(R1) = 1000 V(R1,A)=50 V(R1,B)=100
T(R2) = 2000 V(R2,B)=200 V(R2,C)=300
T(R3) = 3000 V(R3,C)=90 V(R3,D)=500
R1
R2
R3
![Page 48: (Slides by Hector Garcia-Molina, hector/cs245/notes.htm)](https://reader035.fdocuments.us/reader035/viewer/2022070404/56813a63550346895da25ad6/html5/thumbnails/48.jpg)
Chapters 15-16a 48
T(U) = 10002000 V(U,A) = 50 200 V(U,B) = 100
V(U,C) = 300
Partial Result: U = R1 R2
![Page 49: (Slides by Hector Garcia-Molina, hector/cs245/notes.htm)](https://reader035.fdocuments.us/reader035/viewer/2022070404/56813a63550346895da25ad6/html5/thumbnails/49.jpg)
Chapters 15-16a 49
Z = U R3
T(Z) = 100020003000 V(Z,A) = 50200300 V(Z,B) = 100 V(Z,C) = 90 V(Z,D) = 500
![Page 50: (Slides by Hector Garcia-Molina, hector/cs245/notes.htm)](https://reader035.fdocuments.us/reader035/viewer/2022070404/56813a63550346895da25ad6/html5/thumbnails/50.jpg)
Chapters 15-16a 50
Summary
• Estimating size of results is an “art”
• Don’t forget:Statistics must be kept up to
date…(cost?)
![Page 51: (Slides by Hector Garcia-Molina, hector/cs245/notes.htm)](https://reader035.fdocuments.us/reader035/viewer/2022070404/56813a63550346895da25ad6/html5/thumbnails/51.jpg)
Chapters 15-16a 51
Outline
• Estimating cost of query plan– Estimating size of results
done!– Estimating # of IOs
• Generate and compare plans