Index Interactions in Physical Design Tuning Modeling, Analysis, and Applications
description
Transcript of Index Interactions in Physical Design Tuning Modeling, Analysis, and Applications
![Page 1: Index Interactions in Physical Design Tuning Modeling, Analysis, and Applications](https://reader036.fdocuments.us/reader036/viewer/2022062323/56815a72550346895dc7d654/html5/thumbnails/1.jpg)
Index Interactions in Physical Design Tuning Modeling, Analysis, and Applications
Karl Schnaitter, UC Santa CruzNeoklis Polyzotis, UC Santa CruzLise Getoor, Univ. of Maryland
VLDB 2009, Lyon, France
![Page 2: Index Interactions in Physical Design Tuning Modeling, Analysis, and Applications](https://reader036.fdocuments.us/reader036/viewer/2022062323/56815a72550346895dc7d654/html5/thumbnails/2.jpg)
2University of California, Santa Cruz
Index Selection• Index selection problem:
– Given a query workload– Choose indices that improve workload performance
• Does index benefit depend on other indices? – If so, this is called index interaction
• Index “benefit” is a key concept– Informally, for an index i,
[benefit of i] = [exec cost without i] – [exec cost with i]
![Page 3: Index Interactions in Physical Design Tuning Modeling, Analysis, and Applications](https://reader036.fdocuments.us/reader036/viewer/2022062323/56815a72550346895dc7d654/html5/thumbnails/3.jpg)
3University of California, Santa Cruz
Related Work• Interactions are a key concern in physical tuning
– [Whang et al. 1981] make assumptions implying that indices on different tables do not interact
– [Finklestein et al. 1988] assume that indices do not interact if they are relevant to separate queries
– [Bruno and Chaudhuri 2007] explicitly account for some interactions in on-line index selection
– Many more…
• These studies treat interactions as a secondary issue, and often rely on ad hoc assumptions
![Page 4: Index Interactions in Physical Design Tuning Modeling, Analysis, and Applications](https://reader036.fdocuments.us/reader036/viewer/2022062323/56815a72550346895dc7d654/html5/thumbnails/4.jpg)
4University of California, Santa Cruz
Index Interactions• Let S be a set of indices relevant to a query Q• •
cost(X)
cost(X {a}) benefit({a}, X)
cost(X {b})
cost(X {a,b}) benefit({a}, X {b})
Indices a,b are independent with respect to X
€
cost(X) = cost of Q if only X ⊆S is available
€
benefit(Y,X ) = cost(X) − cost(Y ∪X)
![Page 5: Index Interactions in Physical Design Tuning Modeling, Analysis, and Applications](https://reader036.fdocuments.us/reader036/viewer/2022062323/56815a72550346895dc7d654/html5/thumbnails/5.jpg)
5University of California, Santa Cruz
Index Interactions
cost(X)
cost(X {a}) benefit({a}, X)
cost(X {b})
cost(X {a,b}) benefit({a}, X {b})
Indices a,b positively interact with respect to X
• Let S be a set of indices relevant to a query Q• •
€
cost(X) = cost of Q if only X ⊆S is available
€
benefit(Y,X ) = cost(X) − cost(Y ∪X)
![Page 6: Index Interactions in Physical Design Tuning Modeling, Analysis, and Applications](https://reader036.fdocuments.us/reader036/viewer/2022062323/56815a72550346895dc7d654/html5/thumbnails/6.jpg)
6University of California, Santa Cruz
Index Interactions
cost(X)
cost(X {a}) benefit({a}, X)
cost(X {b})
cost(X {a,b}) benefit({a}, X {b})
Indices a,b negatively interact with respect to X
• Let S be a set of indices relevant to a query Q• •
€
cost(X) = cost of Q if only X ⊆S is available
€
benefit(Y,X ) = cost(X) − cost(Y ∪X)
![Page 7: Index Interactions in Physical Design Tuning Modeling, Analysis, and Applications](https://reader036.fdocuments.us/reader036/viewer/2022062323/56815a72550346895dc7d654/html5/thumbnails/7.jpg)
7University of California, Santa Cruz
• = degree of interaction between a,b with respect to X
=
Degree of Interaction
=
• •
€
benefit({a},X) − benefit({a},X ∪{b})cost(X ∪{a,b})
€
cost(X ∪{a}) − cost(X) − cost(X ∪{a,b}) + cost(X ∪{b})cost(X ∪{a,b})
€
doi(a,b,X)
€
X€
X ∪{a}
€
X ∪{b}€
X ∪{a,b}
€
doi is symmetric
€
doi(a,b) = maxX ⊆S
doi(a,b,X)
![Page 8: Index Interactions in Physical Design Tuning Modeling, Analysis, and Applications](https://reader036.fdocuments.us/reader036/viewer/2022062323/56815a72550346895dc7d654/html5/thumbnails/8.jpg)
8University of California, Santa Cruz
Problem Statement• Which indices in S interact?• How strong are the interactions?• The Degree of Interaction Problem:
€
Compute doi(a,b) for all a,b∈ S
![Page 9: Index Interactions in Physical Design Tuning Modeling, Analysis, and Applications](https://reader036.fdocuments.us/reader036/viewer/2022062323/56815a72550346895dc7d654/html5/thumbnails/9.jpg)
9University of California, Santa Cruz
Outline
• Properties of Query Optimization• Degree of Interaction Algorithm• Applying Interaction Information
![Page 10: Index Interactions in Physical Design Tuning Modeling, Analysis, and Applications](https://reader036.fdocuments.us/reader036/viewer/2022062323/56815a72550346895dc7d654/html5/thumbnails/10.jpg)
10University of California, Santa Cruz
Outline
• Properties of Query Optimization• Degree of Interaction Algorithm• Applying Interaction Information
![Page 11: Index Interactions in Physical Design Tuning Modeling, Analysis, and Applications](https://reader036.fdocuments.us/reader036/viewer/2022062323/56815a72550346895dc7d654/html5/thumbnails/11.jpg)
11University of California, Santa Cruz
Query Optimization• Computing doi(a,b) is not practical if the
optimizer is totally arbitrary– Need to compute
• In practice, query optimization is not arbitrary– E.g., we expect
• We put mild assumptions on query optimization:– Plans are selected from some fixed space P– Optimizer chooses the cheapest feasible plan from P– Ties are broken consistently
€
cost(∅ ) ≥ cost({a})
S allfor ),,( XXbadoi
![Page 12: Index Interactions in Physical Design Tuning Modeling, Analysis, and Applications](https://reader036.fdocuments.us/reader036/viewer/2022062323/56815a72550346895dc7d654/html5/thumbnails/12.jpg)
12University of California, Santa Cruz
Index Benefit Graph• An Index Benefit Graph (IBG) encodes the
selection of optimal plans for a query– Introduced by [Frank, Omiecinski, and Navathe 1992]
• Example IBG when S = {a,b,c,d}
a b c d
a b c b c d
a c b c
= 20
= 45
d = 80c = 80
= 50
c d = 65= 50= 80
used in opt plan
cost of plan
– There are 16 subsets of S– IBG has 8 nodes– But IBG can compute
€
cost(X) for all X ⊆S
![Page 13: Index Interactions in Physical Design Tuning Modeling, Analysis, and Applications](https://reader036.fdocuments.us/reader036/viewer/2022062323/56815a72550346895dc7d654/html5/thumbnails/13.jpg)
13University of California, Santa Cruz
Outline
• Properties of Query Optimization• Degree of Interaction Algorithm• Applying Interaction Information
![Page 14: Index Interactions in Physical Design Tuning Modeling, Analysis, and Applications](https://reader036.fdocuments.us/reader036/viewer/2022062323/56815a72550346895dc7d654/html5/thumbnails/14.jpg)
14University of California, Santa Cruz
Naive Algorithm• Recall that we want the degree of interaction between
all pairs of indices in S• Each doi(a,b) may be computed directly
€
For all a,b∈ S
€
Initialize T[a,b] = 0
€
Assign T[a,b] = max(d,T[a,b])
€
Let d =cost(X ∪{a}) − cost(X) − cost(X ∪{a,b}) + cost(X ∪{b})
cost(X ∪{a,b})
€
For all X ⊆S
Upon termination, T[a,b] = doi(a,b) for all a,bCan save time using an IBG as a cache of cost
functionDownside: iteration over all subsets of S
![Page 15: Index Interactions in Physical Design Tuning Modeling, Analysis, and Applications](https://reader036.fdocuments.us/reader036/viewer/2022062323/56815a72550346895dc7d654/html5/thumbnails/15.jpg)
15University of California, Santa Cruz
The QINTERACT Algorithm
€
For all a,b∈ S
€
Initialize T[a,b] = 0
€
Assign T[a,b] = max(doi(a,b,X1),doi(a,b,X2),T[a,b])
€
For all IBG nodes Y
€
Construct two index sets X1, X2 ⊆S (see paper)
€
For all a,b∈ S
€
Initialize T[a,b] = 0
€
Assign T[a,b] = max(doi(a,b,X),T[a,b])
€
For all X ⊆S
Naive Algorithm (condensed)
We should avoid evaluating doi(a,b,X) for all
€
X ⊆S
QINTERACT algorithm processes two index sets per IBG node
QINTERACTAlgorithm
![Page 16: Index Interactions in Physical Design Tuning Modeling, Analysis, and Applications](https://reader036.fdocuments.us/reader036/viewer/2022062323/56815a72550346895dc7d654/html5/thumbnails/16.jpg)
16University of California, Santa Cruz
€
cost(∅ )€
cost(a)
€
cost(b)€
cost(ab)
€
cost(u)€
cost(ua)
€
cost(ub)€
cost(aub)
QINTERACT Example
a b u v = 20
a u v = 30 b u v = 30
a u = 40 u v = 40
v = 50u = 50
b v = 40
•Let’s calculate doi(a,b) on the graph below•What happens on iteration Y = {u} ?
Y
a b u v = 20
a u v = 30 b u v = 30
a u = 40 u v = 40
v = 50u = 50
b v = 40
Y
€
doi(a,b,X1) =40 − 50 − 20 + 30
20= 0
€
X1 = {u}
€
doi(a,b, X2) =40 − 50 − 20 + 40
20= 0.5
€
X2 =∅
![Page 17: Index Interactions in Physical Design Tuning Modeling, Analysis, and Applications](https://reader036.fdocuments.us/reader036/viewer/2022062323/56815a72550346895dc7d654/html5/thumbnails/17.jpg)
17University of California, Santa Cruz
Interleaved IBG Processing• In QINTERACT, the IBG is built, then analyzed
– I.e., IBG construction and analysis is serial
• We can discover interactions in a partial IBG
• IBG construction and analysis may be interleaved- Improves accuracy of doi over time
a b c d
a b c b c d
a c
= 20
= 45 = 50
= 80 . . . . . .b c
d = 80c = 80
c d = 65= 50
€
doi(b,d,{a,c}) =45 − 80 − 20 + 20
20=1.75
![Page 18: Index Interactions in Physical Design Tuning Modeling, Analysis, and Applications](https://reader036.fdocuments.us/reader036/viewer/2022062323/56815a72550346895dc7d654/html5/thumbnails/18.jpg)
18University of California, Santa Cruz
Outline
• Properties of Query Optimization• Degree of Interaction Algorithm• Applying Interaction Information
- Visualizing Index Interactions- Scheduling Index Creation
![Page 19: Index Interactions in Physical Design Tuning Modeling, Analysis, and Applications](https://reader036.fdocuments.us/reader036/viewer/2022062323/56815a72550346895dc7d654/html5/thumbnails/19.jpg)
19University of California, Santa Cruz
Outline
• Properties of Query Optimization• Degree of Interaction Algorithm• Applying Interaction Information
- Visualizing Index Interactions- Scheduling Index Creation
![Page 20: Index Interactions in Physical Design Tuning Modeling, Analysis, and Applications](https://reader036.fdocuments.us/reader036/viewer/2022062323/56815a72550346895dc7d654/html5/thumbnails/20.jpg)
20University of California, Santa Cruz
Visualizing Index Interactions• We can visualize the doi function as a graph
– Nodes correspond to indices– Edge between a and b has weight doi(a,b)
O(CK,OK)
C(CK,NK)
LI(SK,SD,D,EP,OK)
LI(SD,D)
S(NK,N,SK) S(NK,SK) S(SK,NK)
C(NK,CK)
LI(SD,Q)
0.01
0.02
0.04
0.02
0.03
0.09 0.020.01
0.02TPC-H Query 7
![Page 21: Index Interactions in Physical Design Tuning Modeling, Analysis, and Applications](https://reader036.fdocuments.us/reader036/viewer/2022062323/56815a72550346895dc7d654/html5/thumbnails/21.jpg)
21University of California, Santa Cruz
Interaction Graph• The connected components have special meaning
€
1. The benefit of any X ⊆Ci does not depend on S −Ci
2. Refining the partition loses property (1)3. This is the only partition with property (1) and (2)
€
C1
€
C3
€
C2
![Page 22: Index Interactions in Physical Design Tuning Modeling, Analysis, and Applications](https://reader036.fdocuments.us/reader036/viewer/2022062323/56815a72550346895dc7d654/html5/thumbnails/22.jpg)
22University of California, Santa Cruz
Outline
• Properties of Query Optimization• Degree of Interaction Algorithm• Applying Interaction Information
- Visualizing Index Interactions- Scheduling Index Creation
![Page 23: Index Interactions in Physical Design Tuning Modeling, Analysis, and Applications](https://reader036.fdocuments.us/reader036/viewer/2022062323/56815a72550346895dc7d654/html5/thumbnails/23.jpg)
23University of California, Santa Cruz
Scheduling Index Creation• Suppose we want to materialize new indices• In what order should they be created?
Benefit
€
∅ a,ba a,b,c
Materialized Indices
€
∅ a,cc a,b,c
Schedule = a,b,c
Choose first schedule to maximize benefit over time (shaded area)€
∅ a,bb a,b,c
Schedule = b,a,c Schedule = c,a,b
![Page 24: Index Interactions in Physical Design Tuning Modeling, Analysis, and Applications](https://reader036.fdocuments.us/reader036/viewer/2022062323/56815a72550346895dc7d654/html5/thumbnails/24.jpg)
24University of California, Santa Cruz
Scheduling Index Creation• We define an optimization problem
– M = preexisting indices– {a1, …, an} = new indices to create
– Permute new indices as t1, …, tn to maximize
€
benefit({t1,..., ti}, M )i=1
n
∑• This problem is computationally hard
– There is a connection to the Set Cover problem, since each new index “covers” more benefit
![Page 25: Index Interactions in Physical Design Tuning Modeling, Analysis, and Applications](https://reader036.fdocuments.us/reader036/viewer/2022062323/56815a72550346895dc7d654/html5/thumbnails/25.jpg)
25University of California, Santa Cruz
Greedy Scheduling• We are tempted to use a greedy heuristic• This results in the third schedule
Greedy schedule can be suboptimal by a factor of about (n – 1)
Benefit
€
∅ a,ba a,b,c
Materialized Indices
€
∅ a,cc a,b,c
Schedule = a,b,c
€
∅ a,bb a,b,c
Schedule = b,a,c Schedule = c,a,b
![Page 26: Index Interactions in Physical Design Tuning Modeling, Analysis, and Applications](https://reader036.fdocuments.us/reader036/viewer/2022062323/56815a72550346895dc7d654/html5/thumbnails/26.jpg)
26University of California, Santa Cruz
Interaction-Aware Scheduling• Scheduling can use interaction graph
€
C1
€
C3
€
C2
Idea: First find optimal sub-schedules for each Ci
Then choose the best interleaving of sub-schedulesThis heuristic avoids the pitfalls of greedy scheduling We can also show stronger performance guarantees
![Page 27: Index Interactions in Physical Design Tuning Modeling, Analysis, and Applications](https://reader036.fdocuments.us/reader036/viewer/2022062323/56815a72550346895dc7d654/html5/thumbnails/27.jpg)
27University of California, Santa Cruz
Conclusions• Index interactions provide useful insights
for physical design tuning• The doi metric is an effective characterization
of interaction relationships• We can analyze interactions efficiently when
the Index Benefit Graph has limited size• Future work?
![Page 28: Index Interactions in Physical Design Tuning Modeling, Analysis, and Applications](https://reader036.fdocuments.us/reader036/viewer/2022062323/56815a72550346895dc7d654/html5/thumbnails/28.jpg)
28University of California, Santa Cruz
Thank You
![Page 29: Index Interactions in Physical Design Tuning Modeling, Analysis, and Applications](https://reader036.fdocuments.us/reader036/viewer/2022062323/56815a72550346895dc7d654/html5/thumbnails/29.jpg)
29University of California, Santa Cruz
Performance Evaluation• QINTERACT implementation in Java
– Uses JDBC to connect to IBM DB2 database• Experiments use 22 TPC-H benchmark queries • We generate indices based on the DB2 advisor
– SALL = all indices recommended by DB2– S1C = indices in SALL with first column only
• We monitor the progress of the “serial” and “interleaved” approaches over time
![Page 30: Index Interactions in Physical Design Tuning Modeling, Analysis, and Applications](https://reader036.fdocuments.us/reader036/viewer/2022062323/56815a72550346895dc7d654/html5/thumbnails/30.jpg)
30University of California, Santa Cruz
Experimental Results
SALL index set0.1 threshold
S1C index set0.1 threshold
![Page 31: Index Interactions in Physical Design Tuning Modeling, Analysis, and Applications](https://reader036.fdocuments.us/reader036/viewer/2022062323/56815a72550346895dc7d654/html5/thumbnails/31.jpg)
31University of California, Santa Cruz
Applications• QINTERACT returns doi(a,b) for all a,b• We propose two applications of this
information– Visualizing index interactions
• Illustrates the global interactions as a graph• Useful when manually tuning the index set
– Scheduling index construction• Want to choose when new indices will be created• Goal is to increase performance as quickly as possible• Knowledge of index interactions can help
![Page 32: Index Interactions in Physical Design Tuning Modeling, Analysis, and Applications](https://reader036.fdocuments.us/reader036/viewer/2022062323/56815a72550346895dc7d654/html5/thumbnails/32.jpg)
32University of California, Santa Cruz
Problem Statement• Which indices in S interact?• How strong are the interactions?• The Degree of Interaction Problem:
€
Compute doi(a,b) for all a,b∈ S
• It may be useful to ignore “minor” interactions• A threshold-based variant:
€
Decide if doi(a,b) > τ for all a,b∈ S
![Page 33: Index Interactions in Physical Design Tuning Modeling, Analysis, and Applications](https://reader036.fdocuments.us/reader036/viewer/2022062323/56815a72550346895dc7d654/html5/thumbnails/33.jpg)
33University of California, Santa Cruz
Index Selection• Index selection problem:
€
a = any indexX = set of other indicesbenefit(a,X ) = cost(X) − cost(X ∪{a})
• Does benefit(a, X) depend on X ? – If so, this is called index interaction
€
W = a query workloadS = a set of indices relevant to Wcost(M ) = cost of W when indices M ⊆S are availableWant to find M ⊆S to minimize cost(M )
• We can quantify the benefit of an index:
![Page 34: Index Interactions in Physical Design Tuning Modeling, Analysis, and Applications](https://reader036.fdocuments.us/reader036/viewer/2022062323/56815a72550346895dc7d654/html5/thumbnails/34.jpg)
34University of California, Santa Cruz
Future Work• Expand our support for updates• Implementation of visualization tool• Experiments with materialization scheduling• Incremental updates to doi function• Exploring stronger assumptions on query
optimization– Efficient upper bounds on doi function?