Query Optimization and Perspectives

13
Query Optimization and Perspectives March 12 th , 2003

description

Query Optimization and Perspectives. March 12 th , 2003. Administration. Exam next Wednesday, 2:30pm. Special office hours next week will be announced. Problem. Given: a query R1 R2 … Rn Assume we have a function cost() that gives us the cost of every join tree - PowerPoint PPT Presentation

Transcript of Query Optimization and Perspectives

Page 1: Query Optimization and Perspectives

Query Optimization and Perspectives

March 12th, 2003

Page 2: Query Optimization and Perspectives

Administration

• Exam next Wednesday, 2:30pm.

• Special office hours next week will be announced.

Page 3: Query Optimization and Perspectives

Problem

• Given: a query R1 R2 … Rn

• Assume we have a function cost() that gives us the cost of every join tree

• Find the best join tree for the query

Page 4: Query Optimization and Perspectives

Dynamic Programming

• For each subquery Q ⊆ {R1, …, Rn} compute the following:– Size(Q)– A best plan for Q: Plan(Q)– The cost of that plan: Cost(Q)

Page 5: Query Optimization and Perspectives

Dynamic Programming

• Step 1: For each {Ri} do:– Size({Ri}) = B(Ri)– Plan({Ri}) = Ri– Cost({Ri}) = (cost of scanning Ri)

Page 6: Query Optimization and Perspectives

Dynamic Programming

• Step i: For each Q ⊆ {R1, …, Rn} of cardinality i do:– Compute Size(Q) – For every pair of subqueries Q’, Q’’

s.t. Q = Q’ U Q’’compute cost(Plan(Q’) Plan(Q’’))

– Cost(Q) = the smallest such cost– Plan(Q) = the corresponding plan

Page 7: Query Optimization and Perspectives

Dynamic Programming

• Return Plan({R1, …, Rn})

Page 8: Query Optimization and Perspectives

Dynamic Programming

• Summary: computes optimal plans for subqueries:– Step 1: {R1}, {R2}, …, {Rn}– Step 2: {R1, R2}, {R1, R3}, …, {Rn-1, Rn}– …– Step n: {R1, …, Rn}

• We used naïve size/cost estimations• In practice:

– more realistic size/cost estimations – heuristics for Reducing the Search Space

• Restrict to left linear trees• Restrict to trees “without cartesian product”

– need more than just one plan for each subquery:• “interesting orders”

Page 9: Query Optimization and Perspectives

Plan with Materialization

⋈⋈

⋈ T

R S

U

HashTable Srepeat read(R, x)

y join(HashTable, x)write(V1, y)

HashTable Trepeat read(V1, y)

z join(HashTable, y)write(V2, z)

HashTable Urepeat read(V2, z)

u join(HashTable, z)write(Answer, u)

HashTable Srepeat read(R, x)

y join(HashTable, x)write(V1, y)

HashTable Trepeat read(V1, y)

z join(HashTable, y)write(V2, z)

HashTable Urepeat read(V2, z)

u join(HashTable, z)write(Answer, u)

V1

V2

Page 10: Query Optimization and Perspectives

Plan with Pipelining

⋈⋈

⋈ T

R S

U

HashTable1 SHashTable2 THashTable3 Urepeat read(R, x)

y join(HashTable1, x) z join(HashTable2, y)u join(HashTable3, z)write(Answer, u)

HashTable1 SHashTable2 THashTable3 Urepeat read(R, x)

y join(HashTable1, x) z join(HashTable2, y)u join(HashTable3, z)write(Answer, u)

pipe

line

Page 11: Query Optimization and Perspectives

Key Lessons in Optimization

• There are many approaches and many details to consider in query optimization– Classic search/optimization problem!– Not completely solved yet!

• Main points to take away are:– Algebraic rules and their use in transformations

of queries.– Deciding on join ordering: System-R style

(Selinger style) optimization.– Estimating cost of plans and sizes of

intermediate results.

Page 12: Query Optimization and Perspectives

The Course in Perspective

• The relational data model, SQL– Views, updates, transactions

• Conceptual design– Expressing the constraints on the domain– Using them to get good schema designs

• XML– A format for data exchange. Has its own query

language.– Basis for web services.

Page 13: Query Optimization and Perspectives

Perspective (continued)

• Building a database system:– Storage and indexing– Query execution (join algorithms)– Query optimization

• Data integration and sharing:– Databases were created independently– Schemas need to be matched– Need to access multiple databases in one query.– Interoperability through web services.