Preventing suboptimal execution plans - AIOUG · Title: Top tips for getting optimal SQL execution...
Transcript of Preventing suboptimal execution plans - AIOUG · Title: Top tips for getting optimal SQL execution...
2 Copyright © 2011, Oracle and/or its affiliates. All rights
reserved.
Preventing suboptimal execution plans
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.3
Disclaimer
The goal of this session to provide you tips on how to correct common
problems identified from a SQL execution plans
This session will not provide you with sudden enlightenment, making
you an Optimizer expert or give you the power to tune SQL statements
with the flick of your wrist!
Assumes you have basic knowledge of SQL execution and how to read
an explain plan
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.4
Agenda Combating incorrect cardinality
Wrong access path selection
Wrong join type selection
Influencing join order
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.5
Cardinality is the estimated
number rows that will be returned
by each operation
Calculated via formulas based on
optimizer statistics
total num of rows
num of distinct values
The more complex the predicates
the more complicated the
cardinality calculation
Cardinality
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.6
Causes for incorrect cardinality estimates
Causes
No statistics or stale statistics
Data skew
Multiple single column predicates on a table
Function wrapped column
Multiple columns used in a join
Complicated expression
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.7
No statistics
If no statistics are present Optimizer uses dynamic sampling
A small number of blocks are read at compile time and statistics are
estimated from that
Results from dynamic sampling are not shared across queries until 12c
Not a good replacement to standard statistics
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.8
No statisticsDetected by dynamic sampling appearing in note section of plan
Means no stats gathered strong indicator this won’t be best possible plan
Solution gather statistics
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.9
Stale statistics
Statistics are considered stale when 10% or more of the rows in the
object have changed
– Changes include, inserts, updates, deletes etc
Query dictionary to check if statistics are stale
SELECT table_name, stale_stats FROM user_tab_ statistic
Detected by checking staleness or changing cardinality estimates
Table Name Stale_stats
Sales NO
Customers YES
Product --
Note if DML has occurred very recently use dbms_stats.flush_database_monitoring_info
No means Stats are good
Yes means Stats are stale
Null means no Stats
Solution gather statistics
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.10
Stale statistics cause “Out of Range” issues
Query: SELECT Sum(total_sales)
FROM Sales
WHERE sales_date=to_date(’02-AUG-2012’,’dd-mon-yyyy’);
Statistics for sales_date column show max value as ‘31-JUL-2012’
Optimizer checks if predicate falls between the min, max value of column
“Out of Range” means value supplied is outside the min, max range
Optimizer prorates cardinality based on the distance between the predicate value and the maximum value
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.11
Example of ‘Out of Range’
Column stats for TIME_ID say Min = 01-JAN-98 Max = 31-DEC-03
SELECT /*+ gather_plan_statistics */ p.prod_name, SUM(s.quantity_sold)
FROM sales3 s, products p
WHERE s.prod_id =p.prod_id
AND s.time_id=to_date(’31-JAN-04’,’…’);
GROUP BY p.prod_name ;
Optimizer assumes there are no rows for time_id=‘31-Jan_04’ because it is out size of [MIN,MAX] range
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.12
Copies stats from source partition
to destination partition
Adjusts min & max values for
partition column at both partition &
global level
Copies statistics of the dependent
objects
Columns, local indexes etc.
Does not update global indexes
CardinalityUSE DBMS_STATS.COPY_TABLE_STATS();
Sales Table
SALES_1995
:SALES_Q4_2003
SALES_Q1_2004
DBMS_STATS.COPY_TABLE_STATS
(‘SH’,
'SALES’,
'SALES_Q4_2003’,
'SALES_Q1_2004’);
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.13
Causes for incorrect cardinality estimates
Causes
No statistics or stale statistics
Data skew
Multiple single column predicates on a table
Function wrapped column
Multiple columns used in a join
Complicated expression
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.14
Data skew
A few values are more popular in a column than any other
Optimizer assumes an even distribution of row per distinct value
Default cardinality calculation is Total number of rows
Number of Distinct values
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.15
Data skew example
SELECT * FROM Employee
WHERE Job_id = VP;
CLERK7782CLARKCLERK7788SCOTT
VPKINGCLERK7521WARDCLERK7499ALLEN
CLERK6973SMITH
Job_idEmp_idLast_name
Employee Table
8739
There are 107 rows in the Employees table &19 distinct values for job_id => 107 = 5.63
19
NAME ENUM JOB
KOCHHAR 101 AD_VP
DE HAAN 102 AD_VP
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.16
Solution
Re-gather statistics setting method_opt parameter to gather a
histogram for all columns that have a data skew
EXEC DBMS_STATS.GATHER_TABLE_STATS(
’HR’,’EMPLOYEES’,method_opt => ‘FOR ALL COLUMNS SIZE AUTO’);
SELECT …FROM user_tab_col_statistics WHERE table_name = ’ EMPLOYEES';
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.17
Data skew solution
SELECT * FROM Employee
WHERE Job_id = VP;
CLERK7782CLARKCLERK7788SCOTT
VPKINGCLERK7521WARDCLERK7499ALLEN
CLERK6973SMITH
Job_idEmp_idLast_name
Employee Table
8739
Histogram tells Optimizer there’s a skew#rows X # buckets value is in 107 X 2 2
total # of buckets 107on
NAME ENUM JOB
KOCHHAR 101 AD_VP
DE HAAN 102 AD_VP
= =
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.18
Causes for incorrect cardinality estimates
Causes
No statistics or stale statistics
Data skew
Multiple single column predicates on a table
Function wrapped column
Multiple columns used in a join
Complicated expression
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.19
Optimizer under-estimatesthe number of rows returned
Columns state and country are correlated
Optimizer has no way of knowing about correlation
Optimizer assumes each additional predicate reduces number of rows returned
Multiple single column predicates
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.20
Most of our customers come from California in the USA
Therefore there is a skew in the data in these columns
Perhaps a histogram on the state and country columns would help
Multiple single column predicates
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.21
Histograms on the state and country columns helps but the estimate is still not as accurate as it could be
Optimizer still doesn’t know about the correlationState of California is only in the US
Multiple single column predicates
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.22
Histograms on the state and country columns helps but the estimate is still not as accurate as it could be
Optimizer still doesn’t know about the correlationState of California is only in the US
Multiple single column predicates
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.23
Extended statistics were not actually used
Histograms on the state and country columns provide more information then the column group which doesn’t have a histogram
Must create a histogram on the column if we want it to be used
Multiple single column predicates
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.24
Re-gathering statistics on the Customers table will enable the histogram on the column group automatically created as the column group will have been “seen” once since the last statistics gather
Multiple single column predicates
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.25
Extended statistics are used
Accurate cardinality estimate achieved
Multiple single column predicates
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.26
Causes for incorrect cardinality estimates
Causes
No statistics or stale statistics
Data skew
Multiple single column predicates on a table
Function wrapped column
Multiple columns used in a join
Complicated expression
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.27
Function wrapped column
SELECT * FROM Customers
WHERE UPPER(CUST_LAST_NAME) = ‘SMITH’;
Optimizer doesn’t know how function affects values in the column
Optimizer guesses the cardinality to be 1% of rows
SELECT count(*) FROM customers;
COUNT(*)
55500Cardinality estimate is 1% of the rows
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.28
Function wrapped column
New Column with system generated name
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.29
Causes for incorrect cardinality estimates
Causes
No statistics or stale statistics
Data skew
Multiple single column predicates on a table
Function wrapped column
Multiple columns used in a join
Complicated expression
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.30
Multiple join columns
Query: SELECT count(*)
FROM Sales s, Sales2 s2
WHERE s.cust_id = s2.cust_id
AND s.prod_id = s2.prod_id;
Cardinality calculation
#Rows T1 X #Rows T2 X Selectivity of P1 X Selectivity of P2……
#Rows T1 X #Rows T2 X 1 X 1
max((NDV T1.c1), (NDV T2.c1)) max((NDV T1.c2), (NDV T2.c2))
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.31
Query: SELECT count(*)
FROM Sales s, Sales2 s2
WHERE s.cust_id = s2.cust_id
AND s.prod_id = s2.prod_id;
Cardinality calculation=> 918843 X 918843 X 1 X 1
Max(7059,7059 ) Max(72,72 )
=> 918843 X 918843 X 1 X 1 = 1,661,143 rows
7059 72
Multiple join columns
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.32
Example of multiple column joins
Optimizer assumes each join condition eliminate rows
– That’s not the case here so estimate is 4 X off
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.33
Solution
Create extended statistics on the prod_id & cust_id columns for both
the sales and sales2 tables
Then re-gather statistics on both tables
Select dbms_stats.create_extended_stats(Null,’SALES’,’(prod_id,cust_id)’) From dual;
Exec dbms_stats.gather_table_stats( Null, ’ SALES’);
New Column with system generated name
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.34
Causes for incorrect cardinality estimates
Causes
No statistics or stale statistics
Data skew
Multiple single column predicates on a table
Function wrapped column
Multiple columns used in a join
Complicated expression
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.35
Complex expressions
Query: SELECT count(*)
FROM sales
WHERE cust_id < 2222
AND prod_id > 5;
Optimizer won’t use the column group statistics in this case because
predicates are non-equalities
Cardinality estimate used instead is
#Rows T1 X Selectivity of P1 X Selectivity of P2……
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.36
Complex expressions
Query: SELECT count(*)
FROM sales
WHERE cust_id < 2222
AND prod_id > 5;
Estimate is 10X off
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.37
Solution
Dynamic sampling is only solution for complex expressions
Alter session set optimizer_dynamic_sampling = 4;
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.38
Solutions to incorrect cardinality estimates
Causes Solution
Stale or missing statistics DBMS_STATS
Data Skew Create a histogram
Multiple single column predicates on a table Create a column group using DBMS_STATS.CREATE_EXTENDED_STATS
Function wrapped column Create extended statistics using DBMS_STATS.CREATE_EXTENDED_STATS
Multiple columns used in a join Create a column group on join columns
using DBMS_STATS.CREATE_EXTENDED_STAT
Complicated expression Use dynamic sampling level 4 or higher
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.39
Agenda Combating incorrect cardinality
Wrong access path selection
Wrong join type selection
Influencing join order
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.40
Access paths – Getting the dataAccess Path Explanation
Full table scan Reads all rows from table & filters out those that do not meet the where clause predicates. Used when
no index, DOP set etc
Table access by Rowid Rowid specifies the datafile & data block containing the row and the location of the row in that block.
Used if rowid supplied by index or in where clause
Index unique scan Only one row will be returned. Used when stmt contains a UNIQUE or a PRIMARY KEY constraint
that guarantees that only a single row is accessed
Index range scan Accesses adjacent index entries returns ROWID values Used with equality on non-unique indexes or
range predicate on unique index (<.>, between etc)
Index skip scan Skips the leading edge of the index & uses the rest Advantageous if there are few distinct values in
the leading column and many distinct values in the non-leading column
Full index scan Processes all leaf blocks of an index, but only enough branch blocks to find 1st leaf block. Used when
all necessary columns are in index & order by clause matches index struct or if sort merge join is done
Fast full index scan Scans all blocks in index used to replace a FTS when all necessary columns are in the index. Using
multi-block IO & can going parallel
Index joins Hash join of several indexes that together contain all the table columns that are referenced in the
query. Wont eliminate a sort operation
Bitmap indexes uses a bitmap for key values and a mapping function that converts each bit position to a rowid. Can
efficiently merge indexes that correspond to several conditions in a WHERE clause
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.41
Access path example 1
Table: MY_SALES
– 2,098,400 rows
– 5510 MB in size
Index PROD_BIG_CUST_IND on columns prod_id, big, cust_id
Query: SELECT prod_id, cust_id
FROM my_sales
WHERE prod_id = 141
AND cust_id < 8938;
What access path do you think Optimizer should pick?
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.42
Access path example 1
Optimizer picks an index range scan of prod_cust_big_ind
Elapse time is 4 seconds
Alternative is a full table scan
Elapse times is 22 seconds
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.43
Access path example 1
A nightly batch job set the degree of parallelism on my_sales to
100
– Last night the batch job failed half way through and did not complete
What access path do you think the Optimizer picks now?
Now the optimizer costs the table access 100 X cheaper
‒ But elapse times is 32 seconds
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.44
Solution
Don’t set the DOP on the object in a mixed workload environment
Prevent batch jobs interfering with online users by using
– An alter session command
Alter session force parallel query parallel 100;
– A parallel hint
SELECT /*+ parallel(s,100) */ count(*)
FROM My_Sales s
WHERE …..;
– Using AUTO DOP
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.45
Common access path issues
Issue Cause
Uses a table scan instead of index DOP on table but not index, value of MBRC
Picks wrong index Stale or missing statistics
Cost of full index access is cheaper than index
look up followed by table access
Picks index that matches most # of column
Bad Cardinality estimates See section one
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.46
Agenda Combating incorrect cardinality
Wrong access path selection
Wrong join type selection
Influencing join order
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.47
Join types - Join retrieve data from tables
Access Path Explanation
Nested Loops joins For every row in the outer table, Oracle accesses all the rows in the inner table Useful when joining small
subsets of data and there is an efficient way to access the second table (index look up)
Hash Joins The smaller of two tables is scan and resulting rows are used to build a hash table on the join key in memory.
The larger table is then scan, join column of the resulting rows are hashed and the values used to probe
the hash table to find the matching rows. Useful for larger tables & if equality predicates
Sort Merge joins Consists of two steps:
1. Sort join operation: Both the inputs are sorted on the join key.
2. Merge join operation: The sorted lists are merged together.
Useful when the join condition between two tables is an inequality condition
Cartesian Joins Joins every row from one data source with every row from the other data source, creating the Cartesian
Product of the two sets. Only good if tables are very small. Only choice if there is no join condition
specified in query
Outer Joins Returns all rows that satisfy the join condition and also returns all of the rows from the table without the (+)
for which no rows from the other table satisfy the join condition
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.48
Join type example 1
Query: SELECT count(*)
FROM sales s, customers c
WHERE s.cust_id = c.cust_id
AND substr(c.cust_state_province,1,2) = ‘CA’
AND c.country_id = c.country_id+0;
Execution Plan
Nested Loop
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.49
Join type example 1
What would the cost and buffer gets be if we hinted for HJ
Query: SELECT /*+ use_hash(c s) */ count(*)
FROM sales s, customers c
WHERE s.cust_id = c.cust_id
AND substr(c.cust_state_province,1,2) = ‘CA’
AND c.country_id = c.country_id+0;
Execution Plan
Nested Loop
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.50
Why did the optimizer get it so wrong? Underestimated cardinality on the left hand side of the join
Customers is left hand side of the join & predicates are
substr(c.cust_state_province,1,2) = ‘CA’
c.country_id = c.country_id+0;
Join type example 1Nested Loop
Function wrapped column
Additional predicate doesn’t eliminate rows
Solution:
Correct cardinality underestimation on the left hand side of join
Use extended statistics
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.51
What causes the wrong join type to be selected
Issue Cause
Nested loop selected
instead of hash join
Cardinality estimate on the left side is under estimated triggers
Nested loop to be selected
Hash join selected
instead of nested loop
In case of a hash join the Optimizer doesn’t taken into
consideration the benefit of caching. Rows on the left come in a
clustered fashion or ordered so the probe on the right side is not
as expensive
Cartesian Joins Cardinality underestimation
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.52
Agenda Combating incorrect cardinality
Wrong access path selection
Wrong join type selection
Influencing join order
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.53
Join order
If the join order is not correct, check the statistics, cardinality & access methods
1
2
3
Want to start with the table that reduce the result set the most
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.54
Join order
The order in which the tables are join in a multi table stmt
Ideally start with the table that will eliminate the most rows
Strongly affected by the access paths available
Some basic rules
Joins guaranteed to produce at most one row always go first
– Joins between two row sources that have only one row each
When outer joins are used the table with the outer join operator must
come after the other table in the predicate
If view merging is not possible all tables in view will be joined first
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.55
What causes the wrong join order
Causes
Incorrect single table cardinality estimates
Incorrect join cardinality estimates
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.56
More Information
Optimizer Blog
– http://blogs.oracle.com/optimizer
Oracle.com
– http://www.oracle.com/technetwork/database/focus-areas/bi-
datawarehousing/dbbi-tech-info-optmztn-092214.html