Best Practices for Analyzing Objects - WordPress.com · 2011-05-16 · Document Best Practices for...

26
Document Best Practices for Analyzing Objects Best Practices for Analyzing Objects Date: Monday, May 16, 2011 Copyright 2007 TUSC Page 1 of 26

Transcript of Best Practices for Analyzing Objects - WordPress.com · 2011-05-16 · Document Best Practices for...

Page 1: Best Practices for Analyzing Objects - WordPress.com · 2011-05-16 · Document Best Practices for Analyzing Objects Document Title: Best Practices for Analyzing Objects Document

Document Best Practices for Analyzing Objects

Best Practices for Analyzing Objects

Date: Monday, May 16, 2011

Copyright 2007 TUSC Page 1 of 26

Page 2: Best Practices for Analyzing Objects - WordPress.com · 2011-05-16 · Document Best Practices for Analyzing Objects Document Title: Best Practices for Analyzing Objects Document

Document Best Practices for Analyzing Objects

Document Title: Best Practices for Analyzing Objects

Document Filename: best_practices_for_dbms_stats.doc

ConfigurationHistory

Version Date Applied Changes Author(s)01.00 Feb 2008 Initial document Brian P Michael,

Sr. Consultant

DistributionHistory

Version Date Name(s)01.00

Copyright 2007 TUSC Page 2 of 26

Page 3: Best Practices for Analyzing Objects - WordPress.com · 2011-05-16 · Document Best Practices for Analyzing Objects Document Title: Best Practices for Analyzing Objects Document

Document Best Practices for Analyzing Objects

Table of Contents1 PURPOSE OF THIS DOCUMENT..............................................................................................................................42 SUMMARY......................................................................................................................................................................43 TABLE AND INDEX STATISTICS ............................................................................................................................64 COLUMN STATISTICS................................................................................................................................................6

4.1 WHEN SHOULD HISTOGRAMS BE CREATED....................................................................................64.1.1 Example: Small Table ..............................................................................................................................74.1.2 Example: Unique/Primary Keys...............................................................................................................74.1.3 Example: MyWebSite Field......................................................................................................................74.1.4 Example: Age Field...................................................................................................................................74.1.5 Example: Name field.................................................................................................................................74.2 EVERY APPLICATION IS DIFFERENT..................................................................................................7

5 COLUMN WHERE CLAUSE USAGE........................................................................................................................85.1 USING SYS.COL_USAGE$.......................................................................................................................85.2 MAINTAINING SYS.COL_USAGE$........................................................................................................9

6 CPU COST MODELING...............................................................................................................................................97 DBMS_STATS...............................................................................................................................................................108 SETTING DBMS_STATS PARAMETERS...............................................................................................................11

8.1 USING DEFAULTS..................................................................................................................................128.2 GETTING PARAMS.................................................................................................................................128.3 CONSTANTS............................................................................................................................................128.4 ESTIMATE_PERCENT............................................................................................................................138.4.1 auto_sample_size....................................................................................................................................138.5 CASCADE.................................................................................................................................................138.6 METHOD_OPT.........................................................................................................................................148.7 DEGREE....................................................................................................................................................148.8 GRANULARITY.......................................................................................................................................14

9 COLLECTING TABLE STATISTICS.......................................................................................................................1410 COLLECTING INDEX STATS................................................................................................................................1411 COLLECTING COLUMN STATS AND HISTOGRAMS.....................................................................................15

11.1 METHOD_OPT.......................................................................................................................................1511.2 METHOD_OPT SIZE..............................................................................................................................1511.2.1 size N.....................................................................................................................................................1511.2.2 repeat.....................................................................................................................................................1511.2.3 auto........................................................................................................................................................1511.2.4 skewonly...............................................................................................................................................1511.3 METHOD_OPT EXAMPLES.................................................................................................................1511.3.1 FOR ALL COLUMNS..........................................................................................................................1511.3.2 FOR ALL COLUMNS SIZE 1 **** Note this is the default value.....................................................1611.3.3 FOR ALL COLUMNS SIZE 254.........................................................................................................1611.3.4 FOR ALL INDEXED COLUMNS.......................................................................................................1611.3.5 FOR ALL INDEXED COLUMNS SIZE 1..........................................................................................1611.3.6 FOR ALL HIDDEN COLUMNS.........................................................................................................1611.3.7 FOR ALL HIDDEN COLUMNS SIZE 1.............................................................................................1611.3.8 FOR COLUMNS COL_A, COL_B......................................................................................................1611.3.9 FOR COLUMNS COL_A SIZE 1, COL_B SIZE 1.............................................................................1611.3.10 FOR COLUMN COL_A SIZE 5, COL_B SIZE AUTO, COL_C SIZE 200.....................................1611.3.11 FOR COLUMNS COL_A SIZE AUTO, COL_B SIZE AUTO.........................................................1611.3.12 FOR ALL COLUMNS SIZE AUTO..................................................................................................1611.3.13 FOR COLUMNS SIZE AUTO COL_A, COL_B, COL_C................................................................16

Copyright 2007 TUSC Page 3 of 26

Page 4: Best Practices for Analyzing Objects - WordPress.com · 2011-05-16 · Document Best Practices for Analyzing Objects Document Title: Best Practices for Analyzing Objects Document

Document Best Practices for Analyzing Objects

11.3.14 FOR ALL COLUMNS SIZE SKEWONLY.......................................................................................1611.3.15 FOR ALL COLUMNS SIZE REPEAT..............................................................................................17

12 COLLECTING DICTIONARY AND FIXED OBJECT STATS...........................................................................1712.1 FIXED OBJECTS STATS.......................................................................................................................1712.2 DICTIONARY STATS - STATISTICS ON SYS, SYSTEM and OTHER ORACLE COMPONENTS..........................................................................................................................................................................17

13 COLLECTING CPU COST MODELING STATS.................................................................................................1813.1 HOW DO I REVIEW CPU COST MODELING STATISTICS?...........................................................1813.2 SYS.AUX_STATS$.................................................................................................................................1813.3 HOW DO I COLLECT STATS FOR CPU COST MODELING?..........................................................1913.4 WHEN DO I COLLECT NEW SYSTEM STATS?................................................................................1913.5 SAVING AND RESTORE SYSTEM STATS........................................................................................1913.6 VIEWING SAVED SYSTEM STATS....................................................................................................19

14 RETENTION OF PREVIOUSLY COLLECTED STATISTICS...........................................................................2014.1 BACKING UP AND RESTORING STATISTICS USING STATTAB ................................................2014.1.1 CREATING A STATTAB TABLE......................................................................................................2014.1.2 SAVING OFF STATISTICS – DBMS_STATS.EXPORT_/IMPORT................................................2014.1.2.1 Transfering Stats to Another Schema or Database............................................................................2014.1.3 BACKING UP USAGE INFORMATION...........................................................................................2014.1.4 VIEWING SAVED STATISTICS USING STATTAB.......................................................................2114.1.4.1 Viewing Saved Table Statistics..........................................................................................................2114.1.4.2 Viewing Saved Column Statistics......................................................................................................2114.1.4.3 Viewing Saved Index Statistics..........................................................................................................2114.1.4.4 Viewing Saved CPU Statistics...........................................................................................................2214.1.4.5 sys.aux_stats$.....................................................................................................................................2214.2 USING 10G RETENTION OF STATISTICS.........................................................................................2314.2.1.1 Determining How far back we can restore from................................................................................2314.2.1.2 Getting and Setting the Retention Time.............................................................................................2314.2.2 RESTORING STATISTICS WITH 10G AUTO RETENTION..........................................................2314.2.2.1 Restoring Table Stats.........................................................................................................................2314.2.2.2 Restoring Dictionary Stats.................................................................................................................2414.2.2.1 Restoring Database Stats....................................................................................................................2414.2.2.2 Restoring Schema Stats......................................................................................................................24

15 AUTOMATED STATS JOB......................................................................................................................................2416 LOCKING AND UNLOCKING STATISTIC COLLECTIONS...........................................................................2517 LIMITATIONS OF DBMS_STATS..........................................................................................................................25

17.1 CHAINED ROWS ..................................................................................................................................2517.2 VALIDATE STRUCTURE.....................................................................................................................25

18 APPENDIX..................................................................................................................................................................2518.1 A Note from Metalink on Automatic Undo Retention.............................................................................2518.2 BIBLIOGRAPHY....................................................................................................................................26

Copyright 2007 TUSC Page 4 of 26

Page 5: Best Practices for Analyzing Objects - WordPress.com · 2011-05-16 · Document Best Practices for Analyzing Objects Document Title: Best Practices for Analyzing Objects Document

Document Best Practices for Analyzing Objects

1 PURPOSE OF THIS DOCUMENT

This document describes Best Practices for collection of statistics for the Cost Based Optimizer.

2 SUMMARY

The Oracle Cost Based Optimizer (CBO) is a core part of the Oracle technology stack and makes a significant contribution to the overall performance to the database. The technology was originally obtained from Digital Equipment Corp following Oracle’s purchase of the Rdb database system in 1992. Since then it has been refined and extended. With Oracle 10g, the original Rule Based Optimizer (RBO) is de-supported. It is expected that in future releases the RBO will disappear altogether and the CBO will be the only query optimization technology available.The Oracle Cost Based Optimizer relies on table and object statistics to determine the optimal path to use to fulfill a user’s query. In Oracle releases prior to 10g, there are 2 types of optimizers that Oracle utilizes to create execution plans for queries. The RULE based optimizer, available in release 9i and lower, utilizes a set of fairly understandable rules, applied in serial order, to estimate and obtain a proper execution plan.

The Cost Based Optimizer (CBO), which has been available since Oracle version 7.3, is the only optimizer available in future releases. The CBO uses informational statistics captured about an object to estimate and obtain the proper execution plan with the cheapest cost.

In 9i and 10g, additional statistics about the machine, cpu’s and i/o patterns can also be collected using the CPU Cost Modeler.

To operate properly, the CBO must have accurate statistics to create the best, and cheapest execution path for a query. This white paper helps to clarify how to collect statistics accurate statistics and what options to use.

Copyright 2007 TUSC Page 5 of 26

Page 6: Best Practices for Analyzing Objects - WordPress.com · 2011-05-16 · Document Best Practices for Analyzing Objects Document Title: Best Practices for Analyzing Objects Document

Document Best Practices for Analyzing Objects

3 TABLE AND INDEX STATISTICS

Oracle uses a number of statistics to determine the best execution path for any given query. Statistics about the number of rows a table has, the average space are all calculated during a table statistics collection.

4 COLUMN STATISTICS

When Oracle calculates the estimated cardinality of an execution path, Oracle estimates that each distinct column value will point to the same number of rows that any other distinct column value will. If data is highly skewed in favor of 1 column value over another, Oracle can use this information to obtain a closer estimate to the number of rows that will be returned.

To map the skew-ness of a column, Oracle utilizes 2 types of distributions named: Frequency and Height-Balanced (or equi-depth). Oracle limits the number of histogram buckets for either distribution type to 254.

A frequency distribution models a precise histogram, based exactly on how many rows a single column value is contained within. A frequency distribution can be created only when there are less than 255 distinct values for a column. A frequency histogram can take on 2 forms by Oracle. Each form will show up slightly differently when querying histogram$. In form #1, Oracle will create 1 bucket number for each distinct value, and place the exact count of rows for that value in the bucket. In form #2, Oracle will use “Bucket Subtraction”. Bucket Subtraction will label a bucket number with the number of total rows of current value and store the actual column value. In this method, Oracle obtains the number of rows for each distinct value by subtracting the current bucket number with the previous bucket number. You can easily identify a “Bucket Subtraction” histogram because the bucket numbers usually go beyond 255.

A Height-Balanced distribution model (statistically known as an equi-depth histogram) obtains it’s model from trying to evenly distribute that distinct column values across a known number of buckets (hence equi-depth/height balanced).

As the number of distinct values approaches the number of rows and when the number of rows is large, this model becomes very in-accurate.

4.1 WHEN SHOULD HISTOGRAMS BE CREATED

Histograms should only be created when having a histogram in place will likely change a potential execution plan towards another plan.

Here are a few examples, based upon the following table:Create table mywebspace ( MyUniqueKey number,

name varchar2(20), address varchar2(20), DOB date, age number, city varchar2(30), state varchar2(2), county varchar2(30),mywebsite varchar2(400));

Copyright 2007 TUSC Page 6 of 26

Page 7: Best Practices for Analyzing Objects - WordPress.com · 2011-05-16 · Document Best Practices for Analyzing Objects Document Title: Best Practices for Analyzing Objects Document

Document Best Practices for Analyzing Objects

Create unique index mykey on mywebspace(myuniquekey);Create index DOBidx on mywebspace(DOB);Create index ageidx on mywebspace(age);

The mywebsite field always starts with “http://www.myspace.com/personalwebsites/”

4.1.1 Example: Small Table

The table is very small, say, 10 rows.

Since all the rows probably fit in 1 block, we would ever only do 1 I/O operation to retrieve this data. A histogram would never be necessary and in point of fact, indexes probably wouldn’t be either.

4.1.2 Example: Unique/Primary Keys

Never create a histogram on any UNIQUE or PRIMARY KEY column. The data is 100% evenly distributed with 1 single possible value per row.

4.1.3 Example: MyWebSite Field

When Oracle creates a histogram on varchar2 fields, only the first 32 characters of the substring are used for creating the histogram.

Since all mywebsite URL’s start with exactly the same 40 characters, a histogram could not be used effectively.

Also to note, if UTF8 or other multibyte charactersets are used, the substring is 16 characters.

4.1.4 Example: Age Field

Since we are most likely talking about the human species in this table, we are all < 254 years old. A histogram on age might very well be valuable if the age varies widely overall, but there are a huge amount of 18 years old in this table.

Since an index is on AGE, the optimizer might decide to use a FULL table scan if we are looking for 18 year olds only.

4.1.5 Example: Name field

If the name field was queried as the only column in the where clause, a FULL TABLE SCAN, or RANGE scan would be used, regardless of a histogram in place.

Assume that an index was on the name field, and assuming the name field is fairly distinct across the data, the data most likely will closely track to the number of rows in the table and probably wouldn’t change the execution plan or the cardinality estimate.

Therefore, there is no reason to create a histogram on this field.

4.2 EVERY APPLICATION IS DIFFERENT

Regardless of the examples above, every table and every column in each application responds differently to histograms.

The only guaranteed approach to histogram creation is to do thorough analysis on the application and the execution plans for each query.

Copyright 2007 TUSC Page 7 of 26

Page 8: Best Practices for Analyzing Objects - WordPress.com · 2011-05-16 · Document Best Practices for Analyzing Objects Document Title: Best Practices for Analyzing Objects Document

Document Best Practices for Analyzing Objects

5 COLUMN WHERE CLAUSE USAGE

Oracle tracks every column used in a where clause across database reboots using the SYS.COL_USAGE$ table.This table stores the object# (from dba_objects) and the column# (intcol# - column_id) from dba_tab_columnsand the timestamp in which it was last used in a where clause.

The table also tracks “how” it was used in the clause, whether it was used an equality (a=b), an equijoin (tablea.a = tableb.b), nonequijoins (tablea.a <> tableb), Ranges, likes and IS NULL usages.

SQL> desc sys.col_usage$ Name Null? Type ----------------------------------------- -------- ---------------------------- OBJ# NUMBER INTCOL# NUMBER EQUALITY_PREDS NUMBER EQUIJOIN_PREDS NUMBER NONEQUIJOIN_PREDS NUMBER RANGE_PREDS NUMBER LIKE_PREDS NUMBER NULL_PREDS NUMBER TIMESTAMP DATE

Querying the table looks like the following:

OBJ# INTCOL# EQUALITY_PREDS EQUIJOIN_PREDS NONEQUIJOIN_PREDS RANGE_PREDS LIKE_PREDS NULL_PREDS TIMESTAMP---------- ---------- -------------- -------------- ----------------- ----------- ---------- ---------- --------- 72 1 444 0 0 0 0 0 17-DEC-06 72 2 444 0 0 0 0 0 17-DEC-06 72 3 444 0 0 0 0 0 17-DEC-06 73 1 444 0 0 0 0 0 17-DEC-06

The above query can be joined to dba_objects and dba_tab_columns to see the table_name and column_names. This query shows how each column is used in where clauses. It also shows the last date the column was used in a where clause.

5.1 USING SYS.COL_USAGE$

There are many ways this table can be used. I use the following query to help manually determine which fields should have histograms collected, making sure to not collect histograms on any column that is UNIQUE or a PRIMARY KEY.

SELECT TABLE_NAME, COLUMN_NAME, NUM_NULLS, NUM_DISTINCT FROMUSER_TAB_COLUMNS WHERE (TABLE_NAME, COLUMN_NAME) IN( SELECT DISTINCT TABLE_NAME, COLUMN_NAME FROM (

SELECT O.OBJECT_NAME TABLE_NAME, C.COLUMN_NAME, CU.TIMESTAMP LAST_USEDFROM SYS.COL_USAGE$ CU, USER_OBJECTS O, USER_TAB_COLS CWHERE O.OBJECT_ID = CU.OBJ#AND C.COLUMN_ID = CU.INTCOL#AND C.TABLE_NAME = O.OBJECT_NAMEAND C.DATA_TYPE NOT LIKE '%LOB%'AND (cu.equality_preds + cu.equijoin_preds + cu.nonequijoin_preds+

cu.range_preds + cu.like_preds) <> 0AND cu.TIMESTAMP >= SYSDATE - 60AND c.COLUMN_ID <> 1

) COLUSAGEWHERE

NOT EXISTS (

Copyright 2007 TUSC Page 8 of 26

Page 9: Best Practices for Analyzing Objects - WordPress.com · 2011-05-16 · Document Best Practices for Analyzing Objects Document Title: Best Practices for Analyzing Objects Document

Document Best Practices for Analyzing Objects

SELECT 1 FROM USER_CONSTRAINTS C, USER_CONS_COLUMNS CC WHERE C.CONSTRAINT_TYPE IN ( 'P') AND CC.CONSTRAINT_NAME = C.CONSTRAINT_NAME AND CC.OWNER = C.OWNER AND C.TABLE_NAME = COLUSAGE.TABLE_NAME AND CC.TABLE_NAME = C.TABLE_NAME AND CC.COLUMN_NAME = COLUSAGE.COLUMN_NAME ))AND NUM_DISTINCT > 0ORDER BY TABLE_NAME, COLUMN_NAME

5.2 MAINTAINING SYS.COL_USAGE$

To consistently run good collections using “SIZE AUTO”, very old information in the SYS.COL_USAGE$ table must be purged. This table is not maintained by Oracle properly and I have personally put an enhancement request in to have a call to dbms_stats to properly maintain it.

Although oracle does not support direct manipulation of the sys tables, I have found that purging old information, those records where timestamp > 6 months, helps both the results of the SIZE AUTO command, and the performance.

6 CPU COST MODELING

The Cost based model without CPU statistics is based upon I/O costing. Specifically, the optimizer reviews an explain path based upon the plan with the lowest number of I/O’s.

Starting in 9i, the optimizer includes CPU Cost Modeling, which adds a CPU cost to the CBO costing and refines the I/O costs based upon actual hardware responses to single block and multiblock read times.

In 10G, CPU Cost modeling is turned on by default, although, using defaults set by Oracle, until statistics are gathered by the dba.

Once CPU Cost modeling is in place, the optimizer uses the following formula for costing execution plans:

“The costing model is a formula that calculates the cost of any statement.

Cost = (#SRds * sreadtim + #MRds * mreadtim + #CPUCycles / cpuspeed ) / sreadtim

where:• #SRDs is the number of single block reads• #MRDs is the number of multi block reads• #CPUCycles is the number of CPU Cycles *)• sreadtim is the single block read time• mreadtim is the multi block read time• cpuspeed is the CPU cycles per second

Copyright 2007 TUSC Page 9 of 26

Page 10: Best Practices for Analyzing Objects - WordPress.com · 2011-05-16 · Document Best Practices for Analyzing Objects Document Title: Best Practices for Analyzing Objects Document

Document Best Practices for Analyzing Objects

7 DBMS_STATS

DBMS_STATS is the package used to generate cost based optimizer statistics for databases. This package can be broken down into 6 categories and their associated functions and/or procedures below.

• GATHER Procedures to GATHER statistics GATHER_DATABASE_STATS ProceduresGATHER_DICTIONARY_STATS ProcedureGATHER_FIXED_OBJECTS_STATS ProcedureGATHER_INDEX_STATS ProcedureGATHER_SCHEMA_STATS ProceduresGATHER_SYSTEM_STATS ProcedureGATHER_TABLE_STATS Procedure

• DELETE Procedure to DELETE generated statistics DELETE_COLUMN_STATS ProcedureDELETE_DATABASE_STATS ProcedureDELETE_DICTIONARY_STATS ProcedureDELETE_FIXED_OBJECTS_STATS ProcedureDELETE_INDEX_STATS ProcedureDELETE_SCHEMA_STATS ProcedureDELETE_SYSTEM_STATS ProcedureDELETE_TABLE_STATS Procedure

• RETENTION Procedures to SAVE, RESTORE and TRANSFER statistics RESTORE_DICTIONARY_STATS ProcedureRESTORE_FIXED_OBJECTS_STATS ProcedureRESTORE_SCHEMA_STATS ProcedureRESTORE_SYSTEM_STATS ProcedureRESTORE_TABLE_STATS Procedure

EXPORT_COLUMN_STATS ProcedureEXPORT_DATABASE_STATS ProcedureEXPORT_DICTIONARY_STATS ProcedureEXPORT_FIXED_OBJECTS_STATS ProcedureEXPORT_INDEX_STATS ProcedureEXPORT_SCHEMA_STATS ProcedureEXPORT_SYSTEM_STATS ProcedureEXPORT_TABLE_STATS Procedure

IMPORT_COLUMN_STATS ProcedureIMPORT_DATABASE_STATS ProcedureIMPORT_DICTIONARY_STATS ProcedureIMPORT_FIXED_OBJECTS_STATS ProcedureIMPORT_INDEX_STATS ProcedureIMPORT_SCHEMA_STATS ProcedureIMPORT_SYSTEM_STATS ProcedureIMPORT_TABLE_STATS Procedure

Copyright 2007 TUSC Page 10 of 26

Page 11: Best Practices for Analyzing Objects - WordPress.com · 2011-05-16 · Document Best Practices for Analyzing Objects Document Title: Best Practices for Analyzing Objects Document

Document Best Practices for Analyzing Objects

CREATE_STAT_TABLE ProcedureDROP_STAT_TABLE ProcedurePURGE_STATS ProcedureGET_STATS_HISTORY_RETENTION FunctionGET_STATS_HISTORY_AVAILABILITY FunctionALTER_STATS_HISTORY_RETENTION Procedure

• LOCKING Procedures to LOCK and UNLOCK statistics LOCK_SCHEMA_STATS ProcedureLOCK_TABLE_STATS Procedure

UNLOCK_SCHEMA_STATS ProcedureUNLOCK_TABLE_STATS Procedure

• DEFAULTS Procedures to Modify Package DEFAULTS RESET_PARAM_DEFAULTS ProcedureSET_PARAM ProcedureGET_PARAM

• MANUAL Procedures to Manually create or manipulate statistics PREPARE_COLUMN_VALUES ProceduresPREPARE_COLUMN_VALUES_NVARCHAR2 ProcedurePREPARE_COLUMN_VALUES_ROWID Procedure

SET_COLUMN_STATS ProceduresSET_INDEX_STATS ProceduresSET_SYSTEM_STATS ProcedureSET_TABLE_STATS Procedure

GET_COLUMN_STATS ProceduresGET_INDEX_STATS ProceduresGET_SYSTEM_STATS ProcedureGET_TABLE_STATS ProcedureGENERATE_STATS Procedure

8 SETTING DBMS_STATS PARAMETERS

In 10G, default parameters can be set for the database. When parameters are specified in the call to dbms_stats, those parameters override the defaults previously set.

Recommended Defaults: DBMS_STATS.SET_PARAM('CASCADE','TRUE'); DBMS_STATS.SET_PARAM('ESTIMATE_PERCENT','100'); DBMS_STATS.SET_PARAM('DEGREE’,'NULL'); DBMS_STATS.SET_PARAM('METHOD_OPT','FOR ALL COLUMNS SIZE AUTO'); DBMS_STATS.SET_PARAM('NO_INVALIDATE','FALSE'); DBMS_STATS.SET_PARAM('GRANULARITY','ALL');

Copyright 2007 TUSC Page 11 of 26

Page 12: Best Practices for Analyzing Objects - WordPress.com · 2011-05-16 · Document Best Practices for Analyzing Objects Document Title: Best Practices for Analyzing Objects Document

Document Best Practices for Analyzing Objects

DBMS_STATS.SET_PARAM('AUTOSTATS_TARGET','AUTO');

FOR AUTOSTATS TARGET, 3 POSSIBLE VALUES DBMS_STATS.SET_PARAM('AUTOSTATS_TARGET','ALL'); DBMS_STATS.SET_PARAM('AUTOSTATS_TARGET','ORACLE');

DBMS_STATS.SET_PARAM('AUTOSTATS_TARGET','AUTO');

8.1 USING DEFAULTS

If your 10g defaults are set, calls to dbms_stats can be made very simply. In example, to collect schema stats with defaults:

DBMS_STATS.GATHER_SCHEMA_STATS (OWNNAME=>’SOME_SCHEMA’);

In any case, the full options can be specified DBMS_STATS.GATHER_SCHEMA_STATS( OWNNAME=>’SOME_SCHEMA’, ESTIMATE_PERCENT=>100, METHOD_OPT=>’FOR ALL COLUMNS SIZE 1’, DEGREE=>4, GRANULARITY=>’ALL’, CASCADE=>TRUE );

8.2 GETTING PARAMSYou can use a select statement to get the PARAMS.

select'AUTOSTATS_TARGET:', dbms_stats.get_param('AUTOSTATS_TARGET'),'GRANULARITY:', dbms_stats.get_param('GRANULARITY'),'CASCADE:', dbms_stats.get_param('CASCADE'),'ESTIMATE_PERCENT:', dbms_stats.get_param('ESTIMATE_PERCENT'),'DEGREE:', dbms_stats.get_param('DEGREE'),'METHOD_OPT:', dbms_stats.get_param('METHOD_OPT'),'NO_INVALIDATE:',dbms_stats.get_param('NO_INVALIDATE')from dual/

8.3 CONSTANTS

Use the following constant to indicate that auto-sample size algorithms should be used:

AUTO_SAMPLE_SIZE CONSTANT NUMBER;

The constant used to determine the system default degree of parallelism, based on the initialization parameters, is:

DEFAULT_DEGREE CONSTANT NUMBER;

Copyright 2007 TUSC Page 12 of 26

Page 13: Best Practices for Analyzing Objects - WordPress.com · 2011-05-16 · Document Best Practices for Analyzing Objects Document Title: Best Practices for Analyzing Objects Document

Document Best Practices for Analyzing Objects

Use the following constant to let Oracle select the degree of parallelism based on size of the object, number of CPUs and initialization parameters:

AUTO_DEGREE CONSTANT NUMBER;

Use the following constant to let Oracle decide whether to collect statistics for indexes or not:

AUTO_CASCADE CONSTANT BOOLEAN;

Use the following constant to let oracle decide when to invalidate dependent cursors.

AUTO_INVALIDATE CONSTANT BOOLEAN;

8.4 ESTIMATE_PERCENT

Estimate percent specifies what amount of the percent of the table should be sampled to obtain the statistics.Higher sampling percentages, up to 100%, are best, but, there are many documents that say collection statistics on a range of only 10-15%-30% of the table is sufficient.

For most of the tables, especially for table/columns with less than 255 distinct values, it is quite important to collect statistics based upon 100% of the data in the table.

This is very, very important for the creation of frequency based histograms on columns with < 255 distinct values. If a value is missing in the rows estimated, then it will not get mapped to a Frequency distribution.

For extremely large tables, and those tables where a sampling of the data will give a very good statistical representation of the table, estimating the statistics at different values can provide a good result in a more efficient manner.

I tend to recommend using 100% estimate of the data, on tables that are small always, and on databases where collecting at 100% can be done during the appropriate window. For very large objects, test at different levels. Keep in mind, that a table with 100 million rows of data probably won’t shift statistical representation readily.

8.4.1 auto_sample_size

Oracle will determine the best sample size for an objects statistic while performing the collection if this value is used. There are different opinions as to what is best for a database, and every application is different.

I recommend setting estimate percent to 100% on all databases where collection at 100% is possible within the scheduled jobs window. When the collections can not be finished within the window, I suggest testing AUTO_SAMPLE_SIZE to see if good statistical measures can be obtained for your application. An alternative to using AUTO_SAMPLE_SIZE is to hard set ESTIMATE_PERCENT or to break up the collection into separate jobs.

8.5 CASCADE

Copyright 2007 TUSC Page 13 of 26

Page 14: Best Practices for Analyzing Objects - WordPress.com · 2011-05-16 · Document Best Practices for Analyzing Objects Document Title: Best Practices for Analyzing Objects Document

Document Best Practices for Analyzing Objects

When CASCADE is set to TRUE, the statistics will also be collected on all indexes on this table, but, parallel index statistics creation CAN NOT be used.

I prefer setting CASCADE=FALSE when doing very large tables, and calling the DBMS_STATS.GATHER_INDEX_STATS specifically for those objects.

8.6 METHOD_OPT

The METHOD_OPT parameter tells the DBMS_STATS routine whether or not to create histograms for table columns and how to go about doing it.

The default value for METHOD_OPT calculates column statistics with no histograms.

See the discussion for COLUMN STATISTICS below.

8.7 DEGREE

The DEGREE option sets the degree of parallelization for the collection. Since collections typically run in serial, set this to the number of cpu’s in the system, provided the system is normally quiet during statistics collection. Otherwise set this to a reasonable number not to over-parallelize the collection.

8.8 GRANULARITY

The GRANULARITY option, applies to only partitioned and sub-partitioned tables, defines which level of statistics are going to be collected.

There are 6 levels: AUTO, ALL, GLOBAL, PARTITION, SUBPARTITION, GLOBAL AND PARTITION.

When using partitions, be very careful about setting METHOD_OPT.

I recommend using multiple passes of collection on large, partitioned, objects.Collect GLOBAL statistics on named table with/without histograms (METHOD_OPT)LOOP through sub partitions

Collect SUBPARTITION statistics with default METHOD_OPTLOOP through partitions specifically

Collect PARTITION statistics individually with default METHOD_OPT

By breaking the job down, I find they finish faster, and with less problems, especially with sort space.

9 COLLECTING TABLE STATISTICS

Table statistics should always be collected at the partition and sub-partition level when applicable. To collect table level statistics, use DBMS_STATS.GATHER_TABLE_STATS.

DBMS_STATS.GATHER_TABLE_STATS (OWNNAME=>’ABC’,TABNAME=>’MY TAB’,ESTIMATE_PERCENT=>100, METHOD_OPT=>’FOR ALL COLUMNS SIZE AUTO’);

10 COLLECTING INDEX STATSCopyright 2007 TUSC Page 14 of 26

Page 15: Best Practices for Analyzing Objects - WordPress.com · 2011-05-16 · Document Best Practices for Analyzing Objects Document Title: Best Practices for Analyzing Objects Document

Document Best Practices for Analyzing Objects

As with table statistics, options for setting DEGREE, GRANULARITY and ESTIMATE_PERCENT exists as also the ability to gather at partition and sub-partition layers.

On very large databases, I recommend breaking the jobs down, manually calling GATHER_INDEX_STATS as appropriate.

11 COLLECTING COLUMN STATS AND HISTOGRAMS11.1 METHOD_OPT

The METHOD_OPT parameter in the DBMS_STATS.GATHER_TABLE_STATS allows for refinement of histogram collection. There are many articles that tell you to use AUTO all the time, but I find that each application tends to be different.

11.2 METHOD_OPT SIZE

There are 4 size options available.

11.2.1 size N

Using SIZE with a number, other then 1, will create a Histogram on the column with “up to” N buckets.

11.2.2 repeat

Oracle will “refresh” the current column statistics with the same number of buckets as currently used.

11.2.3 auto

Oracle will choose the number of buckets, including NOT creating a histogram, based upon the where clause usage data stored in sys.col_usage$ AND whether this column’s data is highly skewed.

AUTO is Oracle’s preferred, method, although, it is not the default.

See section below on Sys.col_usage$

11.2.4 skewonly

Oracle will choose the number of buckets, including NOT creating a histogram, based upon whether this column’s data is highly skewed, only. It will not look at the columns where clause usage.

The difference between AUTO and SKEWONLY is simple. SKEWONLY does not review sys.col_usage$ and only looks as skewness. AUTO investigates both skewness and usage.

11.3 METHOD_OPT EXAMPLES

Below is a description of the different combinations of how this field can be utilized.Note carefully that when a METHOD_OPT parameter is used, but no SIZE value is specified, SIZE 75 is the default value.

11.3.1 FOR ALL COLUMNS

COLLECTS COLUMN STATS AND DEFAULT 75 BUCKET HISTOGRAMS FOR EVERY COLUMN IN THE TABLE

Copyright 2007 TUSC Page 15 of 26

Page 16: Best Practices for Analyzing Objects - WordPress.com · 2011-05-16 · Document Best Practices for Analyzing Objects Document Title: Best Practices for Analyzing Objects Document

Document Best Practices for Analyzing Objects

11.3.2 FOR ALL COLUMNS SIZE 1 **** Note this is the default value

COLLECTS COLUMN STATS FOR EVERY COLUMN IN THE TABLE, NO HISTOGRAMS

11.3.3 FOR ALL COLUMNS SIZE 254

COLLECTS COLUMN STATS FOR EVERY COLUMN IN THE TABLE, AND UP TO 254 BUCKET HISTOGRAMS

11.3.4 FOR ALL INDEXED COLUMNS

COLLECTS COLUMN STATS FOR INDEXED COLUMNS ONLY, 75 BUCKET HISTOGRAMS

11.3.5 FOR ALL INDEXED COLUMNS SIZE 1

COLLECTS COLUMN STATS FOR INDEXED COLUMNS ONLY, NO HISTOGRAMS

11.3.6 FOR ALL HIDDEN COLUMNS

COLLECTS STATS ON HIDDEN COLUMNS FOR FUNCTION BASED INDEXES, 75 BUCKET HISTOGRAMS

11.3.7 FOR ALL HIDDEN COLUMNS SIZE 1

COLLECTS COLUMN STATS FOR HIDDEN COLUMNS FOR FUNCTION BASED INDEXES, NO HISTOGRAMS

11.3.8 FOR COLUMNS COL_A, COL_B

COLLECTS COLUMN STATS FOR EACH COLUMN LISTED, DEFAULT 75 BUCKET HISTOGRAM

11.3.9 FOR COLUMNS COL_A SIZE 1, COL_B SIZE 1

COLLECTS COLUMN STATS FOR EACH COLUMN LISTED, NO HISTOGRAM

11.3.10 FOR COLUMN COL_A SIZE 5, COL_B SIZE AUTO, COL_C SIZE 200

COLLECTS COLUMN STATS FOR EACH COLUMN LISTED AND WITH SPECIFIC HISTOGRAM BUCKET SIZES GIVEN FOR EACH COLUMN

11.3.11 FOR COLUMNS COL_A SIZE AUTO, COL_B SIZE AUTO

COLLECTS COLUMN STATS FOR EACH COLUMN LISTED, HISTOGRAMS BASED ON USAGE AND SKEW

11.3.12 FOR ALL COLUMNS SIZE AUTO

COLLECTS COLUMN STATS FOR ALL COLUMNS IN THE TABLE, HISTOGRAMS BASED ON SKEW AND WORKLOAD

11.3.13 FOR COLUMNS SIZE AUTO COL_A, COL_B, COL_C

COLLECTS COLUMN STATS FOR EACH COLUMN LISTED, HISTOGRAMS BASED ON USAGE AND SKEW

11.3.14 FOR ALL COLUMNS SIZE SKEWONLY

COLLECTS COLUMN STATS FOR ALL COLUMNS IN THE TABLE, HISTOGRAMS CREATED FOR SKEWED DATA ONLY.

Copyright 2007 TUSC Page 16 of 26

Page 17: Best Practices for Analyzing Objects - WordPress.com · 2011-05-16 · Document Best Practices for Analyzing Objects Document Title: Best Practices for Analyzing Objects Document

Document Best Practices for Analyzing Objects

11.3.15 FOR ALL COLUMNS SIZE REPEAT

RE-COLLECT COLUMN STATISTICS ON ALL COLUMNS THAT CURRENTLY HAVE STATISTICS. RE-COLLECT HISTOGRAMS ON THOSE COLUMNS THAT CURRENTLY HAVE HISTOGRAMS, AND ALSO USE THE SAME BUCKET SIZE.

12 COLLECTING DICTIONARY AND FIXED OBJECT STATS`

Oracle recommends collecting statistics on the fixed objects and dictionary statistics.

In Oracle 9i, there was much debate on whether to gather stats on the sys objects. Some said yes, some said no. My personal experience was that collecting schema level stats in 9i didn’t work so well and was a bad idea.

In 10G, it is quite necessary to collect both statistics on all components.

12.1 FIXED OBJECTS STATS

To collect statistics on the X$ (fixed objects), run the following:DBMS_STATS.GATHER_FIXED_OBJECTS_STATS;

I suggest doing this once a database has been fully populated and any time a significant amount of schema objects have been created.

To see your fixed objects stats, join the v$fixed_table view to tab_stats$.

select b.name, a.obj#, a.rowcnt, a.blkcnt, a.analyzetime, a.samplesizefrom tab_stats$ a, v$fixed_table bwhere a.obj# = b.object_id;

In addition, you can delete, export and import fixed objects stats from one system to another using dbms_stats.delete_fixed_objects_stats, dbms_stats.export_fixed_objects_stats and dbms_stats.import_fixed_objects_stats.

12.2 DICTIONARY STATS - STATISTICS ON SYS, SYSTEM and OTHER ORACLE COMPONENTS

There are many schemas that ship with the database today, drsys, cmsys, mdsys, wmsys, etc.

The documentation suggests that a call to dbms_stats.gather_dictionary_stats with no arguments will collect stats on all ‘SYS’,’SYSTEM’ and other dictionary schemas as listed in the “SCHEMA” column of dba_registry.

I have NOT found this to be the case.

It appears, that even though the stated default for the “OPTIONS” parameter is “GATHER”, specifically setting this parameter, “OPTIONS”, obtain the correct collection. Leaving “OPTIONS” as set to default, does not.

The documentation also states that you can individually collect statistics on the other components by specifically giving the comp_id (component id) from dba_registry to the call to gather stats. Without specifying the “OPTIONS” parameter, this also does not work as expected.

Copyright 2007 TUSC Page 17 of 26

Page 18: Best Practices for Analyzing Objects - WordPress.com · 2011-05-16 · Document Best Practices for Analyzing Objects Document Title: Best Practices for Analyzing Objects Document

Document Best Practices for Analyzing Objects

begin for c1rec in (select comp_id from dba_registry) loop

DBMS_STATS.GATHER_DICTIONARY_STATS(COMP_ID=>c1rec.COMP_ID, OPTIONS=>’GATHER’);

end loop;end;/

If selecting “LAST_ANALYZED” column from dba_tables shows that the date is still old, verify the schema is listed in dba_registry. If not, try performing a dbms_stats.gather_schema_stats.

13 COLLECTING CPU COST MODELING STATS

13.1 HOW DO I REVIEW CPU COST MODELING STATISTICS?

CPU Cost modeling information can be viewed using sys.aux_stats$. These elements can also be directly manipulated and I have found that to be useful for testing different scenarios.

To determine if CPU Cost Modeling is active, verify that data is populated in this table.If cpuspeed is NOT populated, but cpuspeednw IS populated, then CPU Cost modeling is turned on, but using Oracle Defaults.

For CPU Cost Modeling to function properly, workload statistics must be captured using dbms_stats.gather_system_stats.

13.2 SYS.AUX_STATS$

Each record in the sys.aux_stats$ table holds a value for the CPU statistics. The values are defined below:“

• iotfrspeed—I/O transfer speed in bytes for each millisecond• ioseektim - seek time + latency time + operating system overhead time, in milliseconds• sreadtim - average time to read single block (random read), in milliseconds• mreadtim - average time to read an mbrc block at once (sequential read), in milliseconds• cpuspeed - average number of CPU cycles for each second, in millions, captured for the

workload (statistics collected using 'INTERVAL' or 'START' and 'STOP' options)• cpuspeednw - average number of CPU cycles for each second, in millions, captured for the

noworkload (statistics collected using 'NOWORKLOAD' option.• mbrc - average multiblock read count for sequential read, in blocks• maxthr - maximum I/O system throughput, in bytes/second• slavethr - average slave I/O throughput, in bytes/second” *From 10g Manual

Copyright 2007 TUSC Page 18 of 26

Page 19: Best Practices for Analyzing Objects - WordPress.com · 2011-05-16 · Document Best Practices for Analyzing Objects Document Title: Best Practices for Analyzing Objects Document

Document Best Practices for Analyzing Objects

13.3 HOW DO I COLLECT STATS FOR CPU COST MODELING?

A single call to dbms_stats.gather_system_stats with an appropriate interval of a few hours during an average workload is all that is required to collect statistics.

Statistics are gathered using the DBMS_STATS.GATHER_SYSTEM_STATS call using an interval period, or manually started and stopped.

To collect statistics for a 2 hour interval, run the following:DBMS_STATS.GATHER_SYSTEM_STATS (gathering_mode=>’INTERVAL’,interval=>120);

13.4 WHEN DO I COLLECT NEW SYSTEM STATS?

CPU “System” stats should be re-collected whenever the average workload for the database shifts and whenever there is a change to CPU’s and/or I/O hardware and patterns.

This includes most hardware upgrades, including adding HBA cards, NIC Cards, CPU’s, Disk drives, and external RAID or SAN hardware and/or configuration that could have an impact on performance.

13.5 SAVING AND RESTORE SYSTEM STATS

In 9i, cpu cost modeling statistics can be exported using dbm_stats.export_system_stats and then re-imported using dbms_stats.import_system_stats.

In 10g, system stats can also be restored from recent past collections (as available), using dbms_stats.restore_system_stats;

13.6 VIEWING SAVED SYSTEM STATS

Collecting stats, and saving different versions is convenient when possible workloads shift. It is quite easy to then compare different versions to see the effects of hardware, or workload changes on the system stats.

Provided you have exported system stats using dbms_stats.export_system_stats, and provided a “STATID” for that collection, the following view can be used to view the contents of those stats.

CREATE OR REPLACE VIEW STATTAB_cpu_stats ASSELECT CPU.STATID,

CPU.C1 STATUS, CPU.C2 START_TIME, CPU.C3 STOP_TIME, CPU.N3 CPUSPEED, CPU.N11 MBRC, CASE CPU.C4 WHEN 'CPU_SERIO' THEN CPU.N1 END SREADTIME, CASE CPU.C4 WHEN 'CPU_SERIO' THEN CPU.N2 END MREADTIME, CASE PARIO.C4 WHEN 'PARIO' THEN PARIO.N1 END MAXTHR, CASE PARIO.C4 WHEN 'PARIO' THEN PARIO.N2 END SLAVTHR

FROM STATTAB CPU, STATTAB PARIO

WHERE CPU.TYPE= 'S'

Copyright 2007 TUSC Page 19 of 26

Page 20: Best Practices for Analyzing Objects - WordPress.com · 2011-05-16 · Document Best Practices for Analyzing Objects Document Title: Best Practices for Analyzing Objects Document

Document Best Practices for Analyzing Objects

AND CPU.C4 = 'CPU_SERIO'AND CPU.STATID = PARIO.STATIDAND PARIO.C4 = 'PARIO';

14 RETENTION OF PREVIOUSLY COLLECTED STATISTICS

Time and Time again, I am encountered by customers with problems where the database becomes very, very slow. After analysis, it is revealed that someone recollected statistics last night and explain plans are not what they were yesterdayThere is no backup of yesterday’s statistics. This was a major problem in 8i and 9i, but has mostly been erased with the default stats retention history in 10g.

ALWAYS, ALWAYS, ALWAYS – BACKUP UP YOUR STATISTICS BEFORE COLLECTION.

Oracle provides a table to store copies of statistics, and can be run easily using DBMS_STATS.Once in place, a simple call to DBMS_STATS.EXPORT_xxx_STATS should be used before the collection begins.

In 10g, a default 31 days of statistics history is kept. I still use EXPORT functions even in 10g.

14.1 BACKING UP AND RESTORING STATISTICS USING STATTAB

14.1.1 CREATING A STATTAB TABLE

Oracle provides a table to store copies of statistics. This is extremely convenient.

DBMS_STATS.CREATE_STAT_TABLE(ownname=>’MYSCHEMA’,stattab=>’STATTAB,tblspace=>’TABLESPACE_NAME’);

14.1.2 SAVING OFF STATISTICS – DBMS_STATS.EXPORT_/IMPORT

Use a simple call to DBMS_STATS.EXPORT_XXX_STATS to export either all database statistics, a table, index, partition, etc and even the CPU stats.

14.1.2.1 Transfering Stats to Another Schema or Database

To move statistics to another database, export those statistics using DBMS_STATS.EXPORT_xxxx_STATS. Then copy the table to another database using exp, imp, datapump or db link. Then use DBMS_STATS.IMPORT_xxx_STATS to import those statistics to the data dictionary.

If the schema is different or some tables in the new database don’t exist, YOU MUST manually manipulate the STATTAB table. To modify the schema these stats are appropriate for, update STATTAB, setting “C5” column as appropriate.

Delete rows for columns that do not belong, or for tables that do not belong. Use the views below to view the statistics in the STATTAB table.

14.1.3 BACKING UP USAGE INFORMATION

There are also 4 other tables that are worth storing. A copy of SYS.COL_USAGE$, SYS.AUX_STATS$and SYS.HISTOGRAM$, V$OBJECT_USAGE.

Copyright 2007 TUSC Page 20 of 26

Page 21: Best Practices for Analyzing Objects - WordPress.com · 2011-05-16 · Document Best Practices for Analyzing Objects Document Title: Best Practices for Analyzing Objects Document

Document Best Practices for Analyzing Objects

It is good practice to make a backup up copy of the following tables on a regular basis, especially before a large collection.

14.1.4 VIEWING SAVED STATISTICS USING STATTAB

The best way to understand and use the statistics in the STATTAB table is to use the views below.

To use the following views, add a where clause to select from the appropriate “STATID” that you wish to view.

14.1.4.1 Viewing Saved Table StatisticsCREATE OR REPLACE VIEW STATTAB_TABLE_STATSASSELECT STATID,

C5 OWNER, C1 TABLE_NAME, C2 PARTITION_NAME, C3 SUBPART_NAME, N1 NUM_ROWS, N2 NUM_BLOCKS, N3 AVG_ROW_LEN, N4 SAMPLE_SIZE

FROM STATTAB

WHERE TYPE= 'T';

14.1.4.2 Viewing Saved Column StatisticsCREATE OR REPLACE VIEW STATTAB_COLUMN_STATSASSELECT STATID,

C5 OWNER, C1 TABLE_NAME, C2 PARTITION_NAME, C3 SUBPART_NAME, C4 COLUMN_NAME, N1 NUM_DISTINCT, N2 DENSITY, N4 SAMPLE_SIZE, N5 NUM_NULLS, N6 LO_VAL, N7 HI_VAL, N8 AVG_COL_LEN, N10 ENDPOINT_NUMBER, N11 ENDPOINT_VALUE

FROM STATTAB

WHERE TYPE= 'C';

14.1.4.3 Viewing Saved Index StatisticsCREATE OR REPLACE VIEW STATTAB_INDEX_STATSASSELECT STATID,

C5 OWNER, C1 INDEX_NAME, C2 PARTITION_NAME, C3 SUBPART_NAME, N1 NUM_ROWS, N2 LEAF_BLOCKS, N3 DISTINCT_KEYS,

Copyright 2007 TUSC Page 21 of 26

Page 22: Best Practices for Analyzing Objects - WordPress.com · 2011-05-16 · Document Best Practices for Analyzing Objects Document Title: Best Practices for Analyzing Objects Document

Document Best Practices for Analyzing Objects

N4 LEAF_BLOCKS_PER_KEY, N5 DATA_BLOCKS_PER_KEY, N6 CLUSTERING_FACTOR, N7 BLEVEL, N8 SAMPLE_SIZE

FROM STATTAB

WHERE TYPE= 'I';

14.1.4.4 Viewing Saved CPU StatisticsCREATE OR REPLACE VIEW STATTAB_cpu_stats ASSELECT CPU.STATID,

CPU.C1 STATUS, CPU.C2 START_TIME, CPU.C3 STOP_TIME, CPU.N3 CPUSPEED, CPU.N11 MBRC, CASE CPU.C4 WHEN 'CPU_SERIO' THEN CPU.N1 END SREADTIME, CASE CPU.C4 WHEN 'CPU_SERIO' THEN CPU.N2 END MREADTIME, CASE PARIO.C4 WHEN 'PARIO' THEN PARIO.N1 END MAXTHR, CASE PARIO.C4 WHEN 'PARIO' THEN PARIO.N2 END SLAVTHR

FROM STATTAB CPU, STATTAB PARIO

WHERE CPU.TYPE= 'S'AND CPU.C4 = 'CPU_SERIO'AND CPU.STATID = PARIO.STATIDAND PARIO.C4 = ‘PARIO’;

14.1.4.5 sys.aux_stats$

Each record in the sys.aux_stats$ table holds a value for the CPU statistics. The values are defined below:“

• iotfrspeed—I/O transfer speed in bytes for each millisecond• ioseektim - seek time + latency time + operating system overhead time, in milliseconds• sreadtim - average time to read single block (random read), in milliseconds• mreadtim - average time to read an mbrc block at once (sequential read), in milliseconds• cpuspeed - average number of CPU cycles for each second, in millions, captured for the

workload (statistics collected using 'INTERVAL' or 'START' and 'STOP' options)• cpuspeednw - average number of CPU cycles for each second, in millions, captured for

the noworkload (statistics collected using 'NOWORKLOAD' option.• mbrc - average multiblock read count for sequential read, in blocks• maxthr - maximum I/O system throughput, in bytes/second• slavethr - average slave I/O throughput, in bytes/second” *From 10g Manual

14.1.4.5.1 Viewing Saved CPU StatisticsCREATE OR REPLACE VIEW STATTAB_cpu_stats ASSELECT CPU.STATID,

CPU.C1 STATUS, CPU.C2 START_TIME, CPU.C3 STOP_TIME, CPU.N3 CPUSPEED, CPU.N11 MBRC, CASE CPU.C4 WHEN 'CPU_SERIO' THEN CPU.N1 END SREADTIME, CASE CPU.C4 WHEN 'CPU_SERIO' THEN CPU.N2 END MREADTIME, CASE PARIO.C4 WHEN 'PARIO' THEN PARIO.N1 END MAXTHR,

Copyright 2007 TUSC Page 22 of 26

Page 23: Best Practices for Analyzing Objects - WordPress.com · 2011-05-16 · Document Best Practices for Analyzing Objects Document Title: Best Practices for Analyzing Objects Document

Document Best Practices for Analyzing Objects

CASE PARIO.C4 WHEN 'PARIO' THEN PARIO.N2 END SLAVTHRFROM

STATTAB CPU, STATTAB PARIOWHERE CPU.TYPE= 'S'AND CPU.C4 = 'CPU_SERIO'AND CPU.STATID = PARIO.STATIDAND PARIO.C4 = 'PARIO';

14.2 USING 10G RETENTION OF STATISTICS

In 10g, backups of statistics are kept automatically.

14.2.1.1 Determining How far back we can restore from

Oracle 10g maintains availability of statistics for a default period of 31 days.

To identify the retention period and availability of statistics, queries can be run using Dbms_stats.GET_STATS_HISTORY_AVAILABILITY and Dbms_stats.GET_STATS_HISTORY_RETENTION against dual.

SQL> select dbms_stats.get_stats_history_availability from dual;

GET_STATS_HISTORY_AVAILABILITY---------------------------------------------------------------------------16-DEC-07 03.34.26.921000000 PM -06:00

14.2.1.2 Getting and Setting the Retention Time

The default retention time can be change.To view the current retention time, the following query can be used:

SQL> select dbms_stats.get_stats_history_retention from dual;

GET_STATS_HISTORY_RETENTION---------------------------

31

To modify the retention time, run the following:SQL> exec dbms_stats.alter_stats_history_retention(# of Days);

14.2.2 RESTORING STATISTICS WITH 10G AUTO RETENTION

Below are examples of restoring statistics. Each call, takes only a few parameters, mainly, the objects owner, the object name and the timestamp you wish to restore from.

14.2.2.1 Restoring Table Stats

begin dbms_stats.restore_table_stats ( 'ESCROW1', 'ED_FILE_EXCEPTS', '01-JAN-08 10.00.00.000000000 AM -06:00');end;

Copyright 2007 TUSC Page 23 of 26

Page 24: Best Practices for Analyzing Objects - WordPress.com · 2011-05-16 · Document Best Practices for Analyzing Objects Document Title: Best Practices for Analyzing Objects Document

Document Best Practices for Analyzing Objects

14.2.2.2 Restoring Dictionary Stats

begin dbms_stats.restore_dictionary_stats ( '01-JAN-08 10.00.00.000000000 AM -06:00');end;

14.2.2.1 Restoring Database Stats

begin dbms_stats.restore_database_stats ( '01-JAN-08 10.00.00.000000000 AM -06:00');end;

14.2.2.2 Restoring Schema Stats

begin dbms_stats.restore_schema_stats (‘ESCROW1’, '01-JAN-08 10.00.00.000000000 AM -06:00');end;

15 AUTOMATED STATS JOB

In 10g+, a scheduled job exists to automatically gather and maintain statistics for the database.The script that creates this job is ORACLE_HOME/rdbms/admin/catmwin.sql.

The jobname in the 10G scheduler is “GATHER_STATS_JOB”The program simply calls “dbms_stats.gather_database_stats_job_proc;”

SELECT * FROM DBA_SCHEDULER_JOBS WHERE JOB_NAME = 'GATHER_STATS_JOB';

The program name that the job calls is “GATHER_STATS_PROG”

SELECT * FROM DBA_SCHEDULER_PROGRAMS WHERE PROGRAM_NAME = ‘GATHER_STATS_PROG’;

To see the historical start and end times of the jobs:select * from dba_optstat_operations order by end_time;

To see the job run details:

select * FROM DBA_SCHEDULER_JOB_RUN_DETAILS where job_name = ‘GATHER_STATS_JOB’

To see the running jobs:Select * from dba_SCHEDULER_RUNNING_JOBS where job_name = ‘GATHER_STATS_JOB’

To see the job logs:Select * from dba_scheduler_job_log where job_name = ‘GATHER_STATS_JOB’;

To disable/enable the job, you can use the dbms_scheduler routines.

Copyright 2007 TUSC Page 24 of 26

Page 25: Best Practices for Analyzing Objects - WordPress.com · 2011-05-16 · Document Best Practices for Analyzing Objects Document Title: Best Practices for Analyzing Objects Document

Document Best Practices for Analyzing Objects

BEGIN DBMS_SCHEDULER.ENABLE('GATHER_STATS_JOB'); DBMS_SCHEDULER.DISABLE('GATHER_STATS_JOB');END;

/

To change any of the constants that the job uses, use can set the following globals using: AUTO_SAMPLE_SIZE CONSTANT NUMBER; DEFAULT_DEGREE CONSTANT NUMBER; AUTO_DEGREE CONSTANT NUMBER; AUTO_CASCADE CONSTANT BOOLEAN; AUTO_INVALIDATE CONSTANT BOOLEAN

16 LOCKING AND UNLOCKING STATISTIC COLLECTIONS

To keep the automated job, or other users from collecting statistics on objects, schema and table stats can be locked using DBMS_STATS.LOCK_xxx_STATS and DBMS_STATS.UNLOCK_xxx_STATS.

This is very useful if you have performed specific collections and do not want the automatic scheduled job to modify those collections.

17 LIMITATIONS OF DBMS_STATS17.1 CHAINED ROWS

Periodically, analyzed chained rows should be run on all tables. This is especially true when statspack shows large number of “Table Fetch continued row”. To analyze for chained rows, the older, ANALYZE TABLE xxxx LIST CHAINED ROWS INTO CHAINED_ROWS should be run.

This report will give you a list of all rows in the table that are chained, and the ROWID’s of those rows.It is important to fixed tables with many chained rows, by rebuilding those rows.

Chained rows statistics DO NOT AFFECT the CBO and therefore have nothing to do with DBMS_STATS.

17.2 VALIDATE STRUCTURE

To validate the structure of an object, analyze table validate structure must still be used.

DBMS_STATS only performs statistics collections that are relevant to CBO.

18 APPENDIX18.1 A Note from Metalink on Automatic Undo Retention

When undo tablespace is using NON-AUTOEXTEND datafiles,V$UNDOSTAT.TUNED_UNDORETENTION may be calculated too high preventing undo block from being expired and reused. In extreme cases the undo

Copyright 2007 TUSC Page 25 of 26

Page 26: Best Practices for Analyzing Objects - WordPress.com · 2011-05-16 · Document Best Practices for Analyzing Objects Document Title: Best Practices for Analyzing Objects Document

Document Best Practices for Analyzing Objects

tablespace could be filled to capacity by these unexpired blocks.

An alert may be posted on DBA_ALERT_HISTORY that advises to increase the space when it is not really necessary if this fix is applied.If the user sets their own alert thresholds for undo tablespaces the bug may prevent alerts from being produced.

Workaround: alter system set "_smu_debug_mode" = 33554432; This causes the v$undostat.tuned_undoretention to be calculated as the maximum of: maxquerylen secs + 300 undo_retention specified in init.ora

18.2 BIBLIOGRAPHY

The following documents were consulted in the preparation of this paper:

Cost-Based Oracle Fundamentals – Jonathan Lewis

Metalink Note 114671.1 - Gathering Statistics for the Cost Based Optimizer

Metalink Note:117203.1 - How to Use DBMS_STATS to Move Statistics to a Different Database

Metalink Note:159787.1 - 9i: Import STATISTICS=SAFE

Metalink Note:175258.1 - How to Compute Statistics on Partitioned Tables and Indexes

Metalink Note 236935.1 - Global statistics - An Explanation

Metalink Note 237293.1 - How to Move from ANALYZE to DBMS_STATS - Introduction

Metalink Note 237538.1 - How to Move from ANALYZE to DBMS_STATS on Partitioned Tables

Metalink Note 237901.1 - Gathering Schema or Database Statistics Automatically – Examples

Metalink Note 1031826.6 - Histograms: An Overview

Copyright 2007 TUSC Page 26 of 26