Nested loop join technique
Embed Size (px)
Transcript of Nested loop join technique
- 1. Nested Loop Join Technique Part 1 (Table Pre-fetching) Background Table Pre-fetching has been introduced in Oracle 9i and is enabled by default. This new approach gives some improvement in Nested Loop Join (NLJ) by reducing logical IO of the query. In 10g we can control this new behavior by setting a database parameter (_table_lookup_prefetch_size). Its annoying actually but another improvement has been introduced in 11g and in that version, we have full control of this behavior simply by using SQL hints. The objective of this test cases is to see all those behavior (normal, table pre-fetching and also the newest table batching in 11g) when we have NLJ in our query. I am going to compare the performance of unique and non-unique Index in sorted and unsorted data, so in total we will have 4 test cases per batch. In this Part 1 I am going to run the test cases in 10g only (for normal and table pre-fetching technique) and I am planning to rerun the test cases against 11g in Part 2. I take Randolfs exercise as my reference(http://oracle-randolf.blogspot.com/2011/07/logical-ioevolution-part-1-baseline.html), please go to his blog and read the articles, its very explainable but I might miss some parts as well. So if you have time to read, then we can share the knowledge together For the monitor purpose (statistics/ wait event/ etc), I am going to use Snapper version 4 by TanelPoder (http://blog.tanelpoder.com/2013/02/18/manual-before-and-after-snapshot-support-in-snapper-v4/). Just go to his blog as well, this guy is a genius and he has a lot of good stuffs. In his book (Cost Based Oracle Fundamental), Jonathan Lewis has observed about table pre-fetching technique as well. This is what he has explained in the book.
2. Just to recap, the normal NLJ pseudo-code will be looked as below: begin for r_outer in (select rows from outer_table where ) loop for r_inner in (select rows from inner_table where ) loop output the selected columns from both tables end loop end loop end;With above code, output from inner table will be sorted based on outer table.In the other side, Oracle do not guarantee that the output will be sorted based on outer table. I am not too interested in testing this theory, but you can see one example in this blog http://dioncho.wordpress.com/2010/08/16/batching-nljoptimization-and-ordering/ The pseudo-code of new NLJ technique is like the following: begin for r_outer in (select rows from outer_table where ) loop for r_inner in (select rows from inner_table where ) loop get the relevant rowid and put it in list end loop walk through the rowid list and scan the inner_table once to get all required data; end loop end;Test Recipes As a starting point, I will create 5 tables with 10,000 rows each and exactly10 rows per block, using MINIMIZE RECORDS_PER_BLOCK command. The purpose is to get a good figure of the number. In addition to that tables, 4 indexes will be created in the 4 inner tables (except DRIVEN). The index itself will be having BLEVEL=2 (I have to use PCTFREE=99 to force it), so the index height is 3 (ROOT BRANCH LEAF). Later in this test cases we will create a shorter index to see the impact of the query (logical read should be smaller as the index got shorter) 1. DRIVEN, driving (outer) tabletable name should be DRIVER or DRIVING but I mistakenly createdasDRIVEN and it was already half way when I realize it 2. T_UNIQ_SORTED, inner table with Unique Index on ID column and sorted data, to show the normal NLJ 3. T_UNIQ_UNSORTED, inner table with Unique Index on ID column and sorted data, to show the normal NLJ (this is created to see the different between sorted and unsorted data) 4. T_NON_UNIQ_SORTED, inner table with non-unique Index on ID column and sorted data, to show the new table pre-fetching behavior 5. T_NON_UNIQ_UNSORTED, inner table with non-unique Index on ID column and scattered/ random ordered data, to show the new table pre-fetching behavior (this is created to see what is the differences between these techniques) 3. create_tables.LSTrecreate_index.LSTother_info.LSTTest Cases and Results To be able to make fair-enough comparison, I am following these steps in this exercise. The idea is to put as much as block in the buffer to minimize physical IO. I am too lazy to create an automated script so I have done all these steps manually. Sometimes, due to an unwanted load in my VM environment, I have to rerun the test to get good data with acceptable variation. 1. Flush buffer_cache 2. Warm up the buffer by: a. Select all data from outer table, DRIVEN (full table scan) b. Scan inner table using index access (full index scan) 3. Begin snapper process from separate session 4. Execute each test case (there are 4). Turn on event 10046 to trace SQL wait event and event 10200 to dump consistent gets activity.5. End snapper process Below are some scenarios that I have prepared and followed to see how the engine does its work. Please check below attached XLS file for the details result. 1. Normal NLJ against Unique and Non-Unique index 2. Pre-fetch NLJ againstUnique and Non-Unique index 3. Compare the performance of index with BLEVEL=2 and BLEVEL =1 4. Compare the performance of random and sequential data distribution (scattered data)DBA series - Nested Loop Join Technique.xlsxIts Number Time With basic understanding from below table and index statistics, we expect to see around 30,000 consistent gets for the index (since we need to walk from root branch leaf to get the rowid) and 1,000 for 4. the table (with an assumption that Oracle still hold the buffer for every consecutive 10 rows) or 10,000 consistent gets (with a knowledge that we have 10,000 rows in the table). TABLE_NAME NUM_ROWS BLOCKS AVG_ROW_LEN ------------------------------ ---------- ---------- ----------DRIVEN 10000 1000 204 T_UNIQ_UNSORTED 10000 1000 204 T_NON_UNIQ_SORTED 10000 1000 204 T_UNIQ_SORTED 10000 1000 204 T_NON_UNIQ_UNSORTED 10000 1000 204INDEX_NAME CLUSTERING_FACTOR BLEVEL LEAF_BLOCKS DISTINCT_KEYS -------------------------- ----------------- ---------- ----------- ------------T_UNIQ_UNSORTED_IDX 9993 2 10000 10000 T_NON_UNIQ_UNSORTED_IDX 9989 2 10000 10000 T_UNIQ_SORTED_IDX 1000 2 10000 10000 T_NON_UNIQ_SORTED_IDX 1000 2 10000 10000Normal NLJ, Unique and Non-Unique Index Lets start with the most basic one. Before we start this test, we need to disable pre-fetching feature using below command and bounce the instance. If everything is in place, we should see below execution plan from both unique and non-unique version. alter system set "_table_lookup_prefetch_size"=0 scope=spfile;Unique Index 5. Non-Unique IndexReading the tkprof output, in the unique index version, we see 20,668 consistent gets for index access, followed by exactly 10,000 for the inner-table (T_UNIQ_SORTED). While in the non-unique version, we see 30,667 consistent gets for the index access and 10,000 for the outer-table (T_NON_UNIQ_SORTED). In addition to this, we have 1,672 visits for the outer table (DRIVEN). So these facts are not matched with our expectation??? To be able to answer this question, we need to enable event 10200 to dump consistent gets. The output of event 10200 dump file is provided in above tabular attachment and we will look into it to see what was happened. Instead of 30,000 consistent gets for the index (as what we expect in the 6. beginning), Oracle did only 20,668 (as reported in tkprof output and also in the output of event 10200 dump file).In this case Oracle make some optimization by pinning those ROOT buffers (only 668 consistent gets out of 10,000 in the above right most table). That is make sense since ROOT and BRANCH is kind of door or gate to enter the index data, which is in the LEAF block. Moving to the table part, here we have extra 400 consistent gets for T_UNIQ_SORTED (actually we have 1,000 blocks and 10,000 rows) and also extra 267 for DRIVEN, which is inconsistent result if we compare to the tkprof output. What I can say from this symptom is some buffer might be being read more than once. But actually we should have 10,000 consistent gets for DRIVEN (in fact that we have only 1,000 blocks for 10,000 rows), so that 267 extra is considered as small And WHY we have inconsistent result between session statistics and the output of tkprof??? As of now what I can say is, again, may be the output of tkprofis being affected by table and index statistics (product of Oracle algorithm). Of course we need to confirm it by HACKING the statistics rerun again few test cases (I will put it in my list) Going forward to the non-unique index, finally we can able to spot the different of 10,000 consistent gets between those 2 things what is that??? We have 19,999 consistent gets for LEAF block; this means additional 10,000 consistent gets! Ok good!?! When we look into the consistent get hierarchy table, after Oracle visit the inner-table, Oracle go to the next leaf to check whether that leaf has the same value with the current leaf or not. This is an extra job for Oracle when we have non-unique index, it has to check whether the next leaf has the same value or not. This behavior is not present in the unique index.These are another interesting statistics/ wait event to be compared: consistent gets examination related with unique index access, according to Randolf, this is shortcut version of consistent gets and it could reduce the number of required latch when we want to access some buffer (I have to rerun this test and monitor the latch activity as well, may be later) index fetch by rowid index unique scan index scan kdiixs1 index range scan buffer is (not) pinned count part of Oracle optimization to reduce consistent gets rows fetched via callback observed only in unique index scan, but I cannot find further information table scan blocks gotten why it is 1,670 blocks only while we have 2 table with 1,000 blocks each. This is due to warm up activity that is executed before NLJ, so