Advanced Application Indexing and Index Tuningalgis/dsax/IndexTuning.pdf · 3 - Index Tuning 42...
Transcript of Advanced Application Indexing and Index Tuningalgis/dsax/IndexTuning.pdf · 3 - Index Tuning 42...
![Page 1: Advanced Application Indexing and Index Tuningalgis/dsax/IndexTuning.pdf · 3 - Index Tuning 42 Index “Face Lifts” aIndex is created with fillfactor = 100. aInsertions cause page](https://reader030.fdocuments.us/reader030/viewer/2022040612/5f04faf47e708231d410a94a/html5/thumbnails/1.jpg)
Advanced Application Indexing and Index Tuning
![Page 2: Advanced Application Indexing and Index Tuningalgis/dsax/IndexTuning.pdf · 3 - Index Tuning 42 Index “Face Lifts” aIndex is created with fillfactor = 100. aInsertions cause page](https://reader030.fdocuments.us/reader030/viewer/2022040612/5f04faf47e708231d410a94a/html5/thumbnails/2.jpg)
HighHigh--dimensional Indexingdimensional Indexing
• Requirementso Fast range/window query search (range queryo Fast similarity search
Similarity range queryK-nearest neighbour query (KNN query)
![Page 3: Advanced Application Indexing and Index Tuningalgis/dsax/IndexTuning.pdf · 3 - Index Tuning 42 Index “Face Lifts” aIndex is created with fillfactor = 100. aInsertions cause page](https://reader030.fdocuments.us/reader030/viewer/2022040612/5f04faf47e708231d410a94a/html5/thumbnails/3.jpg)
Complex Objects Feature Vectors
Similarity Queries
Nasdaq
Feature extraction and transformation
Index construction
Index for range/ similarity Search
Feature Base Similarity Feature Base Similarity SearchSearch
![Page 4: Advanced Application Indexing and Index Tuningalgis/dsax/IndexTuning.pdf · 3 - Index Tuning 42 Index “Face Lifts” aIndex is created with fillfactor = 100. aInsertions cause page](https://reader030.fdocuments.us/reader030/viewer/2022040612/5f04faf47e708231d410a94a/html5/thumbnails/4.jpg)
E.g. Function FotoFindFotoFind in
Similarity Search based on sample image in color composition
Retrieval by ColourRetrieval by Colour
Given a sample image
![Page 5: Advanced Application Indexing and Index Tuningalgis/dsax/IndexTuning.pdf · 3 - Index Tuning 42 Index “Face Lifts” aIndex is created with fillfactor = 100. aInsertions cause page](https://reader030.fdocuments.us/reader030/viewer/2022040612/5f04faf47e708231d410a94a/html5/thumbnails/5.jpg)
Window/Range query: Retrieve data points fall within a given range along each dimension.Designed to support range retrieval, facilitate joins and similarity search (if applicable).
Query RequirementQuery Requirement
![Page 6: Advanced Application Indexing and Index Tuningalgis/dsax/IndexTuning.pdf · 3 - Index Tuning 42 Index “Face Lifts” aIndex is created with fillfactor = 100. aInsertions cause page](https://reader030.fdocuments.us/reader030/viewer/2022040612/5f04faf47e708231d410a94a/html5/thumbnails/6.jpg)
• Similarity queries:Similarity range and KNN queries
• Similarity range query: Given a query point, find all data points within a given distance r to the query point.
•KNN query: Given a query point,find the K nearest neighbours,in distance to the point.
r
Kth NN
Query RequirementQuery Requirement
![Page 7: Advanced Application Indexing and Index Tuningalgis/dsax/IndexTuning.pdf · 3 - Index Tuning 42 Index “Face Lifts” aIndex is created with fillfactor = 100. aInsertions cause page](https://reader030.fdocuments.us/reader030/viewer/2022040612/5f04faf47e708231d410a94a/html5/thumbnails/7.jpg)
Page accessesCPU
Computation of similarityComparisons
Cost FactorsCost Factors
![Page 8: Advanced Application Indexing and Index Tuningalgis/dsax/IndexTuning.pdf · 3 - Index Tuning 42 Index “Face Lifts” aIndex is created with fillfactor = 100. aInsertions cause page](https://reader030.fdocuments.us/reader030/viewer/2022040612/5f04faf47e708231d410a94a/html5/thumbnails/8.jpg)
For dimensions which are independent and of equal importance: LP metrics
L1 metric -- Manhattan distance or city block distance⌧D1 = ∑ | xi - yi |
L2 metric -- Euclidean distance⌧D2 = √ ∑ (xi -yi)2
Histogram: quadratic distance matrix to take into account similarities between similar but not identical colours by using a ‘colorsimilarity matrix’ D = (X-Y)T A (X-Y)
Similarity MeasuresSimilarity Measures
![Page 9: Advanced Application Indexing and Index Tuningalgis/dsax/IndexTuning.pdf · 3 - Index Tuning 42 Index “Face Lifts” aIndex is created with fillfactor = 100. aInsertions cause page](https://reader030.fdocuments.us/reader030/viewer/2022040612/5f04faf47e708231d410a94a/html5/thumbnails/9.jpg)
For dimensions that are interdependent and varying importance: Mahalanobis metric
D = (X - Y) T C-1 (X-Y)C is the covariance matrix of feature vectors
If feature dimensions are independent:D = ∑ (xi -yi)2 /ci
Similarity MeasuresSimilarity Measures
![Page 10: Advanced Application Indexing and Index Tuningalgis/dsax/IndexTuning.pdf · 3 - Index Tuning 42 Index “Face Lifts” aIndex is created with fillfactor = 100. aInsertions cause page](https://reader030.fdocuments.us/reader030/viewer/2022040612/5f04faf47e708231d410a94a/html5/thumbnails/10.jpg)
Roots of Problems
s
s
50 100 d
1
•Data space is sparsely populated.•The probability of a point lying within a range query with side s is Pr[s] = sd
•when d =100, a range query with 0.95 side only selects 0.59% of the data points
![Page 11: Advanced Application Indexing and Index Tuningalgis/dsax/IndexTuning.pdf · 3 - Index Tuning 42 Index “Face Lifts” aIndex is created with fillfactor = 100. aInsertions cause page](https://reader030.fdocuments.us/reader030/viewer/2022040612/5f04faf47e708231d410a94a/html5/thumbnails/11.jpg)
• Small selectivity range query yields large range side on each dimension. e.g. selectivity 0.1% on d-dimensional, uniformly distributed, data set [0,1], range side is .
e.g. d > 9, range side > 0.5d =30, range side = 0.7943 .
Roots of ProblemsRoots of Problems
![Page 12: Advanced Application Indexing and Index Tuningalgis/dsax/IndexTuning.pdf · 3 - Index Tuning 42 Index “Face Lifts” aIndex is created with fillfactor = 100. aInsertions cause page](https://reader030.fdocuments.us/reader030/viewer/2022040612/5f04faf47e708231d410a94a/html5/thumbnails/12.jpg)
Roots of Problems
50 100 d
1
•The probability of a point lying within a spherical query Sphere(q, 0.5) is Pr[s] = √πd (1/2) d
(d/2)!
q
![Page 13: Advanced Application Indexing and Index Tuningalgis/dsax/IndexTuning.pdf · 3 - Index Tuning 42 Index “Face Lifts” aIndex is created with fillfactor = 100. aInsertions cause page](https://reader030.fdocuments.us/reader030/viewer/2022040612/5f04faf47e708231d410a94a/html5/thumbnails/13.jpg)
Low distance contrast: as the dimensionality increases, the distance to the nearest neighbour approaches the distance to the farthest neighbor.Differennce between the nearest data point and the farthest data point reduces greatly (can be reasoned from the distance functions).
Roots of Problems
![Page 14: Advanced Application Indexing and Index Tuningalgis/dsax/IndexTuning.pdf · 3 - Index Tuning 42 Index “Face Lifts” aIndex is created with fillfactor = 100. aInsertions cause page](https://reader030.fdocuments.us/reader030/viewer/2022040612/5f04faf47e708231d410a94a/html5/thumbnails/14.jpg)
Overlap Query Performance
Curse of Dimensionality Curse of Dimensionality on Ron R--treestrees
![Page 15: Advanced Application Indexing and Index Tuningalgis/dsax/IndexTuning.pdf · 3 - Index Tuning 42 Index “Face Lifts” aIndex is created with fillfactor = 100. aInsertions cause page](https://reader030.fdocuments.us/reader030/viewer/2022040612/5f04faf47e708231d410a94a/html5/thumbnails/15.jpg)
•The performance of the R-tree based structures degrades with increasing dimensionality.
•Sequential scan is much more efficient.
Dimensionality Curse !Dimensionality Curse !
![Page 16: Advanced Application Indexing and Index Tuningalgis/dsax/IndexTuning.pdf · 3 - Index Tuning 42 Index “Face Lifts” aIndex is created with fillfactor = 100. aInsertions cause page](https://reader030.fdocuments.us/reader030/viewer/2022040612/5f04faf47e708231d410a94a/html5/thumbnails/16.jpg)
Approaches to HighApproaches to High--dimensional dimensional IndexingIndexing
Filter-and-Refine ApproachesData PartitioningMetric-Based IndexingDimensionality Reduction
![Page 17: Advanced Application Indexing and Index Tuningalgis/dsax/IndexTuning.pdf · 3 - Index Tuning 42 Index “Face Lifts” aIndex is created with fillfactor = 100. aInsertions cause page](https://reader030.fdocuments.us/reader030/viewer/2022040612/5f04faf47e708231d410a94a/html5/thumbnails/17.jpg)
Increase Fan-Out by increasing the node size, as in X-trees. --- still have the same behaviour of R-tree, and performance degrade as the volume of data increases drastically.
Mapping high-dimensional points into single-dimensional points by using space-filling curves methods or projection.
Transformation in high-dimensional space is expensive.
The number of sub-queries generated can be very big.
Sequential scan seems like a reasonable --- but, needs to search the whole data file -- affected volume is 100%.
Approaches to Solving the ProblemApproaches to Solving the Problem
![Page 18: Advanced Application Indexing and Index Tuningalgis/dsax/IndexTuning.pdf · 3 - Index Tuning 42 Index “Face Lifts” aIndex is created with fillfactor = 100. aInsertions cause page](https://reader030.fdocuments.us/reader030/viewer/2022040612/5f04faf47e708231d410a94a/html5/thumbnails/18.jpg)
Euclidean distance:
Relationship between data points:
Basic DefinitionBasic Definition
Reference point Query point
data point
![Page 19: Advanced Application Indexing and Index Tuningalgis/dsax/IndexTuning.pdf · 3 - Index Tuning 42 Index “Face Lifts” aIndex is created with fillfactor = 100. aInsertions cause page](https://reader030.fdocuments.us/reader030/viewer/2022040612/5f04faf47e708231d410a94a/html5/thumbnails/19.jpg)
Basic Concept of Basic Concept of iDistanceiDistanceIndexing points based on similarity
y = i * c + dist (Si, p)
S1 S2 S3 Sk Sk+1
Reference/anchor points
S1
S2
S3
. . .
d
S1+d c
![Page 20: Advanced Application Indexing and Index Tuningalgis/dsax/IndexTuning.pdf · 3 - Index Tuning 42 Index “Face Lifts” aIndex is created with fillfactor = 100. aInsertions cause page](https://reader030.fdocuments.us/reader030/viewer/2022040612/5f04faf47e708231d410a94a/html5/thumbnails/20.jpg)
Selection of Reference Points
Space Partitioning Data Partitioning0.67 1.0
0.31
0.20
0.70
0
1.0
![Page 21: Advanced Application Indexing and Index Tuningalgis/dsax/IndexTuning.pdf · 3 - Index Tuning 42 Index “Face Lifts” aIndex is created with fillfactor = 100. aInsertions cause page](https://reader030.fdocuments.us/reader030/viewer/2022040612/5f04faf47e708231d410a94a/html5/thumbnails/21.jpg)
This figure illustrates how the searching region is enlarged till getting K NNs.
A range in B+-tree leaf node
KNN SearchingKNN Searching
...
S1
S2
...
![Page 22: Advanced Application Indexing and Index Tuningalgis/dsax/IndexTuning.pdf · 3 - Index Tuning 42 Index “Face Lifts” aIndex is created with fillfactor = 100. aInsertions cause page](https://reader030.fdocuments.us/reader030/viewer/2022040612/5f04faf47e708231d410a94a/html5/thumbnails/22.jpg)
Main Memory Indexing
![Page 23: Advanced Application Indexing and Index Tuningalgis/dsax/IndexTuning.pdf · 3 - Index Tuning 42 Index “Face Lifts” aIndex is created with fillfactor = 100. aInsertions cause page](https://reader030.fdocuments.us/reader030/viewer/2022040612/5f04faf47e708231d410a94a/html5/thumbnails/23.jpg)
Quote from Asilomar report (1998)
“…the typical computing engine may have one terabyte of main memory. ‘Hot tables’and most indexes will be main-memory resident. This will require storage to be rethought. For example, B-tress are not the optimal indexing structure for main memory data.”
![Page 24: Advanced Application Indexing and Index Tuningalgis/dsax/IndexTuning.pdf · 3 - Index Tuning 42 Index “Face Lifts” aIndex is created with fillfactor = 100. aInsertions cause page](https://reader030.fdocuments.us/reader030/viewer/2022040612/5f04faf47e708231d410a94a/html5/thumbnails/24.jpg)
Memory System
CPU
Registers
L1 Cache
CPU Die
L2 Cache
Main Memory
Harddisk
![Page 25: Advanced Application Indexing and Index Tuningalgis/dsax/IndexTuning.pdf · 3 - Index Tuning 42 Index “Face Lifts” aIndex is created with fillfactor = 100. aInsertions cause page](https://reader030.fdocuments.us/reader030/viewer/2022040612/5f04faf47e708231d410a94a/html5/thumbnails/25.jpg)
Memory Hierarchy
-6M cycles40-200cycles5-20 cyclesMiss Penalty
DisksMemoryL2Backing Store
-32GB256K-8MB16-64 KBSize
4-64 KB32-64 B16-32 BBlock Size
MemoryL2 CacheL1 Cache
![Page 26: Advanced Application Indexing and Index Tuningalgis/dsax/IndexTuning.pdf · 3 - Index Tuning 42 Index “Face Lifts” aIndex is created with fillfactor = 100. aInsertions cause page](https://reader030.fdocuments.us/reader030/viewer/2022040612/5f04faf47e708231d410a94a/html5/thumbnails/26.jpg)
Address translation
Translation Lookahead Buffer (TLB) is an essential part of modern processor.TLB contains recently used address translations.Poor memory reference locality causes more TLB misses.Cost of TLB miss is about 100 CPU cycles.
![Page 27: Advanced Application Indexing and Index Tuningalgis/dsax/IndexTuning.pdf · 3 - Index Tuning 42 Index “Face Lifts” aIndex is created with fillfactor = 100. aInsertions cause page](https://reader030.fdocuments.us/reader030/viewer/2022040612/5f04faf47e708231d410a94a/html5/thumbnails/27.jpg)
T-tree (1986)
parent ptr
left child ptr right child ptr
T-tree has the same tree structure as AVL-tree.
data1bf
……mink maxkdataK
![Page 28: Advanced Application Indexing and Index Tuningalgis/dsax/IndexTuning.pdf · 3 - Index Tuning 42 Index “Face Lifts” aIndex is created with fillfactor = 100. aInsertions cause page](https://reader030.fdocuments.us/reader030/viewer/2022040612/5f04faf47e708231d410a94a/html5/thumbnails/28.jpg)
CSB+-tree (sigmod’2000)
![Page 29: Advanced Application Indexing and Index Tuningalgis/dsax/IndexTuning.pdf · 3 - Index Tuning 42 Index “Face Lifts” aIndex is created with fillfactor = 100. aInsertions cause page](https://reader030.fdocuments.us/reader030/viewer/2022040612/5f04faf47e708231d410a94a/html5/thumbnails/29.jpg)
Other Basic Indexes
Hash-tablesStatic table size, with overflow chains
...
M-1
0
Θ(1 + α)
![Page 30: Advanced Application Indexing and Index Tuningalgis/dsax/IndexTuning.pdf · 3 - Index Tuning 42 Index “Face Lifts” aIndex is created with fillfactor = 100. aInsertions cause page](https://reader030.fdocuments.us/reader030/viewer/2022040612/5f04faf47e708231d410a94a/html5/thumbnails/30.jpg)
Other Basic Indexes
Bitmap Indexbitmap with bit position to indicate the presence of a valueCounting bits with 1’sBit-wise operationsMore compact than trees-- amenable to the use of compression techniques
Join indexes
![Page 31: Advanced Application Indexing and Index Tuningalgis/dsax/IndexTuning.pdf · 3 - Index Tuning 42 Index “Face Lifts” aIndex is created with fillfactor = 100. aInsertions cause page](https://reader030.fdocuments.us/reader030/viewer/2022040612/5f04faf47e708231d410a94a/html5/thumbnails/31.jpg)
3 - Index Tuning 31
Index Tuning
Index issuesIndexes may be better or worse than scans Multi-table joins that run on for hours, because the wrong indexes are definedConcurrency control bottlenecksIndexes that are maintained and never used
![Page 32: Advanced Application Indexing and Index Tuningalgis/dsax/IndexTuning.pdf · 3 - Index Tuning 42 Index “Face Lifts” aIndex is created with fillfactor = 100. aInsertions cause page](https://reader030.fdocuments.us/reader030/viewer/2022040612/5f04faf47e708231d410a94a/html5/thumbnails/32.jpg)
Information about indexes...
Application codesV$SQLAREA -- look for the one with high # of executionsINDEX_STATS: meta information about indexesHASH_AREA_SIZEHASH_MULTIBLOCK_IO_COUNT…home work
![Page 33: Advanced Application Indexing and Index Tuningalgis/dsax/IndexTuning.pdf · 3 - Index Tuning 42 Index “Face Lifts” aIndex is created with fillfactor = 100. aInsertions cause page](https://reader030.fdocuments.us/reader030/viewer/2022040612/5f04faf47e708231d410a94a/html5/thumbnails/33.jpg)
Clustered / Non clustered index
Clustered index (primary index)
A clustered index on attribute X co-locates records whose X values are near to one another.
Non-clustered index (secondary index)
A non clustered index does not constrain table organization.There might be several non-clustered indexes per table.
Records Records
![Page 34: Advanced Application Indexing and Index Tuningalgis/dsax/IndexTuning.pdf · 3 - Index Tuning 42 Index “Face Lifts” aIndex is created with fillfactor = 100. aInsertions cause page](https://reader030.fdocuments.us/reader030/viewer/2022040612/5f04faf47e708231d410a94a/html5/thumbnails/34.jpg)
3 - Index Tuning 34
Dense / Sparse Index
Sparse indexPointers are associated to pages
Dense indexPointers are associated to recordsNon clustered indexes are dense
P1 PiP2 record
record record
![Page 35: Advanced Application Indexing and Index Tuningalgis/dsax/IndexTuning.pdf · 3 - Index Tuning 42 Index “Face Lifts” aIndex is created with fillfactor = 100. aInsertions cause page](https://reader030.fdocuments.us/reader030/viewer/2022040612/5f04faf47e708231d410a94a/html5/thumbnails/35.jpg)
3 - Index Tuning 35
Index Implementations in some major DBMS
SQL ServerB+-Tree data structureClustered indexes are sparseIndexes maintained as updates/insertions/deletes are performed
DB2B+-Tree data structure, spatial extender for R-treeClustered indexes are denseExplicit command for index reorganization
OracleB+-tree, hash, bitmap, spatial extender for R-Treeclustered index⌧Index organized table
(unique/clustered)⌧Clusters used when
creating tables.
TimesTen (Main-memory DBMS)
T-tree
![Page 36: Advanced Application Indexing and Index Tuningalgis/dsax/IndexTuning.pdf · 3 - Index Tuning 42 Index “Face Lifts” aIndex is created with fillfactor = 100. aInsertions cause page](https://reader030.fdocuments.us/reader030/viewer/2022040612/5f04faf47e708231d410a94a/html5/thumbnails/36.jpg)
3 - Index Tuning 36
Types of Queries
Point Query
SELECT balanceFROM accountsWHERE number = 1023;
Multipoint Query
SELECT balanceFROM accountsWHERE branchnum = 100;
Range Query
SELECT numberFROM accountsWHERE balance > 10000 and balance <= 20000;
Prefix Match Query
SELECT *FROM employeesWHERE name = ‘J*’ ;
![Page 37: Advanced Application Indexing and Index Tuningalgis/dsax/IndexTuning.pdf · 3 - Index Tuning 42 Index “Face Lifts” aIndex is created with fillfactor = 100. aInsertions cause page](https://reader030.fdocuments.us/reader030/viewer/2022040612/5f04faf47e708231d410a94a/html5/thumbnails/37.jpg)
3 - Index Tuning 37
More Types of Queries
Extremal Query
SELECT *FROM accountsWHERE balance =
max(select balance from accounts)
Ordering Query
SELECT *FROM accountsORDER BY balance;
Grouping Query
SELECT branchnum, avg(balance)FROM accountsGROUP BY branchnum;
Join Query
SELECT distinct branch.adresseFROM accounts, branchWHERE
accounts.branchnum = branch.number
and accounts.balance > 10000;
![Page 38: Advanced Application Indexing and Index Tuningalgis/dsax/IndexTuning.pdf · 3 - Index Tuning 42 Index “Face Lifts” aIndex is created with fillfactor = 100. aInsertions cause page](https://reader030.fdocuments.us/reader030/viewer/2022040612/5f04faf47e708231d410a94a/html5/thumbnails/38.jpg)
Index Tuning -- data
Settings:employees(ssnum, name, lat, long, hundreds1,
hundreds2);clustered index c on employees(hundreds1)
with fillfactor = 100;
nonclustered index nc on employees (hundreds2);index nc3 on employees (ssnum, name, hundreds2); index nc4 on employees (lat, ssnum, name);
1000000 rows ; Cold bufferDual Xeon (550MHz,512Kb), 1Gb RAM, Internal RAID controller from Adaptec (80Mb), 4x18Gb drives (10000RPM), Windows 2000.
![Page 39: Advanced Application Indexing and Index Tuningalgis/dsax/IndexTuning.pdf · 3 - Index Tuning 42 Index “Face Lifts” aIndex is created with fillfactor = 100. aInsertions cause page](https://reader030.fdocuments.us/reader030/viewer/2022040612/5f04faf47e708231d410a94a/html5/thumbnails/39.jpg)
Index Tuning -- operations
Operations:Update: update employees set name = ‘XXX’ where ssnum = ?;
Insert: insert into employees values (1003505,'polo94064',97.48,84.03,4700.55,3987.2);
Multipoint query: select * from employees where hundreds1= ?; select * from employees where hundreds2= ?;
Covered query: select ssnum, name, lat from employees;
Range Query: select * from employees where long between ? and ?;
Point Query: select * from employees where ssnum = ?
![Page 40: Advanced Application Indexing and Index Tuningalgis/dsax/IndexTuning.pdf · 3 - Index Tuning 42 Index “Face Lifts” aIndex is created with fillfactor = 100. aInsertions cause page](https://reader030.fdocuments.us/reader030/viewer/2022040612/5f04faf47e708231d410a94a/html5/thumbnails/40.jpg)
3 - Index Tuning 40
Clustered Index
Multipoint query that returns 100 records out of 1000000.Cold bufferClustered index is twice as fast as non-clustered index and orders of magnitude faster than a scan.
0
0.2
0.4
0.6
0.8
1
SQLServer Oracle DB2
Thro
ughp
ut ra
tio
clustered nonclustered no index
![Page 41: Advanced Application Indexing and Index Tuningalgis/dsax/IndexTuning.pdf · 3 - Index Tuning 42 Index “Face Lifts” aIndex is created with fillfactor = 100. aInsertions cause page](https://reader030.fdocuments.us/reader030/viewer/2022040612/5f04faf47e708231d410a94a/html5/thumbnails/41.jpg)
Positive Points of Clustering indexes
If the index is sparse, it has less points --less I/OsGood for multipoint queries
eg. Looking up names in telephone dir.
Good for equijoin. Why?Good for range, prefix match, and ordering queries
![Page 42: Advanced Application Indexing and Index Tuningalgis/dsax/IndexTuning.pdf · 3 - Index Tuning 42 Index “Face Lifts” aIndex is created with fillfactor = 100. aInsertions cause page](https://reader030.fdocuments.us/reader030/viewer/2022040612/5f04faf47e708231d410a94a/html5/thumbnails/42.jpg)
3 - Index Tuning 42
Index “Face Lifts”
Index is created with fillfactor = 100.Insertions cause page splits and extra I/O for each queryMaintenance consists in dropping and recreating the indexWith maintenance performance is constant while performance degrades significantly if no maintenance is performed.
SQLServer
020406080
100
0 20 40 60 80 100
% Increase in Table Size
Thro
ughp
ut
(que
ries/
sec)
No maintenanceMaintenance
![Page 43: Advanced Application Indexing and Index Tuningalgis/dsax/IndexTuning.pdf · 3 - Index Tuning 42 Index “Face Lifts” aIndex is created with fillfactor = 100. aInsertions cause page](https://reader030.fdocuments.us/reader030/viewer/2022040612/5f04faf47e708231d410a94a/html5/thumbnails/43.jpg)
3 - Index Tuning 43
Index Maintenance
In Oracle, clustered index are approximated by an index defined on a clustered tableNo automatic physical reorganizationIndex defined with pctfree= 0Overflow pages cause performance degradation
Oracle
0
5
10
15
20
0 20 40 60 80 100% Increase in Table Size
Thro
ughp
ut
(que
ries/
sec)
Nomaintenance
![Page 44: Advanced Application Indexing and Index Tuningalgis/dsax/IndexTuning.pdf · 3 - Index Tuning 42 Index “Face Lifts” aIndex is created with fillfactor = 100. aInsertions cause page](https://reader030.fdocuments.us/reader030/viewer/2022040612/5f04faf47e708231d410a94a/html5/thumbnails/44.jpg)
3 - Index Tuning 44
Covering Index - defined
Select name from employee where department = “marketing”Good covering index would be on (department, name)Index on (name, department) less useful.Index on department alone moderately useful.
![Page 45: Advanced Application Indexing and Index Tuningalgis/dsax/IndexTuning.pdf · 3 - Index Tuning 42 Index “Face Lifts” aIndex is created with fillfactor = 100. aInsertions cause page](https://reader030.fdocuments.us/reader030/viewer/2022040612/5f04faf47e708231d410a94a/html5/thumbnails/45.jpg)
3 - Index Tuning 45
Covering Index - impact
Covering index performs better than clustering index when first attributes of index are in the where clause and last attributes in the select.When attributes are not in order then performance is much worse.
0
10
20
30
40
50
60
70
SQLSe rv e r
Thro
ughp
ut (q
uerie
s/se
c) cov e ring
cov e ring - notorde re dnon cluste ring
cluste ring
![Page 46: Advanced Application Indexing and Index Tuningalgis/dsax/IndexTuning.pdf · 3 - Index Tuning 42 Index “Face Lifts” aIndex is created with fillfactor = 100. aInsertions cause page](https://reader030.fdocuments.us/reader030/viewer/2022040612/5f04faf47e708231d410a94a/html5/thumbnails/46.jpg)
Positive/negative points of non-clustering indexes
Eliminate the need to access the underlying table
eg. Index on (A, B, C)Select B,C From R Where A=5.
Good if each query retrieves significantly fewer records than there are pages in DBMay not be good for multipoint queries
![Page 47: Advanced Application Indexing and Index Tuningalgis/dsax/IndexTuning.pdf · 3 - Index Tuning 42 Index “Face Lifts” aIndex is created with fillfactor = 100. aInsertions cause page](https://reader030.fdocuments.us/reader030/viewer/2022040612/5f04faf47e708231d410a94a/html5/thumbnails/47.jpg)
Examples:
Table T has 50-bytes records and attribute A has 20 different values which are uniformly distributed. Page size=4K. Is a nonclustering index on A any good?Now the record size is 2Kbytes.
![Page 48: Advanced Application Indexing and Index Tuningalgis/dsax/IndexTuning.pdf · 3 - Index Tuning 42 Index “Face Lifts” aIndex is created with fillfactor = 100. aInsertions cause page](https://reader030.fdocuments.us/reader030/viewer/2022040612/5f04faf47e708231d410a94a/html5/thumbnails/48.jpg)
3 - Index Tuning 48
Scan Can Sometimes Win
IBM DB2 v7.1 on Windows 2000Range QueryIf a query retrieves 10% of the records or more, scanning is often better than using a non-clustering non-covering index. Crossover > 10% when records are large or table is fragmented on disk – scan cost increases.
0 5 10 15 20 25% of se le cte d re cords
Thro
ughp
ut (q
uerie
s/se
c)
scannon clustering
![Page 49: Advanced Application Indexing and Index Tuningalgis/dsax/IndexTuning.pdf · 3 - Index Tuning 42 Index “Face Lifts” aIndex is created with fillfactor = 100. aInsertions cause page](https://reader030.fdocuments.us/reader030/viewer/2022040612/5f04faf47e708231d410a94a/html5/thumbnails/49.jpg)
3 - Index Tuning 49
Index on Small Tables
Small table: 100 records, i.e., a few pages.Two concurrent processes perform updates (each process works for 10ms before it commits)No index: the table is scanned for each update. No concurrent updates.A clustered index allows to take advantage of row locking.
02468
1012141618
no index index
Thro
ughp
ut (u
pdat
es/s
ec)
![Page 50: Advanced Application Indexing and Index Tuningalgis/dsax/IndexTuning.pdf · 3 - Index Tuning 42 Index “Face Lifts” aIndex is created with fillfactor = 100. aInsertions cause page](https://reader030.fdocuments.us/reader030/viewer/2022040612/5f04faf47e708231d410a94a/html5/thumbnails/50.jpg)
Bitmap vs. Hash vs. B+-Tree
Settings:employees(ssnum, name, lat, long, hundreds1,
hundreds2);create cluster c_hundreds (hundreds2 number(8)) PCTFREE 0;create cluster c_ssnum(ssnum integer) PCTFREE 0 size 60;
create cluster c_hundreds(hundreds2 number(8)) PCTFREE 0 HASHKEYS 1000 size 600;
create cluster c_ssnum(ssnum integer) PCTFREE 0 HASHKEYS 1000000 SIZE 60;
create bitmap index b on employees (hundreds2);create bitmap index b2 on employees (ssnum);
1000000 rows ; Cold bufferDual Xeon (550MHz,512Kb), 1Gb RAM, Internal RAID controller from Adaptec (80Mb), 4x18Gb drives (10000RPM), Windows 2000.
![Page 51: Advanced Application Indexing and Index Tuningalgis/dsax/IndexTuning.pdf · 3 - Index Tuning 42 Index “Face Lifts” aIndex is created with fillfactor = 100. aInsertions cause page](https://reader030.fdocuments.us/reader030/viewer/2022040612/5f04faf47e708231d410a94a/html5/thumbnails/51.jpg)
3 - Index Tuning 51
Multipoint query: B-Tree, Hash Tree, Bitmap
There is an overflow chain in a hash indexIn a clustered B-Tree index records are on contiguous pages.Bitmap is proportional to size of table and non-clustered for record access.
Multipoint Queries
0
5
10
15
20
25
B-Tree Hash index Bitmap index
Thro
ughp
ut (q
uerie
s/se
c)
![Page 52: Advanced Application Indexing and Index Tuningalgis/dsax/IndexTuning.pdf · 3 - Index Tuning 42 Index “Face Lifts” aIndex is created with fillfactor = 100. aInsertions cause page](https://reader030.fdocuments.us/reader030/viewer/2022040612/5f04faf47e708231d410a94a/html5/thumbnails/52.jpg)
3 - Index Tuning 52
Hash indexes don’t help when evaluating range queries
Hash index outperforms B-tree on point queries
Range Queries
0
0.1
0.2
0.3
0.4
0.5
B-Tree Hash index Bitmap index
Thro
ughp
ut (q
uerie
s/se
c)
B-Tree, Hash Tree, Bitmap
Point Queries
0
10
20
30
40
50
60
B-Tree hash index
Thro
ughp
ut(q
uerie
s/se
c)
![Page 53: Advanced Application Indexing and Index Tuningalgis/dsax/IndexTuning.pdf · 3 - Index Tuning 42 Index “Face Lifts” aIndex is created with fillfactor = 100. aInsertions cause page](https://reader030.fdocuments.us/reader030/viewer/2022040612/5f04faf47e708231d410a94a/html5/thumbnails/53.jpg)
Summary
Primary means to reduce search costs (I/O and CPU)Properties: robust, concurrent, scalable, efficientMost supported indexes: Hash, B+-trees, bitmap index, and R-treesTuning: Usage, Maintenance, Drop/Rebuild, index locking in buffer...