Indexing for Multidimensional Data An Introduction.
Transcript of Indexing for Multidimensional Data An Introduction.
![Page 1: Indexing for Multidimensional Data An Introduction.](https://reader035.fdocuments.us/reader035/viewer/2022072014/56649eab5503460f94bb0323/html5/thumbnails/1.jpg)
Indexing for Multidimensional Data
An Introduction
![Page 2: Indexing for Multidimensional Data An Introduction.](https://reader035.fdocuments.us/reader035/viewer/2022072014/56649eab5503460f94bb0323/html5/thumbnails/2.jpg)
Advanced Data Structures 2Jaruloj Chongstitvatana
Applications of Multidimensional Databases
• Databases with multiple-attribute key
• Spatial databases
• Geographic information system (GIS)
• Computer-aided design (CAD)
• Multimedia databases
• Medical applications
![Page 3: Indexing for Multidimensional Data An Introduction.](https://reader035.fdocuments.us/reader035/viewer/2022072014/56649eab5503460f94bb0323/html5/thumbnails/3.jpg)
Advanced Data Structures 3
Characteristics of Good Index Structures
• Dynamic
• Operations– Queries
• Point queries
• Range queries
• Spatial queries
– Insert
– Delete
• Simplicity
• Performance– Disk accesses
– Running time
– Storage utilization• Low % of waste space
• Memory
• Disk
• Scalability– Data size
– Data dimension
Jaruloj Chongstitvatana
![Page 4: Indexing for Multidimensional Data An Introduction.](https://reader035.fdocuments.us/reader035/viewer/2022072014/56649eab5503460f94bb0323/html5/thumbnails/4.jpg)
Advanced Data Structures 4
Why Hierarchical Structures
ADVANTAGES
• Allow the search to be focused on interesting subset of data
• Eliminate useless search• Clean and simple
implementation
DISADVANATGES
• Parallelism
Jaruloj Chongstitvatana
![Page 5: Indexing for Multidimensional Data An Introduction.](https://reader035.fdocuments.us/reader035/viewer/2022072014/56649eab5503460f94bb0323/html5/thumbnails/5.jpg)
Advanced Data Structures 5
Types of Data
• Multi-dimension point data– Database with multiple-attribute key– Point in 2D or 3D
• Interval data• Multi-dimension region data• High-dimensional point data
– Data mining
Jaruloj Chongstitvatana
![Page 6: Indexing for Multidimensional Data An Introduction.](https://reader035.fdocuments.us/reader035/viewer/2022072014/56649eab5503460f94bb0323/html5/thumbnails/6.jpg)
Advanced Data Structures 6Jaruloj Chongstitvatana
Comparison
B tree• Binary tree• Unbalanced• Organize data• Memory-based index
– Measuring the running time
• Practical memory size
B+ tree
• N-ary tree
• Height-balanced
• Organize data space
• Disk-based index– Measuring the number
of disk accesses
• Disk page size
![Page 7: Indexing for Multidimensional Data An Introduction.](https://reader035.fdocuments.us/reader035/viewer/2022072014/56649eab5503460f94bb0323/html5/thumbnails/7.jpg)
Advanced Data Structures 7Jaruloj Chongstitvatana
B tree
10
4
9
20
6
7
![Page 8: Indexing for Multidimensional Data An Introduction.](https://reader035.fdocuments.us/reader035/viewer/2022072014/56649eab5503460f94bb0323/html5/thumbnails/8.jpg)
Advanced Data Structures 8Jaruloj Chongstitvatana
B+ tree
6 11 14 48 19 22
16 31
![Page 9: Indexing for Multidimensional Data An Introduction.](https://reader035.fdocuments.us/reader035/viewer/2022072014/56649eab5503460f94bb0323/html5/thumbnails/9.jpg)
Advanced Data Structures 9
B+ tree
Jaruloj Chongstitvatana
• N-ary tree• Increase the breadth of trees to decrease the height• Used for indexing of large amount of data (stored in
disk)
![Page 10: Indexing for Multidimensional Data An Introduction.](https://reader035.fdocuments.us/reader035/viewer/2022072014/56649eab5503460f94bb0323/html5/thumbnails/10.jpg)
Advanced Data Structures 10
Example
Jaruloj Chongstitvatana
12 52 78
83 91
60 69 19 26 37 46
4 8
012 70
717677
7980818283
858690
9395979899
5456575960
61626667
13141719
20212226
27283135
384445
4950
567
891112
![Page 11: Indexing for Multidimensional Data An Introduction.](https://reader035.fdocuments.us/reader035/viewer/2022072014/56649eab5503460f94bb0323/html5/thumbnails/11.jpg)
Advanced Data Structures 11
Properties of B+ trees
For an M-ary B tree:• The root has up to M children.• Non-leaf nodes store up to M-1 keys, and have
between M/2 and M children, except the root.• All data items are stored at leaves.• All leaves have to same depth, and store
between L/2 and L data items.
Jaruloj Chongstitvatana
![Page 12: Indexing for Multidimensional Data An Introduction.](https://reader035.fdocuments.us/reader035/viewer/2022072014/56649eab5503460f94bb0323/html5/thumbnails/12.jpg)
Advanced Data Structures 12
Search
Jaruloj Chongstitvatana
12 52 78
83 91
60 69 19 26 37 46
4 8
012 70
717677
7980818283
858690
9395979899
5456575960
61626667
13141719
20212226
27283135
384445
4950
567
891112
Search for 66
![Page 13: Indexing for Multidimensional Data An Introduction.](https://reader035.fdocuments.us/reader035/viewer/2022072014/56649eab5503460f94bb0323/html5/thumbnails/13.jpg)
Advanced Data Structures 13
Insert
Jaruloj Chongstitvatana
12 52 78
83 91
60 69 19 26 37 46
4 8
012 70
717677
7980818283
858690
9395979899
5456575960
61626667
13141719
20212226
27283135
384445
4950
567
891112
Insert 55Split leave
![Page 14: Indexing for Multidimensional Data An Introduction.](https://reader035.fdocuments.us/reader035/viewer/2022072014/56649eab5503460f94bb0323/html5/thumbnails/14.jpg)
Advanced Data Structures 14
Insert
Jaruloj Chongstitvatana
12 52 78
83 91
60 69 19 26 37 46
4 8
012 70
717677
7980818283
858690
9395979899
5456575960
61626667
13141719
20212226
2728313536
384445
4950
567
891112
Insert 32Split leave
Insert key 31Split node
Insert key 31
![Page 15: Indexing for Multidimensional Data An Introduction.](https://reader035.fdocuments.us/reader035/viewer/2022072014/56649eab5503460f94bb0323/html5/thumbnails/15.jpg)
Advanced Data Structures 15Jaruloj Chongstitvatana
Handling multiple attributes
• Separate index structure for each attributes– Update all index structures for each record update.– Data are scattered in many disk pages.
a1 a2 a3
disk
a4
![Page 16: Indexing for Multidimensional Data An Introduction.](https://reader035.fdocuments.us/reader035/viewer/2022072014/56649eab5503460f94bb0323/html5/thumbnails/16.jpg)
Advanced Data Structures 16Jaruloj Chongstitvatana
Handling multiple attributes
• Bit interleaving
• Attribute interleaving
![Page 17: Indexing for Multidimensional Data An Introduction.](https://reader035.fdocuments.us/reader035/viewer/2022072014/56649eab5503460f94bb0323/html5/thumbnails/17.jpg)
Advanced Data Structures 17
Multiple-attribute indexing
•Quad-tree
•k-d tree
•k-d-B tree
•Grid file
•hB-tree
Issues• Non-linear relationship• Distance measure• k-nearest-neighbor
queries
Jaruloj Chongstitvatana
![Page 18: Indexing for Multidimensional Data An Introduction.](https://reader035.fdocuments.us/reader035/viewer/2022072014/56649eab5503460f94bb0323/html5/thumbnails/18.jpg)
Advanced Data Structures 18
Spatial Indexing
•R-tree
•R*-tree
•SKD-tree
Issues• Non-linear ordering• Spatial queries• High cost of determining
spatial relationship
Jaruloj Chongstitvatana
![Page 19: Indexing for Multidimensional Data An Introduction.](https://reader035.fdocuments.us/reader035/viewer/2022072014/56649eab5503460f94bb0323/html5/thumbnails/19.jpg)
Advanced Data Structures 19
High-dimensional Indexing
•SS-tree
•TV-tree
Issues: Curse of dimensionality• Volume grows exponentially with
dimension• Partition in higher dimension is
coarser• Distance measurement in higher
dimension is not practical
Jaruloj Chongstitvatana