CS232A: Database System Principles...
Transcript of CS232A: Database System Principles...
1
1
CS232A: Database System Principles
INDEXING
2
Given condition on attribute find qualified records
Attr = value
Condition may also be • Attr>value• Attr>=value
Indexing
? valueQualified records
valuevalue
Indexing• Data Stuctures used for quickly locating tuples that
meet a specific type of condition– Equality condition: find Movie tuples where Director=X– Other conditions possible, eg, range conditions: find
Employee tuples where Salary>40 AND Salary<50• Many types of indexes. Evaluate them on
– Access time– Insertion time– Deletion time– Disk Space needed (esp. as it effects access time)
2
4
Topics
• Conventional indexes• B-trees• Hashing schemes
Terms and Distinctions• Primary index
– the index on the attribute (a.k.a. search key) that determines the sequencing of the table
• Secondary index– index on any other
attribute
• Dense index– every value of the
indexed attribute appears in the index
• Sparse index– many values do not
appear
10203040
1020304050708090100120
50708090
A Dense Primary Index
100120140150
SequentialFile
Dense and Sparse Primary Indexes
10203040
1020304050708090100120
50708090
Dense Primary Index
100120140150
Sparse Primary Index
10305080100140160200
1020304050708090100120
Find the index record with largestvalue that is less or equal to thevalue we are looking.
+ can tell if a value exists without accessing file (consider projection)
+ better access to overflow records
+ less index space
more + and - in a while
3
7
Sparse vs. Dense Tradeoff
• Sparse: Less index space per record can keep more of index in memory
• Dense: Can tell if any record existswithout accessing file
(Later: – sparse better for insertions– dense needed for secondary indexes)
Multi-Level Indexes
• Treat the index as a file and build an index on it
• “Two levels are usually sufficient. More than three levels are rare.”
• Q: Can we build a dense second level index for a dense index ?
10305080100140160200
1020304050708090100120
10100250400
250270300350400460500550
6007509201000
A Note on Pointers
• Record pointers consist of block pointerand position of record in the block
• Using the block pointer only saves space at no extra disk accesses cost
4
Representation of Duplicate Values in Primary Indexes
• Index may point to first instance of each value only
104070100
1010104040707070100120
Deletion from Dense Index
102030
102030
5070
90100120
5070
90
Delete 40, 80
HeaderHeader
Lists of available entries
• Deletion from dense primary index file with no duplicate values is handled in the same way with deletion from a sequential file
• Q: What about deletion from dense primary index with duplicates
Deletion from Sparse Index
• if the deleted entry does not appear in the index do nothing
10305080100140160200
102030
50708090100120
HeaderDelete 40
5
Deletion from Sparse Index (cont’d)
• if the deleted entry does not appear in the index do nothing
• if the deleted entry appears in the index replace it with the next search-key value– comment: we could leave
the deleted value in the index assuming that no part of the system may assume it still exists without checking the block
Delete 30
10405080100140160200
1020
4050708090100120
Header
Deletion from Sparse Index (cont’d)
• if the deleted entry does not appear in the index do nothing
• if the deleted entry appears in the index replace it with the next search-key value
• unless the next search key value has its own index entry. In this case delete the entry
Delete 40, then 30
10
5080100140160200
1020
50708090100120
HeaderHeader
Insertion in Sparse Index
• if no new block is created then do nothing
10305080100140160200
102030 35
50708090100120
HeaderInsert 35
6
16
Insertion in Sparse Index
• if no new block is created then do nothing
• else create overflow record– Reorganize periodically– Could we claim space of
next block?– How often do we
reorganize and how much expensive it is?
– B-trees offer convincing answers
10 30 50 80
100140160200
102030
50708090100120
HeaderInsert 15
17
Secondary indexesSequencefield
5030
7020
4080
10100
6090
File not sorted on secondary search key
18
Secondary indexesSequencefield
5030
7020
4080
10100
6090
• Sparse index
302080100
90...
does not make sense!
7
19
Secondary indexesSequencefield
5030
7020
4080
10100
6090
• Dense index10203040
506070...
105090...
sparsehighlevel
First level has to be dense,next levels are sparse (as usual)
20
Duplicate values & secondary indexes
1020
4020
4010
4010
4030
21
Duplicate values & secondary indexes
1020
4020
4010
4010
4030
10101020
20304040
4040...
one option...
Problem:excess overhead!
• disk space• search time
8
22
Duplicate values & secondary indexes
1020
4020
4010
4010
4030
10
another option: lists of pointers
4030
20Problem:variable sizerecords inindex!
23
Duplicate values & secondary indexes
1020
4020
4010
4010
4030
10203040
5060...
λ
λ
λ
λYet another idea :Chain records with same key?
Problems:• Need to add fields to records, messes up maintenance• Need to follow chain to know records
24
Duplicate values & secondary indexes
1020
4020
4010
4010
4030
10203040
5060...
buckets
9
25
Why “bucket” idea is useful
Indexes RecordsName: primary EMP (name,dept,year,...)
Dept: secondaryYear: secondary
• Enables the processing of queries workingwith pointers only.
• Very common technique in Information Retrieval
Advantage of Buckets: Process Queries Using Pointers Only
Find employees of the Toys dept with 4 years in the companySELECT Name FROM Employee WHERE Dept=“Toys” AND Year=4
ToysPCsPensSuits
Dept IndexAaron Suits 4Helen Pens 3Jack PCs 4Jim Toys 4Joe Toys 3Nick PCs 2Walt Toys 5Yannis Pens 1
1234
Year Index
Intersect toy bucket and 2nd Floor bucket to get set of matching EMP’s
27
This idea used in text information retrieval
Documents
...the cat is fat ...
...my cat and my dog like each other...
...Fido the dog ...Buckets known as
Inverted lists
cat
dog
10
28
Information Retrieval (IR) Queries
• Find articles with “cat” and “dog”– Intersect inverted lists
• Find articles with “cat” or “dog”– Union inverted lists
• Find articles with “cat” and not “dog”– Subtract list of dog pointers from list of cat pointers
• Find articles with “cat” in title• Find articles with “cat” and “dog”
within 5 words
29
Common technique: more info in inverted list
catTitle 5
Title 100
Author 10Abstract 57
Title 12
d3d2
d1
dog
type
positio
n
locatio
n
30
Posting: an entry in inverted list.Represents occurrence ofterm in article
Size of a list: 1 Rare words or(in postings) mis-spellings
106 Common words
Size of a posting: 10-15 bits (compressed)
11
31
Vector space model
w1 w2 w3 w4 w5 w6 w7 …DOC = <1 0 0 1 1 0 0 …>
Query= <0 0 1 1 0 0 0 …>
PRODUCT = 1 + ……. = score
32
• Tricks to weigh scores + normalize
e.g.: Match on common word not asuseful as match on rare words...
Summary of Indexing So Far• Basic topics in conventional indexes
– multiple levels– sparse/dense– duplicate keys and buckets– deletion/insertion similar to sequential files
• Advantages– simple algorithms– index is sequential file
• Disadvantages– eventually sequentiality is lost because of
overflows, reorganizations are needed
12
34
Example Index (sequential)
continuous
free space
102030
405060
708090
39313536
323834
33
overflow area(not sequential)
35
Outline:
• Conventional indexes• B-Trees ⇒ NEXT• Hashing schemes
36
• NEXT: Another type of index– Give up on sequentiality of index– Try to get “balance”
13
37
Root
B+Tree Example n=3
100
120
150
180
30
3 5 11 30 35 100
101
110
120
130
150
156
179
180
200
38
Sample non-leaf
to keys to keys to keys to keys
< 57 57≤ k<81 81≤k<95 ≥95
57 81 95
39
Sample leaf node:
From non-leaf node
to next leafin sequence57 81 95
To r
ecor
d w
ith k
ey 5
7
To r
ecor
d w
ith k
ey 8
1
To r
ecor
d w
ith k
ey 8
5
14
40
In textbook’s notation n=3
Leaf:
Non-leaf:
30 3530
30 35
30
41
Size of nodes: n+1 pointersn keys (fixed)
42
Non-root nodes have to be at least half-full
• Use at least
Non-leaf: (n+1)/2 pointers
Leaf: (n+1)/2 pointers to data
15
43
Full node min. node
Non-leaf
Leaf
n=3
120
150
180
30
3 5 11 30 35
44
B+tree rules tree of order n
(1) All leaves at same lowest level(balanced tree)
(2) Pointers in leaves point to recordsexcept for “sequence pointer”
45
(3) Number of pointers/keys for B+tree
Non-leaf(non-root) n+1 n (n+1)/2 (n+1)/2- 1
Leaf(non-root) n+1 n
Root n+1 n 1 1
Max Max Min Min ptrs keys ptrs→data keys
(n+1)/2 (n+1)/2
16
46
Insert into B+tree
(a) simple case– space available in leaf
(b) leaf overflow(c) non-leaf overflow(d) new root
47
(a) Insert key = 32 n=3
3 5 11 30 31
30
100
32
48
(a) Insert key = 7 n=3
3 5 11 30 31
30
100
3 5
7
7
17
49
(c) Insert key = 160 n=310
0
120
150
180
150
156
179
180
200
160
180
160
179
50
(d) New root, insert 45 n=3
10 20 30
1 2 3 10 12 20 25 30 32 40 40 45
40
30new root
51
(a) Simple case - no example
(b) Coalesce with neighbor (sibling)
(c) Re-distribute keys(d) Cases (b) or (c) at non-leaf
Deletion from B+tree
18
52
(b) Coalesce with sibling– Delete 50
10 40 100
10 20 30 40 50
n=4
40
53
(c) Redistribute keys– Delete 50
10 40 100
10 20 30 35 40 50
n=4
35
35
54
40 4530 3725 2620 2210 141 3
10 20 30 40
(d) Non-leaf coalese– Delete 37
n=4
40
30
25
25
new root
19
55
B+tree deletions in practice
– Often, coalescing is not implemented– Too hard and not worth it!
56
Comparison: B-trees vs. static indexed sequential file
Ref #1: Held & Stonebraker“B-Trees Re-examined”CACM, Feb. 1978
57
Ref # 1 claims:- Concurrency control harder in B-Trees- B-tree consumes more space
For their comparison:block = 512 byteskey = pointer = 4 bytes4 data records per block
20
58
Example: 1 block static index
127 keys
(127+1)4 = 512 Bytes-> pointers in index implicit! up to 127
contigous blocks
k1
k2
k3
k1
k2
k3
1 datablock
59
Example: 1 block B-tree
63 keys
63x(4+4)+8 = 512 Bytes-> pointers needed in B-tree up to 63
blocks because index and data blocksare not contiguous
k1
k2
...
k63
k1
k2
k3
1 datablock
next-
60
Size comparison Ref. #1Size comparison Ref. #1
Static Index B-tree# data # datablocks height blocks height
2 -> 127 2 2 -> 63 2128 -> 16,129 3 64 -> 3968 316,130 -> 2,048,383 4 3969 -> 250,047 4
250,048 -> 15,752,961 5
21
61
Ref. #1 analysis claims
• For an 8,000 block file,after 32,000 insertsafter 16,000 lookups
⇒ Static index saves enough accessesto allow for reorganization
Ref. #1 conclusion Static index better!!
62
Ref #2: M. Stonebraker, “Retrospective on a databasesystem,” TODS, June 1980
Ref. #2 conclusion B-trees better!!
63
• DBA does not know when to reorganize– Self-administration is important target
• DBA does not know how full to loadpages of new index
Ref. #2 conclusion B-trees better!!
22
64
• Buffering– B-tree: has fixed buffer requirements– Static index: large & variable size
buffers needed due to overflow
Ref. #2 conclusion B-trees better!!
65
• Speaking of buffering…Is LRU a good policy for B+tree buffers?
→ Of course not!→ Should try to keep root in memory
at all times(and perhaps some nodes from second level)
66
Interesting problem:
For B+tree, how large should n be?
…
n is number of keys / node
23
Assumptions
• You have the right to set the disk page size for the disk where a B-tree will reside.
• Compute the optimum page size n assuming that– The items are 4 bytes long and the pointers are
also 4 bytes long.– Time to read a node from disk is 12+.003n– Time to process a block in memory is unimportant– B+tree is full (I.e., every page has the maximum
number of items and pointers
68
➸ Can get:f(n) = time to find a record
f(n)
nopt n
69
➸ FIND nopt by f’(n) = 0
Answer should be nopt = “few hundred”
➸ What happens to nopt as
• Disk gets faster?• CPU get faster?
24
70
Variation on B+tree: B-tree (no +)
• Idea:– Avoid duplicate keys– Have record pointers in non-leaf nodes
71
to record to record to recordwith K1 with K2 with K3
to keys to keys to keys to keys< K1 K1<x<K2 K2<x<k3 >k3
K1 P1 K2 P2 K3 P3
72
B-tree example n=2
65 125
145
165
85 105
25 45
10 20 30 40 110
120
90 100
70 80 170
180
50 60 130
140
150
160
• sequence pointersnot useful now!(but keep space for simplicity)
25
73
So, for B-trees:
MAX MINTree Rec Keys Tree Rec KeysPtrs Ptrs Ptrs Ptrs
Non-leafnon-root n+1 n n (n+1)/2 (n+1)/2-1 (n+1)/2-1Leafnon-root 1 n n 1 (n+1)/2 (n+1)/2
Rootnon-leaf n+1 n n 2 1 1
RootLeaf 1 n n 1 1 1
74
Tradeoffs:
☺ B-trees have marginally faster average lookup than B+trees (assuming the height does not change)
in B-tree, non-leaf & leaf different sizesSmaller fan-outin B-tree, deletion more complicated
➨ B+trees preferred!
75
Example:- Pointers 4 bytes- Keys 4 bytes- Blocks 100 bytes (just example)- Look at full 2 level tree
26
76
Root has 8 keys + 8 record pointers+ 9 son pointers
= 8x4 + 8x4 + 9x4 = 100 bytes
B-tree:
Each of 9 sons: 12 rec. pointers (+12 keys)= 12x(4+4) + 4 = 100 bytes
2-level B-tree, Max # records =12x9 + 8 = 116
77
Root has 12 keys + 13 son pointers= 12x4 + 13x4 = 100 bytes
B+tree:
Each of 13 sons: 12 rec. ptrs (+12 keys)= 12x(4 +4) + 4 = 100 bytes
2-level B+tree, Max # records= 13x12 = 156
78
So...
ooooooooooooo ooooooooo156 records 108 records
Total = 116
B+ B
8 records
• Conclusion:– For fixed block size,– B+ tree is better because it is bushier
27
79
Outline/summary
• Conventional Indexes• Sparse vs. dense• Primary vs. secondary
• B trees• B+trees vs. B-trees• B+trees vs. indexed sequential
• Hashing schemes --> Next
Hashing
• hash function h(key) returns address of bucket or record
• for secondary index buckets are required
• if the keys for a specific hash value do not fit into one page the bucket is a linked list of pages
key h(key)
key h(key)
Records
Buckets Records
key
81
Example hash function
• Key = ‘x1 x2 … xn’ n byte character string• Have b buckets• h: add x1 + x2 + ….. xn
– compute sum modulo b
28
82
➽ This may not be best function …➽ Read Knuth Vol. 3 if you really
need to select a good function.
Good hash Expected number offunction: keys/bucket is the
same for all buckets
83
Within a bucket:
• Do we keep keys sorted?
• Yes, if CPU time critical& Inserts/Deletes not too frequent
84
Next: example to illustrateinserts, overflows, deletes
h(K)
29
85
EXAMPLE 2 records/bucket
INSERT:h(a) = 1h(b) = 2h(c) = 1h(d) = 0
0
1
2
3
d
acb
h(e) = 1
e
86
0
1
2
3
a
bce
d
EXAMPLE: deletion
Delete:ef
fg
maybe move“g” up
cd
87
Rule of thumb:• Try to keep space utilization
between 50% and 80%Utilization = # keys used
total # keys that fit
• If < 50%, wasting space• If > 80%, overflows significant
depends on how good hashfunction is & on # keys/bucket
30
88
How do we cope with growth?
• Overflows and reorganizations• Dynamic hashing
• Extensible• Linear
89
Extensible hashing: two ideas
(a) Use i of b bits output by hash functionb
h(K) →
use i → grows over time….
00110101
90
(b) Use directory
h(K)[0-i ] to bucket...
...
31
91
Example: h(k) is 4 bits; 2 keys/bucket
i = 11
1
0001
10011100
Insert 101011100
1010
New directory
200
01
10
11
i =
2
2
92
10001
210011010
21100
Insert:
0111
0000
00
01
10
11
2i =
Example continued
0111
0000
0111
0001
2
2
93
00
01
10
11
2i =
210011010
21100
20111
200000001
Insert:
1001
Example continued
10011001
1010
000
001
010
011
100
101
110
111
3i =
3
3
32
94
Extensible hashing: deletion
• No merging of blocks• Merge blocks
and cut directory if possible(Reverse insert procedure)
95
Deletion example:
• Run thru insert example in reverse!
96
Extensible hashing
Can handle growing files- with less wasted space- with no full reorganizations
Summary
+
Indirection(Not bad if directory in memory)
Directory doubles in size(Now it fits, now it does not)
-
-
33
97
Linear hashing• Another dynamic hashing scheme
Two ideas:(a) Use i low order bits of hash
01110101grows
b
i
(b) File grows linearly
98
Example b=4 bits, i =2, 2 keys/bucket
00 01 10 11
01011111
00001010
m = 01 (max used block)
Futuregrowthbuckets
If h(k)[i ] ≤ m, thenlook at bucket h(k)[i ]else, look at bucket h(k)[i ] - 2i -1
Rule
0101• can have overflow chains!
• insert 0101
99
Example b=4 bits, i =2, 2 keys/bucket
00 01 10 11
01011111
00001010
m = 01 (max used block)
Futuregrowthbuckets
10
1010
0101 • insert 0101
11
11110101
34
100
Example Continued: How to grow beyond this?
00 01 10 11
1111101001010101
0000
m = 11 (max used block)
i = 2
0 0 0 0100 101 110 111
3
. . .
100
100
101
101
01010101
101
• If U > threshold then increase m(and maybe i )
☛ When do we expand file?
• Keep track of: # used slotstotal # of slots
= U
102
Linear Hashing
Can handle growing files- with less wasted space- with no full reorganizations
No indirection like extensible hashing
Summary
+
+
Can still have overflow chains-
35
103
Example: BAD CASE
Very full
Very empty Need to movem here…Would wastespace...
104
Hashing- How it works- Dynamic hashing
- Extensible- Linear
Summary
105
Next:
• Indexing vs Hashing• Index definition in SQL• Multiple key access
36
106
• Hashing good for probes given keye.g., SELECT …
FROM RWHERE R.A = 5
Indexing vs Hashing
107
• INDEXING (Including B Trees) good forRange Searches:e.g., SELECT
FROM RWHERE R.A > 5
Indexing vs Hashing
108
Index definition in SQL
• Create index name on rel (attr)• Create unique index name on rel (attr)
defines candidate key
• Drop INDEX name
37
109
CANNOT SPECIFY TYPE OF INDEX(e.g. B-tree, Hashing, …)
OR PARAMETERS(e.g. Load Factor, Size of Hash,...)
... at least in SQL...
Note
110
ATTRIBUTE LIST ⇒ MULTIKEY INDEX(next)
e.g., CREATE INDEX foo ON R(A,B,C)
Note
111
Motivation: Find records whereDEPT = “Toy” AND SAL > 50k
Multi-key Index
38
112
Strategy I:
• Use one index, say Dept.• Get all Dept = “Toy” records
and check their salary
I1
113
• Use 2 Indexes; Manipulate Pointers
Toy Sal> 50k
Strategy II:
114
• Multiple Key Index
One idea:
Strategy III:
I1
I2
I3
39
115
Example
ExampleRecord
DeptIndex
SalaryIndex
Name=JoeDEPT=SalesSAL=15k
ArtSalesToy
10k15k17k21k
12k15k15k19k
116
For which queries is this index good?
Find RECs Dept = “Sales” SAL=20kFind RECs Dept = “Sales” SAL > 20kFind RECs Dept = “Sales”Find RECs SAL = 20k
117
Interesting application:
• Geographic Data
DATA:
<X1,Y1, Attributes><X2,Y2, Attributes>x
y
. . .
40
118
Queries:
• What city is at <Xi,Yi>?• What is within 5 miles from <Xi,Yi>?• Which is closest point to <Xi,Yi>?
119
h
nb
i a
co
d
10 20
10 20
Examplee
g
f
m
l
kj25 15 35 20
40
30
20
10
h i a bcd efg
n omlj k
• Search points near f• Search points near b
5
15 15
120
Queries
• Find points with Yi > 20• Find points with Xi < 5• Find points “close” to i = <12,38>• Find points “close” to b = <7,24>
41
121
• Many types of geographic index structures have been suggested• Quad Trees• R Trees
122
Two more types of multi key indexes
• Grid• Partitioned hash
123
Grid IndexKey 2
X1 X2 …… XnV1V2
Key 1
Vn
To records with key1=V3, key2=X2
42
124
CLAIM
• Can quickly find records with– key 1 = Vi ∧ Key 2 = Xj
– key 1 = Vi
– key 2 = Xj
• And also ranges….– E.g., key 1 ≥ Vi ∧ key 2 < Xj
125
☛ But there is a catch with Grid Indexes!
• How is Grid Index stored on disk?
LikeArray... X1 X2 X3 X4 X1 X2 X3 X4 X1 X2 X3 X4
V1 V2 V3
Problem:• Need regularity so we can compute
position of <Vi,Xj> entry
126
Solution: Use Indirection
BucketsV1V2
V3 *Grid onlyV4 contains
pointers tobuckets
Buckets------
------
------
------
------
X1 X2 X3
43
127
With indirection:
• Grid can be regular without wasting space• We do have price of indirection
128
Can also index grid on value ranges
Salary Grid
Linear Scale1 2 3
Toy Sales Personnel
0-20K 120K-50K 2
50K- 38
129
Grid files
Good for multiple-key searchSpace, management overhead
(nothing is free)
Need partitioning ranges that evenlysplit keys
+
-
-
44
130
Idea:
Key1 Key2
Partitioned hash function
h1 h2
010110 1110010
131
h1(toy) =0 000h1(sales) =1 001h1(art) =1 010
. 011
.h2(10k) =01 100h2(20k) =11 101h2(30k) =01 110h2(40k) =00 111
.
.
<Fred,toy,10k>,<Joe,sales,10k><Sally,art,30k>
EX:
Insert
<Joe><Sally>
<Fred>
132
h1(toy) =0 000h1(sales) =1 001h1(art) =1 010
. 011
.h2(10k) =01 100h2(20k) =11 101h2(30k) =01 110h2(40k) =00 111
.
.• Find Emp. with Dept. = Sales ∧ Sal=40k
<Fred><Joe><Jan>
<Mary>
<Sally>
<Tom><Bill><Andy>
45
133
h1(toy) =0 000h1(sales) =1 001h1(art) =1 010
. 011
.h2(10k) =01 100h2(20k) =11 101h2(30k) =01 110h2(40k) =00 111
.
.• Find Emp. with Sal=30k
<Fred><Joe><Jan>
<Mary>
<Sally>
<Tom><Bill><Andy>
look here
134
h1(toy) =0 000h1(sales) =1 001h1(art) =1 010
. 011
.h2(10k) =01 100h2(20k) =11 101h2(30k) =01 110h2(40k) =00 111
.
.• Find Emp. with Dept. = Sales
<Fred><Joe><Jan>
<Mary>
<Sally>
<Tom><Bill><Andy>
look here
135
Post hashing discussion:- Indexing vs. Hashing- SQL Index Definition- Multiple Key Access
- Multi Key IndexVariations: Grid, Geo Data
- Partitioned Hash
Summary
46
136
Reading Chapter 5
• Skim the following sections:– 5.3.6, 5.3.7, 5.3.8– 5.4.2, 5.4.3, 5.4.4
• Read the rest
137
The BIG picture….
• Chapters 2 & 3: Storage, records, blocks...• Chapter 4 & 5: Access Mechanisms
- Indexes- B trees- Hashing- Multi key
• Chapter 6 & 7: Query Processing NEXT