Data Organization Btrees - Brown University · PDF fileDatabase System Concepts...
Transcript of Data Organization Btrees - Brown University · PDF fileDatabase System Concepts...
![Page 1: Data Organization Btrees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *](https://reader033.fdocuments.us/reader033/viewer/2022042708/5a9dd04e7f8b9aee528cbb4d/html5/thumbnails/1.jpg)
Data Organization Btrees
![Page 2: Data Organization Btrees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *](https://reader033.fdocuments.us/reader033/viewer/2022042708/5a9dd04e7f8b9aee528cbb4d/html5/thumbnails/2.jpg)
11.2Database System Concepts
Data organization and retrievalFile organization can improve data retrieval time
SELECT *FROM depositorsWHERE bname=“Downtown”
Mianus A215Perry A218Downtown A101....
Brighton A217Downtown A101Downtown A110......
Heap Ordered File
Searching a heap: must search all blocks (100 blocks)
OR
Searching an ordered file: 1. Binary search for the 1st tuple in answer : log2 100 = 7 block accesses2. scan blocks with answer: no more than 2 Total <= 9 block accesses
100 blocks200 recs/blockQuery returns 150 records
![Page 3: Data Organization Btrees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *](https://reader033.fdocuments.us/reader033/viewer/2022042708/5a9dd04e7f8b9aee528cbb4d/html5/thumbnails/3.jpg)
11.3Database System Concepts
Data organization and retrievalBut... file can only be ordered on one search key:
Brighton A217Downtown A101Downtown A110......
Ordered File (bname)Ex. Select * From depositors Where acct_no = “A110”
Requires linear scan (100 BA’s)
Solution: Indexes! Auxiliary data structures over relations that can improve
the search time
![Page 4: Data Organization Btrees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *](https://reader033.fdocuments.us/reader033/viewer/2022042708/5a9dd04e7f8b9aee528cbb4d/html5/thumbnails/4.jpg)
11.4Database System Concepts
A simple indexBrighton A217 700Downtown A101 500Downtown A110 600Mianus A215 700Perry A102 400......
A101A102A110A215A217...... Index of depositors on acct_no
Index records: <search key value, pointer (block, offset or slot#)>
To answer a query for “acct_no=A110” we:
1. Do a binary search on index file, searching for A1102. “Chase” pointer of index record
Index file
![Page 5: Data Organization Btrees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *](https://reader033.fdocuments.us/reader033/viewer/2022042708/5a9dd04e7f8b9aee528cbb4d/html5/thumbnails/5.jpg)
11.5Database System Concepts
Index Choices
1. Primary: index search key =
physical (sort) order search key vs Secondary: all other indexes
Q: how many primary indexes per relation?
2. Dense: index entry for every search key value
vs Sparse: some search key values not in the index
3. Singlelevel vs Multilevel (index on the indexes)
![Page 6: Data Organization Btrees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *](https://reader033.fdocuments.us/reader033/viewer/2022042708/5a9dd04e7f8b9aee528cbb4d/html5/thumbnails/6.jpg)
11.6Database System Concepts
Measuring ‘goodness’
On what basis do we compare different indices?1. Access type: what type of queries can be answered:
selection queries (ssn = 123)? range queries ( 100 <= ssn <= 200)?
2. Access time: what is the cost of evaluating queries measured in # of block accesses
3. Maintenance overhead: cost of insertion / deletion? (also in # block accesses)
4. Space overhead : in # of blocks needed to store the index relative to the real data.
![Page 7: Data Organization Btrees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *](https://reader033.fdocuments.us/reader033/viewer/2022042708/5a9dd04e7f8b9aee528cbb4d/html5/thumbnails/7.jpg)
11.7Database System Concepts
Indexing
Primary (or clustering) index on SSN
STUDENTSsn Name Address
123 smith main str234 jones forbes ave345 smith forbes ave
… … …
123234345456567
![Page 8: Data Organization Btrees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *](https://reader033.fdocuments.us/reader033/viewer/2022042708/5a9dd04e7f8b9aee528cbb4d/html5/thumbnails/8.jpg)
11.8Database System Concepts
Indexing
Primary/sparse index on ssn (primary key)
>=123
>=456
123456
…
![Page 9: Data Organization Btrees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *](https://reader033.fdocuments.us/reader033/viewer/2022042708/5a9dd04e7f8b9aee528cbb4d/html5/thumbnails/9.jpg)
11.9Database System Concepts
IndexingSecondary (or nonclustering) index: duplicates may exist
Addressindex
• Can have many secondary indices• but only one primary index
STUDENTSsn Name Address
123 smith main str234 jones forbes ave345 tomson main str456 stevens forbes ave567 smith forbes ave
![Page 10: Data Organization Btrees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *](https://reader033.fdocuments.us/reader033/viewer/2022042708/5a9dd04e7f8b9aee528cbb4d/html5/thumbnails/10.jpg)
11.10Database System Concepts
Indexing
secondary index: typically, with ‘postings lists’
If not on a candidate key value.
Postings lists
STUDENTSsn Name Address
123 smith main str234 jones forbes ave345 tomson main str456 stevens forbes ave567 smith forbes ave
forbes avemain str
![Page 11: Data Organization Btrees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *](https://reader033.fdocuments.us/reader033/viewer/2022042708/5a9dd04e7f8b9aee528cbb4d/html5/thumbnails/11.jpg)
11.11Database System Concepts
Indexing
Secondary / dense index
Secondary on a candidate key:No duplicates, no need for posting lists
Ssn Name Address345 tomson main str234 jones forbes ave123 smith main str567 smith forbes ave456 stevens forbes ave
123234345456567
![Page 12: Data Organization Btrees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *](https://reader033.fdocuments.us/reader033/viewer/2022042708/5a9dd04e7f8b9aee528cbb4d/html5/thumbnails/12.jpg)
11.12Database System Concepts
Primary vs Secondary
1. Access type: Primary: SELECTION, RANGE Secondary: SELECTION, RANGE but index must point to posting
lists (if not on candidate key).
2. Access time: Primary faster than secondary for range queries
(no list access, all results clustered together)
3. Maintenance Overhead: Primary has greater overhead (must alter index + file)
4. Space Overhead: secondary has more.. (posting lists)
![Page 13: Data Organization Btrees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *](https://reader033.fdocuments.us/reader033/viewer/2022042708/5a9dd04e7f8b9aee528cbb4d/html5/thumbnails/13.jpg)
11.13Database System Concepts
Dense vs Sparse
1. Access type: both: Selection, range (if primary)
2. Access time: Dense: requires lookup for 1st result Sparse: requires lookup + scan for first result
3. Maintenance Overhead: Dense: Must change index entries Sparse: may not have to change index entries
4. Space Overhead: Dense: 1 entry per search key value Sparse: < 1 entry per block
![Page 14: Data Organization Btrees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *](https://reader033.fdocuments.us/reader033/viewer/2022042708/5a9dd04e7f8b9aee528cbb4d/html5/thumbnails/14.jpg)
11.14Database System Concepts
Summary
Dense Sparse
Primary rare usual
secondary usual• All combinations are possible
• at most one sparse/clustering index• as many dense indices as desired• usually: one primary index (probably sparse) and a
few secondary indices (nonclustering)• secondary / sparse: Which keys to use? Hot
items?
![Page 15: Data Organization Btrees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *](https://reader033.fdocuments.us/reader033/viewer/2022042708/5a9dd04e7f8b9aee528cbb4d/html5/thumbnails/15.jpg)
11.15Database System Concepts
ISAM
>=123
>=456
block
2nd level sparse index on the values of the 1st level
What if index is too large to search in memory?
STUDENTSsn Name Address
123 smith main str234 jones forbes ave345 tomson main str456 stevens forbes ave567 smith forbes ave
123456
…
1233,423
…
![Page 16: Data Organization Btrees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *](https://reader033.fdocuments.us/reader033/viewer/2022042708/5a9dd04e7f8b9aee528cbb4d/html5/thumbnails/16.jpg)
11.16Database System Concepts
ISAM observations
What about insertions/deletions?
>=123
>=456
124; peterson; fifth ave.
STUDENTSsn Name Address
123 smith main str234 jones forbes ave345 tomson main str456 stevens forbes ave567 smith forbes ave
123456
…
1233,423
…
![Page 17: Data Organization Btrees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *](https://reader033.fdocuments.us/reader033/viewer/2022042708/5a9dd04e7f8b9aee528cbb4d/html5/thumbnails/17.jpg)
11.17Database System Concepts
ISAM observations
What about insertions/deletions?
124; peterson; fifth ave.
overflows
Problems?
STUDENTSsn Name Address
123 smith main str234 jones forbes ave345 tomson main str456 stevens forbes ave567 smith forbes ave
123456
…
1233,423
…
![Page 18: Data Organization Btrees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *](https://reader033.fdocuments.us/reader033/viewer/2022042708/5a9dd04e7f8b9aee528cbb4d/html5/thumbnails/18.jpg)
11.18Database System Concepts
ISAM observations
What about insertions/deletions?
124; peterson; fifth ave.
overflows
• overflow chains may become very long - what to do?
STUDENTSsn Name Address
123 smith main str234 jones forbes ave345 tomson main str456 stevens forbes ave567 smith forbes ave
123456
…
1233,423
…
![Page 19: Data Organization Btrees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *](https://reader033.fdocuments.us/reader033/viewer/2022042708/5a9dd04e7f8b9aee528cbb4d/html5/thumbnails/19.jpg)
11.19Database System Concepts
ISAM observations
What about insertions/deletions?
124; peterson; fifth ave.
overflows
• overflow chains may become very long - thus:
• shut-down & reorganize
• start with ~80% utilization
STUDENTSsn Name Address
123 smith main str234 jones forbes ave345 tomson main str456 stevens forbes ave567 smith forbes ave
123456
…
1233,423
…
![Page 20: Data Organization Btrees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *](https://reader033.fdocuments.us/reader033/viewer/2022042708/5a9dd04e7f8b9aee528cbb4d/html5/thumbnails/20.jpg)
11.21Database System Concepts
So far
… indices (like ISAM) suffer in the presence of frequent updates
alternative indexing structure: B trees
![Page 21: Data Organization Btrees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *](https://reader033.fdocuments.us/reader033/viewer/2022042708/5a9dd04e7f8b9aee528cbb4d/html5/thumbnails/21.jpg)
11.22Database System Concepts
Btrees
Most successful family of index schemes(Btrees, B+trees, B*trees)
Can be used for primary/secondary, clustering/nonclustering index.
Balanced “nway” search trees
![Page 22: Data Organization Btrees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *](https://reader033.fdocuments.us/reader033/viewer/2022042708/5a9dd04e7f8b9aee528cbb4d/html5/thumbnails/22.jpg)
11.23Database System Concepts
Btrees
e.g., Btree of order 3:
1 3
6
7
9
13
< 6
>6 < 9
>9
records
• Key values appear once.• Record pointers accompany keys.• For simplicity, we will not show records and record
pointers.
![Page 23: Data Organization Btrees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *](https://reader033.fdocuments.us/reader033/viewer/2022042708/5a9dd04e7f8b9aee528cbb4d/html5/thumbnails/23.jpg)
11.24Database System Concepts
Btree Nodes
v1 v2 … vn-1
p1 pn
v<v1 v1 ≤ v < v2 Vn1 < v
Key values are ordered
MAXIMUM: n pointer valuesMINIMUM: n/2 pointer values
(Exception: root’s minimum = 2)
![Page 24: Data Organization Btrees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *](https://reader033.fdocuments.us/reader033/viewer/2022042708/5a9dd04e7f8b9aee528cbb4d/html5/thumbnails/24.jpg)
11.25Database System Concepts
Properties
“block aware” nodes: each node > disk page
O(logB (N)) for everything! (ins/del/search)
N is number of records
B is the branching factor ( = number of pointers)
typically, if B = (50 to 100), then 2 3 levels
utilization >= 50%, guaranteed; on average 69%
![Page 25: Data Organization Btrees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *](https://reader033.fdocuments.us/reader033/viewer/2022042708/5a9dd04e7f8b9aee528cbb4d/html5/thumbnails/25.jpg)
11.26Database System Concepts
Queries
Algorithm for exact match query? (e.g., ssn=8?)
1 3
6
7
9
13
< 6
> 6 < 9 >9
![Page 26: Data Organization Btrees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *](https://reader033.fdocuments.us/reader033/viewer/2022042708/5a9dd04e7f8b9aee528cbb4d/html5/thumbnails/26.jpg)
11.27Database System Concepts
Queries
Algorithm for exact match query? (e.g., ssn=7?)
1 3
6
7
9
13
< 6
>6 < 9 >9
![Page 27: Data Organization Btrees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *](https://reader033.fdocuments.us/reader033/viewer/2022042708/5a9dd04e7f8b9aee528cbb4d/html5/thumbnails/27.jpg)
11.28Database System Concepts
Queries
Algorithm for exact match query? (e.g., ssn=7?)
1 3
6
7
9
13
< 6
>6 < 9>9
![Page 28: Data Organization Btrees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *](https://reader033.fdocuments.us/reader033/viewer/2022042708/5a9dd04e7f8b9aee528cbb4d/html5/thumbnails/28.jpg)
11.29Database System Concepts
Queries
Algorithm for exact match query? (e.g., ssn=7?)
1 3
6
7
9
13
< 6>6 < 9 >9
![Page 29: Data Organization Btrees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *](https://reader033.fdocuments.us/reader033/viewer/2022042708/5a9dd04e7f8b9aee528cbb4d/html5/thumbnails/29.jpg)
11.30Database System Concepts
Queries
Algorithm for exact match query? (e.g., ssn=7?)
1 3
6
7
9
13
< 6
>6 < 9 >9Height of tree = H
(= # disk accesses)
![Page 30: Data Organization Btrees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *](https://reader033.fdocuments.us/reader033/viewer/2022042708/5a9dd04e7f8b9aee528cbb4d/html5/thumbnails/30.jpg)
11.31Database System Concepts
Queries
What about range queries? (e.g., 5<salary<8)
Proximity/ nearest neighbor searches? (e.g., salary ~ 8 )
![Page 31: Data Organization Btrees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *](https://reader033.fdocuments.us/reader033/viewer/2022042708/5a9dd04e7f8b9aee528cbb4d/html5/thumbnails/31.jpg)
11.32Database System Concepts
Queries What about range queries? (eg., 5<salary<8) Proximity/ nearest neighbor searches? (e.g., salary ~ 8 )
1 3
6
7
9
13
< 6
>6 < 9 >9
![Page 32: Data Organization Btrees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *](https://reader033.fdocuments.us/reader033/viewer/2022042708/5a9dd04e7f8b9aee528cbb4d/html5/thumbnails/32.jpg)
11.33Database System Concepts
How Do You Maintain Btrees?
Must insert/delete keys in tree such that the Btree rules are obeyed.
Do this on every insert/delete
Incur a little bit of overhead on each update, but avoid the problem of catastrophic reorganization (a la ISAM).
![Page 33: Data Organization Btrees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *](https://reader033.fdocuments.us/reader033/viewer/2022042708/5a9dd04e7f8b9aee528cbb4d/html5/thumbnails/33.jpg)
11.34Database System Concepts
Btrees: Insertion
Insert in leaf, if room exists
On overflow (no more room), Split: create a new internal node Redistribute keys
s.t., preserves B tree properties Push middle key up (recursively)
![Page 34: Data Organization Btrees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *](https://reader033.fdocuments.us/reader033/viewer/2022042708/5a9dd04e7f8b9aee528cbb4d/html5/thumbnails/34.jpg)
11.35Database System Concepts
Btrees
Easy case: Tree T0; insert ‘8’
1 3
6
7
9
13
< 6
>6 < 9 >9
![Page 35: Data Organization Btrees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *](https://reader033.fdocuments.us/reader033/viewer/2022042708/5a9dd04e7f8b9aee528cbb4d/html5/thumbnails/35.jpg)
11.36Database System Concepts
Btrees
Tree T0; insert ‘8’
1 3
6
7
9
13
< 6
>6 < 9 >9
8
![Page 36: Data Organization Btrees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *](https://reader033.fdocuments.us/reader033/viewer/2022042708/5a9dd04e7f8b9aee528cbb4d/html5/thumbnails/36.jpg)
11.37Database System Concepts
Btrees
Hard case: Tree T0; insert ‘2’
1 3
6
7
9
13
< 6
>6 < 9 >9
2
![Page 37: Data Organization Btrees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *](https://reader033.fdocuments.us/reader033/viewer/2022042708/5a9dd04e7f8b9aee528cbb4d/html5/thumbnails/37.jpg)
11.38Database System Concepts
Btrees
Hardest case: Tree T0; insert ‘2’
1 2
6
7
9
133
push middle up
![Page 38: Data Organization Btrees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *](https://reader033.fdocuments.us/reader033/viewer/2022042708/5a9dd04e7f8b9aee528cbb4d/html5/thumbnails/38.jpg)
11.39Database System Concepts
Btrees
Hard case: Tree T0; insert ‘2’
6
7
9
131 3
22
Overflow
push middle key up
Split
![Page 39: Data Organization Btrees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *](https://reader033.fdocuments.us/reader033/viewer/2022042708/5a9dd04e7f8b9aee528cbb4d/html5/thumbnails/39.jpg)
11.40Database System Concepts
Btrees
Hard case: Tree T0; insert ‘2’
7
9
131 3
2
6
Final state
![Page 40: Data Organization Btrees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *](https://reader033.fdocuments.us/reader033/viewer/2022042708/5a9dd04e7f8b9aee528cbb4d/html5/thumbnails/40.jpg)
11.41Database System Concepts
Btrees insertion
Q: What if there are two middles? (e.g., order 4) A: either one is fine
![Page 41: Data Organization Btrees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *](https://reader033.fdocuments.us/reader033/viewer/2022042708/5a9dd04e7f8b9aee528cbb4d/html5/thumbnails/41.jpg)
11.42Database System Concepts
Btrees: Insertion
Insert in leaf; on overflow, push middle up recursively – ‘propagate split’)
Split: preserves all B tree properties (!!)
Notice how it grows: height increases when root overflows & splits
Automatic, incremental reorganization (contrast with ISAM!)
![Page 42: Data Organization Btrees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *](https://reader033.fdocuments.us/reader033/viewer/2022042708/5a9dd04e7f8b9aee528cbb4d/html5/thumbnails/42.jpg)
11.43Database System Concepts
Overview
Primary / Secondary indices Multilevel (ISAM)
B – trees
Definition, Search, Insertion, deletion
B+ trees
Hashing
![Page 43: Data Organization Btrees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *](https://reader033.fdocuments.us/reader033/viewer/2022042708/5a9dd04e7f8b9aee528cbb4d/html5/thumbnails/43.jpg)
11.44Database System Concepts
Deletion
Rough outline of algorithm: Delete key; on underflow, may need to merge
In practice, some implementers just allow underflows to happen…
![Page 44: Data Organization Btrees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *](https://reader033.fdocuments.us/reader033/viewer/2022042708/5a9dd04e7f8b9aee528cbb4d/html5/thumbnails/44.jpg)
11.45Database System Concepts
Btrees – Deletion
Easiest case: Tree T0; delete ‘3’
1 3
6
7
9
13
< 6>6 < 9
>9
![Page 45: Data Organization Btrees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *](https://reader033.fdocuments.us/reader033/viewer/2022042708/5a9dd04e7f8b9aee528cbb4d/html5/thumbnails/45.jpg)
11.46Database System Concepts
Btrees – Deletion
Easiest case: Tree T0; delete ‘3’
1
6
7
9
13
< 6
>6 < 9 >9
![Page 46: Data Organization Btrees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *](https://reader033.fdocuments.us/reader033/viewer/2022042708/5a9dd04e7f8b9aee528cbb4d/html5/thumbnails/46.jpg)
11.47Database System Concepts
Btrees – Deletion
Case1: delete a key at a leaf – no underflow Case2: delete nonleaf key – no underflow Case3: delete leafkey; underflow, and ‘rich
sibling’ Case4: delete leafkey; underflow, and ‘poor
sibling’
![Page 47: Data Organization Btrees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *](https://reader033.fdocuments.us/reader033/viewer/2022042708/5a9dd04e7f8b9aee528cbb4d/html5/thumbnails/47.jpg)
11.48Database System Concepts
Btrees – Deletion
Case1:
delete a key at a leaf – no underflow
(delete 3 from T0)
1 3
6
7
9
13
< 6
>6 < 9 < 9
![Page 48: Data Organization Btrees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *](https://reader033.fdocuments.us/reader033/viewer/2022042708/5a9dd04e7f8b9aee528cbb4d/html5/thumbnails/48.jpg)
11.49Database System Concepts
Btrees – Deletion
Case 2:
delete a key at a nonleaf – no underflow
delete 6 from T0
1 3
6
7
9
13
< 6>6 < 9 >9
Delete & promote
![Page 49: Data Organization Btrees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *](https://reader033.fdocuments.us/reader033/viewer/2022042708/5a9dd04e7f8b9aee528cbb4d/html5/thumbnails/49.jpg)
11.50Database System Concepts
Btrees – Deletion
1 3 7
9
13
< 6
>6 < 9 >9
Case 2:
delete a key at a nonleaf – no underflow
delete 6 from T0
Delete & promote
![Page 50: Data Organization Btrees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *](https://reader033.fdocuments.us/reader033/viewer/2022042708/5a9dd04e7f8b9aee528cbb4d/html5/thumbnails/50.jpg)
11.51Database System Concepts
Btrees – Deletion
1 7
9
13
< 6
>6 < 9 >9
3
Case 2:
delete a key at a nonleaf – no underflow
delete 6 from T0
Delete & promote
![Page 51: Data Organization Btrees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *](https://reader033.fdocuments.us/reader033/viewer/2022042708/5a9dd04e7f8b9aee528cbb4d/html5/thumbnails/51.jpg)
11.52Database System Concepts
Btrees – Deletion
1 7
9
13
< 3> 3 < 9 > 9
3FINAL TREE
Case 2:
delete a key at a nonleaf – no underflow
delete 6 from T0
![Page 52: Data Organization Btrees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *](https://reader033.fdocuments.us/reader033/viewer/2022042708/5a9dd04e7f8b9aee528cbb4d/html5/thumbnails/52.jpg)
11.53Database System Concepts
Btrees – Deletion
Case2: delete a key at a nonleafno underflow (e.g., delete 6 from T0)
Q: How to promote?
A: pick the largest key from the left subtree (or the smallest from the right subtree)
![Page 53: Data Organization Btrees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *](https://reader033.fdocuments.us/reader033/viewer/2022042708/5a9dd04e7f8b9aee528cbb4d/html5/thumbnails/53.jpg)
11.54Database System Concepts
Btrees – Deletion
Case1: delete a key at a leaf – no underflow Case2: delete nonleaf key – no underflow Case3: delete leafkey; underflow, and ‘rich sibling’ Case4: delete leafkey; underflow, and ‘poor sibling’
![Page 54: Data Organization Btrees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *](https://reader033.fdocuments.us/reader033/viewer/2022042708/5a9dd04e7f8b9aee528cbb4d/html5/thumbnails/54.jpg)
11.55Database System Concepts
Btrees – Deletion
Case3:underflow & ‘rich sibling’
delete 7 from T0
1 3
6
7
9
13
< 6
>6 < 9 >9
Delete & borrow
![Page 55: Data Organization Btrees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *](https://reader033.fdocuments.us/reader033/viewer/2022042708/5a9dd04e7f8b9aee528cbb4d/html5/thumbnails/55.jpg)
11.56Database System Concepts
Btrees – Deletion
1 3
6 9
13
< 6>6 < 9 > 9Rich sibling
Case3:underflow & ‘rich sibling’
delete 7 from T0
Delete & borrow
![Page 56: Data Organization Btrees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *](https://reader033.fdocuments.us/reader033/viewer/2022042708/5a9dd04e7f8b9aee528cbb4d/html5/thumbnails/56.jpg)
11.57Database System Concepts
Btrees – Deletion
Case3: underflow & ‘rich sibling’
‘rich’ = can give a key, without underflowing ‘borrowing’ a key: THROUGH the PARENT!
![Page 57: Data Organization Btrees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *](https://reader033.fdocuments.us/reader033/viewer/2022042708/5a9dd04e7f8b9aee528cbb4d/html5/thumbnails/57.jpg)
11.58Database System Concepts
Btrees – Deletion
1 3
6 9
13
< 6
> 6 < 9 > 9Rich sibling
NO!!
Case3:underflow & ‘rich sibling’
delete 7 from T0
Delete & borrow
![Page 58: Data Organization Btrees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *](https://reader033.fdocuments.us/reader033/viewer/2022042708/5a9dd04e7f8b9aee528cbb4d/html5/thumbnails/58.jpg)
11.59Database System Concepts
Btrees – Deletion
1 3
6 9
13
< 6
>6 < 9 >9
Delete & borrow
Case3:underflow & ‘rich sibling’
delete 7 from T0
![Page 59: Data Organization Btrees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *](https://reader033.fdocuments.us/reader033/viewer/2022042708/5a9dd04e7f8b9aee528cbb4d/html5/thumbnails/59.jpg)
11.60Database System Concepts
Btrees – Deletion
1
3 9
13
< 6
> 6 < 9 > 9
6
Case3:underflow & ‘rich sibling’
delete 7 from T0
Delete & borrow
![Page 60: Data Organization Btrees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *](https://reader033.fdocuments.us/reader033/viewer/2022042708/5a9dd04e7f8b9aee528cbb4d/html5/thumbnails/60.jpg)
11.61Database System Concepts
Btrees – Deletion
1
3 9
13
< 3>3 < 9 > 9
Delete & borrow, through the parent
6
FINAL TREE
Case3:underflow & ‘rich sibling’
delete 7 from T0
![Page 61: Data Organization Btrees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *](https://reader033.fdocuments.us/reader033/viewer/2022042708/5a9dd04e7f8b9aee528cbb4d/html5/thumbnails/61.jpg)
11.62Database System Concepts
Btrees – Deletion
Case1: delete a key at a leaf – no underflow Case2: delete nonleaf key – no underflow Case3: delete leafkey; underflow, and ‘rich sibling’ Case4: delete leafkey; underflow, and ‘poor sibling’
![Page 62: Data Organization Btrees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *](https://reader033.fdocuments.us/reader033/viewer/2022042708/5a9dd04e7f8b9aee528cbb4d/html5/thumbnails/62.jpg)
11.63Database System Concepts
Btrees – DeletionCase 4
Underflow & ‘poor sibling’
Delete 13 from T0
• Merge, by pulling a key from the parent • Exact reversal from insertion:
‘split and push up’, vs. ‘merge and pull down’
1 3
6
7
9
13
< 6
>6 < 9 >9
![Page 63: Data Organization Btrees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *](https://reader033.fdocuments.us/reader033/viewer/2022042708/5a9dd04e7f8b9aee528cbb4d/html5/thumbnails/63.jpg)
11.64Database System Concepts
Btrees – Deletion
1 3
6
7
< 6
> 6
A: merge w/ ‘poor’ sibling
9
Case 4
Underflow & ‘poor sibling’
Delete 13 from T0
![Page 64: Data Organization Btrees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *](https://reader033.fdocuments.us/reader033/viewer/2022042708/5a9dd04e7f8b9aee528cbb4d/html5/thumbnails/64.jpg)
11.65Database System Concepts
Btrees – Deletion
1 3
6
7
< 6
> 69
FINAL TREE
Case 4
Underflow & ‘poor sibling’
Delete 13 from T0
![Page 65: Data Organization Btrees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *](https://reader033.fdocuments.us/reader033/viewer/2022042708/5a9dd04e7f8b9aee528cbb4d/html5/thumbnails/65.jpg)
11.66Database System Concepts
Btrees – Deletion
Case4: underflow & ‘poor sibling’ ‘pull key from parent, and merge’
Q: What if the parent underflows? A: repeat recursively
![Page 66: Data Organization Btrees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *](https://reader033.fdocuments.us/reader033/viewer/2022042708/5a9dd04e7f8b9aee528cbb4d/html5/thumbnails/66.jpg)
11.67Database System Concepts
Btrees in practice
In practice:
1 3
6
7
9
13
< 6
> 6 < 9 > 9
Ssn … …
3
7
6
9
1
FILE
![Page 67: Data Organization Btrees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *](https://reader033.fdocuments.us/reader033/viewer/2022042708/5a9dd04e7f8b9aee528cbb4d/html5/thumbnails/67.jpg)
11.68Database System Concepts
Btrees in practice
In practice, the formats are: leaf nodes: (v1, rp1, v2, rp2, … vn, rpn) Nonleaf nodes: (p1, v1, rp1, p2, v2, rp2, …)
1 3
6
7
9
13
< 6
> 6 < 9 > 9
![Page 68: Data Organization Btrees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *](https://reader033.fdocuments.us/reader033/viewer/2022042708/5a9dd04e7f8b9aee528cbb4d/html5/thumbnails/68.jpg)
11.69Database System Concepts
Overview
primary / secondary indices multilevel (ISAM)
B – trees
B+ trees
hashing
![Page 69: Data Organization Btrees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *](https://reader033.fdocuments.us/reader033/viewer/2022042708/5a9dd04e7f8b9aee528cbb4d/html5/thumbnails/69.jpg)
11.70Database System Concepts
B+ trees Motivation
Btree – print keys in sorted order:
1 3
6
7
9
13
< 6
> 6 < 9 > 9
![Page 70: Data Organization Btrees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *](https://reader033.fdocuments.us/reader033/viewer/2022042708/5a9dd04e7f8b9aee528cbb4d/html5/thumbnails/70.jpg)
11.71Database System Concepts
B+ trees Motivation
Btree needs backtracking – how to avoid it?
1 3
6
7
9
13
< 6
> 6 < 9 > 9
![Page 71: Data Organization Btrees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *](https://reader033.fdocuments.us/reader033/viewer/2022042708/5a9dd04e7f8b9aee528cbb4d/html5/thumbnails/71.jpg)
11.72Database System Concepts
Solution: B+ trees
Facilitate sequential ops
String all leaf nodes together
AND
replicate keys from nonleaf nodes, to make sure every key appears at the leaf level
![Page 72: Data Organization Btrees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *](https://reader033.fdocuments.us/reader033/viewer/2022042708/5a9dd04e7f8b9aee528cbb4d/html5/thumbnails/72.jpg)
11.73Database System Concepts
B+trees
B+tree of order 3:
3 4
6 9
9
< 6
≥ 6 < 9 ≥ 9
6 7 13
(3, Joe, 23) (3, Bob, 23)
(4, John, 23)
………… ………… …………
root: internal node
leaf node
Data File
![Page 73: Data Organization Btrees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *](https://reader033.fdocuments.us/reader033/viewer/2022042708/5a9dd04e7f8b9aee528cbb4d/html5/thumbnails/73.jpg)
11.74Database System Concepts
B+ tree insertion
INSERTION OF KEY ’K’ insert searchkey value to ’L’ such that the keys are in order; if ( ’L’ overflows) { split ’L’ ; insert (ie., COPY) smallest searchkey value of new node to parent node ’P’; if (’P’ overflows) { repeat the Btree split procedure recursively; /* Notice: the BTREE split; NOT the B+ tree */ } }
![Page 74: Data Organization Btrees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *](https://reader033.fdocuments.us/reader033/viewer/2022042708/5a9dd04e7f8b9aee528cbb4d/html5/thumbnails/74.jpg)
11.75Database System Concepts
B+tree insertion – cont’d
ATTENTION:
A split at the LEAF level is handled by
COPYING the middle key up;
A split at a higher level is handled by
PUSHING the middle key up
Remember: Leaf nodes must be complete – all keysInterior nodes need not be complete
![Page 75: Data Organization Btrees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *](https://reader033.fdocuments.us/reader033/viewer/2022042708/5a9dd04e7f8b9aee528cbb4d/html5/thumbnails/75.jpg)
11.76Database System Concepts
B+ trees insertion
1 3
6
6
9
9
> 6
≥ 6 < 9 ≥ 9
7 13
Insert ‘8’
![Page 76: Data Organization Btrees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *](https://reader033.fdocuments.us/reader033/viewer/2022042708/5a9dd04e7f8b9aee528cbb4d/html5/thumbnails/76.jpg)
11.77Database System Concepts
B+ trees insertion
1 3
6
6
9
9
< 6≥ 6 < 9 ≥ 9
7 13
Insert ‘8’
8
![Page 77: Data Organization Btrees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *](https://reader033.fdocuments.us/reader033/viewer/2022042708/5a9dd04e7f8b9aee528cbb4d/html5/thumbnails/77.jpg)
11.78Database System Concepts
B+ trees insertion
1 3
6
6
9
9
<6
≥ 6 <9 ≥ 9
7 13
Eg., insert ‘8’
8
COPY middle (=7) upstairs; Keep 8 in leaf as well
![Page 78: Data Organization Btrees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *](https://reader033.fdocuments.us/reader033/viewer/2022042708/5a9dd04e7f8b9aee528cbb4d/html5/thumbnails/78.jpg)
11.79Database System Concepts
B+ trees insertion
1 3
6
6
9< 6
≥ 6 < 9≥ 9
9 13
Eg., insert ‘8’
COPY middle upstairs and split
7 and 8 remain in leaves since all keys are present there.
7 8
7
![Page 79: Data Organization Btrees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *](https://reader033.fdocuments.us/reader033/viewer/2022042708/5a9dd04e7f8b9aee528cbb4d/html5/thumbnails/79.jpg)
11.80Database System Concepts
B+ trees insertion
1 3
6
6
9<6
≥ 6 < 9≥ 9
9 13
Insert ‘8’
COPY middle upstairs again
7 8
7
Nonleaf overflow – just PUSH the middle
![Page 80: Data Organization Btrees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *](https://reader033.fdocuments.us/reader033/viewer/2022042708/5a9dd04e7f8b9aee528cbb4d/html5/thumbnails/80.jpg)
11.81Database System Concepts
B+ trees – insertion
1 3
6
6
<6
≥ 6 ≥ 9
9 13
Insert ‘8’
7 8
7
9
< 7 ≥ 7
<9
FINAL TREE