CS 4604: Introduction to Database Management...
Transcript of CS 4604: Introduction to Database Management...
![Page 1: CS 4604: Introduction to Database Management Systemscourses.cs.vt.edu/~cs4604/Fall18/lectures/lecture-8.pdf§ Extra office hours till midterm – Check Piazza post Prakash 2018 VT](https://reader033.fdocuments.us/reader033/viewer/2022042010/5e71fef9aa21a56404626eaa/html5/thumbnails/1.jpg)
CS4604:IntroductiontoDatabaseManagementSystems
B.AdityaPrakashLecture#8:StoringdataandIndexes
![Page 2: CS 4604: Introduction to Database Management Systemscourses.cs.vt.edu/~cs4604/Fall18/lectures/lecture-8.pdf§ Extra office hours till midterm – Check Piazza post Prakash 2018 VT](https://reader033.fdocuments.us/reader033/viewer/2022042010/5e71fef9aa21a56404626eaa/html5/thumbnails/2.jpg)
Annoucements
§ Extraofficehourstillmidterm– CheckPiazzapost
Prakash2018 VTCS4604 2
![Page 3: CS 4604: Introduction to Database Management Systemscourses.cs.vt.edu/~cs4604/Fall18/lectures/lecture-8.pdf§ Extra office hours till midterm – Check Piazza post Prakash 2018 VT](https://reader033.fdocuments.us/reader033/viewer/2022042010/5e71fef9aa21a56404626eaa/html5/thumbnails/3.jpg)
STORINGDATA
Prakash2018 VTCS4604 3
![Page 4: CS 4604: Introduction to Database Management Systemscourses.cs.vt.edu/~cs4604/Fall18/lectures/lecture-8.pdf§ Extra office hours till midterm – Check Piazza post Prakash 2018 VT](https://reader033.fdocuments.us/reader033/viewer/2022042010/5e71fef9aa21a56404626eaa/html5/thumbnails/4.jpg)
Prakash2018 VTCS4604
DBMSLayers:
Query Optimization and Execution
Relational Operators
Files and Access Methods
Buffer Management
Disk Space Management
DB
Queries
TODAYà
4
![Page 5: CS 4604: Introduction to Database Management Systemscourses.cs.vt.edu/~cs4604/Fall18/lectures/lecture-8.pdf§ Extra office hours till midterm – Check Piazza post Prakash 2018 VT](https://reader033.fdocuments.us/reader033/viewer/2022042010/5e71fef9aa21a56404626eaa/html5/thumbnails/5.jpg)
Prakash2018 VTCS4604
LeverageOSfordisk/filemanagement?
§ Layersofabstractionaregood…but:
5
![Page 6: CS 4604: Introduction to Database Management Systemscourses.cs.vt.edu/~cs4604/Fall18/lectures/lecture-8.pdf§ Extra office hours till midterm – Check Piazza post Prakash 2018 VT](https://reader033.fdocuments.us/reader033/viewer/2022042010/5e71fef9aa21a56404626eaa/html5/thumbnails/6.jpg)
Prakash2018 VTCS4604
LeverageOSfordisk/filemanagement?
§ Layersofabstractionaregood…but:– Unfortunately,OSoftengetsinthewayofDBMS
6
![Page 7: CS 4604: Introduction to Database Management Systemscourses.cs.vt.edu/~cs4604/Fall18/lectures/lecture-8.pdf§ Extra office hours till midterm – Check Piazza post Prakash 2018 VT](https://reader033.fdocuments.us/reader033/viewer/2022042010/5e71fef9aa21a56404626eaa/html5/thumbnails/7.jpg)
Prakash2018 VTCS4604
LeverageOSfordisk/filemanagement?
§ DBMSwants/needstodothings“itsownway”– Specializedprefetching– Controloverbufferreplacementpolicy
• LRUnotalwaysbest(sometimesworst!!)– Controloverthread/processscheduling
• “Convoyproblem”– AriseswhenOSschedulingconflictswithDBMSlocking
– Controloverflushingdatatodisk• WALprotocolrequiresflushinglogentriestodisk
7
![Page 8: CS 4604: Introduction to Database Management Systemscourses.cs.vt.edu/~cs4604/Fall18/lectures/lecture-8.pdf§ Extra office hours till midterm – Check Piazza post Prakash 2018 VT](https://reader033.fdocuments.us/reader033/viewer/2022042010/5e71fef9aa21a56404626eaa/html5/thumbnails/8.jpg)
Prakash2018 VTCS4604
DisksandFiles
§ DBMSstoresinformationondisks.– but:disksare(relatively)VERYslow!
§ MajorimplicationsforDBMSdesign!
8
![Page 9: CS 4604: Introduction to Database Management Systemscourses.cs.vt.edu/~cs4604/Fall18/lectures/lecture-8.pdf§ Extra office hours till midterm – Check Piazza post Prakash 2018 VT](https://reader033.fdocuments.us/reader033/viewer/2022042010/5e71fef9aa21a56404626eaa/html5/thumbnails/9.jpg)
Prakash2018 VTCS4604
DisksandFiles
§ MajorimplicationsforDBMSdesign:– READ:disk->mainmemory(RAM).– WRITE:reverse– Botharehigh-costoperations,relativetoin-memoryoperations,somustbeplannedcarefully!
9
![Page 10: CS 4604: Introduction to Database Management Systemscourses.cs.vt.edu/~cs4604/Fall18/lectures/lecture-8.pdf§ Extra office hours till midterm – Check Piazza post Prakash 2018 VT](https://reader033.fdocuments.us/reader033/viewer/2022042010/5e71fef9aa21a56404626eaa/html5/thumbnails/10.jpg)
Prakash2018 VTCS4604
WhyNotStoreItAllinMainMemory?
10
![Page 11: CS 4604: Introduction to Database Management Systemscourses.cs.vt.edu/~cs4604/Fall18/lectures/lecture-8.pdf§ Extra office hours till midterm – Check Piazza post Prakash 2018 VT](https://reader033.fdocuments.us/reader033/viewer/2022042010/5e71fef9aa21a56404626eaa/html5/thumbnails/11.jpg)
Prakash2018 VTCS4604
WhyNotStoreItAllinMainMemory?
§ Coststoomuch.– disk:~$1/Gb;memory:~$100/Gb– High-endDatabasestodayinthe10-100TBrange.
– Approx60%ofthecostofaproductionsystemisinthedisks.
§ Mainmemoryisvolatile.§ Note:somespecializedsystemsdostoreentiredatabaseinmainmemory.
11
![Page 12: CS 4604: Introduction to Database Management Systemscourses.cs.vt.edu/~cs4604/Fall18/lectures/lecture-8.pdf§ Extra office hours till midterm – Check Piazza post Prakash 2018 VT](https://reader033.fdocuments.us/reader033/viewer/2022042010/5e71fef9aa21a56404626eaa/html5/thumbnails/12.jpg)
Prakash2018 VTCS4604
TheStorageHierarchySmaller, Faster
Bigger, Slower
12
![Page 13: CS 4604: Introduction to Database Management Systemscourses.cs.vt.edu/~cs4604/Fall18/lectures/lecture-8.pdf§ Extra office hours till midterm – Check Piazza post Prakash 2018 VT](https://reader033.fdocuments.us/reader033/viewer/2022042010/5e71fef9aa21a56404626eaa/html5/thumbnails/13.jpg)
Prakash2018 VTCS4604
TheStorageHierarchy
– Main memory (RAM) for currently used data.
– Disk for the main database (secondary storage).
– Tapes for archiving older versions of the data (tertiary storage).
Smaller, Faster
Bigger, Slower
Registers
L1 Cache
Main Memory
Magnetic Disk
Magnetic Tape
...
13
![Page 14: CS 4604: Introduction to Database Management Systemscourses.cs.vt.edu/~cs4604/Fall18/lectures/lecture-8.pdf§ Extra office hours till midterm – Check Piazza post Prakash 2018 VT](https://reader033.fdocuments.us/reader033/viewer/2022042010/5e71fef9aa21a56404626eaa/html5/thumbnails/14.jpg)
Prakash2018 VTCS4604
JimGray’sStorageLatencyAnalogy:HowFarAwayistheData?
Registers On Chip Cache On Board Cache
Memory
Disk
1 2
10
100
Tape
10 9
10 6
Boston
This Building
This Room My Head
10 min
1.5 hr
2 Years
1 min
Pluto
2,000 Years
The image cannot be displayed. Your computer may not have enough memory to open the image, or the image may have been corrupted. Restart your computer, and then open the file again. If the red x still appears, you may have to delete the image and then insert it again.
The image cannot be displayed. Your computer may not have
The image cannot be displayed. Your computer may not have enough
Andromeda
14
![Page 15: CS 4604: Introduction to Database Management Systemscourses.cs.vt.edu/~cs4604/Fall18/lectures/lecture-8.pdf§ Extra office hours till midterm – Check Piazza post Prakash 2018 VT](https://reader033.fdocuments.us/reader033/viewer/2022042010/5e71fef9aa21a56404626eaa/html5/thumbnails/15.jpg)
Prakash2018 VTCS4604
Disks§ Secondarystoragedeviceofchoice.§ Mainadvantageovertapes:randomaccessvs.sequential.
§ Dataisstoredandretrievedinunitscalleddiskblocksorpages.
§ UnlikeRAM,timetoretrieveadiskpagevariesdependinguponlocationondisk.– relativeplacementofpagesondiskisimportant!
15
![Page 16: CS 4604: Introduction to Database Management Systemscourses.cs.vt.edu/~cs4604/Fall18/lectures/lecture-8.pdf§ Extra office hours till midterm – Check Piazza post Prakash 2018 VT](https://reader033.fdocuments.us/reader033/viewer/2022042010/5e71fef9aa21a56404626eaa/html5/thumbnails/16.jpg)
Prakash2018 VTCS4604
AnatomyofaDisk
Platters
Spindle
• Sector • Track • Cylinder • Platter • Block size = multiple of sector size (which is fixed)
Disk head
Arm movement
Arm assembly
Tracks
Sector
#16
![Page 17: CS 4604: Introduction to Database Management Systemscourses.cs.vt.edu/~cs4604/Fall18/lectures/lecture-8.pdf§ Extra office hours till midterm – Check Piazza post Prakash 2018 VT](https://reader033.fdocuments.us/reader033/viewer/2022042010/5e71fef9aa21a56404626eaa/html5/thumbnails/17.jpg)
Prakash2018 VTCS4604
AccessingaDiskPage
§ Timetoaccess(read/write)adiskblock:– .– .– .
17
![Page 18: CS 4604: Introduction to Database Management Systemscourses.cs.vt.edu/~cs4604/Fall18/lectures/lecture-8.pdf§ Extra office hours till midterm – Check Piazza post Prakash 2018 VT](https://reader033.fdocuments.us/reader033/viewer/2022042010/5e71fef9aa21a56404626eaa/html5/thumbnails/18.jpg)
Prakash2018 VTCS4604
AccessingaDiskPage
§ Timetoaccess(read/write)adiskblock:– seektime:movingarmstopositiondiskheadontrack
– rotationaldelay:waitingforblocktorotateunderhead
– transfertime:actuallymovingdatato/fromdisksurface
18
![Page 19: CS 4604: Introduction to Database Management Systemscourses.cs.vt.edu/~cs4604/Fall18/lectures/lecture-8.pdf§ Extra office hours till midterm – Check Piazza post Prakash 2018 VT](https://reader033.fdocuments.us/reader033/viewer/2022042010/5e71fef9aa21a56404626eaa/html5/thumbnails/19.jpg)
Prakash2018 VTCS4604
AccessingaDiskPage
§ Relativetimes?– seektime:– rotationaldelay:– transfertime:
19
![Page 20: CS 4604: Introduction to Database Management Systemscourses.cs.vt.edu/~cs4604/Fall18/lectures/lecture-8.pdf§ Extra office hours till midterm – Check Piazza post Prakash 2018 VT](https://reader033.fdocuments.us/reader033/viewer/2022042010/5e71fef9aa21a56404626eaa/html5/thumbnails/20.jpg)
Prakash2018 VTCS4604
AccessingaDiskPage
§ Relativetimes?– seektime:about1to20msec– rotationaldelay:0to10msec– transfertime:<1msecper4KBpage
Transfer
Seek
Rotate
transfer
20
![Page 21: CS 4604: Introduction to Database Management Systemscourses.cs.vt.edu/~cs4604/Fall18/lectures/lecture-8.pdf§ Extra office hours till midterm – Check Piazza post Prakash 2018 VT](https://reader033.fdocuments.us/reader033/viewer/2022042010/5e71fef9aa21a56404626eaa/html5/thumbnails/21.jpg)
Prakash2018 VTCS4604
Seektime&rotationaldelaydominate
§ KeytolowerI/Ocost:reduceseek/rotationdelays!
§ Alsonote:Forshareddisks,muchtimespentwaitinginqueueforaccesstoarm/controller
Seek
Rotate
transfer
21
![Page 22: CS 4604: Introduction to Database Management Systemscourses.cs.vt.edu/~cs4604/Fall18/lectures/lecture-8.pdf§ Extra office hours till midterm – Check Piazza post Prakash 2018 VT](https://reader033.fdocuments.us/reader033/viewer/2022042010/5e71fef9aa21a56404626eaa/html5/thumbnails/22.jpg)
Prakash2018 VTCS4604
ArrangingPagesonDisk
§ “Next” blockconcept:– blocksonsametrack,followedby– blocksonsamecylinder,followedby– blocksonadjacentcylinder
§ Accesing‘next’blockischeap§ Ausefuloptimization:pre-fetching
– Seetextbookpage323
22
![Page 23: CS 4604: Introduction to Database Management Systemscourses.cs.vt.edu/~cs4604/Fall18/lectures/lecture-8.pdf§ Extra office hours till midterm – Check Piazza post Prakash 2018 VT](https://reader033.fdocuments.us/reader033/viewer/2022042010/5e71fef9aa21a56404626eaa/html5/thumbnails/23.jpg)
Prakash2018 VTCS4604
Rulesofthumb…
1. MemoryaccessmuchfasterthandiskI/O(~1000x)
§ “Sequential”I/Ofasterthan“random”I/O(~10x)
23
![Page 24: CS 4604: Introduction to Database Management Systemscourses.cs.vt.edu/~cs4604/Fall18/lectures/lecture-8.pdf§ Extra office hours till midterm – Check Piazza post Prakash 2018 VT](https://reader033.fdocuments.us/reader033/viewer/2022042010/5e71fef9aa21a56404626eaa/html5/thumbnails/24.jpg)
Prakash2018 VTCS4604
Conclusions---Storing
§ Memoryhierarchy§ Disks:(>1000xslower)-thus
– packinfoinblocks– trytofetchnearbyblocks(sequentially)
24
![Page 25: CS 4604: Introduction to Database Management Systemscourses.cs.vt.edu/~cs4604/Fall18/lectures/lecture-8.pdf§ Extra office hours till midterm – Check Piazza post Prakash 2018 VT](https://reader033.fdocuments.us/reader033/viewer/2022042010/5e71fef9aa21a56404626eaa/html5/thumbnails/25.jpg)
TREEINDEXES
Prakash2018 VTCS4604 25
![Page 26: CS 4604: Introduction to Database Management Systemscourses.cs.vt.edu/~cs4604/Fall18/lectures/lecture-8.pdf§ Extra office hours till midterm – Check Piazza post Prakash 2018 VT](https://reader033.fdocuments.us/reader033/viewer/2022042010/5e71fef9aa21a56404626eaa/html5/thumbnails/26.jpg)
DeclaringIndexes
§ Nostandard!§ Typicalsyntax:CREATE INDEX StudentsInd ON Students(ID);
CREATE INDEX CoursesInd ON Courses(Number, DeptName);
Prakash2018 VTCS4604 26
![Page 27: CS 4604: Introduction to Database Management Systemscourses.cs.vt.edu/~cs4604/Fall18/lectures/lecture-8.pdf§ Extra office hours till midterm – Check Piazza post Prakash 2018 VT](https://reader033.fdocuments.us/reader033/viewer/2022042010/5e71fef9aa21a56404626eaa/html5/thumbnails/27.jpg)
TypesofIndexes
§ Primary:indexonakey– Usedtoenforceconstraints
§ Secondary:indexonnon-keyattribute§ Clustering:orderoftherowsinthedatapagescorrespondtotheorderoftherowsintheindex– Onlyoneclusteredindexcanexistinagiventable– Usefulforrangepredicates
§ Non-clustering:physicalordernotthesameasindexorder
Prakash2018 VTCS4604 27
![Page 28: CS 4604: Introduction to Database Management Systemscourses.cs.vt.edu/~cs4604/Fall18/lectures/lecture-8.pdf§ Extra office hours till midterm – Check Piazza post Prakash 2018 VT](https://reader033.fdocuments.us/reader033/viewer/2022042010/5e71fef9aa21a56404626eaa/html5/thumbnails/28.jpg)
UsingIndexes(1):EqualitySearches
§ Givenavaluev,theindextakesustoonlythosetuplesthathavevintheattribute(s)oftheindex.
§ E.g.(useCourseIndindex)SELECT Enrollment FROM Courses WHERE Number = “4604” and DeptName = “CS”
Prakash2018 VTCS4604 28
![Page 29: CS 4604: Introduction to Database Management Systemscourses.cs.vt.edu/~cs4604/Fall18/lectures/lecture-8.pdf§ Extra office hours till midterm – Check Piazza post Prakash 2018 VT](https://reader033.fdocuments.us/reader033/viewer/2022042010/5e71fef9aa21a56404626eaa/html5/thumbnails/29.jpg)
UsingIndexes(1):EqualitySearches
§ Givenavaluev,theindextakesustoonlythosetuplesthathavevintheattribute(s)oftheindex.
§ CanuseHashes,butseenext
Prakash2018 VTCS4604 29
![Page 30: CS 4604: Introduction to Database Management Systemscourses.cs.vt.edu/~cs4604/Fall18/lectures/lecture-8.pdf§ Extra office hours till midterm – Check Piazza post Prakash 2018 VT](https://reader033.fdocuments.us/reader033/viewer/2022042010/5e71fef9aa21a56404626eaa/html5/thumbnails/30.jpg)
UsingIndexes(2):RangeSearches
§ ``Findallstudentswithgpa>3.0’’§ maybeslow,evenonsortedfile§ Hashesnotagoodidea!§ Whattodo?
Prakash2018 VTCS4604
Page 1 Page 2 Page N Page 3 Data File
30
![Page 31: CS 4604: Introduction to Database Management Systemscourses.cs.vt.edu/~cs4604/Fall18/lectures/lecture-8.pdf§ Extra office hours till midterm – Check Piazza post Prakash 2018 VT](https://reader033.fdocuments.us/reader033/viewer/2022042010/5e71fef9aa21a56404626eaa/html5/thumbnails/31.jpg)
RangeSearches
§ ``Findallstudentswithgpa>3.0’’§ maybeslow,evenonsortedfile§ Solution:Createan`index’file.
Prakash2018 VTCS4604
Page 1 Page 2 Page N Page 3 Data File
k2 kN k1 Index File
31
![Page 32: CS 4604: Introduction to Database Management Systemscourses.cs.vt.edu/~cs4604/Fall18/lectures/lecture-8.pdf§ Extra office hours till midterm – Check Piazza post Prakash 2018 VT](https://reader033.fdocuments.us/reader033/viewer/2022042010/5e71fef9aa21a56404626eaa/html5/thumbnails/32.jpg)
RangeSearches
§ Moredetails:§ ifindexfileissmall,dobinarysearchthere§ Otherwise??
Prakash2018 VTCS4604
Page 1 Page 2 Page N Page 3 Data File
k2 kN k1 Index File
32
![Page 33: CS 4604: Introduction to Database Management Systemscourses.cs.vt.edu/~cs4604/Fall18/lectures/lecture-8.pdf§ Extra office hours till midterm – Check Piazza post Prakash 2018 VT](https://reader033.fdocuments.us/reader033/viewer/2022042010/5e71fef9aa21a56404626eaa/html5/thumbnails/33.jpg)
B-trees
§ themostsuccessfulfamilyofindexschemes(B-trees,B+-trees,B*-trees)
§ Canbeusedforprimary/secondary,clustering/non-clusteringindex.
§ balanced“n-way”searchtrees§ OriginalPaper:RudolfBayerandMcCreight,E.M.OrganizationandMaintenanceofLargeOrderedIndexes.ActaInformatica1,173-189,1972.
Prakash2018 VTCS4604 33
![Page 34: CS 4604: Introduction to Database Management Systemscourses.cs.vt.edu/~cs4604/Fall18/lectures/lecture-8.pdf§ Extra office hours till midterm – Check Piazza post Prakash 2018 VT](https://reader033.fdocuments.us/reader033/viewer/2022042010/5e71fef9aa21a56404626eaa/html5/thumbnails/34.jpg)
B-trees
§ Eg.,B-treeoforderd=1:
Prakash2018 VTCS4604
1 3
6
7
9
13
<6
>6 <9>9
34
![Page 35: CS 4604: Introduction to Database Management Systemscourses.cs.vt.edu/~cs4604/Fall18/lectures/lecture-8.pdf§ Extra office hours till midterm – Check Piazza post Prakash 2018 VT](https://reader033.fdocuments.us/reader033/viewer/2022042010/5e71fef9aa21a56404626eaa/html5/thumbnails/35.jpg)
B-treeproperties:
§ eachnode,inaB-treeoforderd:– Keyorder– atmostn=2dkeys– atleastdkeys(exceptroot,whichmayhavejust1key)
– allleavesatthesamelevel– ifnumberofpointersisk,thennodehasexactlyk-1keys
– (leavesareempty)
Prakash2018 VTCS4604
v1 v2 … vn-1
p1 pn
35
![Page 36: CS 4604: Introduction to Database Management Systemscourses.cs.vt.edu/~cs4604/Fall18/lectures/lecture-8.pdf§ Extra office hours till midterm – Check Piazza post Prakash 2018 VT](https://reader033.fdocuments.us/reader033/viewer/2022042010/5e71fef9aa21a56404626eaa/html5/thumbnails/36.jpg)
Properties
§ “blockaware”nodes:eachnodeisadiskpage§ O(log(N))foreverything!(ins/del/search)§ typically,ifd=50-100,then2-3levels§ utilization>=50%,guaranteed;onaverage69%
Prakash2018 VTCS4604 36
![Page 37: CS 4604: Introduction to Database Management Systemscourses.cs.vt.edu/~cs4604/Fall18/lectures/lecture-8.pdf§ Extra office hours till midterm – Check Piazza post Prakash 2018 VT](https://reader033.fdocuments.us/reader033/viewer/2022042010/5e71fef9aa21a56404626eaa/html5/thumbnails/37.jpg)
Queries
§ Algoforexactmatchquery?(eg.,ssn=8?)
Prakash2018 VTCS4604
1 3
6
7
9
13
<6
>6 <9>9
37
![Page 38: CS 4604: Introduction to Database Management Systemscourses.cs.vt.edu/~cs4604/Fall18/lectures/lecture-8.pdf§ Extra office hours till midterm – Check Piazza post Prakash 2018 VT](https://reader033.fdocuments.us/reader033/viewer/2022042010/5e71fef9aa21a56404626eaa/html5/thumbnails/38.jpg)
JAVAanimation
§ http://slady.net/java/bt/
Prakash2018 VTCS4604 38
![Page 39: CS 4604: Introduction to Database Management Systemscourses.cs.vt.edu/~cs4604/Fall18/lectures/lecture-8.pdf§ Extra office hours till midterm – Check Piazza post Prakash 2018 VT](https://reader033.fdocuments.us/reader033/viewer/2022042010/5e71fef9aa21a56404626eaa/html5/thumbnails/39.jpg)
Queries
§ Algoforexactmatchquery?(eg.,ssn=8?)
Prakash2018 VTCS4604
1 3
6
7
9
13
<6
>6 <9>9
39
![Page 40: CS 4604: Introduction to Database Management Systemscourses.cs.vt.edu/~cs4604/Fall18/lectures/lecture-8.pdf§ Extra office hours till midterm – Check Piazza post Prakash 2018 VT](https://reader033.fdocuments.us/reader033/viewer/2022042010/5e71fef9aa21a56404626eaa/html5/thumbnails/40.jpg)
Queries
§ Algoforexactmatchquery?(eg.,ssn=8?)
Prakash2018 VTCS4604
1 3
6
7
9
13
<6
>6 <9>9
40
![Page 41: CS 4604: Introduction to Database Management Systemscourses.cs.vt.edu/~cs4604/Fall18/lectures/lecture-8.pdf§ Extra office hours till midterm – Check Piazza post Prakash 2018 VT](https://reader033.fdocuments.us/reader033/viewer/2022042010/5e71fef9aa21a56404626eaa/html5/thumbnails/41.jpg)
Queries
§ Algoforexactmatchquery?(eg.,ssn=8?)
Prakash2018 VTCS4604
1 3
6
7
9
13
<6
>6 <9>9
41
![Page 42: CS 4604: Introduction to Database Management Systemscourses.cs.vt.edu/~cs4604/Fall18/lectures/lecture-8.pdf§ Extra office hours till midterm – Check Piazza post Prakash 2018 VT](https://reader033.fdocuments.us/reader033/viewer/2022042010/5e71fef9aa21a56404626eaa/html5/thumbnails/42.jpg)
Queries
§ Algoforexactmatchquery?(eg.,ssn=8?)
Prakash2018 VTCS4604
1 3
6
7
9
13
<6
>6 <9>9
Hsteps(=diskaccesses)
42
![Page 43: CS 4604: Introduction to Database Management Systemscourses.cs.vt.edu/~cs4604/Fall18/lectures/lecture-8.pdf§ Extra office hours till midterm – Check Piazza post Prakash 2018 VT](https://reader033.fdocuments.us/reader033/viewer/2022042010/5e71fef9aa21a56404626eaa/html5/thumbnails/43.jpg)
Queries
§ whataboutrangequeries?(eg.,5<salary<8)§ Proximity/nearestneighborsearches?(eg.,salary~8)
Prakash2018 VTCS4604 43
![Page 44: CS 4604: Introduction to Database Management Systemscourses.cs.vt.edu/~cs4604/Fall18/lectures/lecture-8.pdf§ Extra office hours till midterm – Check Piazza post Prakash 2018 VT](https://reader033.fdocuments.us/reader033/viewer/2022042010/5e71fef9aa21a56404626eaa/html5/thumbnails/44.jpg)
Queries
§ whataboutrangequeries?(eg.,5<salary<8)§ Proximity/nearestneighborsearches?(eg.,salary~8)
Prakash2018 VTCS4604
1 3
6
7
9
13
<6
>6 <9>9
44
![Page 45: CS 4604: Introduction to Database Management Systemscourses.cs.vt.edu/~cs4604/Fall18/lectures/lecture-8.pdf§ Extra office hours till midterm – Check Piazza post Prakash 2018 VT](https://reader033.fdocuments.us/reader033/viewer/2022042010/5e71fef9aa21a56404626eaa/html5/thumbnails/45.jpg)
Queries
§ whataboutrangequeries?(eg.,5<salary<8)§ Proximity/nearestneighborsearches?(eg.,salary~8)
Prakash2018 VTCS4604
1 3
6
7
9
13
<6
>6 <9>9
45
![Page 46: CS 4604: Introduction to Database Management Systemscourses.cs.vt.edu/~cs4604/Fall18/lectures/lecture-8.pdf§ Extra office hours till midterm – Check Piazza post Prakash 2018 VT](https://reader033.fdocuments.us/reader033/viewer/2022042010/5e71fef9aa21a56404626eaa/html5/thumbnails/46.jpg)
Queries
§ whataboutrangequeries?(eg.,5<salary<8)§ Proximity/nearestneighborsearches?(eg.,salary~8)
Prakash2018 VTCS4604
1 3
6
7
9
13
<6
>6 <9>9
46
![Page 47: CS 4604: Introduction to Database Management Systemscourses.cs.vt.edu/~cs4604/Fall18/lectures/lecture-8.pdf§ Extra office hours till midterm – Check Piazza post Prakash 2018 VT](https://reader033.fdocuments.us/reader033/viewer/2022042010/5e71fef9aa21a56404626eaa/html5/thumbnails/47.jpg)
Queries
§ whataboutrangequeries?(eg.,5<salary<8)§ Proximity/nearestneighborsearches?(eg.,salary~8)
Prakash2018 VTCS4604
1 3
6
7
9
13
<6
>6 <9>9
47
![Page 48: CS 4604: Introduction to Database Management Systemscourses.cs.vt.edu/~cs4604/Fall18/lectures/lecture-8.pdf§ Extra office hours till midterm – Check Piazza post Prakash 2018 VT](https://reader033.fdocuments.us/reader033/viewer/2022042010/5e71fef9aa21a56404626eaa/html5/thumbnails/48.jpg)
Variations
§ HowcouldwedoevenbetterthantheB-treesabove?
Prakash2018 VTCS4604 48
![Page 49: CS 4604: Introduction to Database Management Systemscourses.cs.vt.edu/~cs4604/Fall18/lectures/lecture-8.pdf§ Extra office hours till midterm – Check Piazza post Prakash 2018 VT](https://reader033.fdocuments.us/reader033/viewer/2022042010/5e71fef9aa21a56404626eaa/html5/thumbnails/49.jpg)
B+trees-Motivation
§ B-tree–printkeysinsortedorder:
Prakash2018 VTCS4604
1 3
6
7
9
13
<6
>6 <9>9
49
![Page 50: CS 4604: Introduction to Database Management Systemscourses.cs.vt.edu/~cs4604/Fall18/lectures/lecture-8.pdf§ Extra office hours till midterm – Check Piazza post Prakash 2018 VT](https://reader033.fdocuments.us/reader033/viewer/2022042010/5e71fef9aa21a56404626eaa/html5/thumbnails/50.jpg)
B+trees-Motivation
§ B-treeneedsback-tracking–howtoavoidit?
Prakash2018 VTCS4604
1 3
6
7
9
13
<6
>6 <9>9
50
![Page 51: CS 4604: Introduction to Database Management Systemscourses.cs.vt.edu/~cs4604/Fall18/lectures/lecture-8.pdf§ Extra office hours till midterm – Check Piazza post Prakash 2018 VT](https://reader033.fdocuments.us/reader033/viewer/2022042010/5e71fef9aa21a56404626eaa/html5/thumbnails/51.jpg)
B+trees-Motivation
§ Strongerreason:forclusteringindex,datarecordsarescattered:
Prakash2018 VTCS4604
1 3
6
7
9
13
<6
>6 <9>9
51
![Page 52: CS 4604: Introduction to Database Management Systemscourses.cs.vt.edu/~cs4604/Fall18/lectures/lecture-8.pdf§ Extra office hours till midterm – Check Piazza post Prakash 2018 VT](https://reader033.fdocuments.us/reader033/viewer/2022042010/5e71fef9aa21a56404626eaa/html5/thumbnails/52.jpg)
Solution:B+-trees
§ facilitatesequentialops§ Theystringallleafnodestogether§ AND§ replicatekeysfromnon-leafnodes,tomakesureeverykeyappearsattheleaflevel
§ (vital,forclusteringindex!)
Prakash2018 VTCS4604 52
![Page 53: CS 4604: Introduction to Database Management Systemscourses.cs.vt.edu/~cs4604/Fall18/lectures/lecture-8.pdf§ Extra office hours till midterm – Check Piazza post Prakash 2018 VT](https://reader033.fdocuments.us/reader033/viewer/2022042010/5e71fef9aa21a56404626eaa/html5/thumbnails/53.jpg)
B+trees
Prakash2018 VTCS4604
1 3
6
6
9
9
<6
>=6 <9>=9
7 13
53
![Page 54: CS 4604: Introduction to Database Management Systemscourses.cs.vt.edu/~cs4604/Fall18/lectures/lecture-8.pdf§ Extra office hours till midterm – Check Piazza post Prakash 2018 VT](https://reader033.fdocuments.us/reader033/viewer/2022042010/5e71fef9aa21a56404626eaa/html5/thumbnails/54.jpg)
B+trees
Prakash2018 VTCS4604
1 3
6
6
9
9
<6
>=6 <9>=9
7 13
IndexPages
DataPages
54
![Page 55: CS 4604: Introduction to Database Management Systemscourses.cs.vt.edu/~cs4604/Fall18/lectures/lecture-8.pdf§ Extra office hours till midterm – Check Piazza post Prakash 2018 VT](https://reader033.fdocuments.us/reader033/viewer/2022042010/5e71fef9aa21a56404626eaa/html5/thumbnails/55.jpg)
B+trees
§ Moredetails:next(andtextbook)§ Inshort:onsplit
– atleaflevel:COPYmiddlekeyupstairs– atnon-leaflevel:pushmiddlekeyupstairs(asinplainB-tree)
Prakash2018 VTCS4604 55
![Page 56: CS 4604: Introduction to Database Management Systemscourses.cs.vt.edu/~cs4604/Fall18/lectures/lecture-8.pdf§ Extra office hours till midterm – Check Piazza post Prakash 2018 VT](https://reader033.fdocuments.us/reader033/viewer/2022042010/5e71fef9aa21a56404626eaa/html5/thumbnails/56.jpg)
ExampleB+Tree
§ Searchbeginsatroot,andkeycomparisonsdirectittoaleaf
§ Searchfor5*,15*,alldataentries>=24*...
Prakash2018 VTCS4604
Based on the search for 15*, we know it is not in the tree!
Root
17 24 30
2* 3* 5* 7* 14* 16* 19* 20* 22* 24* 27* 29* 33* 34* 38* 39*
13
56
![Page 57: CS 4604: Introduction to Database Management Systemscourses.cs.vt.edu/~cs4604/Fall18/lectures/lecture-8.pdf§ Extra office hours till midterm – Check Piazza post Prakash 2018 VT](https://reader033.fdocuments.us/reader033/viewer/2022042010/5e71fef9aa21a56404626eaa/html5/thumbnails/57.jpg)
InsertingaDataEntryintoaB+Tree
§ FindcorrectleafL.§ PutdataentryontoL.
– IfLhasenoughspace,done!– Else,mustsplitL(intoLandanewnodeL2)
• Redistributeentriesevenly,copyupmiddlekey.
§ parentnodemayoverflow– butthen:pushupmiddlekey.Splits“grow”tree;rootsplitincreasesheight.
Prakash2018 VTCS4604 57
![Page 58: CS 4604: Introduction to Database Management Systemscourses.cs.vt.edu/~cs4604/Fall18/lectures/lecture-8.pdf§ Extra office hours till midterm – Check Piazza post Prakash 2018 VT](https://reader033.fdocuments.us/reader033/viewer/2022042010/5e71fef9aa21a56404626eaa/html5/thumbnails/58.jpg)
ExampleB+Tree–Inserting30*
Prakash2018 VTCS4604
Root
17 24
2* 3* 5* 7* 14* 16* 19* 20* 22* 24* 27* 29*
13
23*
58
![Page 59: CS 4604: Introduction to Database Management Systemscourses.cs.vt.edu/~cs4604/Fall18/lectures/lecture-8.pdf§ Extra office hours till midterm – Check Piazza post Prakash 2018 VT](https://reader033.fdocuments.us/reader033/viewer/2022042010/5e71fef9aa21a56404626eaa/html5/thumbnails/59.jpg)
ExampleB+Tree–Inserting30*
Prakash2018 VTCS4604
Root
17 24
2* 3* 5* 7* 14* 16* 19* 20* 22* 24* 27* 29*
13
23* 30*
59
![Page 60: CS 4604: Introduction to Database Management Systemscourses.cs.vt.edu/~cs4604/Fall18/lectures/lecture-8.pdf§ Extra office hours till midterm – Check Piazza post Prakash 2018 VT](https://reader033.fdocuments.us/reader033/viewer/2022042010/5e71fef9aa21a56404626eaa/html5/thumbnails/60.jpg)
ExampleB+Tree-Inserting8*
Prakash2018 VTCS4604
Root
17 24
2* 3* 5* 7* 14* 16* 19* 20* 22* 24* 27* 29*
13
23*
60
![Page 61: CS 4604: Introduction to Database Management Systemscourses.cs.vt.edu/~cs4604/Fall18/lectures/lecture-8.pdf§ Extra office hours till midterm – Check Piazza post Prakash 2018 VT](https://reader033.fdocuments.us/reader033/viewer/2022042010/5e71fef9aa21a56404626eaa/html5/thumbnails/61.jpg)
ExampleB+Tree-Inserting8*
Prakash2018 VTCS4604
Root
17 24
2* 3* 5* 7* 14* 16* 19* 20* 22* 24* 27* 29*
13
23*
NoSpace
61
![Page 62: CS 4604: Introduction to Database Management Systemscourses.cs.vt.edu/~cs4604/Fall18/lectures/lecture-8.pdf§ Extra office hours till midterm – Check Piazza post Prakash 2018 VT](https://reader033.fdocuments.us/reader033/viewer/2022042010/5e71fef9aa21a56404626eaa/html5/thumbnails/62.jpg)
Prakash2018 VTCS4604
ExampleB+Tree-Inserting8*Root
17 24
2* 3* 5* 7* 14* 16* 19* 20* 22* 24* 27* 29*
13
23*
2* 3* 5* 14* 16* 19* 20* 22* 24* 27* 29* 23* 7* 8*
13 17 24
5*
SoSplit!
62
![Page 63: CS 4604: Introduction to Database Management Systemscourses.cs.vt.edu/~cs4604/Fall18/lectures/lecture-8.pdf§ Extra office hours till midterm – Check Piazza post Prakash 2018 VT](https://reader033.fdocuments.us/reader033/viewer/2022042010/5e71fef9aa21a56404626eaa/html5/thumbnails/63.jpg)
Prakash2018 VTCS4604
ExampleB+Tree-Inserting8*Root
17 24
2* 3* 5* 7* 14* 16* 19* 20* 22* 24* 27* 29*
13
23*
2* 3* 5* 14* 16* 19* 20* 22* 24* 27* 29* 23* 7* 8*
13 17 24
5*
SoSplit!
AndthenpushmiddleUP
63
![Page 64: CS 4604: Introduction to Database Management Systemscourses.cs.vt.edu/~cs4604/Fall18/lectures/lecture-8.pdf§ Extra office hours till midterm – Check Piazza post Prakash 2018 VT](https://reader033.fdocuments.us/reader033/viewer/2022042010/5e71fef9aa21a56404626eaa/html5/thumbnails/64.jpg)
Prakash2018 VTCS4604
ExampleB+Tree-Inserting8*Root
17 24
2* 3* 5* 7* 14* 16* 19* 20* 22* 24* 27* 29*
13
23*
2* 3* 14* 16* 19* 20* 22* 24* 27* 29* 23* 7* 8*
5 13 17 24
5*
<5 >=5
FinalState
64
![Page 65: CS 4604: Introduction to Database Management Systemscourses.cs.vt.edu/~cs4604/Fall18/lectures/lecture-8.pdf§ Extra office hours till midterm – Check Piazza post Prakash 2018 VT](https://reader033.fdocuments.us/reader033/viewer/2022042010/5e71fef9aa21a56404626eaa/html5/thumbnails/65.jpg)
ExampleB+Tree-Inserting21*
Prakash2018 VTCS4604
2* 3*
Root
5
14* 16* 19* 20* 22* 24* 27* 29* 7* 5* 8*
13 17 24
23*
2* 3* 14* 16* 19* 20* 22* 24* 27* 29* 7* 5* 8* 23*
65
![Page 66: CS 4604: Introduction to Database Management Systemscourses.cs.vt.edu/~cs4604/Fall18/lectures/lecture-8.pdf§ Extra office hours till midterm – Check Piazza post Prakash 2018 VT](https://reader033.fdocuments.us/reader033/viewer/2022042010/5e71fef9aa21a56404626eaa/html5/thumbnails/66.jpg)
ExampleB+Tree-Inserting21*
Prakash2018 VTCS4604
2* 3*
Root
5
14* 16* 19* 20* 22* 24* 27* 29* 7* 5* 8*
13 17 24
23*
2* 3* 14* 16* 19* 20* 24* 27* 29* 7* 5* 8* 21* 22* 23*
17 21 24 13 5 RootisFull,sosplitrecursively
66
![Page 67: CS 4604: Introduction to Database Management Systemscourses.cs.vt.edu/~cs4604/Fall18/lectures/lecture-8.pdf§ Extra office hours till midterm – Check Piazza post Prakash 2018 VT](https://reader033.fdocuments.us/reader033/viewer/2022042010/5e71fef9aa21a56404626eaa/html5/thumbnails/67.jpg)
ExampleB+Tree:Recursivesplit
Prakash2018 VTCS4604
• Notice that root was also split, increasing height.
2* 3*
Root
17
21 24
14* 16* 19* 20* 21* 22* 23* 24* 27* 29*
13 5
7* 5* 8*
67
![Page 68: CS 4604: Introduction to Database Management Systemscourses.cs.vt.edu/~cs4604/Fall18/lectures/lecture-8.pdf§ Extra office hours till midterm – Check Piazza post Prakash 2018 VT](https://reader033.fdocuments.us/reader033/viewer/2022042010/5e71fef9aa21a56404626eaa/html5/thumbnails/68.jpg)
Prakash2018 VTCS4604
Example:Datavs.IndexPageSplit
§ leaf:‘copy’§ non-leaf:‘push’
§ whynot‘copy’@non-leaves?
2* 3* 5* 7* 8*
5
5 21 24
17
13
… 2* 3* 5* 7*
17 21 24 13
Data Page Split
Index Page Split
8*
5
#68
![Page 69: CS 4604: Introduction to Database Management Systemscourses.cs.vt.edu/~cs4604/Fall18/lectures/lecture-8.pdf§ Extra office hours till midterm – Check Piazza post Prakash 2018 VT](https://reader033.fdocuments.us/reader033/viewer/2022042010/5e71fef9aa21a56404626eaa/html5/thumbnails/69.jpg)
SameInserting21*:TheDeferredSplit
Prakash2018 VTCS4604
2* 3*
Root
5
14* 16* 19* 20* 22* 24* 27* 29* 7* 5* 8*
13 17 24
23*
Notethishasfreespace.So…
69
![Page 70: CS 4604: Introduction to Database Management Systemscourses.cs.vt.edu/~cs4604/Fall18/lectures/lecture-8.pdf§ Extra office hours till midterm – Check Piazza post Prakash 2018 VT](https://reader033.fdocuments.us/reader033/viewer/2022042010/5e71fef9aa21a56404626eaa/html5/thumbnails/70.jpg)
Inserting21*:TheDeferredSplit
Prakash2018 VTCS4604
2* 3*
Root
5
14* 16* 19* 20* 22* 24* 27* 29* 7* 5* 8*
13 17 24
23*
LENDkeystosibling,throughPARENT!
2* 3*
Root
5
14* 16* 19* 20* 21* 23* 24* 27* 7* 5* 8*
13 17 23
22* 29*
70
![Page 71: CS 4604: Introduction to Database Management Systemscourses.cs.vt.edu/~cs4604/Fall18/lectures/lecture-8.pdf§ Extra office hours till midterm – Check Piazza post Prakash 2018 VT](https://reader033.fdocuments.us/reader033/viewer/2022042010/5e71fef9aa21a56404626eaa/html5/thumbnails/71.jpg)
Inserting21*:TheDeferredSplit
Prakash2018 VTCS4604
2* 3*
Root
5
14* 16* 19* 20* 22* 24* 27* 29* 7* 5* 8*
13 17 24
23*
Shorter,morepacked,fastertree
2* 3*
Root
5
14* 16* 19* 20* 21* 23* 24* 27* 7* 5* 8*
13 17 23
22* 29*
71
![Page 72: CS 4604: Introduction to Database Management Systemscourses.cs.vt.edu/~cs4604/Fall18/lectures/lecture-8.pdf§ Extra office hours till midterm – Check Piazza post Prakash 2018 VT](https://reader033.fdocuments.us/reader033/viewer/2022042010/5e71fef9aa21a56404626eaa/html5/thumbnails/72.jpg)
Insertionexamplesforyoutotry
Prakash2018 VTCS4604
2* 3*
Root
30
14* 16* 21* 22* 23*
13 5
7* 5* 8*
20 … (not shown)
11*
Insert the following data entries (in order): 28*, 6*, 25*
72
![Page 73: CS 4604: Introduction to Database Management Systemscourses.cs.vt.edu/~cs4604/Fall18/lectures/lecture-8.pdf§ Extra office hours till midterm – Check Piazza post Prakash 2018 VT](https://reader033.fdocuments.us/reader033/viewer/2022042010/5e71fef9aa21a56404626eaa/html5/thumbnails/73.jpg)
Answer…
Prakash2018 VTCS4604
2* 3*
30
7* 8* 14* 16*
7 5
6* 5*
13 …
After inserting 28*, 6*
After inserting 25*
21* 22* 23* 28*
20
11*
73
![Page 74: CS 4604: Introduction to Database Management Systemscourses.cs.vt.edu/~cs4604/Fall18/lectures/lecture-8.pdf§ Extra office hours till midterm – Check Piazza post Prakash 2018 VT](https://reader033.fdocuments.us/reader033/viewer/2022042010/5e71fef9aa21a56404626eaa/html5/thumbnails/74.jpg)
Answer…
Prakash2018 VTCS4604
2* 3*
13
20 23
7* 8* 14* 16* 21* 22* 23* 25* 28*
7 5
6* 5*
30
…
11*
After inserting 25*
74
![Page 75: CS 4604: Introduction to Database Management Systemscourses.cs.vt.edu/~cs4604/Fall18/lectures/lecture-8.pdf§ Extra office hours till midterm – Check Piazza post Prakash 2018 VT](https://reader033.fdocuments.us/reader033/viewer/2022042010/5e71fef9aa21a56404626eaa/html5/thumbnails/75.jpg)
DeletingaDataEntryfromaB+Tree
§ Startatroot,findleafLwhereentrybelongs.§ Removetheentry.
– IfLisatleasthalf-full,done!– IfLunderflows
• Trytore-distribute,borrowingfromsibling(adjacentnodewithsameparentasL).
• Ifre-distributionfails,mergeLandsibling.– updateparent– andpossiblymerge,recursively
Prakash2018 VTCS4604 75
![Page 76: CS 4604: Introduction to Database Management Systemscourses.cs.vt.edu/~cs4604/Fall18/lectures/lecture-8.pdf§ Extra office hours till midterm – Check Piazza post Prakash 2018 VT](https://reader033.fdocuments.us/reader033/viewer/2022042010/5e71fef9aa21a56404626eaa/html5/thumbnails/76.jpg)
DeletionfromB+Tree
Prakash2018 VTCS4604 76
2* 3*
Root 17
24 30
14* 16* 19* 20* 22* 24* 27* 29* 33* 34* 38* 39*
13 5
7* 5* 8*
1
![Page 77: CS 4604: Introduction to Database Management Systemscourses.cs.vt.edu/~cs4604/Fall18/lectures/lecture-8.pdf§ Extra office hours till midterm – Check Piazza post Prakash 2018 VT](https://reader033.fdocuments.us/reader033/viewer/2022042010/5e71fef9aa21a56404626eaa/html5/thumbnails/77.jpg)
Prakash2018 VTCS4604
Example:Delete19*&20*
Deleting19*iseasy:
2* 3*
Root 17
24 30
14* 16* 19* 20* 22* 24* 27* 29* 33* 34* 38* 39*
13 5
7* 5* 8*
2* 3*
Root 17
30
14* 16* 33* 34* 38* 39*
13 5
7* 5* 8* 22* 24*
27
27* 29*
20* 22*
• Deleting20*->re-distribution(notice:27copiedup)
1 2
3
77
![Page 78: CS 4604: Introduction to Database Management Systemscourses.cs.vt.edu/~cs4604/Fall18/lectures/lecture-8.pdf§ Extra office hours till midterm – Check Piazza post Prakash 2018 VT](https://reader033.fdocuments.us/reader033/viewer/2022042010/5e71fef9aa21a56404626eaa/html5/thumbnails/78.jpg)
Prakash2018 VTCS4604
2* 3*
Root 17
30
14* 16* 33* 34* 38* 39*
13 5
7* 5* 8* 22* 24*
27
27* 29*
...AndThenDeleting24*
2* 3*
Root 17
14* 16* 33* 34* 38* 39*
13 5
7* 5* 8* 22* 27*
30
29*
• Mustmergeleaves:OPPOSITEofinsert
3
4
78
![Page 79: CS 4604: Introduction to Database Management Systemscourses.cs.vt.edu/~cs4604/Fall18/lectures/lecture-8.pdf§ Extra office hours till midterm – Check Piazza post Prakash 2018 VT](https://reader033.fdocuments.us/reader033/viewer/2022042010/5e71fef9aa21a56404626eaa/html5/thumbnails/79.jpg)
Prakash2018 VTCS4604
2* 3*
Root 17
30
14* 16* 33* 34* 38* 39*
13 5
7* 5* 8* 22* 24*
27
27* 29*
...AndThenDeleting24*
2* 3*
Root 17
14* 16* 33* 34* 38* 39*
13 5
7* 5* 8* 22* 27*
30
29*
• Mustmergeleaves:OPPOSITEofinsert
…butarewedone??
3
4
79
![Page 80: CS 4604: Introduction to Database Management Systemscourses.cs.vt.edu/~cs4604/Fall18/lectures/lecture-8.pdf§ Extra office hours till midterm – Check Piazza post Prakash 2018 VT](https://reader033.fdocuments.us/reader033/viewer/2022042010/5e71fef9aa21a56404626eaa/html5/thumbnails/80.jpg)
...MergeNon-LeafNodes,ShrinkTree
Prakash2018 VTCS4604
2* 3*
Root 17
14* 16* 33* 34* 38* 39*
13 5
7* 5* 8* 22* 27*
30
29*
4
2* 3* 7* 14* 16* 22* 27* 29* 33* 34* 38* 39* 5* 8*
Root 30 13 5 17
5
80
![Page 81: CS 4604: Introduction to Database Management Systemscourses.cs.vt.edu/~cs4604/Fall18/lectures/lecture-8.pdf§ Extra office hours till midterm – Check Piazza post Prakash 2018 VT](https://reader033.fdocuments.us/reader033/viewer/2022042010/5e71fef9aa21a56404626eaa/html5/thumbnails/81.jpg)
ExampleofNon-leafRe-distribution
§ Treeisshownbelowduringdeletionof24*.§ Now,wecanre-distributekeys
Prakash2018 VTCS4604
Root
13 5 17 20
22
30
14* 16* 17* 18* 20* 33* 34* 38* 39* 22* 27* 29* 21* 7* 5* 8* 3* 2*
81
![Page 82: CS 4604: Introduction to Database Management Systemscourses.cs.vt.edu/~cs4604/Fall18/lectures/lecture-8.pdf§ Extra office hours till midterm – Check Piazza post Prakash 2018 VT](https://reader033.fdocuments.us/reader033/viewer/2022042010/5e71fef9aa21a56404626eaa/html5/thumbnails/82.jpg)
AfterRe-distribution
§ needonlyre-distribute‘20’;did‘17’,too§ whywouldwewanttore-distributemorekeys?Ans:reduceslikelihoodofsplit(seeBook,pg.356)
Prakash2018 VTCS4604
14* 16* 33* 34* 38* 39* 22* 27* 29* 17* 18* 20* 21* 7* 5* 8* 2* 3*
Root
13 5
17
30 20 22
82
![Page 83: CS 4604: Introduction to Database Management Systemscourses.cs.vt.edu/~cs4604/Fall18/lectures/lecture-8.pdf§ Extra office hours till midterm – Check Piazza post Prakash 2018 VT](https://reader033.fdocuments.us/reader033/viewer/2022042010/5e71fef9aa21a56404626eaa/html5/thumbnails/83.jpg)
Mainobservationsfordeletion
§ Ifakeyvalueappearstwice(leaf+nonleaf),theabovealgorithmsdeleteitfromtheleaf,only
§ whynotnon-leaf,too?
Prakash2018 VTCS4604 83
![Page 84: CS 4604: Introduction to Database Management Systemscourses.cs.vt.edu/~cs4604/Fall18/lectures/lecture-8.pdf§ Extra office hours till midterm – Check Piazza post Prakash 2018 VT](https://reader033.fdocuments.us/reader033/viewer/2022042010/5e71fef9aa21a56404626eaa/html5/thumbnails/84.jpg)
Mainobservationsfordeletion
§ Ifakeyvalueappearstwice(leaf+nonleaf),theabovealgorithmsdeleteitfromtheleaf,only
§ whynotnon-leaf,too?§ ‘lazydeletions’-infact,somevendorsjustmarkentriesasdeleted(~underflow),– andreorganize/compactlater
Prakash2018 VTCS4604 84
![Page 85: CS 4604: Introduction to Database Management Systemscourses.cs.vt.edu/~cs4604/Fall18/lectures/lecture-8.pdf§ Extra office hours till midterm – Check Piazza post Prakash 2018 VT](https://reader033.fdocuments.us/reader033/viewer/2022042010/5e71fef9aa21a56404626eaa/html5/thumbnails/85.jpg)
Recap:mainideas
§ onoverflow,split(and‘push’,or‘copy’)– orconsiderdeferredsplit
§ onunderflow,borrowkeys;ormerge– orletitunderflow...
Prakash2018 VTCS4604 85
![Page 86: CS 4604: Introduction to Database Management Systemscourses.cs.vt.edu/~cs4604/Fall18/lectures/lecture-8.pdf§ Extra office hours till midterm – Check Piazza post Prakash 2018 VT](https://reader033.fdocuments.us/reader033/viewer/2022042010/5e71fef9aa21a56404626eaa/html5/thumbnails/86.jpg)
B+TreesinPractice
§ Typicalorder:100.Typicalfill-factor:67%.– averagefanout=2*100*0.67=134
§ Typicalcapacities:– Height4:1334=312,900,721entries– Height3:1333=2,406,104entries
Prakash2018 VTCS4604 86
![Page 87: CS 4604: Introduction to Database Management Systemscourses.cs.vt.edu/~cs4604/Fall18/lectures/lecture-8.pdf§ Extra office hours till midterm – Check Piazza post Prakash 2018 VT](https://reader033.fdocuments.us/reader033/viewer/2022042010/5e71fef9aa21a56404626eaa/html5/thumbnails/87.jpg)
B+TreesinPractice
§ Canoftenkeeptoplevelsinbufferpool:– Level1=1page=8KB– Level2=134pages=1MB– Level3=17,956pages=140MB
Prakash2018 VTCS4604 87
![Page 88: CS 4604: Introduction to Database Management Systemscourses.cs.vt.edu/~cs4604/Fall18/lectures/lecture-8.pdf§ Extra office hours till midterm – Check Piazza post Prakash 2018 VT](https://reader033.fdocuments.us/reader033/viewer/2022042010/5e71fef9aa21a56404626eaa/html5/thumbnails/88.jpg)
BulkLoadingofaB+Tree
§ Inanemptytree,insertmanykeys§ Whynotone-at-a-time?
– Tooslow!
Prakash2018 VTCS4604 88
![Page 89: CS 4604: Introduction to Database Management Systemscourses.cs.vt.edu/~cs4604/Fall18/lectures/lecture-8.pdf§ Extra office hours till midterm – Check Piazza post Prakash 2018 VT](https://reader033.fdocuments.us/reader033/viewer/2022042010/5e71fef9aa21a56404626eaa/html5/thumbnails/89.jpg)
BulkLoadingofaB+Tree
§ Initialization:Sortalldataentries§ scanlist;wheneverenoughforapage,pack§ <repeatforupperlevel>
Prakash2018 VTCS4604
3* 4* 6* 9* 10* 11* 12* 13* 20* 22* 23* 31* 35* 36* 38* 41* 44*
Sorted pages of data entries; not yet in B+ tree Root
89
![Page 90: CS 4604: Introduction to Database Management Systemscourses.cs.vt.edu/~cs4604/Fall18/lectures/lecture-8.pdf§ Extra office hours till midterm – Check Piazza post Prakash 2018 VT](https://reader033.fdocuments.us/reader033/viewer/2022042010/5e71fef9aa21a56404626eaa/html5/thumbnails/90.jpg)
Prakash2018 VTCS4604
3* 4* 6* 9* 10* 11* 12* 13* 20* 22* 23* 31* 35* 36* 38* 41* 44*
Root
Data entry pages not yet in B+ tree 35 23 12 6
10 20
3* 4* 6* 9* 10* 11* 12* 13* 20* 22* 23* 31* 35* 36* 38* 41* 44*
6
Root
10
12 23
20
35
38
not yet in B+ tree Data entry pages
BulkLoadingofaB+Tree
#90
![Page 91: CS 4604: Introduction to Database Management Systemscourses.cs.vt.edu/~cs4604/Fall18/lectures/lecture-8.pdf§ Extra office hours till midterm – Check Piazza post Prakash 2018 VT](https://reader033.fdocuments.us/reader033/viewer/2022042010/5e71fef9aa21a56404626eaa/html5/thumbnails/91.jpg)
ANoteon`Order’
§ Order(d)conceptreplacedbyphysicalspacecriterioninpractice(`atleasthalf-full’).
§ Manyrealsystemsareevensloppierthanthis:theyallowunderflow,andonlyreclaimspacewhenapageiscompletelyempty.
§ (whatarethebenefitsofsuch‘slopiness’?)
Prakash2018 VTCS4604 91
![Page 92: CS 4604: Introduction to Database Management Systemscourses.cs.vt.edu/~cs4604/Fall18/lectures/lecture-8.pdf§ Extra office hours till midterm – Check Piazza post Prakash 2018 VT](https://reader033.fdocuments.us/reader033/viewer/2022042010/5e71fef9aa21a56404626eaa/html5/thumbnails/92.jpg)
Conclusions
§ B+treeistheprevailingindexingmethod§ Excellent,O(logN)worst-caseperformanceforins/del/search;(~3-4diskaccessesinpractice)
§ guaranteed50%spaceutilization;avg69%
Prakash2018 VTCS4604 92
![Page 93: CS 4604: Introduction to Database Management Systemscourses.cs.vt.edu/~cs4604/Fall18/lectures/lecture-8.pdf§ Extra office hours till midterm – Check Piazza post Prakash 2018 VT](https://reader033.fdocuments.us/reader033/viewer/2022042010/5e71fef9aa21a56404626eaa/html5/thumbnails/93.jpg)
Conclusions
§ Canbeusedforanytypeofindex:primary/secondary,sparse(clustering),ordense(non-clustering)
§ Severalfine-extensionsonthebasicalgorithm– deferredsplit;– bulk-loading
Prakash2018 VTCS4604 93