B-Trees And B+-Trees Jay Yim CS 157B Dr. Lee. Preview B-Tree Indexing B-Tree Indexing B-Tree B-Tree...

34
B-Trees And B+-Trees B-Trees And B+-Trees Jay Yim Jay Yim CS 157B CS 157B Dr. Lee Dr. Lee

Transcript of B-Trees And B+-Trees Jay Yim CS 157B Dr. Lee. Preview B-Tree Indexing B-Tree Indexing B-Tree B-Tree...

Page 1: B-Trees And B+-Trees Jay Yim CS 157B Dr. Lee. Preview B-Tree Indexing B-Tree Indexing B-Tree B-Tree B-Tree Characteristics B-Tree Characteristics B-Tree.

B-Trees And B+-TreesB-Trees And B+-Trees

Jay YimJay Yim

CS 157BCS 157B

Dr. LeeDr. Lee

Page 2: B-Trees And B+-Trees Jay Yim CS 157B Dr. Lee. Preview B-Tree Indexing B-Tree Indexing B-Tree B-Tree B-Tree Characteristics B-Tree Characteristics B-Tree.

PreviewPreview

• B-Tree IndexingB-Tree Indexing

• B-TreeB-Tree

• B-Tree CharacteristicsB-Tree Characteristics

• B-Tree ExampleB-Tree Example

• B+-TreeB+-Tree

• B+-Tree CharacteristicsB+-Tree Characteristics

• B+-Tree ExampleB+-Tree Example

Page 3: B-Trees And B+-Trees Jay Yim CS 157B Dr. Lee. Preview B-Tree Indexing B-Tree Indexing B-Tree B-Tree B-Tree Characteristics B-Tree Characteristics B-Tree.

B-Tree IndexB-Tree Index

• Standard use index in relational databases in a B-Standard use index in relational databases in a B-Tree index.Tree index.

• Allows for rapid tree traversal searching through Allows for rapid tree traversal searching through an upside-down tree structurean upside-down tree structure

• Reading a single record from a very large table Reading a single record from a very large table using a B-Tree index, can often result in a few using a B-Tree index, can often result in a few block reads—even when the index and table are block reads—even when the index and table are millions of blocks in size.millions of blocks in size.

• Any index structure other than a B-Tree index is Any index structure other than a B-Tree index is subject to overflow. subject to overflow. – Overflow is where any changes made to tables will not Overflow is where any changes made to tables will not

have records added into the original index structure, but have records added into the original index structure, but rather tacked on the end.rather tacked on the end.

Page 4: B-Trees And B+-Trees Jay Yim CS 157B Dr. Lee. Preview B-Tree Indexing B-Tree Indexing B-Tree B-Tree B-Tree Characteristics B-Tree Characteristics B-Tree.

What is a B-Tree?What is a B-Tree?

• B-tree is a specialized multiway tree B-tree is a specialized multiway tree designed especially for use on disk. designed especially for use on disk.

• B-Tree consists of a root node, B-Tree consists of a root node, branch nodes and leaf nodes branch nodes and leaf nodes containing the indexed field values in containing the indexed field values in the ending (or leaf) nodes of the tree.the ending (or leaf) nodes of the tree.

Page 5: B-Trees And B+-Trees Jay Yim CS 157B Dr. Lee. Preview B-Tree Indexing B-Tree Indexing B-Tree B-Tree B-Tree Characteristics B-Tree Characteristics B-Tree.

B-Tree Characteristics B-Tree Characteristics

• In a B-tree each node may contain a large In a B-tree each node may contain a large number of keys number of keys

• B-tree is designed to branch out in a large B-tree is designed to branch out in a large number of directions and to contain a lot of keys number of directions and to contain a lot of keys in each node so that the height of the tree is in each node so that the height of the tree is relatively small relatively small

• Constraints that tree is always balancedConstraints that tree is always balanced• Space wasted by deletion, if any, never becomes Space wasted by deletion, if any, never becomes

excessiveexcessive• Insert and deletions are simple processesInsert and deletions are simple processes

– Complicated only under special circumstancesComplicated only under special circumstances-Insertion into a node that is already full or a deletion -Insertion into a node that is already full or a deletion from a node makes it less then half fullfrom a node makes it less then half full

Page 6: B-Trees And B+-Trees Jay Yim CS 157B Dr. Lee. Preview B-Tree Indexing B-Tree Indexing B-Tree B-Tree B-Tree Characteristics B-Tree Characteristics B-Tree.

Characteristics of a B-Tree of Characteristics of a B-Tree of Order POrder P

• Within each node, KWithin each node, K11 < K < K22 < .. < K < .. < Kp-1p-1

• Each node has at most p tree pointerEach node has at most p tree pointer• Each node, except the root and leaf nodes, Each node, except the root and leaf nodes,

has at least ceil(p/2) tree pointers, The root has at least ceil(p/2) tree pointers, The root node has at least two tree pointers unless it node has at least two tree pointers unless it is the only node in the tree.is the only node in the tree.

• All leaf nodes are at the same level. Leaf All leaf nodes are at the same level. Leaf node have the same structure as internal node have the same structure as internal nodes except that all of their tree pointer Pnodes except that all of their tree pointer Pii are null.are null.

Page 7: B-Trees And B+-Trees Jay Yim CS 157B Dr. Lee. Preview B-Tree Indexing B-Tree Indexing B-Tree B-Tree B-Tree Characteristics B-Tree Characteristics B-Tree.

B-Tree InsertionB-Tree Insertion

1)1) B-tree starts with a single root node (which is also a leaf B-tree starts with a single root node (which is also a leaf node) at level 0.node) at level 0.

2)2) Once the root node is full with p – 1 search key values Once the root node is full with p – 1 search key values and when attempt to insert another entry in the tree, the and when attempt to insert another entry in the tree, the root node splits into two nodes at level 1.root node splits into two nodes at level 1.

3)3) Only the middle value is kept in the root node, and the Only the middle value is kept in the root node, and the rest of the values are split evenly between the other two rest of the values are split evenly between the other two nodes. nodes.

4)4) When a nonroot node is full and a new entry is inserted When a nonroot node is full and a new entry is inserted into it, that node is split into two nodes at the same level, into it, that node is split into two nodes at the same level, and the middle entry is moved to the parent node along and the middle entry is moved to the parent node along with two pointers to the new split nodes. with two pointers to the new split nodes.

5)5) If the parent node is full, it is also split.If the parent node is full, it is also split.6)6) Splitting can propagate all the way to the root node, Splitting can propagate all the way to the root node,

creating a new level if the root is split.creating a new level if the root is split.

Page 8: B-Trees And B+-Trees Jay Yim CS 157B Dr. Lee. Preview B-Tree Indexing B-Tree Indexing B-Tree B-Tree B-Tree Characteristics B-Tree Characteristics B-Tree.

B-Tree DeletionB-Tree Deletion

1)1) If deletion of a value causes a node to be If deletion of a value causes a node to be less than half full, it is combined with it less than half full, it is combined with it neighboring nodes, and this can also neighboring nodes, and this can also propagate all the way to the root. propagate all the way to the root. - Can reduce the number of tree levels.- Can reduce the number of tree levels.

*Shown by analysis and simulation that, after numerous random *Shown by analysis and simulation that, after numerous random insertions and deletions on a B-tree, the nodes are insertions and deletions on a B-tree, the nodes are approximately 69 percent full when the number of values in the approximately 69 percent full when the number of values in the tree stabilizes. If this happens , node splitting and combining tree stabilizes. If this happens , node splitting and combining will occur only rarely, so insertion and deletion become quite will occur only rarely, so insertion and deletion become quite efficient. efficient.

Page 9: B-Trees And B+-Trees Jay Yim CS 157B Dr. Lee. Preview B-Tree Indexing B-Tree Indexing B-Tree B-Tree B-Tree Characteristics B-Tree Characteristics B-Tree.

B-tree of Order 5 ExampleB-tree of Order 5 Example• All internal nodes have at least ceil(5 / 2) = ceil(2.5) = 3 All internal nodes have at least ceil(5 / 2) = ceil(2.5) = 3

children (and hence at least 2 keys), other then the root children (and hence at least 2 keys), other then the root node. node.

• The maximum number of children that a node can have is 5 The maximum number of children that a node can have is 5 (so that 4 is the maximum number of keys) (so that 4 is the maximum number of keys)

• each leaf node must contain at least 2 keys each leaf node must contain at least 2 keys

Page 10: B-Trees And B+-Trees Jay Yim CS 157B Dr. Lee. Preview B-Tree Indexing B-Tree Indexing B-Tree B-Tree B-Tree Characteristics B-Tree Characteristics B-Tree.

B-Tree Order 5 InsertionB-Tree Order 5 Insertion

• Originally we have an empty B-tree of order 5Originally we have an empty B-tree of order 5• Want to insert C N G A H E K Q M F W L T Z D P R Want to insert C N G A H E K Q M F W L T Z D P R

X Y S X Y S • Order 5 means that a node can have a maximum Order 5 means that a node can have a maximum

of 5 children and 4 keys of 5 children and 4 keys • All nodes other than the root must have a All nodes other than the root must have a

minimum of 2 keys minimum of 2 keys • The first 4 letters get inserted into the same node The first 4 letters get inserted into the same node

Page 11: B-Trees And B+-Trees Jay Yim CS 157B Dr. Lee. Preview B-Tree Indexing B-Tree Indexing B-Tree B-Tree B-Tree Characteristics B-Tree Characteristics B-Tree.

B-Tree Order 5 Insertion B-Tree Order 5 Insertion Cont.Cont.• When we try to insert the H, we find no room in When we try to insert the H, we find no room in

this node, so we split it into 2 nodes, moving the this node, so we split it into 2 nodes, moving the median item G up into a new root node. median item G up into a new root node.

Page 12: B-Trees And B+-Trees Jay Yim CS 157B Dr. Lee. Preview B-Tree Indexing B-Tree Indexing B-Tree B-Tree B-Tree Characteristics B-Tree Characteristics B-Tree.

B-Tree Order 5 Insertion B-Tree Order 5 Insertion Cont.Cont.• Inserting E, K, and Q proceeds without Inserting E, K, and Q proceeds without

requiring any splits requiring any splits

Page 13: B-Trees And B+-Trees Jay Yim CS 157B Dr. Lee. Preview B-Tree Indexing B-Tree Indexing B-Tree B-Tree B-Tree Characteristics B-Tree Characteristics B-Tree.

B-Tree Order 5 Insertion B-Tree Order 5 Insertion Cont.Cont.• Inserting M requires a split Inserting M requires a split

Page 14: B-Trees And B+-Trees Jay Yim CS 157B Dr. Lee. Preview B-Tree Indexing B-Tree Indexing B-Tree B-Tree B-Tree Characteristics B-Tree Characteristics B-Tree.

B-Tree Order 5 Insertion B-Tree Order 5 Insertion Cont.Cont.• The letters F, W, L, and T are then added The letters F, W, L, and T are then added

without needing any split without needing any split

Page 15: B-Trees And B+-Trees Jay Yim CS 157B Dr. Lee. Preview B-Tree Indexing B-Tree Indexing B-Tree B-Tree B-Tree Characteristics B-Tree Characteristics B-Tree.

B-Tree Order 5 Insertion B-Tree Order 5 Insertion Cont.Cont.• When Z is added, the rightmost leaf must be When Z is added, the rightmost leaf must be

split. The median item T is moved up into the split. The median item T is moved up into the parent node parent node

Page 16: B-Trees And B+-Trees Jay Yim CS 157B Dr. Lee. Preview B-Tree Indexing B-Tree Indexing B-Tree B-Tree B-Tree Characteristics B-Tree Characteristics B-Tree.

B-Tree Order 5 Insertion B-Tree Order 5 Insertion Cont.Cont.• The insertion of D causes the leftmost leaf to be split. D The insertion of D causes the leftmost leaf to be split. D

happens to be the median key and so is the one moved happens to be the median key and so is the one moved up into the parent node. up into the parent node.

• The letters P, R, X, and Y are then added without any The letters P, R, X, and Y are then added without any need of splitting need of splitting

Page 17: B-Trees And B+-Trees Jay Yim CS 157B Dr. Lee. Preview B-Tree Indexing B-Tree Indexing B-Tree B-Tree B-Tree Characteristics B-Tree Characteristics B-Tree.

B-Tree Order 5 Insertion B-Tree Order 5 Insertion Cont.Cont.• Finally, when S is added, the node with N, P, Q, and R Finally, when S is added, the node with N, P, Q, and R

splits, sending the median Q up to the parent. splits, sending the median Q up to the parent. • The parent node is full, so it splits, sending the median The parent node is full, so it splits, sending the median

M up to form a new root node. M up to form a new root node.

Page 18: B-Trees And B+-Trees Jay Yim CS 157B Dr. Lee. Preview B-Tree Indexing B-Tree Indexing B-Tree B-Tree B-Tree Characteristics B-Tree Characteristics B-Tree.

B-Tree Order 5 DeletionB-Tree Order 5 Deletion

• Initial B-TreeInitial B-Tree

Page 19: B-Trees And B+-Trees Jay Yim CS 157B Dr. Lee. Preview B-Tree Indexing B-Tree Indexing B-Tree B-Tree B-Tree Characteristics B-Tree Characteristics B-Tree.

B-Tree Order 5 Deletion B-Tree Order 5 Deletion Cont.Cont.• Delete H Delete H • Since H is in a leaf and the leaf has more than Since H is in a leaf and the leaf has more than

the minimum number of keys, we just remove it.the minimum number of keys, we just remove it.

Page 20: B-Trees And B+-Trees Jay Yim CS 157B Dr. Lee. Preview B-Tree Indexing B-Tree Indexing B-Tree B-Tree B-Tree Characteristics B-Tree Characteristics B-Tree.

B-Tree Order 5 Deletion B-Tree Order 5 Deletion Cont.Cont.• Delete T.Delete T.• Since T is not in a leaf, we find its successor (the next item in Since T is not in a leaf, we find its successor (the next item in

ascending order), which happens to be W.ascending order), which happens to be W.• Move W up to replace the T. That way, what we really have to do Move W up to replace the T. That way, what we really have to do

is to delete W from the leaf . is to delete W from the leaf .

Page 21: B-Trees And B+-Trees Jay Yim CS 157B Dr. Lee. Preview B-Tree Indexing B-Tree Indexing B-Tree B-Tree B-Tree Characteristics B-Tree Characteristics B-Tree.

B+- Tree CharacteristicsB+- Tree Characteristics

• Data records are only stored in the leaves.Data records are only stored in the leaves.• Internal nodes store just keys.Internal nodes store just keys.• Keys are used for directing a search to the Keys are used for directing a search to the

proper leaf. proper leaf. • If a target key is less than a key in an If a target key is less than a key in an

internal node, then the pointer just to its left internal node, then the pointer just to its left is followed. is followed.

• If a target key is greater or equal to the key If a target key is greater or equal to the key in the internal node, then the pointer to its in the internal node, then the pointer to its right is followed. right is followed.

• B+ Tree combines features of ISAM (Indexed B+ Tree combines features of ISAM (Indexed Sequential Access Method) and B Trees. Sequential Access Method) and B Trees.

Page 22: B-Trees And B+-Trees Jay Yim CS 157B Dr. Lee. Preview B-Tree Indexing B-Tree Indexing B-Tree B-Tree B-Tree Characteristics B-Tree Characteristics B-Tree.

B+- Tree Characteristics B+- Tree Characteristics Cont.Cont.• Implemented on disk, it is likely that Implemented on disk, it is likely that

the leaves contain key, pointer pairs the leaves contain key, pointer pairs where the pointer field points to the where the pointer field points to the record of data associated with the record of data associated with the key. key. – allows the data file to exist separately allows the data file to exist separately

from the B+ tree, which functions as an from the B+ tree, which functions as an "index" giving an ordering to the data in "index" giving an ordering to the data in the data file. the data file.

Page 23: B-Trees And B+-Trees Jay Yim CS 157B Dr. Lee. Preview B-Tree Indexing B-Tree Indexing B-Tree B-Tree B-Tree Characteristics B-Tree Characteristics B-Tree.

B+- Tree Characteristics B+- Tree Characteristics Cont.Cont.

• Very Fast SearchingVery Fast Searching

• Insertion and deletion are expensive. Insertion and deletion are expensive.

Page 24: B-Trees And B+-Trees Jay Yim CS 157B Dr. Lee. Preview B-Tree Indexing B-Tree Indexing B-Tree B-Tree B-Tree Characteristics B-Tree Characteristics B-Tree.

Formula n-order B+ tree with a height Formula n-order B+ tree with a height

of hof h

• Maximum number of keys is nMaximum number of keys is nhh

• Minimum number of keys is 2(n / 2)Minimum number of keys is 2(n / 2)h − h −

11

Page 25: B-Trees And B+-Trees Jay Yim CS 157B Dr. Lee. Preview B-Tree Indexing B-Tree Indexing B-Tree B-Tree B-Tree Characteristics B-Tree Characteristics B-Tree.

B+ tree of order 200 B+ tree of order 200 ExampleExample• Leaves can each contain up to 199 keysLeaves can each contain up to 199 keys• Assuming that the root node has at least Assuming that the root node has at least

100 children 100 children • A 2 level B+ tree that meets these A 2 level B+ tree that meets these

assumptions can store about 9,900 assumptions can store about 9,900 records, since there are at least 100 records, since there are at least 100 leaves, each containing at least 99 keys.leaves, each containing at least 99 keys.

• A 3 level B+ tree of this type can store A 3 level B+ tree of this type can store about 1 million keys. A 4 level B+ tree can about 1 million keys. A 4 level B+ tree can store up to about 100 million keys. store up to about 100 million keys.

Page 26: B-Trees And B+-Trees Jay Yim CS 157B Dr. Lee. Preview B-Tree Indexing B-Tree Indexing B-Tree B-Tree B-Tree Characteristics B-Tree Characteristics B-Tree.

B+- Tree StructureB+- Tree Structure

Page 27: B-Trees And B+-Trees Jay Yim CS 157B Dr. Lee. Preview B-Tree Indexing B-Tree Indexing B-Tree B-Tree B-Tree Characteristics B-Tree Characteristics B-Tree.

B+- Tree order 3 InsertionB+- Tree order 3 Insertion

• Insert value 5, 8, 1, 7Insert value 5, 8, 1, 7

• Inserting value 5 Inserting value 5

• Since the node is empty, the value Since the node is empty, the value must be placed in the leaf node.must be placed in the leaf node.

Page 28: B-Trees And B+-Trees Jay Yim CS 157B Dr. Lee. Preview B-Tree Indexing B-Tree Indexing B-Tree B-Tree B-Tree Characteristics B-Tree Characteristics B-Tree.

B+- Tree Insertion Cont.B+- Tree Insertion Cont.

• Inserting value 8Inserting value 8

• Since the node has room, we insert the Since the node has room, we insert the new value.new value.

Page 29: B-Trees And B+-Trees Jay Yim CS 157B Dr. Lee. Preview B-Tree Indexing B-Tree Indexing B-Tree B-Tree B-Tree Characteristics B-Tree Characteristics B-Tree.

B+- Tree Insertion Cont.B+- Tree Insertion Cont.

• Insert value 1Insert value 1

• Since the node is full, it must be split into two Since the node is full, it must be split into two nodes.nodes.

• Each node is half full.Each node is half full.

Page 30: B-Trees And B+-Trees Jay Yim CS 157B Dr. Lee. Preview B-Tree Indexing B-Tree Indexing B-Tree B-Tree B-Tree Characteristics B-Tree Characteristics B-Tree.

B+- Tree Insertion Cont.B+- Tree Insertion Cont.

• Inserting value 7.Inserting value 7.

Page 31: B-Trees And B+-Trees Jay Yim CS 157B Dr. Lee. Preview B-Tree Indexing B-Tree Indexing B-Tree B-Tree B-Tree Characteristics B-Tree Characteristics B-Tree.

B+- Tree DeletionB+- Tree Deletion

• Initial TreeInitial Tree

Page 32: B-Trees And B+-Trees Jay Yim CS 157B Dr. Lee. Preview B-Tree Indexing B-Tree Indexing B-Tree B-Tree B-Tree Characteristics B-Tree Characteristics B-Tree.

B+- Tree Deletion Cont.B+- Tree Deletion Cont.

• Delete Value 9Delete Value 9

• Since the node is not less than half full, Since the node is not less than half full, the tree is correct.the tree is correct.

Page 33: B-Trees And B+-Trees Jay Yim CS 157B Dr. Lee. Preview B-Tree Indexing B-Tree Indexing B-Tree B-Tree B-Tree Characteristics B-Tree Characteristics B-Tree.

B+- Tree Deletion Cont.B+- Tree Deletion Cont.

• Deleting value 8Deleting value 8• The node is less then half full, the values are The node is less then half full, the values are

redistributed from the node on the left because it is full.redistributed from the node on the left because it is full.• The parent node is adjusted to reflect the change.The parent node is adjusted to reflect the change.

Page 34: B-Trees And B+-Trees Jay Yim CS 157B Dr. Lee. Preview B-Tree Indexing B-Tree Indexing B-Tree B-Tree B-Tree Characteristics B-Tree Characteristics B-Tree.

ReferencesReferences

• Beginning Database Design By Gavin PowellBeginning Database Design By Gavin Powell

• Database System Concepts By Silberschatz, Database System Concepts By Silberschatz, Korth, SudarshanKorth, Sudarshan

• Fundamentals of Database Systems By Fundamentals of Database Systems By Elmasri, NavatheElmasri, Navathe

• http://http://dns.mec.ac.in/notes/ds/bpdns.mec.ac.in/notes/ds/bp lus.htmlus.htm

• http://cis.stvincent.edu/swd/btree/http://cis.stvincent.edu/swd/btree/btree.htmlbtree.html