Chapter 10 2-3-4 Trees and External Storage John Urrutia 2014,
All Rights Reserved1
Slide 2
2-3-4 Trees Binary Tree Each parent node may have up to 2
children. Each child can have only 1 data item. Multi-way tree
(2-3-4) Each parent node must have 2 to 4 children. The max number
of children is call the order of the tree Each child node will have
1 data item & can have up to 3 2-3-4 Trees are self-balancing
Just like binary trees John Urrutia 2014, All Rights Reserved2
Slide 3
2-3-4 Trees (the Rules) Leaf nodes have no children All leaf
nodes are always at the same level All leaf nodes must have at
least 1 Data item but may have as many as 3 50 30 10 20 40 55 62
6466 75 83 86 60 7080 John Urrutia 2014, All Rights Reserved3
Slide 4
2-3-4 Trees (the Rules) Non-leaf nodes The data items in the
node dictates the number of children 1 Data item exactly 2 children
2 Data items exactly 3 children 3 Data items exactly 4 children
This relationship sets the structure of the tree Empty Nodes are
not allowed John Urrutia 2014, All Rights Reserved4
Slide 5
2-3-4 Trees (the Rules) Nodes with: 2 Links are called 2-node 3
Links are called 3-node 4 Links are called 4-node Unlike binary
trees 2-3-4 do not have nodes with only 1 link John Urrutia 2014,
All Rights Reserved5
Slide 6
2-3-4 Tree Organization Data items are numbered 0, 1, 2 and are
stored in ascending sequence Data links are numbered 0, 1, 2, 3 All
Data in a child of Link 0 have values < the data value of Link 0
All Data in a child of Link 1 have values > the data value of
Link 0 but < the data value of Link 1 All Data in a child of
Link 2 have values > the data value of Link 1 but < the data
value of Link 2 All Data in a child of Link 3 have values > the
data value of Link 2 John Urrutia 2014, All Rights Reserved6
Slide 7
2-3-4 Tree Organization 30 35 5578100 105 50 75 95 0 1 2 0 1 2
3 John Urrutia 2014, All Rights Reserved7
Slide 8
2-3-4 Tree Organization All Data in a child of Link 0 have
values < the data value of Link 0 All Data in a child of Link 1
have values > the data value of Link 0 but < the data value
of Link 1 All Data in a child of Link 2 have values > the data
value of Link 1 but < the data value of Link 2 All Data in a
child of Link 3 have values > the data value of Link 2 Duplicate
values are normally not permitted John Urrutia 2014, All Rights
Reserved8
Slide 9
Keys & Children A B C Keys > C B < Keys < CA <
Keys < BKeys < A John Urrutia 2014, All Rights Reserved9
Slide 10
Searching 2-3-4 Trees Search for the value (64) in the parent
Select link whose value is greater than the 64 (Link 1 ) Search
Link 1 and repeat as necessary until value found or at leaf node 50
30 10 20 40 55 62 6466 75 83 86 60 7080 John Urrutia 2014, All
Rights Reserved10
Slide 11
Inserting 2-3-4 Trees Insertion always occurs in a leaf node
Search for the value to insert in the root and select the first
link whose value is > the insert value Navigate to the Link If
the Link is full split it If not follow link to next level Repeat
as necessary until the appropriate leaf is found. If leaf is full
Split the leaf into two and insert the value If not Insert the
value John Urrutia 2014, All Rights Reserved11
Slide 12
Inserting 2-3-4 Trees The simple process: Find the leaf node
that should contain the new value If the node isnt full simply
insert the value. 28|55| 11| |42| |74| | 05|09|30| |97|
|44|47|63|67|72 13|23| 13| |23 18 John Urrutia 2014, All Rights
Reserved12
Slide 13
Inserting 2-3-4 Trees Splitting a full node: Insert 25 Create a
new node 40|50|60 39| |41| | 52| | 63| | | | 10| | John Urrutia
2014, All Rights Reserved13
Slide 14
Inserting 2-3-4 Trees Splitting a full node: Insert 25 Move 50
to the parent 40| |60 39| |41| | 52| | 63| | | | 10|50| | | John
Urrutia 2014, All Rights Reserved14
Slide 15
Inserting 2-3-4 Trees Splitting a full node: Insert 25 Move 60
to the new node with the children 40| | 39| |41| | 52| | 63| | 60|
| 10|50| | | John Urrutia 2014, All Rights Reserved15
Slide 16
Inserting 2-3-4 Trees Splitting the root node: Create 2 new
nodes 1 for each left and right children Middle becomes the new
root 9| |41| | 52| | 91| | 10|50|90 John Urrutia 2014, All Rights
Reserved16
Slide 17
Inserting 2-3-4 Trees Splitting the root node: Create 2 new
nodes 1 for each left and right children Middle becomes the new
root 9| |41| | 52| | 91| | 10|50|90 10| | 90| | John Urrutia 2014,
All Rights Reserved17
Slide 18
Inserting 2-3-4 Trees Splitting the root node: Create 2 new
nodes 1 for each left and right children Middle becomes the new
root 50| | 9| |41| | 52| | 91| | 10| | 90| | John Urrutia 2014, All
Rights Reserved18
Slide 19
2-3-4 DataItem Class class DataItem { public long dData; public
DataItem(long dd) { dData = dd; } public void displayItem() {
Console.Write("/"+dData); } } John Urrutia 2014, All Rights
Reserved19 Data
Slide 20
2-3-4 Node Class Data class Node { private const int ORDER = 4;
private int numItems; private Node parent; private Node[]
childArray = new Node[ORDER]; private DataItem[] itemArray = new
DataItem[ORDER-1]; //--------------------------------------------
John Urrutia 2014, All Rights Reserved20
Slide 21
2-3-4 Node Class Node Methods public void connectChild(int
childNum, Node child) public Node disconnectChild(int childNum)
public Node getChild(int childNum) public Node getParent() John
Urrutia 2014, All Rights Reserved21
Slide 22
2-3-4 Node Class Data Methods public DataItem getItem(int
index) public int insertItem(DataItem newItem) public DataItem
removeItem() John Urrutia 2014, All Rights Reserved22
Slide 23
2-3-4 Node Class Utility Methods public Boolean isFull() public
Boolean isLeaf() public int getNumItems() public int findItem(long
key) public void displayNode() John Urrutia 2014, All Rights
Reserved23
Slide 24
2-3-4 Tree Class private Node root = new Node(); public int
find(long key) public void insert(long dValue) public void
split(Node thisNode) public Node getNextChild(Node theNode, long
Value) public void displayTree() private void recDisplayTree(Node
thisNode, int level, int childNumber) John Urrutia 2014, All Rights
Reserved24
Slide 25
2-3-4 Tree Class Code walk through John Urrutia 2014, All
Rights Reserved25
Slide 26
2-3-4 Trees & Red-Black Trees 2-3-4 trees dont look like
Red-Black tree or do they?? Red-black trees were developed after
234 trees We can transform 2-3-4 to Red-Black because they are
isomorphic using these rules: Transform any 2-node in the 2-3-4
tree into a black node in the red-black tree. Transform any 3-node
into a child node and a parent node Transform any 4-node into a
parent and two children John Urrutia 2014, All Rights
Reserved26
Slide 27
2-3-4 Trees & Red-Black Trees John Urrutia 2014, All Rights
Reserved27 41| | 41 2 Node
Slide 28
2-3-4 Trees & Red-Black Trees John Urrutia 2014, All Rights
Reserved28 41|52| 41 3 Node 52 41 Either Is Okay
Slide 29
2-3-4 Trees & Red-Black Trees John Urrutia 2014, All Rights
Reserved29 41|52|63 41 4 Node 52 63
Slide 30
2-3-4 Trees & Red-Black Trees Color Flips Are the same as a
4-node split Rotations are the result of a 3-node split Right
rotation is the for the Left node split Left rotation is for the
Right node split Efficiency with some slight differences, they are
roughly the same John Urrutia 2014, All Rights Reserved30
Slide 31
2-3 Trees Created by J. E. Hopcroft in 1970 Similar to 2-3-4
trees except a Node can hold 2 data items and can have 0 to 3
children. The split process is similar but cannot happen on the way
down to the insertion point After insertion splits percolate up the
tree to maintain balance John Urrutia 2014, All Rights
Reserved31
Slide 32
External Storage Processor speed is rated in clock speed
(Gigahertz) or Instructions per second (MIPS or FLOpS) 2.67
Gigahertz = 2,670,000,000 ticks per sec. Approx. 333,000,000
instructions per sec. The most expensive operation a system
performs is I/O Approx. 1,100,000 bytes per sec. 300 times as long
as an average instruction. John Urrutia 2014, All Rights
Reserved32
Slide 33
External Storage
Slide 34
Disk Organization Data Terms John Urrutia 2014, All Rights
Reserved34 Block Buffer Cylinder Sector Track Partition Seek Read
Write Transfer Operation Terms
Slide 35
External Storage Data Terms Block the amount of data
transferred in one I/O Buffer RAM to store one or more blocks of
data. Usually in multiples of sector size 4,8,16,32KB Cluster the
set of blocks that match the I/O buffer size. Which are read or
written together. Cylinder the set of tracks simultaneously
accessible by the read/write heads Sector the physical area on a
platter to hold one block Track The circle scribed by the
read/write head Partition a logical division on a disk drive John
Urrutia 2014, All Rights Reserved35
Slide 36
External Storage Disk Organization Data Terms John Urrutia
2014, All Rights Reserved36 Block / Sector Track
Slide 37
External Storage Disk Organization Data Terms John Urrutia
2014, All Rights Reserved37 Cylinder
Slide 38
External Storage John Urrutia 2014, All Rights Reserved38
Operation Terms Seek The physical movement of the read/write head
to a particular cylinder on the platter Read The process of
retrieving data from the drive Write The process of storing data on
the drive Transfer The movement of data to or from the drive
Slide 39
External Storage Disk Specifications John Urrutia 2014, All
Rights Reserved39 Manufacturer Seagate Technology Model ST9250410AS
Spindle Speed 7200 rpm Avg. Latency 4.17msec I/O data transfer rate
3.0 (Gbits/sec max) T2T seek time (read) 1.5msec Avg. seek (read)
11.0msec Avg. seek (write) 13.0msec
Slide 40
External Storage Disk Organization John Urrutia 2014, All
Rights Reserved40 Bytes/Sector512 Sectors/Track63 Size232.88 GB
(250,056,737,280 bytes) Total Cylinders30,401 Total
Sectors488,392,065 Total Tracks7,752,255 Tracks/Cylinder2
Slide 41
External Storage File system Organization Sequential Access
Stream of bytes blocked together Must be read in sequential order
beginning to end or vice versa. Can only add data to either end of
the file. Cant delete records without copying entire file. Direct
(random) Access Data organized into record blocks based on a key
value Can be read sequentially or randomly by record Can add or
delete anywhere in the file provided there is room. John Urrutia
2014, All Rights Reserved41
Slide 42
B-Trees and I/O We structure our b-tree so the data in the
nodes correspond to the size of the disk clusters. We use the key
values to designate the cluster that contains the data. This
provides us with log n access to any record in our dataset, where n
represents the number of children for each node in the tree. Each
level in the tree requires 1 I/O when searching for a prospective
record.
Slide 43
Summary A multiway tree has more keys and children than a
binary tree. A 2-3-4 tree is a multiway tree with up to three keys
and four children per node. In a multiway tree, the keys in a node
are arranged in ascending order. In a 2-3-4 tree, all insertions
are made in leaf nodes, and all leaf nodes are on the same level.
John Urrutia 2014, All Rights Reserved43
Slide 44
Summary Three kinds of nodes are possible in a 2-3-4 tree: A
2-node has one key and two children A 3-node has two keys and three
children A 4-node has three keys and four children. There is no
1-node in a 2-3-4 tree. In a search in a 2-3-4 tree, at each node
the keys are examined. If the search key is not found the next node
will be: Child 0 If the search key is less than key 0 Child 1 if
the search key is between key 0 and key 1 Child 2 if the search key
is between key 1 and key 2 Child 3 if the search key is greater
than key 2. John Urrutia 2014, All Rights Reserved44
Slide 45
Summary 2-3-4 tree Insertion requires that any full node be
split on the way down the tree, during the search for the insertion
point. Splitting the root creates two new nodes Splitting any other
node creates one new node. The height of a 2-3-4 tree only
increases when the root is split. John Urrutia 2014, All Rights
Reserved45
Slide 46
Summary There is a one-to-one correspondence between a 2-3-4
tree and a red-black tree. To transform a 2-3-4 tree into a
red-black tree Make each 2-node into a black node Make each 3-node
into a black parent with a red child Make each 4-node into a black
parent with two red children. John Urrutia 2014, All Rights
Reserved46
Slide 47
Summary When a 3-node is transformed into a parent and child,
either node can become the parent. Splitting a node in a 2-3-4 tree
is the same as performing a color flip in a red-black tree. A
rotation in a red-black tree corresponds to changing between the
two possible orientations (slants) when transforming a 3-node. John
Urrutia 2014, All Rights Reserved47
Slide 48
Summary The height of a 2-3-4 tree is less than log N. Search
times are proportional to the height. The 2-3-4 tree wastes space
because many nodes are not even half full. John Urrutia 2014, All
Rights Reserved48
Slide 49
Summary A 2-3 tree is similar to a 2-3-4 tree, except that it
can have only one or two data items and one, two, or three
children. Insertion in a 2-3 tree involves finding the appropriate
leaf and then performing splits from the leaf upward, until a
non-full node is found. John Urrutia 2014, All Rights
Reserved49
Slide 50
Summary External storage means storing data outside of main
memory, usually on a disk. External storage is larger, cheaper (per
byte), and slower than main memory. Data in external storage is
typically transferred to and from main memory a block at a time.
Data can be arranged in external storage in sequential key order.
This gives fast search times but slow insertion (and deletion)
times. John Urrutia 2014, All Rights Reserved50
Slide 51
Summary A B-tree is a multiway tree in which each node may have
dozens or hundreds of keys and children. There is always one more
child than there are keys in a node. For the best performance, a
B-tree is typically organized so that a node holds one block of
data. If the search criteria involve many keys, a sequential search
of all the records in a file may be the most practical approach.
John Urrutia 2014, All Rights Reserved51