1 CS 310 – Data Structures All figures labeled with “Figure X.Y” Copyright © 2006 Pearson...

71
1 CS 310 – Data Structures All figures labeled with “Figure X.Y” Copyright © 2006 Pearson Addison-Wesley. All rights reserved. photo ©Oregon Scenics used with permissi Trees Part II

Transcript of 1 CS 310 – Data Structures All figures labeled with “Figure X.Y” Copyright © 2006 Pearson...

Page 1: 1 CS 310 – Data Structures All figures labeled with “Figure X.Y” Copyright © 2006 Pearson Addison-Wesley. All rights reserved. photo ©Oregon Scenics used.

1

CS 310 – Data Structures

All figures labeled with “Figure X.Y”

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. photo ©Oregon Scenics used with permission

Trees Part II

Page 2: 1 CS 310 – Data Structures All figures labeled with “Figure X.Y” Copyright © 2006 Pearson Addison-Wesley. All rights reserved. photo ©Oregon Scenics used.

2

Balanced binary search trees

• Binary search trees with an extra condition– Left and right subtrees must have the same

height.– In practice, we use the condition that left and

right subtrees can differ in height by no more than one.

– Note: It is convenient to denote the hight of an empty tree as -1.

Page 3: 1 CS 310 – Data Structures All figures labeled with “Figure X.Y” Copyright © 2006 Pearson Addison-Wesley. All rights reserved. photo ©Oregon Scenics used.

3

• Keeping the tree balanced results in logarithmic search time.

AVL trees

• Adelson-Velskii & Landis

• First balanced tree – 1962

Page 4: 1 CS 310 – Data Structures All figures labeled with “Figure X.Y” Copyright © 2006 Pearson Addison-Wesley. All rights reserved. photo ©Oregon Scenics used.

4

Unbalanced tree

insert(tree, 1) – results in an unbalanced tree

Page 5: 1 CS 310 – Data Structures All figures labeled with “Figure X.Y” Copyright © 2006 Pearson Addison-Wesley. All rights reserved. photo ©Oregon Scenics used.

5

Which nodes to rebalance?

Nodes along the path from the insertion point to the root might need rebalancing.

Page 6: 1 CS 310 – Data Structures All figures labeled with “Figure X.Y” Copyright © 2006 Pearson Addison-Wesley. All rights reserved. photo ©Oregon Scenics used.

6

How can we get into trouble?

Insertion into left subtree of left child

Insertion into right subtree of left child

Page 7: 1 CS 310 – Data Structures All figures labeled with “Figure X.Y” Copyright © 2006 Pearson Addison-Wesley. All rights reserved. photo ©Oregon Scenics used.

7

Where else?

• Symmetric problems when inserting into the right subtree– Insertion in the right subtree of the left child of

the node in question.– Insertion in the right subtree of the right child

of the node in question.

Page 8: 1 CS 310 – Data Structures All figures labeled with “Figure X.Y” Copyright © 2006 Pearson Addison-Wesley. All rights reserved. photo ©Oregon Scenics used.

8

The outside cases

• When the unbalance comes from inserting on the outside of the tree, we can fix the problem with one rotation.

k1

k2

k1

k2

Page 9: 1 CS 310 – Data Structures All figures labeled with “Figure X.Y” Copyright © 2006 Pearson Addison-Wesley. All rights reserved. photo ©Oregon Scenics used.

9

Single rotation

Page 10: 1 CS 310 – Data Structures All figures labeled with “Figure X.Y” Copyright © 2006 Pearson Addison-Wesley. All rights reserved. photo ©Oregon Scenics used.

10

Concrete example

Page 11: 1 CS 310 – Data Structures All figures labeled with “Figure X.Y” Copyright © 2006 Pearson Addison-Wesley. All rights reserved. photo ©Oregon Scenics used.

11

Symmetric case for single rotation

Page 12: 1 CS 310 – Data Structures All figures labeled with “Figure X.Y” Copyright © 2006 Pearson Addison-Wesley. All rights reserved. photo ©Oregon Scenics used.

12

Page 13: 1 CS 310 – Data Structures All figures labeled with “Figure X.Y” Copyright © 2006 Pearson Addison-Wesley. All rights reserved. photo ©Oregon Scenics used.

13

The inside cases

• Rotations on the inside are more difficult.

Page 14: 1 CS 310 – Data Structures All figures labeled with “Figure X.Y” Copyright © 2006 Pearson Addison-Wesley. All rights reserved. photo ©Oregon Scenics used.

14

Double rotation

25

15

A

19

B C

D

25

15

A

19

B C

D

rotation 1swap child and grandchild

Page 15: 1 CS 310 – Data Structures All figures labeled with “Figure X.Y” Copyright © 2006 Pearson Addison-Wesley. All rights reserved. photo ©Oregon Scenics used.

15

Double Rotation

25

15

A

19

B C

D2515

A

19

B C D

rotation 2rotation between grandparent and new parent

Page 16: 1 CS 310 – Data Structures All figures labeled with “Figure X.Y” Copyright © 2006 Pearson Addison-Wesley. All rights reserved. photo ©Oregon Scenics used.

16

Practice makes perfect

• Practice with http://webpages.ull.es/users/jriera/Docencia/AVL/AVL%20tree%20applet.htm

Page 17: 1 CS 310 – Data Structures All figures labeled with “Figure X.Y” Copyright © 2006 Pearson Addison-Wesley. All rights reserved. photo ©Oregon Scenics used.

17

AVL implementation

• Implementation difficult

• Basic idea– For insertion into tree T, insert into

appropriate left/right Tlr subtree.

– If height of Tlr remains the same, all done.

– Otherwise, we need to see if T has become unbalanced. If so, perform appropriate repairs with root T.

Page 18: 1 CS 310 – Data Structures All figures labeled with “Figure X.Y” Copyright © 2006 Pearson Addison-Wesley. All rights reserved. photo ©Oregon Scenics used.

18

Page 19: 1 CS 310 – Data Structures All figures labeled with “Figure X.Y” Copyright © 2006 Pearson Addison-Wesley. All rights reserved. photo ©Oregon Scenics used.

19

AVL

• In practice, requires two passes through tree – down: insertion– up: repair

• Better schemes have been proposed.

Page 20: 1 CS 310 – Data Structures All figures labeled with “Figure X.Y” Copyright © 2006 Pearson Addison-Wesley. All rights reserved. photo ©Oregon Scenics used.

20

Red-Black trees

• Single top down pass for insertion & deletion

• Binary search trees with:– Colored nodes: red or black (null nodes treated as

black)– Root is always black– If node is red children are black– Constant black depth. Every path from node to null

link has the same number of black nodes.

Page 21: 1 CS 310 – Data Structures All figures labeled with “Figure X.Y” Copyright © 2006 Pearson Addison-Wesley. All rights reserved. photo ©Oregon Scenics used.

21

Sentinels

• Many implementations of red-black trees sometimes create special nodes called sentinels.

• Sentinel nodes are used in place of the null link to indicate leaves. In red-black trees, sentinels are always colored black.

Page 22: 1 CS 310 – Data Structures All figures labeled with “Figure X.Y” Copyright © 2006 Pearson Addison-Wesley. All rights reserved. photo ©Oregon Scenics used.

22

Properties

• If there are B black nodes along each of the paths, the tree must have at least 2B-1 black nodes.

• As there are never two consecutive red nodes:– height is at most 2log(N+1)– which implies logarithmic search

Page 23: 1 CS 310 – Data Structures All figures labeled with “Figure X.Y” Copyright © 2006 Pearson Addison-Wesley. All rights reserved. photo ©Oregon Scenics used.

23

Sample red-black tree

Page 24: 1 CS 310 – Data Structures All figures labeled with “Figure X.Y” Copyright © 2006 Pearson Addison-Wesley. All rights reserved. photo ©Oregon Scenics used.

24

Insertion

• New nodes are always inserted as leaves.

• What color?– black? Other paths will no longer have the

same number of black nodes.– red?

• If parent is black, then we are okay:

Page 25: 1 CS 310 – Data Structures All figures labeled with “Figure X.Y” Copyright © 2006 Pearson Addison-Wesley. All rights reserved. photo ©Oregon Scenics used.

25

Inserting when the parent is red

• What color is the parent’s sibling?– black and inserted leaf is an outside child

relative to grandparent: single rotation & recolor

Color change ensures we do not have two consecutive red nodes.Note: Figure does not assume X is a leaf.

Page 26: 1 CS 310 – Data Structures All figures labeled with “Figure X.Y” Copyright © 2006 Pearson Addison-Wesley. All rights reserved. photo ©Oregon Scenics used.

26

Red parent & black sibling

• We saw outside children single rotation

• Inside children double rotation

Page 27: 1 CS 310 – Data Structures All figures labeled with “Figure X.Y” Copyright © 2006 Pearson Addison-Wesley. All rights reserved. photo ©Oregon Scenics used.

27

Inserting when the parent is red

• Parent’s sibling is also red?

Page 28: 1 CS 310 – Data Structures All figures labeled with “Figure X.Y” Copyright © 2006 Pearson Addison-Wesley. All rights reserved. photo ©Oregon Scenics used.

28

Insertions with red parent & sibling

• Before rotations • Single rotation

• Problems– consecutive red nodes– recoloring doesn’t help

85

80

90

95

70

60

85

80 90

95

70

60

insert 95

Page 29: 1 CS 310 – Data Structures All figures labeled with “Figure X.Y” Copyright © 2006 Pearson Addison-Wesley. All rights reserved. photo ©Oregon Scenics used.

29

Insertions with red parent & sibling

• Double rotation

85

80 90

79

70

60

insert 79 after first rotation after second rotation

What if the 80’s parent(originally node 70’s parent) had been red?

85

80

90

79

70

6085

80

9079

70

60

Remember: This is a subtree, you could notconstruct a tree that looked like this.

Page 30: 1 CS 310 – Data Structures All figures labeled with “Figure X.Y” Copyright © 2006 Pearson Addison-Wesley. All rights reserved. photo ©Oregon Scenics used.

30

What if 80’s parent had been red?

• We could try to propagate this up the tree, applying the rotations to the next higher level.

• Unfortunately, this puts us in the same situation as the AVL tree which requires two traversals.

85

80

9079

70

60

?

Page 31: 1 CS 310 – Data Structures All figures labeled with “Figure X.Y” Copyright © 2006 Pearson Addison-Wesley. All rights reserved. photo ©Oregon Scenics used.

31

Red-black treeTop down insertion

• We only get into trouble when the parent of the inserted node has a red sibling.

• By recognizing this, we can prevent it from happening.

Page 32: 1 CS 310 – Data Structures All figures labeled with “Figure X.Y” Copyright © 2006 Pearson Addison-Wesley. All rights reserved. photo ©Oregon Scenics used.

32

Swapping colors

• On our way down, if we see two red children:

• we swap the parent and child colors:

Page 33: 1 CS 310 – Data Structures All figures labeled with “Figure X.Y” Copyright © 2006 Pearson Addison-Wesley. All rights reserved. photo ©Oregon Scenics used.

33

Swapping colors

• Is this all right?– black depth preserved– What if the new red node is the root?

• It is a problem, but we can just recolor it black.

– What if the parent is red?• let us think about this…

P PS?

Page 34: 1 CS 310 – Data Structures All figures labeled with “Figure X.Y” Copyright © 2006 Pearson Addison-Wesley. All rights reserved. photo ©Oregon Scenics used.

34

Swapping colors when the parent is also red

• Parent’s sibling is black?– Swap colors.– Repair with

• single (slide 28) or

• double rotation (slide 29)

• Parent’s sibling is red?– Can’t happen!– Why not?

P PS P PS

P PS? P PS?

Page 35: 1 CS 310 – Data Structures All figures labeled with “Figure X.Y” Copyright © 2006 Pearson Addison-Wesley. All rights reserved. photo ©Oregon Scenics used.

35

Top-down red-black tree

• Implementation is complicated by some special cases.

• Two tricks to ease implementation:– Instead of null links, we have a sentinel node

which is always black.– The root pointer points to a pseudoroot node

• Contains a smaller than any other value (-∞).• Right pointer points to real root.

Page 36: 1 CS 310 – Data Structures All figures labeled with “Figure X.Y” Copyright © 2006 Pearson Addison-Wesley. All rights reserved. photo ©Oregon Scenics used.

36

Page 37: 1 CS 310 – Data Structures All figures labeled with “Figure X.Y” Copyright © 2006 Pearson Addison-Wesley. All rights reserved. photo ©Oregon Scenics used.

37

Page 38: 1 CS 310 – Data Structures All figures labeled with “Figure X.Y” Copyright © 2006 Pearson Addison-Wesley. All rights reserved. photo ©Oregon Scenics used.

38

Page 39: 1 CS 310 – Data Structures All figures labeled with “Figure X.Y” Copyright © 2006 Pearson Addison-Wesley. All rights reserved. photo ©Oregon Scenics used.

39

Page 40: 1 CS 310 – Data Structures All figures labeled with “Figure X.Y” Copyright © 2006 Pearson Addison-Wesley. All rights reserved. photo ©Oregon Scenics used.

40

Page 41: 1 CS 310 – Data Structures All figures labeled with “Figure X.Y” Copyright © 2006 Pearson Addison-Wesley. All rights reserved. photo ©Oregon Scenics used.

41

Page 42: 1 CS 310 – Data Structures All figures labeled with “Figure X.Y” Copyright © 2006 Pearson Addison-Wesley. All rights reserved. photo ©Oregon Scenics used.

42

85

80 90

82

70

60

50 65

Page 43: 1 CS 310 – Data Structures All figures labeled with “Figure X.Y” Copyright © 2006 Pearson Addison-Wesley. All rights reserved. photo ©Oregon Scenics used.

43

Page 44: 1 CS 310 – Data Structures All figures labeled with “Figure X.Y” Copyright © 2006 Pearson Addison-Wesley. All rights reserved. photo ©Oregon Scenics used.

44

insert 76 while (compare( item, current ) != 0 ) { great = grand; grand = parent; parent = current; current = compare( item, current ) < 0 ? current.left : current.right; // Check if two red children; // fix if so if (current.left.color == RED && current.right.color == RED) handleReorient( item ); }

50

7525

8065

78 82

Page 45: 1 CS 310 – Data Structures All figures labeled with “Figure X.Y” Copyright © 2006 Pearson Addison-Wesley. All rights reserved. photo ©Oregon Scenics used.

45

insert 76private void handleReorient(T item) {

// flip colorcurrent.color = REDcurrent.left.color = BLACK;current.right.color = BLACK;

…}

50

7525

8065

78 82

Page 46: 1 CS 310 – Data Structures All figures labeled with “Figure X.Y” Copyright © 2006 Pearson Addison-Wesley. All rights reserved. photo ©Oregon Scenics used.

46

insert 76private void handleReorient(T item) {

… // slightly rewritten from Weiss

boolean leftOfGrand = compare(item, grand) < 0;boolean leftOfParent = compare(item, parent) < 0;if (parent.color == RED) { grand.color = RED; if (leftOfGrand != leftOfParent) { // double rotation

parent = rotate(item, grand); } current = rotate(item, great); current.color = BLACK;

} header.right.color = BLACK;

}

50

7525

8065

78 82

Page 47: 1 CS 310 – Data Structures All figures labeled with “Figure X.Y” Copyright © 2006 Pearson Addison-Wesley. All rights reserved. photo ©Oregon Scenics used.

47

insert 76private void handleReorient(T item) {

… // slightly rewritten from Weiss

boolean leftOfGrand = true;boolean leftOfParent = true;if (parent.color == RED) { grand.color = RED; if (leftOfGrand != leftOfParent) { // double rotation

parent = rotate(item, grand); } current = rotate(item, great); current.color = BLACK;

} header.right.color = BLACK;

}

50

7525

8065

78 82

Page 48: 1 CS 310 – Data Structures All figures labeled with “Figure X.Y” Copyright © 2006 Pearson Addison-Wesley. All rights reserved. photo ©Oregon Scenics used.

48

insert 76rotate(T item, RedBlackNode<T> parent) {

if (compare(item, parent) < 0) {

} else {

return parent.right = compare(item, parent.right) < 0 ?

rotateWithLeftChild(parent.right) :

rotateWithRightChild(parent.right);

}

50

7525

8065

78 82

Page 49: 1 CS 310 – Data Structures All figures labeled with “Figure X.Y” Copyright © 2006 Pearson Addison-Wesley. All rights reserved. photo ©Oregon Scenics used.

49

insert 76rotate(T item, RedBlackNode<T> parent) {

if (compare(item, parent) < 0) {

} else {

return parent.right = compare(item, parent.right) < 0 ?

rotateWithLeftChild(parent.right) :

rotateWithRightChild(parent.right);

}

50

7525

8065

78 82

50

75

25

80

65 78 82

Page 50: 1 CS 310 – Data Structures All figures labeled with “Figure X.Y” Copyright © 2006 Pearson Addison-Wesley. All rights reserved. photo ©Oregon Scenics used.

50

insert 76private void handleReorient(T item) {

… // slightly rewritten from Weiss

boolean leftOfGrand = true;boolean leftOfParent = true;if (parent.color == RED) { grand.color = RED; if (leftOfGrand != leftOfParent) { // double rotation

parent = rotate(item, grand); } current = rotate(item, great); current.color = BLACK;

} header.right.color = BLACK;

}

50

75

25

80

65 78 82

Page 51: 1 CS 310 – Data Structures All figures labeled with “Figure X.Y” Copyright © 2006 Pearson Addison-Wesley. All rights reserved. photo ©Oregon Scenics used.

51

insert 76

insert continues until

current is sentinel

parent

and now we can insert 76 as

a red node.

50

75

25

80

65 78 82

Page 52: 1 CS 310 – Data Structures All figures labeled with “Figure X.Y” Copyright © 2006 Pearson Addison-Wesley. All rights reserved. photo ©Oregon Scenics used.

52

Red/Black tree applets• http://gauss.ececs.uc.edu/RedBlack/redblack.html• or http://webpages.ull.es/users/jriera/Docencia/AVL/AVL%20tree%20applet.htm

Page 53: 1 CS 310 – Data Structures All figures labeled with “Figure X.Y” Copyright © 2006 Pearson Addison-Wesley. All rights reserved. photo ©Oregon Scenics used.

53

Red/Black tree deletion

• More complicated than insertion.

• We will cover the basic ideas and omit the implementation:– Deleting black nodes causes problems

make sure we delete red nodes.– We replace the values in internal nodes.

Page 54: 1 CS 310 – Data Structures All figures labeled with “Figure X.Y” Copyright © 2006 Pearson Addison-Wesley. All rights reserved. photo ©Oregon Scenics used.

54

Red/Black tree deletion

• X – current node

• S – sibling

• P – parent

• Assume sentinel red

• Consider what we can do when X’s children are black.

Page 55: 1 CS 310 – Data Structures All figures labeled with “Figure X.Y” Copyright © 2006 Pearson Addison-Wesley. All rights reserved. photo ©Oregon Scenics used.

55

Case 1

• If S has black children, we can perform a color flip.

P

X S

P

X S

Page 56: 1 CS 310 – Data Structures All figures labeled with “Figure X.Y” Copyright © 2006 Pearson Addison-Wesley. All rights reserved. photo ©Oregon Scenics used.

56

Case 2

• If sibling’s outer child is red, perform a single rotation.

P

X S

R

P

X

S

R

Page 57: 1 CS 310 – Data Structures All figures labeled with “Figure X.Y” Copyright © 2006 Pearson Addison-Wesley. All rights reserved. photo ©Oregon Scenics used.

57

Case 3

• If sibling’s inner child is red, perform a double rotation.

P

X S

L

P

X

L

S

Page 58: 1 CS 310 – Data Structures All figures labeled with “Figure X.Y” Copyright © 2006 Pearson Addison-Wesley. All rights reserved. photo ©Oregon Scenics used.

58

Deletion when X is a leaf

• Recall sentinels are colored black.

• Therefore, we can consider X to have two black children and use the 3 cases just described.

Page 59: 1 CS 310 – Data Structures All figures labeled with “Figure X.Y” Copyright © 2006 Pearson Addison-Wesley. All rights reserved. photo ©Oregon Scenics used.

59

X has a red child?

• In all three cases, X was colored red, so the color flip and rotations inappropriate.

• Without covering the specifics, we can move to the next level and perform an operation to make X one of the following:– red– a leaf node use one of the three cases– or X has a single child:

• red child delete X, make child black• black child use one of the three cases

Page 60: 1 CS 310 – Data Structures All figures labeled with “Figure X.Y” Copyright © 2006 Pearson Addison-Wesley. All rights reserved. photo ©Oregon Scenics used.

60

b-trees

• Log N search seems less convincing when some operations are orders of magnitude slower than others.

• A fast hard drive in 2006 can find a disk block in about 7.5 ms, or about 133 uncorrelated accesses per s.

• In contrast, an AMD Sempron 3600+ (top of the value line late 2006) can execute over 3,000 MFLOPSper s.

• Successful search in a 10 million record balanced binary search tree requires about 25 comparisons.– Trivial in RAM– About .2 s on disk assuming nobody else is using the system.

Page 61: 1 CS 310 – Data Structures All figures labeled with “Figure X.Y” Copyright © 2006 Pearson Addison-Wesley. All rights reserved. photo ©Oregon Scenics used.

61

M-ary b-trees

• Data stored in leaves• Nonleaf nodes

– store up to M-1 keys– ith key is the smallest subkey in i+1th

subtree• Root either

– a leaf or– has between 2 to M children.

• All interior nodes (except root) ceil(M/2) to M children.• All leaves at the same depth with ceil(L/2) to L data

items (L maximum number of items, root leaf may have less)

Page 62: 1 CS 310 – Data Structures All figures labeled with “Figure X.Y” Copyright © 2006 Pearson Addison-Wesley. All rights reserved. photo ©Oregon Scenics used.

62

Sample b-tree

Page 63: 1 CS 310 – Data Structures All figures labeled with “Figure X.Y” Copyright © 2006 Pearson Addison-Wesley. All rights reserved. photo ©Oregon Scenics used.

63

Selecting M & L

• Optimal choice of M and L depends upon the minimum amount of information that can addressed on a disk.

• While disks typically operate on a small blocks of bytes (512), many operating systems group the blocks into a larger unit called a cluster.

Page 64: 1 CS 310 – Data Structures All figures labeled with “Figure X.Y” Copyright © 2006 Pearson Addison-Wesley. All rights reserved. photo ©Oregon Scenics used.

64

Selecting M & L

• We typically choose M & L based upon the cluster size.

• M = floor(cluster size / key size)

• L = floor(cluster size / record size)

• In the worst case, we will access about logM/2(N) clusters.

Page 65: 1 CS 310 – Data Structures All figures labeled with “Figure X.Y” Copyright © 2006 Pearson Addison-Wesley. All rights reserved. photo ©Oregon Scenics used.

65

Adding to a b-tree

• Very easy when there’s room in the leaf node: insert 57

Page 66: 1 CS 310 – Data Structures All figures labeled with “Figure X.Y” Copyright © 2006 Pearson Addison-Wesley. All rights reserved. photo ©Oregon Scenics used.

66

Adding to a full node

• Split into two leaves if possible: insert 55

Page 67: 1 CS 310 – Data Structures All figures labeled with “Figure X.Y” Copyright © 2006 Pearson Addison-Wesley. All rights reserved. photo ©Oregon Scenics used.

67

When the interior node is full

• insert 40 cannot split easily

Page 68: 1 CS 310 – Data Structures All figures labeled with “Figure X.Y” Copyright © 2006 Pearson Addison-Wesley. All rights reserved. photo ©Oregon Scenics used.

68

Splitting an interior node

• interior node covering 40: 8, 18, 26, and 35

• leaf node: 35, 36, 37, 38, 39

• Split leaf: [35, 36, 37], and [38, 39, 40]

• Split the interior node to: [8, 18], promote 26 to parent, [35, 38]

Page 69: 1 CS 310 – Data Structures All figures labeled with “Figure X.Y” Copyright © 2006 Pearson Addison-Wesley. All rights reserved. photo ©Oregon Scenics used.

69

Splitting an interior node

Page 70: 1 CS 310 – Data Structures All figures labeled with “Figure X.Y” Copyright © 2006 Pearson Addison-Wesley. All rights reserved. photo ©Oregon Scenics used.

70

b-tree insertion

• If parent is full, the process can be repeated.

Page 71: 1 CS 310 – Data Structures All figures labeled with “Figure X.Y” Copyright © 2006 Pearson Addison-Wesley. All rights reserved. photo ©Oregon Scenics used.

71

b-tree deletion

• Just remove it when there are ceil(L/2) items will still remain.

• When the number of items is too small, merge nodes