Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

122
Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

Transcript of Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

Page 1: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

Binary Search Trees

Cormen (cap 12, Edition 3)Estruturas de Dados e seus Algoritmos (Cap 4)

Page 2: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

2

Dictionary Data Structures

Goal:

– Design a data structure to store a small set of keys S={k1,k2,.. ,kn} from a large universe U.

– It shall efficiently support

– Query(x): determine whether a key x is in S or not

– Insert(x): Add x to the set S if x is not there

– Delete(x): Remove x from S if x is there

Additional Goals– Low memory consumption– Efficient construction

Page 3: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

3

Dictionary Data Structures

Linked Lists

Query(x): O(n) time

Insert(x): Insert at the beginning of the list: O(1) time

Delete(x): Find and then remove, O(n) time

Construction time: O(n) time

Space consumption: O(n)

Page 4: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

4

Binary Search Trees

A Binary search tree(BST) T for a list K= (k1 < ··· < kn) of n

keys is a rooted tree that satisfies the following properties:

– It has n internal nodes and n+1 leaves

– Each internal node of T is associated with a distinct key

– [Search Property] If node v is associated with key ki then:

all nodes at the left subtree of v are associated with keys smaller than ki

all nodes at the right subtree of v are associated with keys larger than ki

Page 5: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

How does the BST works?

Search Property:

x

y x

Page 6: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

How does the BST works?

Search Property:

x

y x x z

Page 7: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

7

Binary Search Trees

Binary Search Trees

k2

k1 k4

k3 k5

Page 8: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

8

Binary Search Trees

Keys are elements from a totally ordered set U

– U can be the set of integers

– U can be the set of students from a university

Page 9: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

9

Binary Search Trees

Additional Properties

– The key with minimum value is stored in the leftmost node of the tree

– The key with maximum value is stored in the righttmost node of the tree

Page 10: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

10

Binary Search Trees

Basic Operations

– Query(x): Determine whether x belongs to T or not

– Insert(x): if x is not in T, then insert x in T.

– Delete(x): If x in T, then remove x from T

Page 11: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

BST: Query(x)

Algorithm Query(x)

If x = leaf then Return “element was not found”

End If

If x = root then Return “element was found”

Else if x < root then

search in the left subtree else

search in the right subtreeEnd If

Page 12: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

12

Binary Search Trees

Query(x)

k2

k1k4

k3 k5

Page 13: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

Inseting a new key

Add a new element in the tree at the correct position in order to keep the search property.

Algorithm Insert(x, T)If x = root.key then

Return ‘x is already in T’ End IfIf root(T) is a leaf then

Associate the leaf with xReturn

End IfIf x < root.key then

Insert (x, left tree of T)else

Insert (x, right tree of T)End If

Page 14: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

Example: Insert(50), Insert(20), Insert(39), Insert(8), Insert(79), Insert(26)

50

20

8 39

26

79

Inseting a new key

Page 15: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

Removing a node in a BST

SITUATIONS:

Removing a leafRemoving an internal node with a unique childRemoving an internal node with two children

Page 16: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

Removing a Leaf

6

2

1

8

3

4

Page 17: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

Removing a Leaf

6

2

1

8

3

4

Page 18: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

Removing a Leaf

6

2

1

8

4

6

2

1

8

4

3

Page 19: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

Removing a node in a BST

SITUATIONS:

Removing a leafRemoving an internal node with a unique childRemoving an internal node with two children

Page 20: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

It is necessary to correct the pointer, “jumping” the node: the only grandchild becomes the right (left) son.

Removing an internal node with a unique child

Page 21: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

6

2

1

8

3

4

Removing an internal node with a unique child

Page 22: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

6

2

1

8

3

4

Removing an internal node with a unique child

Page 23: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

6

2

1

8

3

4

Removing an internal node with a unique child

Page 24: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

6

2

1

8

3

6

2

1

8

3

4

Removing an internal node with a unique child

Page 25: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

Removing a node in a BST

SITUATIONS:

Removing a leafRemoving an internal node with a unique childRemoving an internal node with two children

Page 26: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

• Find the element which preceeds the element to be removed considering the ordering

(this corresponds to remove the rightmost element from the left subtree)

• Switch the information of the node to be removed with the node found

Removing an internal node with two children

Page 27: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

Removing an internal node with two children

6

2

1

8

3

4

Page 28: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

Removing an internal node with two children

6

2

1

8

3

4

6

2

1

8

3

4

Page 29: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

6

2

1

8

3

4

Removing an internal node with two children

Page 30: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

6

2

1

8

3

4

4

2

1

8

3

6

Removing an internal node with two children

Page 31: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

4

2

1

8

3

6

Removing an internal node with two children

Page 32: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

4

2

1

8

3

6

Removing an internal node with two children

Page 33: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

4

2

1

8

3

6

4

2

1

8

3

6

Removing an internal node with two children

Page 34: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

34

Binary Search Trees: Operations Complexity

Basic Operations

– Query(x): Determine whether x belongs to T or not Number of operations = O( height of T)

– Insert(x): if x is not in T, then insert x in T. Number of operations = O( height of T)

– Delete(x): If x in T, then remove x from T Number of operations = O( height of T)

– Max(T) and Min(T) Number of operations = O( height of T)

Page 35: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

35

Binary Search Trees: Operations Complexity

Basic Operations

– Query(x): Determine whether x belongs to T or not Number of operations = O( height of T)

– Insert(x): if x is not in T, then insert x in T. Number of operations = O( height of T)

– Delete(x): If x in T, then remove x from T Number of operations = O( height of T)

– Max(T) and Min(T) Number of operations = O( height of T)

Shallow trees are desirable

Page 36: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

36

Binary Search Trees: Construction

• Simple Approach: let k1 ,…, kn be the set of key not necessarily

ordered. Proceed as follows:

insert( k1 ), insert( k2 ) , . . . , insert( kn )

Page 37: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

Example: 50, 20, 39, 8, 79, 26, 58, 15, 88, 4, 85, 96, 71, 42, 53.

50

20

8

4 15

39

26 42

79

58

53 71

88

85 96

Page 38: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

38

Binary Search Trees: Construction

• Simple Approach: let k1 ,…, kn be the set of key not necessarily

ordered. Proceed as follows:

insert( k1 ), insert( k2 ) , . . . , insert( kn )

The structure has height O(n) if the set of keys is ordered.

For a random permutation of the n first integers, its expected height is O(log n) (Cormen, 12.3)

Page 39: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

39

Binary Search Trees: Construction

• Simple Approach: let k1 ,…, kn be the set of key not necessarily

ordered. Proceed as follows:

• Sort the keys

BST(1:n)root(T) ‘median key’

left(root) <- BST(1,n/2)right(root) <- BST(n/2+1,n)

End

Page 40: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

Relation between #nodes and height of a binary tree

At each level the number of nodes duplicates, such that for a binary tree with height h we have at most:

20+ 21 + 22 + ...+ 2h-1 = 2h – 1 nodes

Page 41: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

Relation between #nodes and height of a binary tree

At each level the number of nodes duplicates, such that for a binary tree with height h we have at most:

20+ 21 + 22 + ...+ 2h-1 = 2h – 1 nodes

Or equivalently:The height of every binary search tree with n nodes is at least log

n

Page 42: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

The tree may become unbalanced

Remove: node 8

6

2

1

8

3

4

6

2

1

3

4

Page 43: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

The tree may become unbalanced

Remove: node 8 Remove node 1

6

2

1

8

3

4

6

2

1

3

4

Page 44: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

The tree may become unbalanced

Remove: node 8 Remove node 1

6

2

1

8

3

4

6

2

1

3

4

6

2

3

4

Page 45: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

The tree may become unbalanced

The binary tree may become degenerate after operations of insertion and remotion: becoming a list, for example.

Page 46: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

Balanced Trees

Cormen (cap 13, Edition 3)Estruturas de Dados e seus Algoritmos (Cap 5)

Page 47: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

AVL TREES

(Adelson-Velskii and Landis 1962)

BST trees that maintain a reasonable balanced tree all the time.

Key idea: if insertion or deletion get the tree out of balance then fix it immediately

All operations insert, delete,… can be done on an AVL tree with N nodes in O(log N) time (worst case)

Page 48: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

AVL TREES

AVL Tree Property: It is a BST in which the heights of the left and right subtrees of the root differ by at most 1 and in which the right and left subtrees are also AVL trees

Height: length of the longest path from the root to a leaf.

Page 49: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

AVL TREES: Example:

An example of an AVL tree where the heights are shown next to the nodes:

88

44

17 78

32 50

48 62

2

4

1

1

2

3

1

1

Page 50: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

AVL TREES

Other Examples:

Page 51: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

AVL TREES

Other Examples:

Page 52: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

Let r be the root of an AVL tree of height hLet Nh denote the minimum number of nodes in an AVL tree of

height h

Relation between #nodes and height of na AVL tree

Page 53: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

Let r be the root of an AVL tree of height hLet Nh denote the minimum number of nodes in an AVL tree of

height h

r

Te Td

T

Relation between #nodes and height of na AVL tree

Page 54: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

Let r be the root of an AVL tree of height hLet Nh denote the minimum number of nodes in an AVL tree of

height h

r

Te Td

h-1

T

Relation between #nodes and height of na AVL tree

Page 55: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

Let r be the root of an AVL tree of height hLet Nh denote the minimum number of nodes in an AVL tree of

height h

r

Te Td

h-1 h-1 ou h-2

T

Relation between #nodes and height of na AVL tree

Page 56: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

Let r be the root of an AVL tree of height hLet Nh denote the minimum number of nodes in an AVL tree of

height h

It grows faster than Fibonacci series Nh ≥ 1.618h-2

r

Te Td

h-1 h-1 ou h-2

T

Nh ≥ 1 + Nh-1 + Nh-2

Relation between #nodes and height of na AVL tree

Page 57: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

Let r be the root of an AVL tree of height hLet Nh denote the minimum number of nodes in an AVL tree of

height h

It grows faster than Fibonacci series Nh ≥ 1.618h-2 Height of AVL Tree <= 1.44 log N (N is the number of nodes)

r

Te Td

h-1 h-1 ou h-2

T

Nh ≥ 1 + Nh-1 + Nh-2

Relation between #nodes and height of na AVL tree

Page 58: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

Height of AVL Tree

Height of the tree is O(logN) Where N is the number of elements contained in the tree

This implies that tree search operations Query(), Max(), Min() take O(logN) time.

Page 59: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

Insertion in an AVL Tree

Insertion is as in a binary search tree (always done by expanding an external node)

Page 60: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

Insertion in an AVL Tree

Insertion is as in a binary search tree (always done by expanding an external node)

Example:

44

17 78

32 50 88

48 62

Page 61: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

Insertion in an AVL Tree

Insertion is as in a binary search tree (always done by expanding an external node)

Example:Insert node 54

44

17 78

32 50 88

48 62

Page 62: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

Insertion in an AVL Tree

Insertion is as in a binary search tree (always done by expanding an external node)

Example:

44

17 78

32 50 88

48 62

54

Insert node 54

44

17 78

32 50 88

48 62

Page 63: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

Insertion in an AVL Tree

Insertion is as in a binary search tree (always done by expanding an external node)

Example:Insert node 54

444

17 78

32 50 88

48 62

54

44

17 78

32 50 88

48 62

Page 64: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

Insertion in an AVL Tree

Insertion is as in a binary search tree (always done by expanding an external node)

Example:Insert node 54

4

3

44

17 78

32 50 88

48 62

54

44

17 78

32 50 88

48 62

Page 65: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

Insertion in an AVL Tree

Insertion is as in a binary search tree (always done by expanding an external node)

Example:Insert node 54

4

3 1

44

17 78

32 50 88

48 62

54

44

17 78

32 50 88

48 62

Page 66: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

Insertion in an AVL Tree

Insertion is as in a binary search tree (always done by expanding an external node)

Example:Insert node 54

4

3 1

44

17 78

32 50 88

48 62

54

44

17 78

32 50 88

48 62

Unbalanced!!

Page 67: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

After insertion and deletion we will examine the tree structure and see if any node violates the AVL tree property

If the AVL property is violated at node x, it means the heights of left(x) and right(x) differ by exactly 2

If it does violate the property we can modify the tree structure using “rotations” to restore the AVL tree property

How does the AVL tree work?

Page 68: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

Rotations

Two types of rotations Single rotations

– two nodes are “rotated” Double rotations

– three nodes are “rotated”

Page 69: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

Localizing the problem

Two principles:

• Imbalance will only occur on the path from the inserted/deleted node to the root (only these nodes have had their subtrees altered - local problem)

• Rebalancing should occur at the deepest unbalanced node (local solution too)

Page 70: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

Single Rotation (Right): Case I

• Rotate x with left child y

• x and y satisfy the AVL property after the rotation

Page 71: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

Single Rotation (Left): Case II

• Rotate x with right child y

• x and y satisfy the AVL property after the rotation

Page 72: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

Single Rotation - Example

Tree is an AVL tree by definition.

hh+1

Page 73: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

hh+2

Node 02 added

Tree violates the AVL definition!Perform rotation.

Single Rotation - Example

Page 74: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

Tree has this form.

h

hh+1

A

B

C

x

y

Single Rotation - Example

Page 75: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

Example – After Rotation

Tree has this form.

AB C

x

y

Page 76: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

Single Rotation

Sometimes a single rotation fails to solve the problem

k2

k1

XY

Z

k1

X

YZ

k2

h+2

hhh+2

In such cases, we need to use a double-rotation

Page 77: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

Double Rotations: Case IV

Page 78: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

Tree is an AVL tree by definition.

hh+1

Delete node 94

Double Rotations

Page 79: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

AVL tree is violated.

hh+2

Double Rotations

Page 80: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

Tree has this form.

B1 B2

C

A

x

y

z

Double Rotations

Page 81: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

A B1 B2 C

xy

z

Tree has this form

After Double Rotations

Page 82: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

Insertion

We keep the height of each node x to check the AVL properrty

Part 1. Perform normal BST insertion

Part 2. Check AVL property and restore the property if necessary.

– To check whether the AVL property persists we only need to check the nodes in the path from the new leaf to the root of the BST because the balance of the other nodes are not affected

– Check if node x is balanced using the identity

Height(x) = 1 + max { Height (left(x)), Height(right(x) }

– We should update the heights of the visited nodes in this process

Page 83: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

Insertion: Part 2 Detailed

For each x in the path from the inserted leaf towards the root.If the heights of left(x) and right(x) height differ at most by 1 Do ‘nothing’ Else we know that one of the subtrees of x has height h and the

other h+2

If the height of left(x) is h+2 then

– If the height of left(left(x)) is h+1, we single rotate with left child (case 1)

– Otherwise, the height of right(left(x)) is h+1 and we double rotate with left child (case 3)

Otherwise, the height of right(x) is h+2

– If the height of right(right(x)) is h+1, then we rotate with right child (case 2)

– Otherwise, the height of left(right(x)) is h+1 and we double rotate with right child (case 4)

Break For

Page 84: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

Insertion: Correctness

• Let x be the deepest node that does not satisfy the AVL property.

• Assume that case 2 occurs (the new element is inserted in tree C)• x and y satisfy the property after the rotation.• The ancestors of x satisfy the property because the

height(x) before the insertion is h+2 and height(y) after the rotation is also h+2

Page 85: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

Insertion: Correctness

• Let x be the deepest node that does not satisfy the AVL property.

• Assume that case 2 occurs (the new element is inserted in tree C)• The nodes in the path between the new element and y also

satisfy the AVL property due to the assumption that x is the deepest node for which the AVL property does not hold

• Nodes that are not in the path from the root to the new element are not affected

Page 86: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

Insertion: Correctness

• Let x be the deepest node that does not satisfy the AVL property.

• Assume that case 4 occurs (the new element is inserted in tree B1)• x, y and z satisfy the property after the rotation• The ancestors of x are balanced after the rotation because

the height of x is h+2 before the insertion and the height of z is h+2 after the rotation.

Page 87: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

Insertion: Correctness

• Let x be the deepest node that does not satisfy the AVL property.

• Assume that case 4 occurs (the new element is inserted in tree B1)• The remaining nodes in the path between the new element

and x also satisfy the property due to the assumption that x is the deepest node that does not satisfy the AVL property

• The nodes that are not in the path between the new element and x are not affected.

Page 88: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

Insertion: Complexity

• The time complexity to perform a rotation is O(1) since we just update a few pointers

• The time complexity to find a node that violates the AVL property depends on the height of the tree, which is log(N)

Page 89: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

Deletion

Perform normal BST deletion

Perform verification similar to those employed for the insertion to restore the tree property

Page 90: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

Summary AVL Trees

Maintains a Balanced TreeModifies the insertion and deletion routine

Performs single or double rotations to restore structure Guarantees that the height of the tree is O(logn)

The guarantee directly implies that functions find(), min(), and max() will be performed in O(logn)

Page 91: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

Other Balanced trees

• Red Black Trees (Cormen Cap 13, Jayme cap 6)

• 2-3 Trees (Hopcroft)

Page 92: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

104

Dictionary Problem: non uniform access probabilities

We want to keep a data structure to support a sequence of INSERT, QUERY, DELETE operations

– Some elements are accessed much more often than others non-uniform access probabilities

Page 93: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

Consider the following AVL Tree

44

17 78

32 50 88

48 62

Dictionary Problem: non uniform access probabilities

Page 94: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

Consider the following AVL Tree

44

17 78

32 50 88

48 62

Dictionary Problem: non uniform access probabilities

Suppose we want to search for the following sequence of elements: 48, 48, 48, 48, 62, 62, 62, 48, 62.

Page 95: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

Consider the following AVL Tree

44

17 78

32 50 88

48 62

Suppose we want to search for the following sequence of elements: 48, 48, 48, 48, 62, 62, 62, 48, 62.

Dictionary Problem: non uniform access probabilities

In this case, is this a good structure?

Page 96: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

Consider the following AVL Tree

48

32 62

44 50 78

Suppose we want to search for the following sequence of elements: 48, 48, 48, 48, 62, 62, 62, 48, 62.

Dictionary Problem: non uniform access probabilities

This structure is much better!

17

88

Page 97: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

109

Dictionary Problem: non uniform access probabilities

Application: Building Inverted indexes

Given a text T, we want to design an inverted index S for T, that is, a structure that maintains for every word x of T, the list of positions where x occurs.

T ALO ALO MEU AMIGO …. ALO AMIGO MEU Positions 123456789... 12 ... 30 34 40

ALO 1,4,30 AMIGO 12,34 MEU 9, 40

Page 98: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

110

Dictionary Problem: non uniform access probabilities

Application: Building Inverted indexes

Given a text T, we want to design an inverted index S for T, that is, a structure that maintains for every word x of T, the list of positions where x occurs.

T ALO ALO MEU AMIGO …. ALO AMIGO MEU Positions 123456789... 12 ... 30 34 40

ALO 1,4,30 AMIGO 12,34 MEU 9, 40

We do not know the list of words beforehand; some words may occur much more frequently than others

Page 99: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

111

Dictionary Problem: non uniform access probabilities

Static Case: distribution access probability is known beforehand

• Lists

• Optimal Binary Search Trees

Dynamic Case: distribution access probability is not known beforehand

• Self Adjusted Lists

• Self Adjusted Binary Search Trees• Splay Trees

Page 100: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

112

Dictionary Problem with non uniform access probabilities

Problem

Given sequence K = k1 < k2 <··· < kn of n sorted keys, with a search probability pi for each key ki.

– We assume that we always search an element that belongs to K. This assumption can be easily removed.

Want to design a data structure with minimum expected search cost.

Actual cost = # of items examined.– For key ki , number of elements accessed before finding ki

Page 101: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

Optimal Binary Search Trees

Cormen (cap 15.5, Edition 3)Estruturas de Dados e seus Algoritmos (Cap 4)

Page 102: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

114

Dictionary Problem: non uniform access probabilities

Approach 1: Linked lists

Put the elements with highest probabilities of being accessed at the beginning of the list

Keys (1,2,3,4,5); p=(0.1, 0.3, 0.2, 0.05, 0.15)

Best possible linked list

2 3 5 1 4

Expected cost of accessing an element= 1 x 0.3 + 2x0.2 + 3x0.15 + 4x0.1 + 5x0.05

Page 103: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

115

Approach 2: Binary Search Tree

Given sequence K = k1 < k2 <··· < kn of n sorted keys, with a search probability pi for each key ki.

Want to build a binary search tree (BST) with minimum expected search cost.

Actual cost = # of items examined.

For key ki , cost = depthT (ki) + 1,

where depthT(ki) = depth of ki in BST T . (root is at depth 0)

Dictionary Problem with non uniform access probabilities

Page 104: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

116

Expected Search Cost

n

iiiT

n

i

n

iiiiT

n

iiiT

pk

ppk

pk

TE

1

1 1

1

)(depth1

)(depth

)1)(depth(

]in cost search [

Sum of probabilities is 1.

Identity (1)

Page 105: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

117

Example

Consider 5 keys with these search probabilities:

p1 = 0.25, p2 = 0.2, p3 = 0.05, p4 = 0.2, p5 = 0.3.

k2

k1 k4

k3 k5

i depthT ( k i) depthT ( k i) · pi

1 1 0.252 0 03 2 0.14 1 0.25 2 0.6 1.15

Therefore, E[search cost] = 2.15.

Page 106: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

118

Example

p1 = 0.25, p2 = 0.2, p3 = 0.05, p4 = 0.2, p5 = 0.3.

i depthT(ki) depthT(ki)·pi

1 1 0.252 0 03 3 0.154 2 0.45 1 0.3 1.10

Therefore, E[search cost] = 2.10.

k2

k1 k5

k4

k3 This tree turns out to be optimal for this set of keys.

Page 107: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

119

Example

Observations:

Optimal BST may not have smallest height.

Optimal BST may not have highest-probability key at root.

Build by exhaustive checking?

Construct each n-node BST.

For each, assign keys and compute expected search cost.

But there are (4n/n3/2) different BSTs with n nodes.

Page 108: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

120

Optimal Substructure

Any subtree of a BST contains keys in a contiguous range ki, ..., kj

for some 1 ≤ i ≤ j ≤ n.

If T is an optimal BST and T contains subtree T ’ with keys ki, ... ,kj , then

T must be an optimal BST for keys ki, ..., kj.

Proof: Otherwise, we can obtain a tree better T by replacing T’ with an optimal BST for keys ki, ..., kj.

T

T

Page 109: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

121

Optimal Substructure

One of the keys in ki, …,kj, say kr, where i ≤ r ≤ j,

must be the root of an optimal subtree for these keys.

Left subtree of kr contains ki,...,kr1.

Right subtree of kr contains kr+1, ...,kj.

To find an optimal BST:

Examine all candidate roots kr , for i ≤ r ≤ j

Determine all optimal BSTs containing ki,...,kr1 and containing kr+1,...,kj

kr

ki kr-1 kr+1 kj

Page 110: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

Recursive Solution

When the OPT subtree becomes a subtree of a node: Depth of every node in OPT subtree goes up by 1. Expected search cost increases by

j

illpjiw ),( from Identity (1)

Page 111: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

123

Recursive Solution

When the OPT subtree becomes a subtree of a node: Depth of every node in OPT subtree goes up by 1. Expected search cost increases by

j

illpjiw ),( from Identity (1)

k1 k4

k3 k5

k2

k1 k4

k3 k5k0

Page 112: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

124

Recursive Solution

e[i,j]: cost of the optimal BST for ki,..,kj :

If kr is the root of an optimal BST for ki,..,kj :

e[i, j ] = pr + ( e[i, r1] + w(i, r1) ) + ( e[r+1, j] + w(r+1, j) )=

e[i, r1] + e[r+1, j] + w(i, j).

But, we don’t know kr. Hence,

jijiwjrerie

ijjie

jri if)},(],1[]1,[{min

1 if0],[

Page 113: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

125

Computing an Optimal Solution

For each subproblem (i,j), store:

expected search cost in a table e [1 .. n+1 , 0 .. n]

Will use only entries e[i, j ], where j ≥ i1.

root[i, j ] = root of subtree with keys ki,..,kj, for 1 ≤ i ≤ j ≤ n.

w[1..n+1, 0..n] = sum of probabilities

w[i, i1] = 0 for 1 ≤ i ≤ n.

w[i, j ] = w[i, j-1] + pj for 1 ≤ i ≤ j ≤ n.

Page 114: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

126

Pseudo-code

1. OPTIMAL-BST(p, q, n)2. for i ← 1 to n + 13. do e[i, i 1] ← 04. for len ← 1 to n5. do for i ← 1 to nlen + 16. do j ←i + len17. e[i, j ]←∞ 8. for r ←i to j9. do t ← e[i, r1] + e[r + 1, j ] + w[i, j ]10. if t < e[i, j ]11. then e[i, j ] ← t12. root[i, j ] ←r13. return e and root

1. OPTIMAL-BST(p, q, n)2. for i ← 1 to n + 13. do e[i, i 1] ← 04. for len ← 1 to n5. do for i ← 1 to nlen + 16. do j ←i + len17. e[i, j ]←∞ 8. for r ←i to j9. do t ← e[i, r1] + e[r + 1, j ] + w[i, j ]10. if t < e[i, j ]11. then e[i, j ] ← t12. root[i, j ] ←r13. return e and root

Time: O(n3)Space: O(n2)

Consider all trees with l keys.

Fix the first key.

Fix the last key

Determine the root of the optimal (sub)tree

Page 115: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

127

Speeding up the Algorithm

Knuth principle: Let kr be the root of an optimal BST for the set of keys ki < ....< kj. Furthermore, let kj+1 > kj and ki-1<ki. Then,

(i) there is an optimal BST for the set of keys ki-1 ,ki,..., kj

with root smaller than or equal to kr

(ii) there is an optimal BST for the set of keys ki ,ki+1,..., kj+1

with root larger than or equal to kr

Page 116: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

128

Knuth principle: Example

p1 = 0.25, p2 = 0.2, p3 = 0.05, p4 = 0.2, p5 = 0.3.

• Let k0 be a key with probability p0 then there is an optimal BST for the set (k0,…, k5) with root smaller than or equal to k2.

k2

k1 k5

k4

k3

Page 117: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

129

Knuth principle: Example

p1 = 0.25, p2 = 0.2, p3 = 0.05, p4 = 0.2, p5 = 0.3.

• Let k6 be a key with probability p6 then there is an optimal BST for the set (k1,…, k6) with root larger than or equal to k2

k2

k1 k5

k4

k3

Page 118: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

130

Speeding up the Algorithm

1. OPTIMAL-BST-Revised(p, q, n)2. for i ← 1 to n + 13. do e[i, i 1] ← 04. for len ← 1 to n5. do for i ← 1 to nlen + 16. do j ←i + len17. e[i, j ]←∞ 8. for r ←root[i,j-1] to root[i+1,j]9. do t ← e[i, r1] + e[r + 1, j ] + w[i, j ]10. if t < e[i, j ]11. then e[i, j ] ← t12. root[i, j ] ←r13. return e and root

1. OPTIMAL-BST-Revised(p, q, n)2. for i ← 1 to n + 13. do e[i, i 1] ← 04. for len ← 1 to n5. do for i ← 1 to nlen + 16. do j ←i + len17. e[i, j ]←∞ 8. for r ←root[i,j-1] to root[i+1,j]9. do t ← e[i, r1] + e[r + 1, j ] + w[i, j ]10. if t < e[i, j ]11. then e[i, j ] ← t12. root[i, j ] ←r13. return e and root

Time: O(n2)Space: O(n2)

Consider all trees with l keys.

O(n l )

Determine the root of the optimal (sub)tree

Optimization.

Page 119: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

Speeding up the Algorithm

131

• To calculate an optimal BST for kesy ki,..., kj it is enough to search for the

root in the interval [root(i,j-1), root(i+1,j)]

• Therefore, the cost to find the root of the optimal BST for the set of

keys ki,..., kj is proportional to root[i+1,j]-root[i,j-1]

• For a fixed len, the cost is

• Adding for all possible values of len, we obtain O(n2)

Page 120: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

Lower Bound on the expected search cost

132

• Let l1,..., ln be the levels of the leaves of a ternary tree.

Then, the following inequality holds

Proof: Induction on the size of the tree

Page 121: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

Lower Bound on the expected search cost

133

• Let h1,..., hn be the levels of the nodes of a BST. Then, the

following inequality holds

Proof:

• Transform the BST into a ternary tree so that the depth of each node increases by at most one unit

• Use the previous equation

Page 122: Binary Search Trees Cormen (cap 12, Edition 3) Estruturas de Dados e seus Algoritmos (Cap 4)

Lower Bound on the expected search cost

134

• Let T be an optimal BST for the set of keys k1,..., kn with probabilites p1,..., pn. Then,

ExpectedCost(T) >= -1 +

Proof: ExpectedCost(T) >= z*,

where z* =

s.a

Using analytical methods one can prove that

z*= -1 +