David Luebke 1 5/22/2015 ITCS 6114 Universal Hashing Dynamic Order Statistics.

27
David Luebke 1 11/01/2 ITCS 6114 Universal Hashing Dynamic Order Statistics

Transcript of David Luebke 1 5/22/2015 ITCS 6114 Universal Hashing Dynamic Order Statistics.

Page 1: David Luebke 1 5/22/2015 ITCS 6114 Universal Hashing Dynamic Order Statistics.

David Luebke 1 04/18/23

ITCS 6114

Universal Hashing

Dynamic Order Statistics

Page 2: David Luebke 1 5/22/2015 ITCS 6114 Universal Hashing Dynamic Order Statistics.

David Luebke 2 04/18/23

Choosing A Hash Function

● Choosing the hash function well is crucial■ Bad hash function puts all elements in same slot■ A good hash function:

○ Should distribute keys uniformly into slots○ Should not depend on patterns in the data

● We discussed three methods:■ Division method■ Multiplication method■ Universal hashing

Page 3: David Luebke 1 5/22/2015 ITCS 6114 Universal Hashing Dynamic Order Statistics.

David Luebke 3 04/18/23

Universal Hashing

● When attempting to foil an malicious adversary, randomize the algorithm

● Universal hashing: pick a hash function randomly when the algorithm begins (not upon every insert!)■ Guarantees good performance on average, no

matter what keys adversary chooses■ Need a family of hash functions to choose from

Page 4: David Luebke 1 5/22/2015 ITCS 6114 Universal Hashing Dynamic Order Statistics.

David Luebke 4 04/18/23

Universal Hashing

● A family of hash functions is said to be universal if:■ With a random hash function from , the chance of a

collision between x and y is exactly 1/m (x y) ● We can use this to get good expected performance:

■ Choose h from a universal family of hash functions■ Hash n keys into a table of m slots, n m■ Then the expected number of collisions involving a

particular key x is less than 1

Page 5: David Luebke 1 5/22/2015 ITCS 6114 Universal Hashing Dynamic Order Statistics.

David Luebke 5 04/18/23

A Universal Hash Function

● Choose table size m to be prime● Decompose key x into r+1 bytes, so that

x = {x0, x1, …, xr}■ Only requirement is that max value of byte < m

■ Let a = {a0, a1, …, ar} denote a sequence of r+1 elements chosen randomly from {0, 1, …, m - 1}

■ Define corresponding hash function ha :

■ With this definition, has mr+1 members

r

iiia mxaxh

0

mod

Page 6: David Luebke 1 5/22/2015 ITCS 6114 Universal Hashing Dynamic Order Statistics.

David Luebke 6 04/18/23

A Universal Hash Function

● is a universal collection of hash functions (Theorem 12.4)

● How to use:■ Pick r based on m and the range of keys in U■ Pick a hash function by (randomly) picking the a’s■ Use that hash function on all keys

Page 7: David Luebke 1 5/22/2015 ITCS 6114 Universal Hashing Dynamic Order Statistics.

David Luebke 7 04/18/23

Order Statistic Trees

● OS Trees augment red-black trees: ■ Associate a size field with each node in the tree■ x->size records the size of subtree rooted at x,

including x itself:M8

C5

P2

Q1

A1

F3

D1

H1

Page 8: David Luebke 1 5/22/2015 ITCS 6114 Universal Hashing Dynamic Order Statistics.

David Luebke 8 04/18/23

Selection On OS Trees

M8

C5

P2

Q1

A1

F3

D1

H1

How can we use this property to select the ith element of the set?

Page 9: David Luebke 1 5/22/2015 ITCS 6114 Universal Hashing Dynamic Order Statistics.

David Luebke 9 04/18/23

OS-Select

OS-Select(x, i)

{

r = x->left->size + 1;

if (i == r)

return x;

else if (i < r)

return OS-Select(x->left, i);

else

return OS-Select(x->right, i-r);

}

Page 10: David Luebke 1 5/22/2015 ITCS 6114 Universal Hashing Dynamic Order Statistics.

David Luebke 10 04/18/23

OS-Select Example

● Example: show OS-Select(root, 5):

M8

C5

P2

Q1

A1

F3

D1

H1

OS-Select(x, i)

{

r = x->left->size + 1;

if (i == r)

return x;

else if (i < r)

return OS-Select(x->left, i);

else

return OS-Select(x->right, i-r);

}

Page 11: David Luebke 1 5/22/2015 ITCS 6114 Universal Hashing Dynamic Order Statistics.

David Luebke 11 04/18/23

OS-Select Example

● Example: show OS-Select(root, 5):

M8

C5

P2

Q1

A1

F3

D1

H1

OS-Select(x, i)

{

r = x->left->size + 1;

if (i == r)

return x;

else if (i < r)

return OS-Select(x->left, i);

else

return OS-Select(x->right, i-r);

}

i = 5r = 6

Page 12: David Luebke 1 5/22/2015 ITCS 6114 Universal Hashing Dynamic Order Statistics.

David Luebke 12 04/18/23

OS-Select Example

● Example: show OS-Select(root, 5):

M8

C5

P2

Q1

A1

F3

D1

H1

OS-Select(x, i)

{

r = x->left->size + 1;

if (i == r)

return x;

else if (i < r)

return OS-Select(x->left, i);

else

return OS-Select(x->right, i-r);

}

i = 5r = 6

i = 5r = 2

Page 13: David Luebke 1 5/22/2015 ITCS 6114 Universal Hashing Dynamic Order Statistics.

David Luebke 13 04/18/23

OS-Select Example

● Example: show OS-Select(root, 5):

M8

C5

P2

Q1

A1

F3

D1

H1

OS-Select(x, i)

{

r = x->left->size + 1;

if (i == r)

return x;

else if (i < r)

return OS-Select(x->left, i);

else

return OS-Select(x->right, i-r);

}

i = 5r = 6

i = 5r = 2

i = 3r = 2

Page 14: David Luebke 1 5/22/2015 ITCS 6114 Universal Hashing Dynamic Order Statistics.

David Luebke 14 04/18/23

OS-Select Example

● Example: show OS-Select(root, 5):

M8

C5

P2

Q1

A1

F3

D1

H1

OS-Select(x, i)

{

r = x->left->size + 1;

if (i == r)

return x;

else if (i < r)

return OS-Select(x->left, i);

else

return OS-Select(x->right, i-r);

}

i = 5r = 6

i = 5r = 2

i = 3r = 2

i = 1r = 1

Page 15: David Luebke 1 5/22/2015 ITCS 6114 Universal Hashing Dynamic Order Statistics.

David Luebke 15 04/18/23

OS-Select: A Subtlety

OS-Select(x, i)

{

r = x->left->size + 1;

if (i == r)

return x;

else if (i < r)

return OS-Select(x->left, i);

else

return OS-Select(x->right, i-r);

}

● What happens at the leaves?● How can we deal elegantly with this?

Oops…

Page 16: David Luebke 1 5/22/2015 ITCS 6114 Universal Hashing Dynamic Order Statistics.

David Luebke 16 04/18/23

OS-Select

OS-Select(x, i)

{

r = x->left->size + 1;

if (i == r)

return x;

else if (i < r)

return OS-Select(x->left, i);

else

return OS-Select(x->right, i-r);

}

● What will be the running time?

Page 17: David Luebke 1 5/22/2015 ITCS 6114 Universal Hashing Dynamic Order Statistics.

David Luebke 17 04/18/23

Determining The Rank Of An Element

M8

C5

P2

Q1

A1

F3

D1

H1

What is the rank of this element?

Page 18: David Luebke 1 5/22/2015 ITCS 6114 Universal Hashing Dynamic Order Statistics.

David Luebke 18 04/18/23

Determining The Rank Of An Element

M8

C5

P2

Q1

A1

F3

D1

H1

Of this one? Why?

Page 19: David Luebke 1 5/22/2015 ITCS 6114 Universal Hashing Dynamic Order Statistics.

David Luebke 19 04/18/23

Determining The Rank Of An Element

M8

C5

P2

Q1

A1

F3

D1

H1

Of the root? What’s the pattern here?

Page 20: David Luebke 1 5/22/2015 ITCS 6114 Universal Hashing Dynamic Order Statistics.

David Luebke 20 04/18/23

Determining The Rank Of An Element

M8

C5

P2

Q1

A1

F3

D1

H1

What about the rank of this element?

Page 21: David Luebke 1 5/22/2015 ITCS 6114 Universal Hashing Dynamic Order Statistics.

David Luebke 21 04/18/23

Determining The Rank Of An Element

M8

C5

P2

Q1

A1

F3

D1

H1

This one? What’s the pattern here?

Page 22: David Luebke 1 5/22/2015 ITCS 6114 Universal Hashing Dynamic Order Statistics.

David Luebke 22 04/18/23

OS-Rank

OS-Rank(T, x)

{

r = x->left->size + 1;

y = x;

while (y != T->root)

if (y == y->p->right)

r = r + y->p->left->size + 1;

y = y->p;

return r;

}

● What will be the running time?

Page 23: David Luebke 1 5/22/2015 ITCS 6114 Universal Hashing Dynamic Order Statistics.

David Luebke 23 04/18/23

OS-Trees: Maintaining Sizes

● So we’ve shown that with subtree sizes, order statistic operations can be done in O(lg n) time

● Next step: maintain sizes during Insert() and Delete() operations■ How would we adjust the size fields during

insertion on a plain binary search tree?

Page 24: David Luebke 1 5/22/2015 ITCS 6114 Universal Hashing Dynamic Order Statistics.

David Luebke 24 04/18/23

OS-Trees: Maintaining Sizes

● So we’ve shown that with subtree sizes, order statistic operations can be done in O(lg n) time

● Next step: maintain sizes during Insert() and Delete() operations■ How would we adjust the size fields during

insertion on a plain binary search tree?■ A: increment sizes of nodes traversed during search

Page 25: David Luebke 1 5/22/2015 ITCS 6114 Universal Hashing Dynamic Order Statistics.

David Luebke 25 04/18/23

OS-Trees: Maintaining Sizes

● So we’ve shown that with subtree sizes, order statistic operations can be done in O(lg n) time

● Next step: maintain sizes during Insert() and Delete() operations■ How would we adjust the size fields during

insertion on a plain binary search tree?■ A: increment sizes of nodes traversed during search■ Why won’t this work on red-black trees?

Page 26: David Luebke 1 5/22/2015 ITCS 6114 Universal Hashing Dynamic Order Statistics.

David Luebke 26 04/18/23

Maintaining Size Through Rotation

● Salient point: rotation invalidates only x and y● Can recalculate their sizes in constant time

■ Why?

y19

x11

x19

y12

rightRotate(y)

leftRotate(x)

6 4

7 6

4 7

Page 27: David Luebke 1 5/22/2015 ITCS 6114 Universal Hashing Dynamic Order Statistics.

David Luebke 27 04/18/23

Augmenting Data Structures: Methodology

● Choose underlying data structure■ E.g., red-black trees

● Determine additional information to maintain■ E.g., subtree sizes

● Verify that information can be maintained for operations that modify the structure

■ E.g., Insert(), Delete() (don’t forget rotations!)

● Develop new operations■ E.g., OS-Rank(), OS-Select()