David Luebke 1 3/19/2016 CS 332: Algorithms Augmenting Data Structures.

39
David Luebke 1 06/29/22 CS 332: Algorithms Augmenting Data Structures

description

David Luebke 3 3/19/2016 Review: Hash Tables l More formally: n Given a table T and a record x, with key (= symbol) and satellite data, we need to support: u Insert (T, x) u Delete (T, x) u Search(T, x) n Don’t care about sorting the records l Hash tables support all the above in O(1) expected time

Transcript of David Luebke 1 3/19/2016 CS 332: Algorithms Augmenting Data Structures.

Page 1: David Luebke 1 3/19/2016 CS 332: Algorithms Augmenting Data Structures.

David Luebke 1 05/14/23

CS 332: Algorithms

Augmenting Data Structures

Page 2: David Luebke 1 3/19/2016 CS 332: Algorithms Augmenting Data Structures.

David Luebke 2 05/14/23

Administrivia

Midterm is postponed until Thursday, Oct 26 Reminder: homework 3 due today

In the CS front office Due at 5 PM (but don’t risk being there at 4:59!) Check your e-mail for some clarifications & hints

Page 3: David Luebke 1 3/19/2016 CS 332: Algorithms Augmenting Data Structures.

David Luebke 3 05/14/23

Review: Hash Tables

More formally: Given a table T and a record x, with key (= symbol)

and satellite data, we need to support: Insert (T, x) Delete (T, x) Search(T, x)

Don’t care about sorting the records Hash tables support all the above in

O(1) expected time

Page 4: David Luebke 1 3/19/2016 CS 332: Algorithms Augmenting Data Structures.

David Luebke 4 05/14/23

Review: Direct Addressing

Suppose: The range of keys is 0..m-1 Keys are distinct

The idea: Use key itself as the address into the table Set up an array T[0..m-1] in which

T[i] = x if x T and key[x] = i T[i] = NULL otherwise

This is called a direct-address table

Page 5: David Luebke 1 3/19/2016 CS 332: Algorithms Augmenting Data Structures.

David Luebke 5 05/14/23

Review: Hash Functions

Next problem: collisionT

0

m - 1

h(k1)h(k4)

h(k2) = h(k5)

h(k3)

k4

k2 k3

k1

k5

U(universe of keys)

K(actualkeys)

Page 6: David Luebke 1 3/19/2016 CS 332: Algorithms Augmenting Data Structures.

David Luebke 6 05/14/23

Review: Resolving Collisions

How can we solve the problem of collisions? Open addressing

To insert: if slot is full, try another slot, and another, until an open slot is found (probing)

To search, follow same sequence of probes as would be used when inserting the element

Chaining Keep linked list of elements in slots Upon collision, just add new element to list

Page 7: David Luebke 1 3/19/2016 CS 332: Algorithms Augmenting Data Structures.

David Luebke 7 05/14/23

Review: Chaining

Chaining puts elements that hash to the same slot in a linked list:

——

——

——————

——T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6k8

k7

k1 k4 ——

k5 k2

k3

k8 k6 ————

k7 ——

Page 8: David Luebke 1 3/19/2016 CS 332: Algorithms Augmenting Data Structures.

David Luebke 8 05/14/23

Review: Analysis Of Hash Tables

Simple uniform hashing: each key in table is equally likely to be hashed to any slot

Load factor = n/m = average # keys per slot Average cost of unsuccessful search = O(1+α) Successful search: O(1+ α/2) = O(1+ α) If n is proportional to m, α = O(1)

So the cost of searching = O(1) if we size our table appropriately

Page 9: David Luebke 1 3/19/2016 CS 332: Algorithms Augmenting Data Structures.

David Luebke 9 05/14/23

Review: Choosing A Hash Function

Choosing the hash function well is crucial Bad hash function puts all elements in same slot A good hash function:

Should distribute keys uniformly into slots Should not depend on patterns in the data

We discussed three methods: Division method Multiplication method Universal hashing

Page 10: David Luebke 1 3/19/2016 CS 332: Algorithms Augmenting Data Structures.

David Luebke 10 05/14/23

Review: The Division Method

h(k) = k mod m In words: hash k into a table with m slots using the

slot given by the remainder of k divided by m Elements with adjacent keys hashed to

different slots: good If keys bear relation to m: bad Upshot: pick table size m = prime number not

too close to a power of 2 (or 10)

Page 11: David Luebke 1 3/19/2016 CS 332: Algorithms Augmenting Data Structures.

David Luebke 11 05/14/23

Review: The Multiplication Method

For a constant A, 0 < A < 1: h(k) = m (kA - kA)

Upshot: Choose m = 2P

Choose A not too close to 0 or 1 Knuth: Good choice for A = (5 - 1)/2

Fractional part of kA

Page 12: David Luebke 1 3/19/2016 CS 332: Algorithms Augmenting Data Structures.

David Luebke 12 05/14/23

Review: Universal Hashing

When attempting to foil an malicious adversary, randomize the algorithm

Universal hashing: pick a hash function randomly when the algorithm begins (not upon every insert!) Guarantees good performance on average, no

matter what keys adversary chooses Need a family of hash functions to choose from

Page 13: David Luebke 1 3/19/2016 CS 332: Algorithms Augmenting Data Structures.

David Luebke 13 05/14/23

Review: Universal Hashing

Let be a (finite) collection of hash functions …that map a given universe U of keys… …into the range {0, 1, …, m - 1}.

If is universal if: for each pair of distinct keys x, y U,

the number of hash functions h for which h(x) = h(y) is ||/m

In other words: With a random hash function from , the chance of a collision

between x and y (x y) is exactly 1/m

Page 14: David Luebke 1 3/19/2016 CS 332: Algorithms Augmenting Data Structures.

David Luebke 14 05/14/23

Review: A Universal Hash Function

Choose table size m to be prime Decompose key x into r+1 bytes, so that

x = {x0, x1, …, xr} Only requirement is that max value of byte < m Let a = {a0, a1, …, ar} denote a sequence of r+1 elements

chosen randomly from {0, 1, …, m - 1} Define corresponding hash function ha :

With this definition, has mr+1 members mxaxh ii

r

ia mod

0

Page 15: David Luebke 1 3/19/2016 CS 332: Algorithms Augmenting Data Structures.

David Luebke 15 05/14/23

Augmenting Data Structures

This course is supposed to be about design and analysis of algorithms

So far, we’ve only looked at one design technique (What is it?)

Page 16: David Luebke 1 3/19/2016 CS 332: Algorithms Augmenting Data Structures.

David Luebke 16 05/14/23

Augmenting Data Structures

This course is supposed to be about design and analysis of algorithms

So far, we’ve only looked at one design technique: divide and conquer

Next up: augmenting data structures Or, “One good thief is worth ten good scholars”

Page 17: David Luebke 1 3/19/2016 CS 332: Algorithms Augmenting Data Structures.

David Luebke 17 05/14/23

Dynamic Order Statistics

We’ve seen algorithms for finding the ith element of an unordered set in O(n) time

Next, a structure to support finding the ith element of a dynamic set in O(lg n) time What operations do dynamic sets usually support? What structure works well for these? How could we use this structure for order statistics? How might we augment it to support efficient extraction

of order statistics?

Page 18: David Luebke 1 3/19/2016 CS 332: Algorithms Augmenting Data Structures.

David Luebke 18 05/14/23

Order Statistic Trees

OS Trees augment red-black trees: Associate a size field with each node in the tree x->size records the size of subtree rooted at x,

including x itself:M8

C5

P2

Q1

A1

F3

D1

H1

Page 19: David Luebke 1 3/19/2016 CS 332: Algorithms Augmenting Data Structures.

David Luebke 19 05/14/23

Selection On OS Trees

M8

C5

P2

Q1

A1

F3

D1

H1

How can we use this property to select the ith element of the set?

Page 20: David Luebke 1 3/19/2016 CS 332: Algorithms Augmenting Data Structures.

David Luebke 20 05/14/23

OS-Select

OS-Select(x, i){ r = x->left->size + 1; if (i == r) return x; else if (i < r) return OS-Select(x->left, i); else return OS-Select(x->right, i-r);}

Page 21: David Luebke 1 3/19/2016 CS 332: Algorithms Augmenting Data Structures.

David Luebke 21 05/14/23

OS-Select Example

Example: show OS-Select(root, 5):

M8

C5

P2

Q1

A1

F3

D1

H1

OS-Select(x, i){ r = x->left->size + 1; if (i == r) return x; else if (i < r) return OS-Select(x->left, i); else return OS-Select(x->right, i-r);}

Page 22: David Luebke 1 3/19/2016 CS 332: Algorithms Augmenting Data Structures.

David Luebke 22 05/14/23

OS-Select Example

Example: show OS-Select(root, 5):

M8

C5

P2

Q1

A1

F3

D1

H1

OS-Select(x, i){ r = x->left->size + 1; if (i == r) return x; else if (i < r) return OS-Select(x->left, i); else return OS-Select(x->right, i-r);}

i = 5r = 6

Page 23: David Luebke 1 3/19/2016 CS 332: Algorithms Augmenting Data Structures.

David Luebke 23 05/14/23

OS-Select Example

Example: show OS-Select(root, 5):

M8

C5

P2

Q1

A1

F3

D1

H1

OS-Select(x, i){ r = x->left->size + 1; if (i == r) return x; else if (i < r) return OS-Select(x->left, i); else return OS-Select(x->right, i-r);}

i = 5r = 6

i = 5r = 2

Page 24: David Luebke 1 3/19/2016 CS 332: Algorithms Augmenting Data Structures.

David Luebke 24 05/14/23

OS-Select Example

Example: show OS-Select(root, 5):

M8

C5

P2

Q1

A1

F3

D1

H1

OS-Select(x, i){ r = x->left->size + 1; if (i == r) return x; else if (i < r) return OS-Select(x->left, i); else return OS-Select(x->right, i-r);}

i = 5r = 6

i = 5r = 2

i = 3r = 2

Page 25: David Luebke 1 3/19/2016 CS 332: Algorithms Augmenting Data Structures.

David Luebke 25 05/14/23

OS-Select Example

Example: show OS-Select(root, 5):

M8

C5

P2

Q1

A1

F3

D1

H1

OS-Select(x, i){ r = x->left->size + 1; if (i == r) return x; else if (i < r) return OS-Select(x->left, i); else return OS-Select(x->right, i-r);}

i = 5r = 6

i = 5r = 2

i = 3r = 2

i = 1r = 1

Page 26: David Luebke 1 3/19/2016 CS 332: Algorithms Augmenting Data Structures.

David Luebke 26 05/14/23

OS-Select: A Subtlety

OS-Select(x, i){ r = x->left->size + 1; if (i == r) return x; else if (i < r) return OS-Select(x->left, i); else return OS-Select(x->right, i-r);}

What happens at the leaves? How can we deal elegantly with this?

Page 27: David Luebke 1 3/19/2016 CS 332: Algorithms Augmenting Data Structures.

David Luebke 27 05/14/23

OS-Select

OS-Select(x, i){ r = x->left->size + 1; if (i == r) return x; else if (i < r) return OS-Select(x->left, i); else return OS-Select(x->right, i-r);}

What will be the running time?

Page 28: David Luebke 1 3/19/2016 CS 332: Algorithms Augmenting Data Structures.

David Luebke 28 05/14/23

Determining The Rank Of An Element

M8

C5

P2

Q1

A1

F3

D1

H1

What is the rank of this element?

Page 29: David Luebke 1 3/19/2016 CS 332: Algorithms Augmenting Data Structures.

David Luebke 29 05/14/23

Determining The Rank Of An Element

M8

C5

P2

Q1

A1

F3

D1

H1

Of this one? Why?

Page 30: David Luebke 1 3/19/2016 CS 332: Algorithms Augmenting Data Structures.

David Luebke 30 05/14/23

Determining The Rank Of An Element

M8

C5

P2

Q1

A1

F3

D1

H1

Of the root? What’s the pattern here?

Page 31: David Luebke 1 3/19/2016 CS 332: Algorithms Augmenting Data Structures.

David Luebke 31 05/14/23

Determining The Rank Of An Element

M8

C5

P2

Q1

A1

F3

D1

H1

What about the rank of this element?

Page 32: David Luebke 1 3/19/2016 CS 332: Algorithms Augmenting Data Structures.

David Luebke 32 05/14/23

Determining The Rank Of An Element

M8

C5

P2

Q1

A1

F3

D1

H1

This one? What’s the pattern here?

Page 33: David Luebke 1 3/19/2016 CS 332: Algorithms Augmenting Data Structures.

David Luebke 33 05/14/23

OS-Rank

OS-Rank(T, x){ r = x->left->size + 1; y = x; while (y != T->root) if (y == y->p->right) r = r + y->p->left->size + 1; y = y->p; return r;} What will be the running time?

Page 34: David Luebke 1 3/19/2016 CS 332: Algorithms Augmenting Data Structures.

David Luebke 34 05/14/23

OS-Trees: Maintaining Sizes

So we’ve shown that with subtree sizes, order statistic operations can be done in O(lg n) time

Next step: maintain sizes during Insert() and Delete() operations How would we adjust the size fields during

insertion on a plain binary search tree?

Page 35: David Luebke 1 3/19/2016 CS 332: Algorithms Augmenting Data Structures.

David Luebke 35 05/14/23

OS-Trees: Maintaining Sizes

So we’ve shown that with subtree sizes, order statistic operations can be done in O(lg n) time

Next step: maintain sizes during Insert() and Delete() operations How would we adjust the size fields during

insertion on a plain binary search tree? A: increment sizes of nodes traversed during search

Page 36: David Luebke 1 3/19/2016 CS 332: Algorithms Augmenting Data Structures.

David Luebke 36 05/14/23

OS-Trees: Maintaining Sizes

So we’ve shown that with subtree sizes, order statistic operations can be done in O(lg n) time

Next step: maintain sizes during Insert() and Delete() operations How would we adjust the size fields during

insertion on a plain binary search tree? A: increment sizes of nodes traversed during search Why won’t this work on red-black trees?

Page 37: David Luebke 1 3/19/2016 CS 332: Algorithms Augmenting Data Structures.

David Luebke 37 05/14/23

Maintaining Size Through Rotation

Salient point: rotation invalidates only x and y Can recalculate their sizes in constant time

Why?

y19

x11

x19

y12

rightRotate(y)

leftRotate(x)

6 4

7 6

4 7

Page 38: David Luebke 1 3/19/2016 CS 332: Algorithms Augmenting Data Structures.

David Luebke 38 05/14/23

Augmenting Data Structures: Methodology

Choose underlying data structure E.g., red-black trees

Determine additional information to maintain E.g., subtree sizes

Verify that information can be maintained for operations that modify the structure

E.g., Insert(), Delete() (don’t forget rotations!) Develop new operations

E.g., OS-Rank(), OS-Select()

Page 39: David Luebke 1 3/19/2016 CS 332: Algorithms Augmenting Data Structures.

David Luebke 39 05/14/23

The End

Up next: Interval trees Review for midterm