Hash Tables. 2 Many applications require a dynamic set that only supports the dictionary operations:...
-
Upload
frances-lightman -
Category
Documents
-
view
215 -
download
0
Transcript of Hash Tables. 2 Many applications require a dynamic set that only supports the dictionary operations:...
![Page 1: Hash Tables. 2 Many applications require a dynamic set that only supports the dictionary operations: Insert, Delete, Search A hash table is an efficient.](https://reader036.fdocuments.us/reader036/viewer/2022062619/5519b2a25503467a578b467e/html5/thumbnails/1.jpg)
Hash Tables
![Page 2: Hash Tables. 2 Many applications require a dynamic set that only supports the dictionary operations: Insert, Delete, Search A hash table is an efficient.](https://reader036.fdocuments.us/reader036/viewer/2022062619/5519b2a25503467a578b467e/html5/thumbnails/2.jpg)
2
Hash Tables
Many applications require a dynamic set that only supports the dictionary operations: Insert, Delete, Search
A hash table is an efficient implementation of a dictionary
Worst case – same as linked list – O(n)
Under reasonable assumptions – O(1)
![Page 3: Hash Tables. 2 Many applications require a dynamic set that only supports the dictionary operations: Insert, Delete, Search A hash table is an efficient.](https://reader036.fdocuments.us/reader036/viewer/2022062619/5519b2a25503467a578b467e/html5/thumbnails/3.jpg)
3
Direct-address Tables
AssumptionsThe universe U of keys is reasonably small:U = { 0, 1, 2, …, m-1 }, for some small m
No two elements have the same key
ImplementationAllocate an array of size m
Insert the kth element into the kth slot in the array
![Page 4: Hash Tables. 2 Many applications require a dynamic set that only supports the dictionary operations: Insert, Delete, Search A hash table is an efficient.](https://reader036.fdocuments.us/reader036/viewer/2022062619/5519b2a25503467a578b467e/html5/thumbnails/4.jpg)
4
Direct-address Tables
![Page 5: Hash Tables. 2 Many applications require a dynamic set that only supports the dictionary operations: Insert, Delete, Search A hash table is an efficient.](https://reader036.fdocuments.us/reader036/viewer/2022062619/5519b2a25503467a578b467e/html5/thumbnails/5.jpg)
5
Direct-address Tables
Advantage O(1) time for all operations
DisadvantagesWasteful if the number of elements actually inserted is significantly smaller than the size of the universe (m)
Only applicable for small values of m,i.e. a limited range of keys
![Page 6: Hash Tables. 2 Many applications require a dynamic set that only supports the dictionary operations: Insert, Delete, Search A hash table is an efficient.](https://reader036.fdocuments.us/reader036/viewer/2022062619/5519b2a25503467a578b467e/html5/thumbnails/6.jpg)
6
Hash Tables
Performance is almost similar to that of a direct-address table, but without the limitations
The universe U may be very large
The storage requirement is O(|K|), where K is the set of keys actually used
Disadvantage – O(1) performance is now average case, not worst case
![Page 7: Hash Tables. 2 Many applications require a dynamic set that only supports the dictionary operations: Insert, Delete, Search A hash table is an efficient.](https://reader036.fdocuments.us/reader036/viewer/2022062619/5519b2a25503467a578b467e/html5/thumbnails/7.jpg)
7
Hash Tables
We need a hash function to map keys from the universe U into the hash tableh: U { 0, 1, …, m-1 }
For each key k, the hash function computes a hash value h(k)
If two keys hash to the same value:h(k1) = h(k2), we call this a collision
![Page 8: Hash Tables. 2 Many applications require a dynamic set that only supports the dictionary operations: Insert, Delete, Search A hash table is an efficient.](https://reader036.fdocuments.us/reader036/viewer/2022062619/5519b2a25503467a578b467e/html5/thumbnails/8.jpg)
8
Hash Tables
![Page 9: Hash Tables. 2 Many applications require a dynamic set that only supports the dictionary operations: Insert, Delete, Search A hash table is an efficient.](https://reader036.fdocuments.us/reader036/viewer/2022062619/5519b2a25503467a578b467e/html5/thumbnails/9.jpg)
9
Collisions
Can we avoid collisions altogether?
No. Since |U| > m, some keys must have the same hash value
A good hash function will be as ‘random’ as possible
Still, collisions must be resolved
![Page 10: Hash Tables. 2 Many applications require a dynamic set that only supports the dictionary operations: Insert, Delete, Search A hash table is an efficient.](https://reader036.fdocuments.us/reader036/viewer/2022062619/5519b2a25503467a578b467e/html5/thumbnails/10.jpg)
10
Collision Resolution
Chaining (also called open hash)Elements stored in their ‘correct’ slot
Collisions resolved by creating linked lists
Open addressing (also calledclosed hash)
All elements stored inside the table
Maybe rehashed if their slot is full
![Page 11: Hash Tables. 2 Many applications require a dynamic set that only supports the dictionary operations: Insert, Delete, Search A hash table is an efficient.](https://reader036.fdocuments.us/reader036/viewer/2022062619/5519b2a25503467a578b467e/html5/thumbnails/11.jpg)
11
Chaining – Open Hash
![Page 12: Hash Tables. 2 Many applications require a dynamic set that only supports the dictionary operations: Insert, Delete, Search A hash table is an efficient.](https://reader036.fdocuments.us/reader036/viewer/2022062619/5519b2a25503467a578b467e/html5/thumbnails/12.jpg)
12
Collision resolution – Chaining
All keys that have the same hash value are placed in a linked list
Insertion can be done at the beginning of the list in O(1) time
Searching is proportional to the length of the list – with a good hash function, will also be O(1)
![Page 13: Hash Tables. 2 Many applications require a dynamic set that only supports the dictionary operations: Insert, Delete, Search A hash table is an efficient.](https://reader036.fdocuments.us/reader036/viewer/2022062619/5519b2a25503467a578b467e/html5/thumbnails/13.jpg)
13
Hash Function Requirements
A hash function must be deterministic – the hash value generated for each key cannot change during the life of the hash table
Equal keys must always be mapped to the same hash value
![Page 14: Hash Tables. 2 Many applications require a dynamic set that only supports the dictionary operations: Insert, Delete, Search A hash table is an efficient.](https://reader036.fdocuments.us/reader036/viewer/2022062619/5519b2a25503467a578b467e/html5/thumbnails/14.jpg)
14
Hash Function Properties
Properties of a good hash functionEasy to evaluate – h(x) can be computed very quickly (not only in O(1), but also with a small constant)
Uniform distribution over all the table slots
Different keys are mapped to different slots (as much as possible)
![Page 15: Hash Tables. 2 Many applications require a dynamic set that only supports the dictionary operations: Insert, Delete, Search A hash table is an efficient.](https://reader036.fdocuments.us/reader036/viewer/2022062619/5519b2a25503467a578b467e/html5/thumbnails/15.jpg)
15
Simple Uniform HashingThe quality of the hash function strongly influences the efficiencyof the hash table
Simple Uniform Hashing assumption: The hash function will hash any keyinto any slot with equal probability
It is possible to define hash functions that almost satisfy this assumption
![Page 16: Hash Tables. 2 Many applications require a dynamic set that only supports the dictionary operations: Insert, Delete, Search A hash table is an efficient.](https://reader036.fdocuments.us/reader036/viewer/2022062619/5519b2a25503467a578b467e/html5/thumbnails/16.jpg)
16
Analysis
The load factor of a hash table is defined as the number of elements stored in the table, divided by the total number of slots:
A search will take under the assumption of simple uniform hashing
Therefore, all hash operations can be performed in O(1)
/n m
(1 )
![Page 17: Hash Tables. 2 Many applications require a dynamic set that only supports the dictionary operations: Insert, Delete, Search A hash table is an efficient.](https://reader036.fdocuments.us/reader036/viewer/2022062619/5519b2a25503467a578b467e/html5/thumbnails/17.jpg)
17
Computing Key Values
The first step is to represent the key as a natural integer number
For example if S is a string then we can interpret it as an integer value using the following formula:keylength
i
0
128 ( [ ])i
char key i
![Page 18: Hash Tables. 2 Many applications require a dynamic set that only supports the dictionary operations: Insert, Delete, Search A hash table is an efficient.](https://reader036.fdocuments.us/reader036/viewer/2022062619/5519b2a25503467a578b467e/html5/thumbnails/18.jpg)
18
The Division Method
Key k is mapped into one of m slots by taking the remainder of k divided by m:
Choosing the value of mPreferably prime
Not too close to a power of 2
( ) modh k k m
![Page 19: Hash Tables. 2 Many applications require a dynamic set that only supports the dictionary operations: Insert, Delete, Search A hash table is an efficient.](https://reader036.fdocuments.us/reader036/viewer/2022062619/5519b2a25503467a578b467e/html5/thumbnails/19.jpg)
19
Example – the Division Method
Let h be a hash table of 9 slots andh(k) = k mod 9. insert the elements:6, 43, 23, 62, 1, 13, 34, 55, 25
h(13) = 13 mod 9 = 4h(34) = 34 mod 9 = 7h(55) = 55 mod 9 = 1h(25) = 25 mod 9 = 7
h(6) = 6 mod 9 = 6h(43) = 43 mod 9 = 7h(23) = 23 mod 9 = 5h(62) = 62 mod 9 = 8h(1) = 1 mod 9 = 1
![Page 20: Hash Tables. 2 Many applications require a dynamic set that only supports the dictionary operations: Insert, Delete, Search A hash table is an efficient.](https://reader036.fdocuments.us/reader036/viewer/2022062619/5519b2a25503467a578b467e/html5/thumbnails/20.jpg)
20
Open AddressingEach element occupies a single slot in the hash table – no chaining is done
To insert an element, we probe the table according to the hash function until an empty slot is found
The hash function is now a function of both the key and the number of attempts in the insertion process: {0,1..... 1} {0,1..... 1}h U m m
![Page 21: Hash Tables. 2 Many applications require a dynamic set that only supports the dictionary operations: Insert, Delete, Search A hash table is an efficient.](https://reader036.fdocuments.us/reader036/viewer/2022062619/5519b2a25503467a578b467e/html5/thumbnails/21.jpg)
21
Linear Probing
A hash value is computed using any hash function h’, and then the number of the current attempt is added to it:
Slots are examined sequentially, until an empty one is found
( , ) ( '( ) )modh k i h k i m
![Page 22: Hash Tables. 2 Many applications require a dynamic set that only supports the dictionary operations: Insert, Delete, Search A hash table is an efficient.](https://reader036.fdocuments.us/reader036/viewer/2022062619/5519b2a25503467a578b467e/html5/thumbnails/22.jpg)
22
Linear Probing
Easy to implement but suffers from primary clustering
Clusters tend to grow:If an empty slot is preceded by i full slots, the probability that it will be the next one filled is (i+1)/m
If an empty slot is preceded by another empty slot, the probability is only 1/m
![Page 23: Hash Tables. 2 Many applications require a dynamic set that only supports the dictionary operations: Insert, Delete, Search A hash table is an efficient.](https://reader036.fdocuments.us/reader036/viewer/2022062619/5519b2a25503467a578b467e/html5/thumbnails/23.jpg)
23
Exercise
You are given a hash table H with 11 slots
Demonstrate inserting the following elements using linear probing and a hash function h(k) = k mod m
10, 22, 31, 4, 15, 28, 17, 88, 59
![Page 24: Hash Tables. 2 Many applications require a dynamic set that only supports the dictionary operations: Insert, Delete, Search A hash table is an efficient.](https://reader036.fdocuments.us/reader036/viewer/2022062619/5519b2a25503467a578b467e/html5/thumbnails/24.jpg)
24
Solutionh(10, 0) = (10 mod 11 + 0) mod 11 = 10
h(22, 0) = (22 mod 11 + 0) mod 11 = 0
h(31, 0) = (31 mod 11 + 0) mod 11 = 9
h(4, 0) = (4 mod 11 + 0) mod 11 = 4
h(15, 0) = (15 mod 11 + 0) mod 11 = 4
h(15, 1) = (15 mod 11 + 1) mod 11 = 5
h(28, 0) = (28 mod 11 + 0) mod 11 = 6
h(17, 0) = (17 mod 11 + 0) mod 11 = 6
0 1 2 3 4 5 6 7 8 9 10
22
4 15
28
31
10
![Page 25: Hash Tables. 2 Many applications require a dynamic set that only supports the dictionary operations: Insert, Delete, Search A hash table is an efficient.](https://reader036.fdocuments.us/reader036/viewer/2022062619/5519b2a25503467a578b467e/html5/thumbnails/25.jpg)
25
Solutionh(17, 1) = (17 mod 11 + 1) mod 11 = 7
h(88, 0) = (88 mod 11 + 0) mod 11 = 0
h(88, 1) = (88 mod 11 + 1) mod 11 = 1
h(59, 0) = (59 mod 11 + 0) mod 11 = 4
h(59, 1) = (59 mod 11 + 1) mod 11 = 5
h(59, 2) = (59 mod 11 + 2) mod 11 = 6
h(59, 3) = (59 mod 11 + 3) mod 11 = 7
h(59, 4) = (59 mod 11 + 4) mod 11 = 8
0 1 2 3 4 5 6 7 8 9 10
22
88
4 15
28
17
59
31
10
![Page 26: Hash Tables. 2 Many applications require a dynamic set that only supports the dictionary operations: Insert, Delete, Search A hash table is an efficient.](https://reader036.fdocuments.us/reader036/viewer/2022062619/5519b2a25503467a578b467e/html5/thumbnails/26.jpg)
26
Quadratic Probing
In this case, the second attempt is a more complex function of i:
Tries to avoid primary clustering
However, suffers from secondary clustering
The entire probing sequence is determined by the initial probe:
21 2( , ) ( '( ) )modh k i h k c i c i m
1 2 1 2( , ) ( , ) ( , 1) ( , 1)h k i h k i h k i h k i
![Page 27: Hash Tables. 2 Many applications require a dynamic set that only supports the dictionary operations: Insert, Delete, Search A hash table is an efficient.](https://reader036.fdocuments.us/reader036/viewer/2022062619/5519b2a25503467a578b467e/html5/thumbnails/27.jpg)
27
Double Hashing
Given two hash functions
One of the best methods for open addressing collision resolution
Permutations are almost random
For the entire hash to be searched,m and h2(k) must be relatively prime
1 2,h h
1 2( , ) ( ( ) ( ))modh k i h k ih k m
![Page 28: Hash Tables. 2 Many applications require a dynamic set that only supports the dictionary operations: Insert, Delete, Search A hash table is an efficient.](https://reader036.fdocuments.us/reader036/viewer/2022062619/5519b2a25503467a578b467e/html5/thumbnails/28.jpg)
28
Double Hashing
Possible selections of h2(k)
Select m to be a power of 2, and design h2(k) to produce odd numbers
Select m to be prime, and m’ to be m-11
2
( ) mod
( ) 1 ( mod ')
h k k m
h k k m
![Page 29: Hash Tables. 2 Many applications require a dynamic set that only supports the dictionary operations: Insert, Delete, Search A hash table is an efficient.](https://reader036.fdocuments.us/reader036/viewer/2022062619/5519b2a25503467a578b467e/html5/thumbnails/29.jpg)
29
Double Hashing
![Page 30: Hash Tables. 2 Many applications require a dynamic set that only supports the dictionary operations: Insert, Delete, Search A hash table is an efficient.](https://reader036.fdocuments.us/reader036/viewer/2022062619/5519b2a25503467a578b467e/html5/thumbnails/30.jpg)
30
Issues in Open AddressingSearch may fail if items are deleted
Solution:Mark deleted items with a special symbol
Search treats this symbol as full, while insert treats it as empty
Table may be filled up
Solution:Rehashing (copy into a larger table)