hashing - Technical University of Denmark · 2020-04-10 · • Chained hashing. • Maintain array...
Transcript of hashing - Technical University of Denmark · 2020-04-10 · • Chained hashing. • Maintain array...
![Page 1: hashing - Technical University of Denmark · 2020-04-10 · • Chained hashing. • Maintain array A[0..m-1] of linked lists. • Store element x in linked list at A[h(x.key)]. •](https://reader033.fdocuments.us/reader033/viewer/2022043019/5f3ba64a5c6cad07b924b661/html5/thumbnails/1.jpg)
Philip Bille
Hashing
• Dictionaries• Chained Hashing• Linear Probing• Hash Functions
![Page 2: hashing - Technical University of Denmark · 2020-04-10 · • Chained hashing. • Maintain array A[0..m-1] of linked lists. • Store element x in linked list at A[h(x.key)]. •](https://reader033.fdocuments.us/reader033/viewer/2022043019/5f3ba64a5c6cad07b924b661/html5/thumbnails/2.jpg)
Hashing
• Dictionaries• Chained Hashing• Linear Probing• Hash Functions
![Page 3: hashing - Technical University of Denmark · 2020-04-10 · • Chained hashing. • Maintain array A[0..m-1] of linked lists. • Store element x in linked list at A[h(x.key)]. •](https://reader033.fdocuments.us/reader033/viewer/2022043019/5f3ba64a5c6cad07b924b661/html5/thumbnails/3.jpg)
• Dictionaries. Maintain dynamic set S of elements supporting the following operations. Each element x has a key x.key from a universe U and satellite data x.data.• SEARCH(k): determine if element with key k exists. If so, return it. • INSERT(x): add x to S (we assume x is not already in S)• DELETE(x): remove x from S.
• U = {0,..,99}• key(S) = {1, 13, 16, 41, 54, 66, 96}
Dictionaries
key(S)U
![Page 4: hashing - Technical University of Denmark · 2020-04-10 · • Chained hashing. • Maintain array A[0..m-1] of linked lists. • Store element x in linked list at A[h(x.key)]. •](https://reader033.fdocuments.us/reader033/viewer/2022043019/5f3ba64a5c6cad07b924b661/html5/thumbnails/4.jpg)
• Applications. • Basic data structures for representing a set.• Used in numerous algorithms and data structures.
• Challenge. How can we solve problem with current techniques?
Dictionaries
![Page 5: hashing - Technical University of Denmark · 2020-04-10 · • Chained hashing. • Maintain array A[0..m-1] of linked lists. • Store element x in linked list at A[h(x.key)]. •](https://reader033.fdocuments.us/reader033/viewer/2022043019/5f3ba64a5c6cad07b924b661/html5/thumbnails/5.jpg)
• Solution 1: linked-list. Maintain S as a linked list.
• SEARCH(k): linear search for key k.• INSERT(x): insert x in the front of the list.• DELETE(x): remove x from list.
• Time.• SEARCH in O(n) time.• INSERT and DELETE in O(1) tine.
• Space.• O(n).
Dictionaries
66 54 1 96 16 41head
13
![Page 6: hashing - Technical University of Denmark · 2020-04-10 · • Chained hashing. • Maintain array A[0..m-1] of linked lists. • Store element x in linked list at A[h(x.key)]. •](https://reader033.fdocuments.us/reader033/viewer/2022043019/5f3ba64a5c6cad07b924b661/html5/thumbnails/6.jpg)
• Solution 2: direct addressing.• Maintain S in array A of size |U|.• Store element x at A[x.key].
• SEARCH(k): return A[x.key].• INSERT(x): Set A[x.key] = x.• DELETE(x): Set A[x.key] = null.
• Time.• SEARCH, INSERT and DELETE in O(1) time.
• Space.• O(|U|)
Dictionaries
0123456789
10111213141516171819
1
16
13
A
![Page 7: hashing - Technical University of Denmark · 2020-04-10 · • Chained hashing. • Maintain array A[0..m-1] of linked lists. • Store element x in linked list at A[h(x.key)]. •](https://reader033.fdocuments.us/reader033/viewer/2022043019/5f3ba64a5c6cad07b924b661/html5/thumbnails/7.jpg)
• Challenge. Can we do significantly better?
Dictionaries
Data structure SEARCH INSERT DELETE space
linked list O(n) O(1) O(1) O(n)
direct addressing O(1) O(1) O(1) O(|U|)
![Page 8: hashing - Technical University of Denmark · 2020-04-10 · • Chained hashing. • Maintain array A[0..m-1] of linked lists. • Store element x in linked list at A[h(x.key)]. •](https://reader033.fdocuments.us/reader033/viewer/2022043019/5f3ba64a5c6cad07b924b661/html5/thumbnails/8.jpg)
Hashing
• Dictionaries• Chained Hashing• Linear Probing• Hash Functions
![Page 9: hashing - Technical University of Denmark · 2020-04-10 · • Chained hashing. • Maintain array A[0..m-1] of linked lists. • Store element x in linked list at A[h(x.key)]. •](https://reader033.fdocuments.us/reader033/viewer/2022043019/5f3ba64a5c6cad07b924b661/html5/thumbnails/9.jpg)
• Idea. Find a hash function h : U → {0, ... , m-1}, where m = Θ(n). Hash function should spread keys from S approximately evenly over {0, ..., m-1}.
• Chained hashing. • Maintain array A[0..m-1] of linked lists.• Store element x in linked list at A[h(x.key)].
• Collision. • x and y collides if h(x.key) = h(y.key).
• SEARCH(k): linear search in A[h(k)] for key k.• INSERT(x): insert x in front of list A[h(x.key)]. • DELETE(x): remove x from list A[h(x.key)].
Chained Hashing
13
0
1
2
3
4
5
6
7
8
9
15 1
54
41
66 16 96
U = {0,..,99}key(S) = {1, 13, 16, 41, 54, 66, 96}m = 10h(k) = k mod 10
![Page 10: hashing - Technical University of Denmark · 2020-04-10 · • Chained hashing. • Maintain array A[0..m-1] of linked lists. • Store element x in linked list at A[h(x.key)]. •](https://reader033.fdocuments.us/reader033/viewer/2022043019/5f3ba64a5c6cad07b924b661/html5/thumbnails/10.jpg)
0
1
2
3
4
5
6
k = 50
50
h(k) = k mod 7
![Page 11: hashing - Technical University of Denmark · 2020-04-10 · • Chained hashing. • Maintain array A[0..m-1] of linked lists. • Store element x in linked list at A[h(x.key)]. •](https://reader033.fdocuments.us/reader033/viewer/2022043019/5f3ba64a5c6cad07b924b661/html5/thumbnails/11.jpg)
0
1
2
3
4
5
6
k = 87
50
87
h(k) = k mod 7
![Page 12: hashing - Technical University of Denmark · 2020-04-10 · • Chained hashing. • Maintain array A[0..m-1] of linked lists. • Store element x in linked list at A[h(x.key)]. •](https://reader033.fdocuments.us/reader033/viewer/2022043019/5f3ba64a5c6cad07b924b661/html5/thumbnails/12.jpg)
0
1
2
3
4
5
6
k = 75
50
87
75
h(k) = k mod 7
![Page 13: hashing - Technical University of Denmark · 2020-04-10 · • Chained hashing. • Maintain array A[0..m-1] of linked lists. • Store element x in linked list at A[h(x.key)]. •](https://reader033.fdocuments.us/reader033/viewer/2022043019/5f3ba64a5c6cad07b924b661/html5/thumbnails/13.jpg)
0
1
2
3
4
5
6
k = 15
50
87
75
15
h(k) = k mod 7
![Page 14: hashing - Technical University of Denmark · 2020-04-10 · • Chained hashing. • Maintain array A[0..m-1] of linked lists. • Store element x in linked list at A[h(x.key)]. •](https://reader033.fdocuments.us/reader033/viewer/2022043019/5f3ba64a5c6cad07b924b661/html5/thumbnails/14.jpg)
k = 7
15
0
1
2
3
4
5
6
50
87
75
7
h(k) = k mod 7
![Page 15: hashing - Technical University of Denmark · 2020-04-10 · • Chained hashing. • Maintain array A[0..m-1] of linked lists. • Store element x in linked list at A[h(x.key)]. •](https://reader033.fdocuments.us/reader033/viewer/2022043019/5f3ba64a5c6cad07b924b661/html5/thumbnails/15.jpg)
k = 17
15
0
1
2
3
4
5
6
50
87
75
17
7
h(k) = k mod 7
![Page 16: hashing - Technical University of Denmark · 2020-04-10 · • Chained hashing. • Maintain array A[0..m-1] of linked lists. • Store element x in linked list at A[h(x.key)]. •](https://reader033.fdocuments.us/reader033/viewer/2022043019/5f3ba64a5c6cad07b924b661/html5/thumbnails/16.jpg)
17
k = 22
0
1
2
3
4
5
6
15 50
87
75
7
22
h(k) = k mod 7
![Page 17: hashing - Technical University of Denmark · 2020-04-10 · • Chained hashing. • Maintain array A[0..m-1] of linked lists. • Store element x in linked list at A[h(x.key)]. •](https://reader033.fdocuments.us/reader033/viewer/2022043019/5f3ba64a5c6cad07b924b661/html5/thumbnails/17.jpg)
• SEARCH(k): linear search in A[h(k)] for key k.• INSERT(x): insert x in front of list A[h(x.key)]. • DELETE(x): remove x from list A[h(x.key)].
• Exercise. Insert sequence of keys K = 5, 28, 19, 15, 20, 33, 12, 17, 10 in an initially empty hash table of size 9 using chained hashing with hash function h(k) = k mod 9.
Chained Hashing
![Page 18: hashing - Technical University of Denmark · 2020-04-10 · • Chained hashing. • Maintain array A[0..m-1] of linked lists. • Store element x in linked list at A[h(x.key)]. •](https://reader033.fdocuments.us/reader033/viewer/2022043019/5f3ba64a5c6cad07b924b661/html5/thumbnails/18.jpg)
• SEARCH(k): linear search in A[h(k)] for key k.• INSERT(x): insert x in front of list A[h(x.key)]. • DELETE(x): remove x from list A[h(x.key)].
• Time. • SEARCH in O(length of list) time. • INSERT and DELETE in O(1) time.• Length of list depends on hash function.
• Space.• O(m + n) = O(n).
Chained Hashing
13
0
1
2
3
4
5
6
7
8
9
15 1
54
41
66 16 96
U = {0,..,99}key(S) = {1, 13, 16, 41, 54, 66, 96}m = 10h(k) = k mod 10
![Page 19: hashing - Technical University of Denmark · 2020-04-10 · • Chained hashing. • Maintain array A[0..m-1] of linked lists. • Store element x in linked list at A[h(x.key)]. •](https://reader033.fdocuments.us/reader033/viewer/2022043019/5f3ba64a5c6cad07b924b661/html5/thumbnails/19.jpg)
• Def. Load factor α = average length of lists = n/m = O(1)
• Simple uniform hashing. Assume that every key is mapped uniformly at random to {0, .., m-1}.• Expected length of list = α.• ⇒ expected time for SEARCH is O(1).
• Time (assuming simple uniform hashing). • SEARCH in O(1) expected time. • INSERT and DELETE in O(1) time.
Chained Hashing
13
0
1
2
3
4
5
6
7
8
9
15 1
54
41
66 16 96
U = {0,..,99}key(S) = {1, 13, 16, 41, 54, 66, 96}m = 10h(k) = k mod 10
![Page 20: hashing - Technical University of Denmark · 2020-04-10 · • Chained hashing. • Maintain array A[0..m-1] of linked lists. • Store element x in linked list at A[h(x.key)]. •](https://reader033.fdocuments.us/reader033/viewer/2022043019/5f3ba64a5c6cad07b924b661/html5/thumbnails/20.jpg)
Dictionaries
Data structure SEARCH INSERT DELETE space
linked list O(n) O(1) O(1) O(n)
direct addressing O(1) O(1) O(1) O(|U|)
chained hashing O(1)† O(1) O(1) O(n)
† = expected time assuming simple uniform hashing
![Page 21: hashing - Technical University of Denmark · 2020-04-10 · • Chained hashing. • Maintain array A[0..m-1] of linked lists. • Store element x in linked list at A[h(x.key)]. •](https://reader033.fdocuments.us/reader033/viewer/2022043019/5f3ba64a5c6cad07b924b661/html5/thumbnails/21.jpg)
Hashing
• Dictionaries• Chained Hashing• Linear Probing• Hash Functions
![Page 22: hashing - Technical University of Denmark · 2020-04-10 · • Chained hashing. • Maintain array A[0..m-1] of linked lists. • Store element x in linked list at A[h(x.key)]. •](https://reader033.fdocuments.us/reader033/viewer/2022043019/5f3ba64a5c6cad07b924b661/html5/thumbnails/22.jpg)
• Linear probing. • Maintain S in array A of size m.• Element x stored in A[h(x.key)] or in cluster to the right of A[h(x.key)]. • Cluster = consecutive (cyclic) sequence of non-empty entries.
• SEARCH(k): linear search from A[k] in cluster to the right of A[k].• INSERT(x): insert x on A[h(x.key)]. If non-empty, insert on next empty entry to the right
of x (cyclically). • DELETE(x): remove x from A[h(x.key)]. Re-insert all elements to the right of x in the
cluster.
Linear Probing
41 1 11 13 54 98
0 1 2 3 4 5 6 7 8 9
h(k) = k mod 10
![Page 23: hashing - Technical University of Denmark · 2020-04-10 · • Chained hashing. • Maintain array A[0..m-1] of linked lists. • Store element x in linked list at A[h(x.key)]. •](https://reader033.fdocuments.us/reader033/viewer/2022043019/5f3ba64a5c6cad07b924b661/html5/thumbnails/23.jpg)
0 1 2 3 4 5 6 7 8 9 10
h(k) = k mod 11
5 1 32 54 11 1927
![Page 24: hashing - Technical University of Denmark · 2020-04-10 · • Chained hashing. • Maintain array A[0..m-1] of linked lists. • Store element x in linked list at A[h(x.key)]. •](https://reader033.fdocuments.us/reader033/viewer/2022043019/5f3ba64a5c6cad07b924b661/html5/thumbnails/24.jpg)
• Theorem. Simple uniform hashing ⟹ expected O(1) time for linear probing operations.
• Caching. Linear probing is cache-efficient.
• Variants. • Quadratic probing• Double hashing.
Linear Probing
41 1 11 13 54 98
0 1 2 3 4 5 6 7 8 9
h(k) = k mod 10
![Page 25: hashing - Technical University of Denmark · 2020-04-10 · • Chained hashing. • Maintain array A[0..m-1] of linked lists. • Store element x in linked list at A[h(x.key)]. •](https://reader033.fdocuments.us/reader033/viewer/2022043019/5f3ba64a5c6cad07b924b661/html5/thumbnails/25.jpg)
Dictionaries
Data structure SEARCH INSERT DELETE space
linked list O(n) O(1) O(1) O(n)
direct addressing O(1) O(1) O(1) O(|U|)
chained hashing O(1)† O(1) O(1) O(n)
linear probing O(1)† O(1)† O(1)† O(n)
† = expected time assuming simple uniform hashing
![Page 26: hashing - Technical University of Denmark · 2020-04-10 · • Chained hashing. • Maintain array A[0..m-1] of linked lists. • Store element x in linked list at A[h(x.key)]. •](https://reader033.fdocuments.us/reader033/viewer/2022043019/5f3ba64a5c6cad07b924b661/html5/thumbnails/26.jpg)
Hashing
• Dictionaries• Chained Hashing• Linear Probing• Hash Functions
![Page 27: hashing - Technical University of Denmark · 2020-04-10 · • Chained hashing. • Maintain array A[0..m-1] of linked lists. • Store element x in linked list at A[h(x.key)]. •](https://reader033.fdocuments.us/reader033/viewer/2022043019/5f3ba64a5c6cad07b924b661/html5/thumbnails/27.jpg)
• Simple hash functions.• h(k) = k mod m. Typically, m is prime. • h(k) =⎣m(kZ -⎣kZ⎦)⎦, for constant Z, 0 < Z < 1.
• Universal hash functions.• Choose hash functions randomly from family of hash functions. • Designed to have strong guarantees on collision probabilities. • ⇒ Dictionaries with constant expected time performance.
• Expectation on random choice of hash function. Independent of input set.
• Other hash functions.• Tabulation hashing, MurmurHash, SHA-xxx, FNV, ...
• Applications.• Cryptography, similarity, coding, ...
Hash Functions
![Page 28: hashing - Technical University of Denmark · 2020-04-10 · • Chained hashing. • Maintain array A[0..m-1] of linked lists. • Store element x in linked list at A[h(x.key)]. •](https://reader033.fdocuments.us/reader033/viewer/2022043019/5f3ba64a5c6cad07b924b661/html5/thumbnails/28.jpg)
Hashing
• Dictionaries• Chained Hashing• Linear Probing• Hash Functions