2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When...
Transcript of 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When...
Problem Solving2-12 Hashing
MA Jun
Institute of Computer Software
May 14, 2020
. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .
Contents
1 Hash-table: Basic Idea
2 Hash Functions
3 Collision Resolution
. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Applications of Hashing
There are many applications of hashing (not limited to hash table),including modern day cryptography hash functions. Some of theseapplications are listed below:
Message DigestPassword VerificationData Structures (Programming Languages)Compiler OperationRabin-Karp AlgorithmLinking File Name and Path Together
https://www.geeksforgeeks.org/applications-of-hashing/
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 1 / 34
Contents
1 Hash-table: Basic Idea
2 Hash Functions
3 Collision Resolution
. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Hash-table: Basic Idea
Many applications require a Dynamic Set that supports only thedictionary operations Insert, Search, and Delete.
Searching for an element in a hash table can take as long as searching foran element in a linked list Θ(n) time in the worst case.
Under reasonable assumptions, the average time to search for an elementin a hash table is Θ(1).
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 2 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Hash-table: Basic Idea
Many applications require a Dynamic Set that supports only thedictionary operations Insert, Search, and Delete.
Searching for an element in a hash table can take as long as searching foran element in a linked list Θ(n) time in the worst case.
Under reasonable assumptions, the average time to search for an elementin a hash table is Θ(1).
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 2 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Hash-table: Basic Idea
Many applications require a Dynamic Set that supports only thedictionary operations Insert, Search, and Delete.
Searching for an element in a hash table can take as long as searching foran element in a linked list Θ(n) time in the worst case.
Under reasonable assumptions, the average time to search for an elementin a hash table is Θ(1).
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 2 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Hash-table: Basic Idea
Hashing: the idea
Q : When should we use hash table?What situations is the hash table suitable for?
Key Space
x
Very large, but only a smallpart is used in an applica-tion at a certain time.
· · ·
· · ·
E [0]E [1]
E [m − 1]
Feasible size
Hash FunctionE [k]
Index distributionCollision handling
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 3 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Hash-table: Basic Idea
Hashing: the idea
Q : When should we use hash table?What situations is the hash table suitable for?
Key Space
x
Very large, but only a smallpart is used in an applica-tion at a certain time.
· · ·
· · ·
E [0]E [1]
E [m − 1]
Feasible size
Hash FunctionE [k]
Index distributionCollision handling
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 3 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Hash-table: Basic Idea
Hashing: the idea
Q : When should we use hash table?What situations is the hash table suitable for?
Key Space
x
Very large, but only a smallpart is used in an applica-tion at a certain time.
· · ·
· · ·
E [0]E [1]
E [m − 1]
Feasible size
Hash FunctionE [k]
Index distributionCollision handling
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 3 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Hash-table: Basic Idea
Hashing: the idea
Q : When should we use hash table?What situations is the hash table suitable for?
Key Space
x
Very large, but only a smallpart is used in an applica-tion at a certain time.
· · ·
· · ·
E [0]E [1]
E [m − 1]
Feasible size
Hash FunctionE [k]
Index distributionCollision handling
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 3 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Hash-table: Basic Idea
Hashing: the idea
Q : When should we use hash table?What situations is the hash table suitable for?
Key Space
x
Very large, but only a smallpart is used in an applica-tion at a certain time.
· · ·
· · ·
E [0]E [1]
E [m − 1]
Feasible size
Hash FunctionE [k]
Index distributionCollision handling
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 3 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Hash-table: Basic Idea
Hashing: the idea
Q : When should we use hash table?What situations is the hash table suitable for?
Key Space
x
Very large, but only a smallpart is used in an applica-tion at a certain time.
· · ·
· · ·
E [0]E [1]
E [m − 1]
Feasible size
Hash FunctionE [k]
Index distributionCollision handling
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 3 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Hash-table: Basic Idea
Hashing: the idea
Q : What is a Collision? When does it take place?
Hash FunctionKey Space
· · ·
· · ·
E [0]E [1]
E [m − 1]
xyE [k]
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 4 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Hash-table: Basic Idea
Hashing: the idea
Q : What is a Collision? When does it take place?
Hash FunctionKey Space
· · ·
· · ·
E [0]E [1]
E [m − 1]
xyE [k]
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 4 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Hash-table: Basic Idea
Hashing: the idea
Q : What is a Collision? When does it take place?
Hash FunctionKey Space
· · ·
· · ·
E [0]E [1]
E [m − 1]
xy
E [k]
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 4 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Hash-table: Basic Idea
Hashing: the idea
Q : What is a Collision? When does it take place?
Hash FunctionKey Space
· · ·
· · ·
E [0]E [1]
E [m − 1]
xyE [k]
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 4 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Hash-table: Basic Idea
Collision
Q : Given n keys and m slots, what is the expected number of keyshashed into the same slot?
Hash FunctionKey Space
xy
· · ·
· · ·
E [0]E [1]
E [m − 1]
E [k]
It depends ...
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 5 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Hash-table: Basic Idea
Collision
Q : Given n keys and m slots, what is the expected number of keyshashed into the same slot?
Hash FunctionKey Space
xy
· · ·
· · ·
E [0]E [1]
E [m − 1]
E [k]
It depends ...
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 5 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Hash-table: Basic Idea
Model & Assumption
ModelInserting n keys: sequence of n s independent trails.Each insertion corresponds to a value from {0, 1, · · · , m − 1}.
AssumptionFor each insertion, the result is Uniformly Distributed, i.e. each valuefrom {0, 1, · · · , m − 1} is equally picked.
In hashing n items into a hash table of size m, the expected number of itemsthat hash to any one location is α = n/m (α: loading factor)
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 6 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Hash-table: Basic Idea
Model & Assumption
ModelInserting n keys: sequence of n s independent trails.Each insertion corresponds to a value from {0, 1, · · · , m − 1}.
AssumptionFor each insertion, the result is Uniformly Distributed, i.e. each valuefrom {0, 1, · · · , m − 1} is equally picked.
In hashing n items into a hash table of size m, the expected number of itemsthat hash to any one location is α = n/m (α: loading factor)
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 6 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Hash-table: Basic Idea
Model & Assumption
ModelInserting n keys: sequence of n s independent trails.Each insertion corresponds to a value from {0, 1, · · · , m − 1}.
AssumptionFor each insertion, the result is Uniformly Distributed, i.e. each valuefrom {0, 1, · · · , m − 1} is equally picked.
In hashing n items into a hash table of size m, the expected number of itemsthat hash to any one location is α = n/m (α: loading factor)
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 6 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Hash-table: Basic Idea
Empty location
After inserting n items into m locations
Q : What is the probability of a given location is empty?
(1 − 1m )n
Q : What is the expected number of empty locations?
Xi ={
1 , location i is empty0 , otherwise
X =∑
1≤i≤mXi
E (X ) = E (∑
1≤i≤mXi) =
∑1≤i≤m
E (Xi) =∑
1≤i≤m(1 − 1/m)n =
m(1 − 1/m)n
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 7 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Hash-table: Basic Idea
Empty location
After inserting n items into m locations
Q : What is the probability of a given location is empty?
(1 − 1m )n
Q : What is the expected number of empty locations?
Xi ={
1 , location i is empty0 , otherwise
X =∑
1≤i≤mXi
E (X ) = E (∑
1≤i≤mXi) =
∑1≤i≤m
E (Xi) =∑
1≤i≤m(1 − 1/m)n =
m(1 − 1/m)n
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 7 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Hash-table: Basic Idea
Empty location
After inserting n items into m locations
Q : What is the probability of a given location is empty?
(1 − 1m )n
Q : What is the expected number of empty locations?
Xi ={
1 , location i is empty0 , otherwise
X =∑
1≤i≤mXi
E (X ) = E (∑
1≤i≤mXi) =
∑1≤i≤m
E (Xi) =∑
1≤i≤m(1 − 1/m)n =
m(1 − 1/m)n
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 7 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Hash-table: Basic Idea
Empty location
After inserting n items into m locations
Q : What is the probability of a given location is empty?
(1 − 1m )n
Q : What is the expected number of empty locations?
Xi ={
1 , location i is empty0 , otherwise
X =∑
1≤i≤mXi
E (X ) = E (∑
1≤i≤mXi) =
∑1≤i≤m
E (Xi) =∑
1≤i≤m(1 − 1/m)n =
m(1 − 1/m)n
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 7 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Hash-table: Basic Idea
Empty location
After inserting n items into m locations
Q : What is the probability of a given location is empty?
(1 − 1m )n
Q : What is the expected number of empty locations?
Xi ={
1 , location i is empty0 , otherwise
X =∑
1≤i≤mXi
E (X ) = E (∑
1≤i≤mXi) =
∑1≤i≤m
E (Xi) =∑
1≤i≤m(1 − 1/m)n =
m(1 − 1/m)n
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 7 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Hash-table: Basic Idea
Empty location
After inserting n items into m locations
Q : What is the probability of a given location is empty?
(1 − 1m )n
Q : What is the expected number of empty locations?
Xi ={
1 , location i is empty0 , otherwise
X =∑
1≤i≤mXi
E (X ) = E (∑
1≤i≤mXi) =
∑1≤i≤m
E (Xi) =∑
1≤i≤m(1 − 1/m)n =
m(1 − 1/m)n
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 7 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Hash-table: Basic Idea
Empty location
After inserting n items into m locations
Q : What is the probability of a given location is empty?
(1 − 1m )n
Q : What is the expected number of empty locations?
Xi ={
1 , location i is empty0 , otherwise
X =∑
1≤i≤mXi
E (X ) = E (∑
1≤i≤mXi) =
∑1≤i≤m
E (Xi) =∑
1≤i≤m(1 − 1/m)n =
m(1 − 1/m)n
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 7 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Hash-table: Basic Idea
After inserting n items into a hashtable with n locations
Q : What is the expected number of items in a given location?
Yi ={
1 , the ith item in the given location0 , otherwise
Y =∑
1≤i≤nYi
E (Y ) = E (∑
1≤i≤nYi) =
∑1≤i≤n
E (Yi) =∑
1≤i≤n1/n = 1
Expected number of empty locations: n(1 − 1/n)n = n/e ≈ 0.368n
PARADOX?collisions!
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 8 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Hash-table: Basic Idea
After inserting n items into a hashtable with n locations
Q : What is the expected number of items in a given location?
Yi ={
1 , the ith item in the given location0 , otherwise
Y =∑
1≤i≤nYi
E (Y ) = E (∑
1≤i≤nYi) =
∑1≤i≤n
E (Yi) =∑
1≤i≤n1/n = 1
Expected number of empty locations: n(1 − 1/n)n = n/e ≈ 0.368n
PARADOX?collisions!
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 8 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Hash-table: Basic Idea
After inserting n items into a hashtable with n locations
Q : What is the expected number of items in a given location?
Yi ={
1 , the ith item in the given location0 , otherwise
Y =∑
1≤i≤nYi
E (Y ) = E (∑
1≤i≤nYi) =
∑1≤i≤n
E (Yi) =∑
1≤i≤n1/n = 1
Expected number of empty locations: n(1 − 1/n)n = n/e ≈ 0.368n
PARADOX?collisions!
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 8 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Hash-table: Basic Idea
After inserting n items into a hashtable with n locations
Q : What is the expected number of items in a given location?
Yi ={
1 , the ith item in the given location0 , otherwise
Y =∑
1≤i≤nYi
E (Y ) = E (∑
1≤i≤nYi) =
∑1≤i≤n
E (Yi) =∑
1≤i≤n1/n = 1
Expected number of empty locations: n(1 − 1/n)n = n/e ≈ 0.368n
PARADOX?collisions!
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 8 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Hash-table: Basic Idea
After inserting n items into a hashtable with n locations
Q : What is the expected number of items in a given location?
Yi ={
1 , the ith item in the given location0 , otherwise
Y =∑
1≤i≤nYi
E (Y ) = E (∑
1≤i≤nYi) =
∑1≤i≤n
E (Yi) =∑
1≤i≤n1/n = 1
Expected number of empty locations: n(1 − 1/n)n = n/e ≈ 0.368n
PARADOX?collisions!
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 8 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Hash-table: Basic Idea
After inserting n items into a hashtable with n locations
Q : What is the expected number of items in a given location?
Yi ={
1 , the ith item in the given location0 , otherwise
Y =∑
1≤i≤nYi
E (Y ) = E (∑
1≤i≤nYi) =
∑1≤i≤n
E (Yi) =∑
1≤i≤n1/n = 1
Expected number of empty locations: n(1 − 1/n)n = n/e ≈ 0.368n
PARADOX?collisions!
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 8 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Hash-table: Basic Idea
After inserting n items into a hashtable with n locations
Q : What is the expected number of items in a given location?
Yi ={
1 , the ith item in the given location0 , otherwise
Y =∑
1≤i≤nYi
E (Y ) = E (∑
1≤i≤nYi) =
∑1≤i≤n
E (Yi) =∑
1≤i≤n1/n = 1
Expected number of empty locations: n(1 − 1/n)n = n/e ≈ 0.368n
PARADOX?
collisions!
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 8 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Hash-table: Basic Idea
After inserting n items into a hashtable with n locations
Q : What is the expected number of items in a given location?
Yi ={
1 , the ith item in the given location0 , otherwise
Y =∑
1≤i≤nYi
E (Y ) = E (∑
1≤i≤nYi) =
∑1≤i≤n
E (Yi) =∑
1≤i≤n1/n = 1
Expected number of empty locations: n(1 − 1/n)n = n/e ≈ 0.368n
PARADOX?collisions!
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 8 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Hash-table: Basic Idea
Expected number of collisions
After inserting n items into m locations
Q : What is the expected number of collisions?
E (collisions) = n − E (occupied locations)
= n − (m − E (empty locations))
= n − (m − m (1 − 1/m)n)
= n − m + m (1 − 1/m)n
Example:When n = 100, m = 100, then E (collision) ≈ 37
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 9 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Hash-table: Basic Idea
Expected number of collisions
After inserting n items into m locations
Q : What is the expected number of collisions?
E (collisions) = n − E (occupied locations)
= n − (m − E (empty locations))
= n − (m − m (1 − 1/m)n)
= n − m + m (1 − 1/m)n
Example:When n = 100, m = 100, then E (collision) ≈ 37
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 9 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Hash-table: Basic Idea
Expected number of collisions
After inserting n items into m locations
Q : What is the expected number of collisions?
E (collisions) = n − E (occupied locations)
= n − (m − E (empty locations))
= n − (m − m (1 − 1/m)n)
= n − m + m (1 − 1/m)n
Example:When n = 100, m = 100, then E (collision) ≈ 37
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 9 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Hash-table: Basic Idea
Expected number of collisions
After inserting n items into m locations
Q : What is the expected number of collisions?
E (collisions) = n − E (occupied locations)
= n − (m − E (empty locations))
= n − (m − m (1 − 1/m)n)
= n − m + m (1 − 1/m)n
Example:When n = 100, m = 100, then E (collision) ≈ 37
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 9 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Hash-table: Basic Idea
Expected number of collisions
After inserting n items into m locations
Q : What is the expected number of collisions?
E (collisions) = n − E (occupied locations)
= n − (m − E (empty locations))
= n − (m − m (1 − 1/m)n)
= n − m + m (1 − 1/m)n
Example:When n = 100, m = 100, then E (collision) ≈ 37
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 9 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Hash-table: Basic Idea
After inserting n items into m locations
Q : What is the expected number of items needed to fullfill all mlocations?
Xi : the number of items to be added to increase thenumber of occupied locations from i − 1 to i .Given i − 1 occupied locations,
the success probabilityfor each trial is (m − i + 1)/m
Thus, E (Xi) = m/(m − i + 1)
E (X ) =m∑
i=1E (Xi) =
m∑i=1
m/(m − i + 1)
= mm∑
i=11/(m − i + 1)
= mm∑
j=11/j = Θ(m lg m)
· · ·
# = i − 1
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 10 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Hash-table: Basic Idea
After inserting n items into m locations
Q : What is the expected number of items needed to fullfill all mlocations?
Xi : the number of items to be added to increase thenumber of occupied locations from i − 1 to i .
Given i − 1 occupied locations,
the success probabilityfor each trial is (m − i + 1)/m
Thus, E (Xi) = m/(m − i + 1)
E (X ) =m∑
i=1E (Xi) =
m∑i=1
m/(m − i + 1)
= mm∑
i=11/(m − i + 1)
= mm∑
j=11/j = Θ(m lg m)
· · ·
# = i − 1
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 10 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Hash-table: Basic Idea
After inserting n items into m locations
Q : What is the expected number of items needed to fullfill all mlocations?
Xi : the number of items to be added to increase thenumber of occupied locations from i − 1 to i .Given i − 1 occupied locations,
the success probabilityfor each trial is (m − i + 1)/mThus, E (Xi) = m/(m − i + 1)
E (X ) =m∑
i=1E (Xi) =
m∑i=1
m/(m − i + 1)
= mm∑
i=11/(m − i + 1)
= mm∑
j=11/j = Θ(m lg m)
· · ·
# = i − 1
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 10 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Hash-table: Basic Idea
After inserting n items into m locations
Q : What is the expected number of items needed to fullfill all mlocations?
Xi : the number of items to be added to increase thenumber of occupied locations from i − 1 to i .Given i − 1 occupied locations,
the success probabilityfor each trial is (m − i + 1)/mThus, E (Xi) = m/(m − i + 1)
E (X ) =m∑
i=1E (Xi) =
m∑i=1
m/(m − i + 1)
= mm∑
i=11/(m − i + 1)
= mm∑
j=11/j = Θ(m lg m)
· · ·
# = i − 1
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 10 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Hash-table: Basic Idea
After inserting n items into m locations
Q : What is the expected number of items needed to fullfill all mlocations?
Xi : the number of items to be added to increase thenumber of occupied locations from i − 1 to i .Given i − 1 occupied locations, the success probabilityfor each trial is (m − i + 1)/m
Thus, E (Xi) = m/(m − i + 1)
E (X ) =m∑
i=1E (Xi) =
m∑i=1
m/(m − i + 1)
= mm∑
i=11/(m − i + 1)
= mm∑
j=11/j = Θ(m lg m)
· · ·
# = i − 1
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 10 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Hash-table: Basic Idea
After inserting n items into m locations
Q : What is the expected number of items needed to fullfill all mlocations?
Xi : the number of items to be added to increase thenumber of occupied locations from i − 1 to i .Given i − 1 occupied locations, the success probabilityfor each trial is (m − i + 1)/mThus, E (Xi) = m/(m − i + 1)
E (X ) =m∑
i=1E (Xi) =
m∑i=1
m/(m − i + 1)
= mm∑
i=11/(m − i + 1)
= mm∑
j=11/j = Θ(m lg m)
· · ·
# = i − 1
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 10 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Hash-table: Basic Idea
After inserting n items into m locations
Q : What is the expected number of items needed to fullfill all mlocations?
Xi : the number of items to be added to increase thenumber of occupied locations from i − 1 to i .Given i − 1 occupied locations, the success probabilityfor each trial is (m − i + 1)/mThus, E (Xi) = m/(m − i + 1)
E (X ) =m∑
i=1E (Xi) =
m∑i=1
m/(m − i + 1)
= mm∑
i=11/(m − i + 1)
= mm∑
j=11/j = Θ(m lg m)
· · ·
# = i − 1
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 10 / 34
Contents
1 Hash-table: Basic Idea
2 Hash Functions
3 Collision Resolution
. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Hash Functions
Hashing: the idea
Hash Function
Index distributionCollision handling
Key Space
Very large, but only a smallpart is used in an applica-tion at a certain time.
x· · ·
· · ·
E [0]E [1]
E [m − 1]
E [k]
In feasible size
Hash Function is IMPORTANT!
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 11 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Hash Functions
Hashing: the idea
Hash Function
Index distributionCollision handling
Key Space
Very large, but only a smallpart is used in an applica-tion at a certain time.
x· · ·
· · ·
E [0]E [1]
E [m − 1]
E [k]
In feasible size
Hash Function is IMPORTANT!
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 11 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Hash Functions
Two typical hash functions
Division Method: map a key k into one of m slots by taking theremainder of k divided by m.
h(k) = k mod m
Multiplication Method:Step-1: multiply the k by a constant A in the range 0 < A < 1 andextract the fractional part of kA.Step-2: multiply the fractional part by m.
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 12 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Hash Functions
Two typical hash functions
Division Method: map a key k into one of m slots by taking the remainderof k divided by m:
h(k) = k mod m
Q : Why should we avoid m = 2p?Since if m = 2p, then h(k) is just the p lowest-order bits of k.Unless we know that all low-order p-bit patterns are equally likely, weare better off designing the hash function to depend on all the bits ofthe key.
A prime not too close to an exact power of 2 is often a good choice for m.
Exercise 11.3-3
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 13 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Hash Functions
Two typical hash functions
Division Method: map a key k into one of m slots by taking the remainderof k divided by m:
h(k) = k mod m
Q : Why should we avoid m = 2p?
Since if m = 2p, then h(k) is just the p lowest-order bits of k.Unless we know that all low-order p-bit patterns are equally likely, weare better off designing the hash function to depend on all the bits ofthe key.
A prime not too close to an exact power of 2 is often a good choice for m.
Exercise 11.3-3
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 13 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Hash Functions
Two typical hash functions
Division Method: map a key k into one of m slots by taking the remainderof k divided by m:
h(k) = k mod m
Q : Why should we avoid m = 2p?Since if m = 2p, then h(k) is just the p lowest-order bits of k.
Unless we know that all low-order p-bit patterns are equally likely, weare better off designing the hash function to depend on all the bits ofthe key.
A prime not too close to an exact power of 2 is often a good choice for m.
Exercise 11.3-3
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 13 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Hash Functions
Two typical hash functions
Division Method: map a key k into one of m slots by taking the remainderof k divided by m:
h(k) = k mod m
Q : Why should we avoid m = 2p?Since if m = 2p, then h(k) is just the p lowest-order bits of k.Unless we know that all low-order p-bit patterns are equally likely, weare better off designing the hash function to depend on all the bits ofthe key.
A prime not too close to an exact power of 2 is often a good choice for m.
Exercise 11.3-3
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 13 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Hash Functions
Two typical hash functions
Division Method: map a key k into one of m slots by taking the remainderof k divided by m:
h(k) = k mod m
Q : Why should we avoid m = 2p?Since if m = 2p, then h(k) is just the p lowest-order bits of k.Unless we know that all low-order p-bit patterns are equally likely, weare better off designing the hash function to depend on all the bits ofthe key.
A prime not too close to an exact power of 2 is often a good choice for m.
Exercise 11.3-3
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 13 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Hash Functions
Two typical hash functions
Multiplication Method:Step-1: multiply the k by a constant A in the range 0 < A < 1 andextract the fractional part of kA.Step-2: multiply the fractional part by m
h(k) = ⌊m(kA mod 1)⌋
Q : Why do we usually choose m = 2p?
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 14 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Hash Functions
Two typical hash functions
Multiplication Method:Step-1: multiply the k by a constant A in the range 0 < A < 1 andextract the fractional part of kA.Step-2: multiply the fractional part by m
h(k) = ⌊m(kA mod 1)⌋
Q : Why do we usually choose m = 2p?
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 14 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Hash Functions
Two typical hash functions
Multiplication Method:Step-1: multiply the k by a constant A in the range 0 < A < 1 andextract the fractional part of kA.Step-2: multiply the fractional part by m
h(k) = ⌊m(kA mod 1)⌋
Q : Why do we usually choose m = 2p?
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 14 / 34
Contents
1 Hash-table: Basic Idea
2 Hash Functions
3 Collision Resolution
. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Collision Resolution
Collision
Hash FunctionKey Space
xy
· · ·
· · ·
E [0]E [1]
E [m − 1]
E [k]
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 15 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Collision Resolution
Collision Resolution
Chaining (Closed Addressing) Open Addressing
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 16 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Collision Resolution Chaining
Collision Resolution by Chaining
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 17 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Collision Resolution Chaining
Collision Resolution by Chaining
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 17 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Collision Resolution Chaining
Collision Resolution by Chaining: Unsuccessful SearchQ : What is the average cost of an unsuccessful search?
For j = 0, 1, 2, ..., m − 1, the average length of thelist at E [j] is n/m = α.
Any key that is not in the table is equally likely to hash to any of them addresses.The average cost to determine that the key is not in the list E [h(k)]is the cost to search to the end of the list, which is α.Total cost is Θ(1 + α)
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 18 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Collision Resolution Chaining
Collision Resolution by Chaining: Unsuccessful SearchQ : What is the average cost of an unsuccessful search?
For j = 0, 1, 2, ..., m − 1, the average length of thelist at E [j] is n/m = α.
Any key that is not in the table is equally likely to hash to any of them addresses.The average cost to determine that the key is not in the list E [h(k)]is the cost to search to the end of the list, which is α.Total cost is Θ(1 + α)
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 18 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Collision Resolution Chaining
Collision Resolution by Chaining: Unsuccessful SearchQ : What is the average cost of an unsuccessful search?
For j = 0, 1, 2, ..., m − 1, the average length of thelist at E [j] is n/m = α.
Any key that is not in the table is equally likely to hash to any of them addresses.The average cost to determine that the key is not in the list E [h(k)]is the cost to search to the end of the list, which is α.Total cost is Θ(1 + α)
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 18 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Collision Resolution Chaining
Collision Resolution by Chaining: Unsuccessful SearchQ : What is the average cost of an unsuccessful search?
For j = 0, 1, 2, ..., m − 1, the average length of thelist at E [j] is n/m = α.
Any key that is not in the table is equally likely to hash to any of them addresses.
The average cost to determine that the key is not in the list E [h(k)]is the cost to search to the end of the list, which is α.Total cost is Θ(1 + α)
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 18 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Collision Resolution Chaining
Collision Resolution by Chaining: Unsuccessful SearchQ : What is the average cost of an unsuccessful search?
For j = 0, 1, 2, ..., m − 1, the average length of thelist at E [j] is n/m = α.
Any key that is not in the table is equally likely to hash to any of them addresses.The average cost to determine that the key is not in the list E [h(k)]is the cost to search to the end of the list, which is α.
Total cost is Θ(1 + α)
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 18 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Collision Resolution Chaining
Collision Resolution by Chaining: Unsuccessful SearchQ : What is the average cost of an unsuccessful search?
For j = 0, 1, 2, ..., m − 1, the average length of thelist at E [j] is n/m = α.
Any key that is not in the table is equally likely to hash to any of them addresses.The average cost to determine that the key is not in the list E [h(k)]is the cost to search to the end of the list, which is α.Total cost is Θ(1 + α)
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 18 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Collision Resolution Chaining
Collision Resolution by Chaining: Successful SearchQ : What is the average cost of an successful search?
xi : the ith element inserted into the table.The probability of that xi is searched is 1/n.For a specific xi , the number of elements examined in a successfulsearch is t + 1I t: the number of elements inserted into the same list as xi , after xi
has been inserted.I The probability of that xj is inserted into the same list of xi is 1/m.I The expected number of elements examined for a successful search
of xi is (1 +n∑
j=i+1
1m )
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 19 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Collision Resolution Chaining
Collision Resolution by Chaining: Successful SearchQ : What is the average cost of an successful search?
xi : the ith element inserted into the table.The probability of that xi is searched is 1/n.For a specific xi , the number of elements examined in a successfulsearch is t + 1I t: the number of elements inserted into the same list as xi , after xi
has been inserted.I The probability of that xj is inserted into the same list of xi is 1/m.I The expected number of elements examined for a successful search
of xi is (1 +n∑
j=i+1
1m )
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 19 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Collision Resolution Chaining
Collision Resolution by Chaining: Successful SearchQ : What is the average cost of an successful search?
xi : the ith element inserted into the table.
The probability of that xi is searched is 1/n.For a specific xi , the number of elements examined in a successfulsearch is t + 1I t: the number of elements inserted into the same list as xi , after xi
has been inserted.I The probability of that xj is inserted into the same list of xi is 1/m.I The expected number of elements examined for a successful search
of xi is (1 +n∑
j=i+1
1m )
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 19 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Collision Resolution Chaining
Collision Resolution by Chaining: Successful SearchQ : What is the average cost of an successful search?
xi : the ith element inserted into the table.The probability of that xi is searched is 1/n.
For a specific xi , the number of elements examined in a successfulsearch is t + 1I t: the number of elements inserted into the same list as xi , after xi
has been inserted.I The probability of that xj is inserted into the same list of xi is 1/m.I The expected number of elements examined for a successful search
of xi is (1 +n∑
j=i+1
1m )
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 19 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Collision Resolution Chaining
Collision Resolution by Chaining: Successful SearchQ : What is the average cost of an successful search?
xi : the ith element inserted into the table.The probability of that xi is searched is 1/n.For a specific xi , the number of elements examined in a successfulsearch is t + 1
I t: the number of elements inserted into the same list as xi , after xihas been inserted.
I The probability of that xj is inserted into the same list of xi is 1/m.I The expected number of elements examined for a successful search
of xi is (1 +n∑
j=i+1
1m )
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 19 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Collision Resolution Chaining
Collision Resolution by Chaining: Successful SearchQ : What is the average cost of an successful search?
xi : the ith element inserted into the table.The probability of that xi is searched is 1/n.For a specific xi , the number of elements examined in a successfulsearch is t + 1I t: the number of elements inserted into the same list as xi , after xi
has been inserted.
I The probability of that xj is inserted into the same list of xi is 1/m.I The expected number of elements examined for a successful search
of xi is (1 +n∑
j=i+1
1m )
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 19 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Collision Resolution Chaining
Collision Resolution by Chaining: Successful SearchQ : What is the average cost of an successful search?
xi : the ith element inserted into the table.The probability of that xi is searched is 1/n.For a specific xi , the number of elements examined in a successfulsearch is t + 1I t: the number of elements inserted into the same list as xi , after xi
has been inserted.I The probability of that xj is inserted into the same list of xi is 1/m.
I The expected number of elements examined for a successful searchof xi is (1 +
n∑j=i+1
1m )
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 19 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Collision Resolution Chaining
Collision Resolution by Chaining: Successful SearchQ : What is the average cost of an successful search?
xi : the ith element inserted into the table.The probability of that xi is searched is 1/n.For a specific xi , the number of elements examined in a successfulsearch is t + 1I t: the number of elements inserted into the same list as xi , after xi
has been inserted.I The probability of that xj is inserted into the same list of xi is 1/m.I The expected number of elements examined for a successful search
of xi is (1 +n∑
j=i+1
1m )
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 19 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Collision Resolution Chaining
Collision Resolution by Chaining: Successful SearchQ : What is the average cost of a successful search?
Average number of elements examined for an successful search is
1n
n∑i=1
(1 +n∑
j=i+11m )
= 1 + 1nm (
n∑j=i+1
1)
= 1 + 1nm (
n∑i=1
n − i)
= 1 + 1nm (n2 −
n∑i=1
i)
= 1 + 1nm (n2 − n(n+1)
2 )
= 1 + n2m − n
2nm
= 1 + α − α2n
Average cost for a successful search is 2 + α − α2n = Θ(1 + α)
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 20 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Collision Resolution Chaining
Collision Resolution by Chaining: Successful SearchQ : What is the average cost of a successful search?
Average number of elements examined for an successful search is
1n
n∑i=1
(1 +n∑
j=i+11m ) = 1 + 1
nm (n∑
j=i+11)
= 1 + 1nm (
n∑i=1
n − i)
= 1 + 1nm (n2 −
n∑i=1
i)
= 1 + 1nm (n2 − n(n+1)
2 )
= 1 + n2m − n
2nm
= 1 + α − α2n
Average cost for a successful search is 2 + α − α2n = Θ(1 + α)
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 20 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Collision Resolution Chaining
Collision Resolution by Chaining: Successful SearchQ : What is the average cost of a successful search?
Average number of elements examined for an successful search is
1n
n∑i=1
(1 +n∑
j=i+11m ) = 1 + 1
nm (n∑
j=i+11)
= 1 + 1nm (
n∑i=1
n − i)
= 1 + 1nm (n2 −
n∑i=1
i)
= 1 + 1nm (n2 − n(n+1)
2 )
= 1 + n2m − n
2nm
= 1 + α − α2n
Average cost for a successful search is 2 + α − α2n = Θ(1 + α)
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 20 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Collision Resolution Chaining
Collision Resolution by Chaining: Successful SearchQ : What is the average cost of a successful search?
Average number of elements examined for an successful search is
1n
n∑i=1
(1 +n∑
j=i+11m ) = 1 + 1
nm (n∑
j=i+11)
= 1 + 1nm (
n∑i=1
n − i)
= 1 + 1nm (n2 −
n∑i=1
i)
= 1 + 1nm (n2 − n(n+1)
2 )
= 1 + n2m − n
2nm
= 1 + α − α2n
Average cost for a successful search is 2 + α − α2n = Θ(1 + α)
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 20 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Collision Resolution Chaining
Collision Resolution by Chaining: Successful SearchQ : What is the average cost of a successful search?
Average number of elements examined for an successful search is
1n
n∑i=1
(1 +n∑
j=i+11m ) = 1 + 1
nm (n∑
j=i+11)
= 1 + 1nm (
n∑i=1
n − i)
= 1 + 1nm (n2 −
n∑i=1
i)
= 1 + 1nm (n2 − n(n+1)
2 )
= 1 + n2m − n
2nm
= 1 + α − α2n
Average cost for a successful search is 2 + α − α2n = Θ(1 + α)
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 20 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Collision Resolution Chaining
Collision Resolution by Chaining: Successful SearchQ : What is the average cost of a successful search?
Average number of elements examined for an successful search is
1n
n∑i=1
(1 +n∑
j=i+11m ) = 1 + 1
nm (n∑
j=i+11)
= 1 + 1nm (
n∑i=1
n − i)
= 1 + 1nm (n2 −
n∑i=1
i)
= 1 + 1nm (n2 − n(n+1)
2 )
= 1 + n2m − n
2nm
= 1 + α − α2n
Average cost for a successful search is 2 + α − α2n = Θ(1 + α)
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 20 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Collision Resolution Chaining
Collision Resolution by Chaining: Example
JAVA 7 HashMap
1 static int hash( int h) {2 h^= (h>>>20)^(h>>>12);3 return h^(h>>>7)^(h>>>4);4 }
In Java 7, after calculating hash from hashfunction, if more then one element has samehash than they are searched by linear search, soit’s complexity is O(n).In Java 8, that search is performed by binarysearch so the complexity will become log n.
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 21 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Collision Resolution Open Addressing
Open Addressing
All elements are stored in the hash table, no linked list isused. So, α ≤ 1.
Collision is settled by “rehashing”:I A function used to get a new hashing address for each
collided addressI The hash table slots are probed successively, until a valid
location is found.The probing sequence can be seen as a permutation of(0, 1, 2, · · · , m − 1)→⟨h(k, 0), h(k, 1), · · · , h(k, m − 1)⟩
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 22 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Collision Resolution Open Addressing
Open Addressing
All elements are stored in the hash table, no linked list isused. So, α ≤ 1.Collision is settled by “rehashing”:
I A function used to get a new hashing address for eachcollided address
I The hash table slots are probed successively, until a validlocation is found.
The probing sequence can be seen as a permutation of(0, 1, 2, · · · , m − 1)→⟨h(k, 0), h(k, 1), · · · , h(k, m − 1)⟩
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 22 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Collision Resolution Open Addressing
Open Addressing
All elements are stored in the hash table, no linked list isused. So, α ≤ 1.Collision is settled by “rehashing”:I A function used to get a new hashing address for each
collided address
I The hash table slots are probed successively, until a validlocation is found.
The probing sequence can be seen as a permutation of(0, 1, 2, · · · , m − 1)→⟨h(k, 0), h(k, 1), · · · , h(k, m − 1)⟩
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 22 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Collision Resolution Open Addressing
Open Addressing
All elements are stored in the hash table, no linked list isused. So, α ≤ 1.Collision is settled by “rehashing”:I A function used to get a new hashing address for each
collided addressI The hash table slots are probed successively, until a valid
location is found.
The probing sequence can be seen as a permutation of(0, 1, 2, · · · , m − 1)→⟨h(k, 0), h(k, 1), · · · , h(k, m − 1)⟩
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 22 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Collision Resolution Open Addressing
Open Addressing
All elements are stored in the hash table, no linked list isused. So, α ≤ 1.Collision is settled by “rehashing”:I A function used to get a new hashing address for each
collided addressI The hash table slots are probed successively, until a valid
location is found.The probing sequence can be seen as a permutation of(0, 1, 2, · · · , m − 1)→⟨h(k, 0), h(k, 1), · · · , h(k, m − 1)⟩
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 22 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Collision Resolution Open Addressing
Open Addressing: Commonly Used Probings
Linear Probing: (primary clustering may occur)Given an auxiliary hash function h′, the hash function is:
h(k, i) = (h′(k) + i) mod m, (i = 0, 1, ..., m − 1)
Quadratic Probing: (secondary clustering may occur)Given auxiliary function h′ and nonzero auxiliary constant c1 and c2, thehash function is:
h(k, i) = (h′(k) + c1i + c2i2) mod m, (i = 0, 1, ..., m − 1)
Double Hashing:Given auxiliary functions h1 and h2, the hash function is:
h(k, i) = (h1(k) + ih2(k)) mod m, (i = 0, 1, ..., m − 1)
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 23 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Collision Resolution Open Addressing
Open Addressing: Commonly Used Probings
Linear Probing: (primary clustering may occur)Given an auxiliary hash function h′, the hash function is:
h(k, i) = (h′(k) + i) mod m, (i = 0, 1, ..., m − 1)
Quadratic Probing: (secondary clustering may occur)Given auxiliary function h′ and nonzero auxiliary constant c1 and c2, thehash function is:
h(k, i) = (h′(k) + c1i + c2i2) mod m, (i = 0, 1, ..., m − 1)
Double Hashing:Given auxiliary functions h1 and h2, the hash function is:
h(k, i) = (h1(k) + ih2(k)) mod m, (i = 0, 1, ..., m − 1)
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 23 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Collision Resolution Open Addressing
Open Addressing: Commonly Used Probings
Linear Probing: (primary clustering may occur)Given an auxiliary hash function h′, the hash function is:
h(k, i) = (h′(k) + i) mod m, (i = 0, 1, ..., m − 1)
Quadratic Probing: (secondary clustering may occur)Given auxiliary function h′ and nonzero auxiliary constant c1 and c2, thehash function is:
h(k, i) = (h′(k) + c1i + c2i2) mod m, (i = 0, 1, ..., m − 1)
Double Hashing:Given auxiliary functions h1 and h2, the hash function is:
h(k, i) = (h1(k) + ih2(k)) mod m, (i = 0, 1, ..., m − 1)MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 23 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Collision Resolution Open Addressing
Linear Probing: an Example
0 17761
1776
2
1776
3
1776
4
1776
5
1776
6
1776
7
1776
10551492
1918
Hasing function: h(x) = 5x mod 8
Rehasing function: rh(j) = (j + 1) mod 8
1812hashing
1812rehashing
1812
1945
hashing
1945rehashing
1945
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 24 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Collision Resolution Open Addressing
Linear Probing: an Example
0 17761
1776
2
1776
3
1776
4
1776
5
1776
6
1776
7
1776
10551492
1918
Hasing function: h(x) = 5x mod 8
Rehasing function: rh(j) = (j + 1) mod 8
1812
hashing
1812rehashing
1812
1945
hashing
1945rehashing
1945
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 24 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Collision Resolution Open Addressing
Linear Probing: an Example
0 17761
1776
2
1776
3
1776
4
1776
5
1776
6
1776
7
1776
10551492
1918
Hasing function: h(x) = 5x mod 8
Rehasing function: rh(j) = (j + 1) mod 8
1812hashing
1812rehashing
1812
1945
hashing
1945rehashing
1945
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 24 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Collision Resolution Open Addressing
Linear Probing: an Example
0 17761
1776
2
1776
3
1776
4
1776
5
1776
6
1776
7
1776
10551492
1918
Hasing function: h(x) = 5x mod 8
Rehasing function: rh(j) = (j + 1) mod 8
1812hashing
1812rehashing
1812
1945
hashing
1945rehashing
1945
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 24 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Collision Resolution Open Addressing
Linear Probing: an Example
0 17761
1776
2
1776
3
1776
4
1776
5
1776
6
1776
7
1776
10551492
1918
Hasing function: h(x) = 5x mod 8
Rehasing function: rh(j) = (j + 1) mod 8
1812hashing
1812rehashing
1812
1945
hashing
1945rehashing
1945
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 24 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Collision Resolution Open Addressing
Linear Probing: an Example
0 17761
1776
2
1776
3
1776
4
1776
5
1776
6
1776
7
1776
10551492
1918
Hasing function: h(x) = 5x mod 8
Rehasing function: rh(j) = (j + 1) mod 8
1812hashing
1812rehashing
1812
1945
hashing
1945rehashing
1945
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 24 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Collision Resolution Open Addressing
Linear Probing: an Example
0 17761
1776
2
1776
3
1776
4
1776
5
1776
6
1776
7
1776
10551492
1918
Hasing function: h(x) = 5x mod 8
Rehasing function: rh(j) = (j + 1) mod 8
1812hashing
1812rehashing
1812
1945
hashing
1945rehashing
1945
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 24 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Collision Resolution Open Addressing
Linear Probing: an Example
0 17761
1776
2
1776
3
1776
4
1776
5
1776
6
1776
7
1776
10551492
1918
Hasing function: h(x) = 5x mod 8
Rehasing function: rh(j) = (j + 1) mod 8
1812hashing
1812rehashing
1812
1945
hashing
1945
rehashing1945
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 24 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Collision Resolution Open Addressing
Linear Probing: an Example
0 17761
1776
2
1776
3
1776
4
1776
5
1776
6
1776
7
1776
10551492
1918
Hasing function: h(x) = 5x mod 8
Rehasing function: rh(j) = (j + 1) mod 8
1812hashing
1812rehashing
1812
1945
hashing
1945rehashing
1945
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 24 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Collision Resolution Open Addressing
Linear Probing: an Example
0 17761
1776
2
1776
3
1776
4
1776
5
1776
6
1776
7
1776
10551492
1918
Hasing function: h(x) = 5x mod 8
Rehasing function: rh(j) = (j + 1) mod 8
1812hashing
1812rehashing
1812
1945
hashing
1945rehashing
1945
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 24 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Collision Resolution Open Addressing
Open Addressing: Commonly Used Probings
Q : How to evaluate the goodness of a probing?
AssumptionEach key is equally likely to have any of the m! permutationsof (1, 2, · · · , m − 1) as its probe sequence.
Both linear and quadratic probing have only m distinct probesequences, as determined by the first probe.
h(k, i) = (h′(k) + i) mod m, (i = 0, 1, ..., m − 1)h(k, i) = (h′(k) + c1i + c2i2) mod m, (i = 0, 1, ..., m − 1)
Double hashing improves over linear or quadratic probing in thatΘ(m2) probe sequences are used.
h(k, i) = (h1(k) + ih2(k)) mod m, (i = 0, 1, ..., m − 1)
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 25 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Collision Resolution Open Addressing
Open Addressing: Commonly Used Probings
Q : How to evaluate the goodness of a probing?
AssumptionEach key is equally likely to have any of the m! permutationsof (1, 2, · · · , m − 1) as its probe sequence.
Both linear and quadratic probing have only m distinct probesequences, as determined by the first probe.
h(k, i) = (h′(k) + i) mod m, (i = 0, 1, ..., m − 1)h(k, i) = (h′(k) + c1i + c2i2) mod m, (i = 0, 1, ..., m − 1)
Double hashing improves over linear or quadratic probing in thatΘ(m2) probe sequences are used.
h(k, i) = (h1(k) + ih2(k)) mod m, (i = 0, 1, ..., m − 1)
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 25 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Collision Resolution Open Addressing
Open Addressing: Commonly Used Probings
Q : How to evaluate the goodness of a probing?
AssumptionEach key is equally likely to have any of the m! permutationsof (1, 2, · · · , m − 1) as its probe sequence.
Both linear and quadratic probing have only m distinct probesequences, as determined by the first probe.
h(k, i) = (h′(k) + i) mod m, (i = 0, 1, ..., m − 1)h(k, i) = (h′(k) + c1i + c2i2) mod m, (i = 0, 1, ..., m − 1)
Double hashing improves over linear or quadratic probing in thatΘ(m2) probe sequences are used.
h(k, i) = (h1(k) + ih2(k)) mod m, (i = 0, 1, ..., m − 1)
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 25 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Collision Resolution Open Addressing
Open Addressing: Commonly Used Probings
Q : How to evaluate the goodness of a probing?
AssumptionEach key is equally likely to have any of the m! permutationsof (1, 2, · · · , m − 1) as its probe sequence.
Both linear and quadratic probing have only m distinct probesequences, as determined by the first probe.
h(k, i) = (h′(k) + i) mod m, (i = 0, 1, ..., m − 1)h(k, i) = (h′(k) + c1i + c2i2) mod m, (i = 0, 1, ..., m − 1)
Double hashing improves over linear or quadratic probing in thatΘ(m2) probe sequences are used.
h(k, i) = (h1(k) + ih2(k)) mod m, (i = 0, 1, ..., m − 1)
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 25 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Collision Resolution Open Addressing
Open Addressing: Deleting Element
Q : Why Open Addressing is not suitable for situations where itemsmight be deleted?
The probing chain might be broken if an item is deleted.How to deal with it?
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 26 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Collision Resolution Open Addressing
Open Addressing: Deleting Element
Q : Why Open Addressing is not suitable for situations where itemsmight be deleted?
The probing chain might be broken if an item is deleted.
How to deal with it?
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 26 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Collision Resolution Open Addressing
Open Addressing: Deleting Element
Q : Why Open Addressing is not suitable for situations where itemsmight be deleted?
The probing chain might be broken if an item is deleted.How to deal with it?
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 26 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Collision Resolution Open Addressing
Open Addressing: Unsuccessful SearchQ : Assuming uniform hashing, what is the average number of probesin an unsuccessful search?
X : the number of probes made in an unsuccessful search
E (X ) =∞∑
i=0i · Pr(X = i)
=∞∑
i=0i · (Pr(X ≥ i) − Pr(X ≥ i + 1))
=∞∑
i=1Pr(X ≥ i)
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 27 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Collision Resolution Open Addressing
Open Addressing: Unsuccessful SearchQ : Assuming uniform hashing, what is the average number of probesin an unsuccessful search?
X : the number of probes made in an unsuccessful search
E (X ) =∞∑
i=0i · Pr(X = i)
=∞∑
i=0i · (Pr(X ≥ i) − Pr(X ≥ i + 1))
=∞∑
i=1Pr(X ≥ i)
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 27 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Collision Resolution Open Addressing
Open Addressing: Unsuccessful SearchQ : Assuming uniform hashing, what is the average number of probesin an unsuccessful search?
X : the number of probes made in an unsuccessful search
E (X ) =∞∑
i=0i · Pr(X = i)
=∞∑
i=0i · (Pr(X ≥ i) − Pr(X ≥ i + 1))
=∞∑
i=1Pr(X ≥ i)
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 27 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Collision Resolution Open Addressing
Open Addressing: Unsuccessful SearchQ : Assuming uniform hashing, what is the average number of probesin an unsuccessful search?
X : the number of probes made in an unsuccessful search
E (X ) =∞∑
i=0i · Pr(X = i)
=∞∑
i=0i · (Pr(X ≥ i) − Pr(X ≥ i + 1))
=∞∑
i=1Pr(X ≥ i)
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 27 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Collision Resolution Open Addressing
Open Addressing: Unsuccessful Search
Q : How to compute Pr(X ≥ i)?
Ai : the ith probe occurs at an occupied slotX ≥ i = A1 ∩ A2 ∩ · · · ∩ Ai−1
Pr(X ≥ i) = Pr(A1 ∩ A2 ∩ · · · ∩ Ai−1)= Pr(A1) · Pr(A2|A1) · Pr(A3|A1 ∩ A2) · · · Pr(Ai−1|A1 ∩ A2 ∩ · · · ∩ Ai−2)
= nm · n−1
m−1 · · · n−i+2m−i+2
≤ ( nm )i−1
= αi−1
Then, E (x) =∞∑
i=1Pr(X ≥ i) ≤
∞∑i=1
αi−1 =≤∞∑
i=0αi = 1
1−α
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 28 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Collision Resolution Open Addressing
Open Addressing: Unsuccessful Search
Q : How to compute Pr(X ≥ i)?Ai : the ith probe occurs at an occupied slot
X ≥ i = A1 ∩ A2 ∩ · · · ∩ Ai−1
Pr(X ≥ i) = Pr(A1 ∩ A2 ∩ · · · ∩ Ai−1)= Pr(A1) · Pr(A2|A1) · Pr(A3|A1 ∩ A2) · · · Pr(Ai−1|A1 ∩ A2 ∩ · · · ∩ Ai−2)
= nm · n−1
m−1 · · · n−i+2m−i+2
≤ ( nm )i−1
= αi−1
Then, E (x) =∞∑
i=1Pr(X ≥ i) ≤
∞∑i=1
αi−1 =≤∞∑
i=0αi = 1
1−α
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 28 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Collision Resolution Open Addressing
Open Addressing: Unsuccessful Search
Q : How to compute Pr(X ≥ i)?Ai : the ith probe occurs at an occupied slotX ≥ i = A1 ∩ A2 ∩ · · · ∩ Ai−1
Pr(X ≥ i) = Pr(A1 ∩ A2 ∩ · · · ∩ Ai−1)= Pr(A1) · Pr(A2|A1) · Pr(A3|A1 ∩ A2) · · · Pr(Ai−1|A1 ∩ A2 ∩ · · · ∩ Ai−2)
= nm · n−1
m−1 · · · n−i+2m−i+2
≤ ( nm )i−1
= αi−1
Then, E (x) =∞∑
i=1Pr(X ≥ i) ≤
∞∑i=1
αi−1 =≤∞∑
i=0αi = 1
1−α
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 28 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Collision Resolution Open Addressing
Open Addressing: Unsuccessful Search
Q : How to compute Pr(X ≥ i)?Ai : the ith probe occurs at an occupied slotX ≥ i = A1 ∩ A2 ∩ · · · ∩ Ai−1
Pr(X ≥ i) = Pr(A1 ∩ A2 ∩ · · · ∩ Ai−1)= Pr(A1) · Pr(A2|A1) · Pr(A3|A1 ∩ A2) · · · Pr(Ai−1|A1 ∩ A2 ∩ · · · ∩ Ai−2)
= nm · n−1
m−1 · · · n−i+2m−i+2
≤ ( nm )i−1
= αi−1
Then, E (x) =∞∑
i=1Pr(X ≥ i) ≤
∞∑i=1
αi−1 =≤∞∑
i=0αi = 1
1−α
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 28 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Collision Resolution Open Addressing
Open Addressing: Unsuccessful Search
Q : How to compute Pr(X ≥ i)?Ai : the ith probe occurs at an occupied slotX ≥ i = A1 ∩ A2 ∩ · · · ∩ Ai−1
Pr(X ≥ i) = Pr(A1 ∩ A2 ∩ · · · ∩ Ai−1)= Pr(A1) · Pr(A2|A1) · Pr(A3|A1 ∩ A2) · · · Pr(Ai−1|A1 ∩ A2 ∩ · · · ∩ Ai−2)
= nm · n−1
m−1 · · · n−i+2m−i+2
≤ ( nm )i−1
= αi−1
Then, E (x) =∞∑
i=1Pr(X ≥ i) ≤
∞∑i=1
αi−1 =≤∞∑
i=0αi = 1
1−α
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 28 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Collision Resolution Open Addressing
Open Addressing: Inserting an Element
Q : What is the average cost for inserting an element into a tablewith load factor α?
Proof.An element is inserted only if there is room in the table, and thusα < 1.Inserting a key requires an unsuccessful search followed by placing thekey into the first empty slot found.Thus, the expected number of probes is at most 1/(1 − α)
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 29 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Collision Resolution Open Addressing
Open Addressing: Inserting an Element
Q : What is the average cost for inserting an element into a tablewith load factor α?
Proof.An element is inserted only if there is room in the table, and thusα < 1.Inserting a key requires an unsuccessful search followed by placing thekey into the first empty slot found.Thus, the expected number of probes is at most 1/(1 − α)
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 29 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Collision Resolution Open Addressing
Open Addressing: Insertion/Unsuccessful Search
11−α
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 30 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Collision Resolution Open Addressing
Open Addressing: Successful Search
Q : What is the average number of probes in a successful search ofan element in a table with load factor α?
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 31 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Collision Resolution Open Addressing
Open Addressing: Successful Search
Q : What is the average number of probes in a successful search ofan element in a table with load factor α?
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 31 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Collision Resolution Open Addressing
Open Addressing: Successful SearchQ : what is the average number of probes in a successful search foran element in a table with load factor α?
Proof.
A search for a key k reproduces the same probe sequence as whenthe element with key k was inserted.If k is the (i+1)st key inserted into the table, the expected number ofprobes in a search for k is at most 1/(1 − i/m) = m/(m − i)The expected number of probes is:
1n
n−1∑i=0
mm−i = m
nn−1∑i=0
1m−i = 1
α
m∑k=m−n+1
1k
≤ 1α
∫ mm−n
1x dx = 1
α ln mm−n
= 1α ln 1
1−α
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 32 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Collision Resolution Open Addressing
Open Addressing: Successful SearchQ : what is the average number of probes in a successful search foran element in a table with load factor α?
Proof.A search for a key k reproduces the same probe sequence as whenthe element with key k was inserted.
If k is the (i+1)st key inserted into the table, the expected number ofprobes in a search for k is at most 1/(1 − i/m) = m/(m − i)The expected number of probes is:
1n
n−1∑i=0
mm−i = m
nn−1∑i=0
1m−i = 1
α
m∑k=m−n+1
1k
≤ 1α
∫ mm−n
1x dx = 1
α ln mm−n
= 1α ln 1
1−α
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 32 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Collision Resolution Open Addressing
Open Addressing: Successful SearchQ : what is the average number of probes in a successful search foran element in a table with load factor α?
Proof.A search for a key k reproduces the same probe sequence as whenthe element with key k was inserted.If k is the (i+1)st key inserted into the table, the expected number ofprobes in a search for k is at most 1/(1 − i/m) = m/(m − i)
The expected number of probes is:1n
n−1∑i=0
mm−i = m
nn−1∑i=0
1m−i = 1
α
m∑k=m−n+1
1k
≤ 1α
∫ mm−n
1x dx = 1
α ln mm−n
= 1α ln 1
1−α
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 32 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Collision Resolution Open Addressing
Open Addressing: Successful SearchQ : what is the average number of probes in a successful search foran element in a table with load factor α?
Proof.A search for a key k reproduces the same probe sequence as whenthe element with key k was inserted.If k is the (i+1)st key inserted into the table, the expected number ofprobes in a search for k is at most 1/(1 − i/m) = m/(m − i)The expected number of probes is:
1n
n−1∑i=0
mm−i = m
nn−1∑i=0
1m−i = 1
α
m∑k=m−n+1
1k
≤ 1α
∫ mm−n
1x dx = 1
α ln mm−n
= 1α ln 1
1−α
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 32 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Collision Resolution Open Addressing
Open Addressing: Successful Search
1α ln 1
1−α
For your reference:α = 0.5: 1.387α = 0.9: 2.559
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 33 / 34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Collision Resolution Open Addressing
Thank You!Questions?
Office [email protected]
MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 34 / 34