2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When...

132
Problem Solving 2-12 Hashing MA Jun Institute of Computer Software May 14, 2020 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Transcript of 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When...

Page 1: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

Problem Solving2-12 Hashing

MA Jun

Institute of Computer Software

May 14, 2020

. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .

Page 2: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

Contents

1 Hash-table: Basic Idea

2 Hash Functions

3 Collision Resolution

. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .

Page 3: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Applications of Hashing

There are many applications of hashing (not limited to hash table),including modern day cryptography hash functions. Some of theseapplications are listed below:

Message DigestPassword VerificationData Structures (Programming Languages)Compiler OperationRabin-Karp AlgorithmLinking File Name and Path Together

https://www.geeksforgeeks.org/applications-of-hashing/

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 1 / 34

Page 4: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

Contents

1 Hash-table: Basic Idea

2 Hash Functions

3 Collision Resolution

. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .

Page 5: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Hash-table: Basic Idea

Many applications require a Dynamic Set that supports only thedictionary operations Insert, Search, and Delete.

Searching for an element in a hash table can take as long as searching foran element in a linked list Θ(n) time in the worst case.

Under reasonable assumptions, the average time to search for an elementin a hash table is Θ(1).

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 2 / 34

Page 6: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Hash-table: Basic Idea

Many applications require a Dynamic Set that supports only thedictionary operations Insert, Search, and Delete.

Searching for an element in a hash table can take as long as searching foran element in a linked list Θ(n) time in the worst case.

Under reasonable assumptions, the average time to search for an elementin a hash table is Θ(1).

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 2 / 34

Page 7: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Hash-table: Basic Idea

Many applications require a Dynamic Set that supports only thedictionary operations Insert, Search, and Delete.

Searching for an element in a hash table can take as long as searching foran element in a linked list Θ(n) time in the worst case.

Under reasonable assumptions, the average time to search for an elementin a hash table is Θ(1).

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 2 / 34

Page 8: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Hash-table: Basic Idea

Hashing: the idea

Q : When should we use hash table?What situations is the hash table suitable for?

Key Space

x

Very large, but only a smallpart is used in an applica-tion at a certain time.

· · ·

· · ·

E [0]E [1]

E [m − 1]

Feasible size

Hash FunctionE [k]

Index distributionCollision handling

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 3 / 34

Page 9: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Hash-table: Basic Idea

Hashing: the idea

Q : When should we use hash table?What situations is the hash table suitable for?

Key Space

x

Very large, but only a smallpart is used in an applica-tion at a certain time.

· · ·

· · ·

E [0]E [1]

E [m − 1]

Feasible size

Hash FunctionE [k]

Index distributionCollision handling

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 3 / 34

Page 10: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Hash-table: Basic Idea

Hashing: the idea

Q : When should we use hash table?What situations is the hash table suitable for?

Key Space

x

Very large, but only a smallpart is used in an applica-tion at a certain time.

· · ·

· · ·

E [0]E [1]

E [m − 1]

Feasible size

Hash FunctionE [k]

Index distributionCollision handling

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 3 / 34

Page 11: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Hash-table: Basic Idea

Hashing: the idea

Q : When should we use hash table?What situations is the hash table suitable for?

Key Space

x

Very large, but only a smallpart is used in an applica-tion at a certain time.

· · ·

· · ·

E [0]E [1]

E [m − 1]

Feasible size

Hash FunctionE [k]

Index distributionCollision handling

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 3 / 34

Page 12: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Hash-table: Basic Idea

Hashing: the idea

Q : When should we use hash table?What situations is the hash table suitable for?

Key Space

x

Very large, but only a smallpart is used in an applica-tion at a certain time.

· · ·

· · ·

E [0]E [1]

E [m − 1]

Feasible size

Hash FunctionE [k]

Index distributionCollision handling

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 3 / 34

Page 13: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Hash-table: Basic Idea

Hashing: the idea

Q : When should we use hash table?What situations is the hash table suitable for?

Key Space

x

Very large, but only a smallpart is used in an applica-tion at a certain time.

· · ·

· · ·

E [0]E [1]

E [m − 1]

Feasible size

Hash FunctionE [k]

Index distributionCollision handling

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 3 / 34

Page 14: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Hash-table: Basic Idea

Hashing: the idea

Q : What is a Collision? When does it take place?

Hash FunctionKey Space

· · ·

· · ·

E [0]E [1]

E [m − 1]

xyE [k]

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 4 / 34

Page 15: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Hash-table: Basic Idea

Hashing: the idea

Q : What is a Collision? When does it take place?

Hash FunctionKey Space

· · ·

· · ·

E [0]E [1]

E [m − 1]

xyE [k]

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 4 / 34

Page 16: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Hash-table: Basic Idea

Hashing: the idea

Q : What is a Collision? When does it take place?

Hash FunctionKey Space

· · ·

· · ·

E [0]E [1]

E [m − 1]

xy

E [k]

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 4 / 34

Page 17: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Hash-table: Basic Idea

Hashing: the idea

Q : What is a Collision? When does it take place?

Hash FunctionKey Space

· · ·

· · ·

E [0]E [1]

E [m − 1]

xyE [k]

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 4 / 34

Page 18: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Hash-table: Basic Idea

Collision

Q : Given n keys and m slots, what is the expected number of keyshashed into the same slot?

Hash FunctionKey Space

xy

· · ·

· · ·

E [0]E [1]

E [m − 1]

E [k]

It depends ...

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 5 / 34

Page 19: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Hash-table: Basic Idea

Collision

Q : Given n keys and m slots, what is the expected number of keyshashed into the same slot?

Hash FunctionKey Space

xy

· · ·

· · ·

E [0]E [1]

E [m − 1]

E [k]

It depends ...

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 5 / 34

Page 20: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Hash-table: Basic Idea

Model & Assumption

ModelInserting n keys: sequence of n s independent trails.Each insertion corresponds to a value from {0, 1, · · · , m − 1}.

AssumptionFor each insertion, the result is Uniformly Distributed, i.e. each valuefrom {0, 1, · · · , m − 1} is equally picked.

In hashing n items into a hash table of size m, the expected number of itemsthat hash to any one location is α = n/m (α: loading factor)

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 6 / 34

Page 21: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Hash-table: Basic Idea

Model & Assumption

ModelInserting n keys: sequence of n s independent trails.Each insertion corresponds to a value from {0, 1, · · · , m − 1}.

AssumptionFor each insertion, the result is Uniformly Distributed, i.e. each valuefrom {0, 1, · · · , m − 1} is equally picked.

In hashing n items into a hash table of size m, the expected number of itemsthat hash to any one location is α = n/m (α: loading factor)

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 6 / 34

Page 22: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Hash-table: Basic Idea

Model & Assumption

ModelInserting n keys: sequence of n s independent trails.Each insertion corresponds to a value from {0, 1, · · · , m − 1}.

AssumptionFor each insertion, the result is Uniformly Distributed, i.e. each valuefrom {0, 1, · · · , m − 1} is equally picked.

In hashing n items into a hash table of size m, the expected number of itemsthat hash to any one location is α = n/m (α: loading factor)

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 6 / 34

Page 23: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Hash-table: Basic Idea

Empty location

After inserting n items into m locations

Q : What is the probability of a given location is empty?

(1 − 1m )n

Q : What is the expected number of empty locations?

Xi ={

1 , location i is empty0 , otherwise

X =∑

1≤i≤mXi

E (X ) = E (∑

1≤i≤mXi) =

∑1≤i≤m

E (Xi) =∑

1≤i≤m(1 − 1/m)n =

m(1 − 1/m)n

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 7 / 34

Page 24: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Hash-table: Basic Idea

Empty location

After inserting n items into m locations

Q : What is the probability of a given location is empty?

(1 − 1m )n

Q : What is the expected number of empty locations?

Xi ={

1 , location i is empty0 , otherwise

X =∑

1≤i≤mXi

E (X ) = E (∑

1≤i≤mXi) =

∑1≤i≤m

E (Xi) =∑

1≤i≤m(1 − 1/m)n =

m(1 − 1/m)n

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 7 / 34

Page 25: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Hash-table: Basic Idea

Empty location

After inserting n items into m locations

Q : What is the probability of a given location is empty?

(1 − 1m )n

Q : What is the expected number of empty locations?

Xi ={

1 , location i is empty0 , otherwise

X =∑

1≤i≤mXi

E (X ) = E (∑

1≤i≤mXi) =

∑1≤i≤m

E (Xi) =∑

1≤i≤m(1 − 1/m)n =

m(1 − 1/m)n

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 7 / 34

Page 26: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Hash-table: Basic Idea

Empty location

After inserting n items into m locations

Q : What is the probability of a given location is empty?

(1 − 1m )n

Q : What is the expected number of empty locations?

Xi ={

1 , location i is empty0 , otherwise

X =∑

1≤i≤mXi

E (X ) = E (∑

1≤i≤mXi) =

∑1≤i≤m

E (Xi) =∑

1≤i≤m(1 − 1/m)n =

m(1 − 1/m)n

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 7 / 34

Page 27: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Hash-table: Basic Idea

Empty location

After inserting n items into m locations

Q : What is the probability of a given location is empty?

(1 − 1m )n

Q : What is the expected number of empty locations?

Xi ={

1 , location i is empty0 , otherwise

X =∑

1≤i≤mXi

E (X ) = E (∑

1≤i≤mXi) =

∑1≤i≤m

E (Xi) =∑

1≤i≤m(1 − 1/m)n =

m(1 − 1/m)n

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 7 / 34

Page 28: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Hash-table: Basic Idea

Empty location

After inserting n items into m locations

Q : What is the probability of a given location is empty?

(1 − 1m )n

Q : What is the expected number of empty locations?

Xi ={

1 , location i is empty0 , otherwise

X =∑

1≤i≤mXi

E (X ) = E (∑

1≤i≤mXi) =

∑1≤i≤m

E (Xi) =∑

1≤i≤m(1 − 1/m)n =

m(1 − 1/m)n

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 7 / 34

Page 29: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Hash-table: Basic Idea

Empty location

After inserting n items into m locations

Q : What is the probability of a given location is empty?

(1 − 1m )n

Q : What is the expected number of empty locations?

Xi ={

1 , location i is empty0 , otherwise

X =∑

1≤i≤mXi

E (X ) = E (∑

1≤i≤mXi) =

∑1≤i≤m

E (Xi) =∑

1≤i≤m(1 − 1/m)n =

m(1 − 1/m)n

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 7 / 34

Page 30: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Hash-table: Basic Idea

After inserting n items into a hashtable with n locations

Q : What is the expected number of items in a given location?

Yi ={

1 , the ith item in the given location0 , otherwise

Y =∑

1≤i≤nYi

E (Y ) = E (∑

1≤i≤nYi) =

∑1≤i≤n

E (Yi) =∑

1≤i≤n1/n = 1

Expected number of empty locations: n(1 − 1/n)n = n/e ≈ 0.368n

PARADOX?collisions!

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 8 / 34

Page 31: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Hash-table: Basic Idea

After inserting n items into a hashtable with n locations

Q : What is the expected number of items in a given location?

Yi ={

1 , the ith item in the given location0 , otherwise

Y =∑

1≤i≤nYi

E (Y ) = E (∑

1≤i≤nYi) =

∑1≤i≤n

E (Yi) =∑

1≤i≤n1/n = 1

Expected number of empty locations: n(1 − 1/n)n = n/e ≈ 0.368n

PARADOX?collisions!

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 8 / 34

Page 32: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Hash-table: Basic Idea

After inserting n items into a hashtable with n locations

Q : What is the expected number of items in a given location?

Yi ={

1 , the ith item in the given location0 , otherwise

Y =∑

1≤i≤nYi

E (Y ) = E (∑

1≤i≤nYi) =

∑1≤i≤n

E (Yi) =∑

1≤i≤n1/n = 1

Expected number of empty locations: n(1 − 1/n)n = n/e ≈ 0.368n

PARADOX?collisions!

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 8 / 34

Page 33: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Hash-table: Basic Idea

After inserting n items into a hashtable with n locations

Q : What is the expected number of items in a given location?

Yi ={

1 , the ith item in the given location0 , otherwise

Y =∑

1≤i≤nYi

E (Y ) = E (∑

1≤i≤nYi) =

∑1≤i≤n

E (Yi) =∑

1≤i≤n1/n = 1

Expected number of empty locations: n(1 − 1/n)n = n/e ≈ 0.368n

PARADOX?collisions!

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 8 / 34

Page 34: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Hash-table: Basic Idea

After inserting n items into a hashtable with n locations

Q : What is the expected number of items in a given location?

Yi ={

1 , the ith item in the given location0 , otherwise

Y =∑

1≤i≤nYi

E (Y ) = E (∑

1≤i≤nYi) =

∑1≤i≤n

E (Yi) =∑

1≤i≤n1/n = 1

Expected number of empty locations: n(1 − 1/n)n = n/e ≈ 0.368n

PARADOX?collisions!

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 8 / 34

Page 35: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Hash-table: Basic Idea

After inserting n items into a hashtable with n locations

Q : What is the expected number of items in a given location?

Yi ={

1 , the ith item in the given location0 , otherwise

Y =∑

1≤i≤nYi

E (Y ) = E (∑

1≤i≤nYi) =

∑1≤i≤n

E (Yi) =∑

1≤i≤n1/n = 1

Expected number of empty locations: n(1 − 1/n)n = n/e ≈ 0.368n

PARADOX?collisions!

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 8 / 34

Page 36: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Hash-table: Basic Idea

After inserting n items into a hashtable with n locations

Q : What is the expected number of items in a given location?

Yi ={

1 , the ith item in the given location0 , otherwise

Y =∑

1≤i≤nYi

E (Y ) = E (∑

1≤i≤nYi) =

∑1≤i≤n

E (Yi) =∑

1≤i≤n1/n = 1

Expected number of empty locations: n(1 − 1/n)n = n/e ≈ 0.368n

PARADOX?

collisions!

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 8 / 34

Page 37: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Hash-table: Basic Idea

After inserting n items into a hashtable with n locations

Q : What is the expected number of items in a given location?

Yi ={

1 , the ith item in the given location0 , otherwise

Y =∑

1≤i≤nYi

E (Y ) = E (∑

1≤i≤nYi) =

∑1≤i≤n

E (Yi) =∑

1≤i≤n1/n = 1

Expected number of empty locations: n(1 − 1/n)n = n/e ≈ 0.368n

PARADOX?collisions!

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 8 / 34

Page 38: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Hash-table: Basic Idea

Expected number of collisions

After inserting n items into m locations

Q : What is the expected number of collisions?

E (collisions) = n − E (occupied locations)

= n − (m − E (empty locations))

= n − (m − m (1 − 1/m)n)

= n − m + m (1 − 1/m)n

Example:When n = 100, m = 100, then E (collision) ≈ 37

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 9 / 34

Page 39: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Hash-table: Basic Idea

Expected number of collisions

After inserting n items into m locations

Q : What is the expected number of collisions?

E (collisions) = n − E (occupied locations)

= n − (m − E (empty locations))

= n − (m − m (1 − 1/m)n)

= n − m + m (1 − 1/m)n

Example:When n = 100, m = 100, then E (collision) ≈ 37

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 9 / 34

Page 40: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Hash-table: Basic Idea

Expected number of collisions

After inserting n items into m locations

Q : What is the expected number of collisions?

E (collisions) = n − E (occupied locations)

= n − (m − E (empty locations))

= n − (m − m (1 − 1/m)n)

= n − m + m (1 − 1/m)n

Example:When n = 100, m = 100, then E (collision) ≈ 37

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 9 / 34

Page 41: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Hash-table: Basic Idea

Expected number of collisions

After inserting n items into m locations

Q : What is the expected number of collisions?

E (collisions) = n − E (occupied locations)

= n − (m − E (empty locations))

= n − (m − m (1 − 1/m)n)

= n − m + m (1 − 1/m)n

Example:When n = 100, m = 100, then E (collision) ≈ 37

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 9 / 34

Page 42: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Hash-table: Basic Idea

Expected number of collisions

After inserting n items into m locations

Q : What is the expected number of collisions?

E (collisions) = n − E (occupied locations)

= n − (m − E (empty locations))

= n − (m − m (1 − 1/m)n)

= n − m + m (1 − 1/m)n

Example:When n = 100, m = 100, then E (collision) ≈ 37

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 9 / 34

Page 43: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Hash-table: Basic Idea

After inserting n items into m locations

Q : What is the expected number of items needed to fullfill all mlocations?

Xi : the number of items to be added to increase thenumber of occupied locations from i − 1 to i .Given i − 1 occupied locations,

the success probabilityfor each trial is (m − i + 1)/m

Thus, E (Xi) = m/(m − i + 1)

E (X ) =m∑

i=1E (Xi) =

m∑i=1

m/(m − i + 1)

= mm∑

i=11/(m − i + 1)

= mm∑

j=11/j = Θ(m lg m)

· · ·

# = i − 1

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 10 / 34

Page 44: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Hash-table: Basic Idea

After inserting n items into m locations

Q : What is the expected number of items needed to fullfill all mlocations?

Xi : the number of items to be added to increase thenumber of occupied locations from i − 1 to i .

Given i − 1 occupied locations,

the success probabilityfor each trial is (m − i + 1)/m

Thus, E (Xi) = m/(m − i + 1)

E (X ) =m∑

i=1E (Xi) =

m∑i=1

m/(m − i + 1)

= mm∑

i=11/(m − i + 1)

= mm∑

j=11/j = Θ(m lg m)

· · ·

# = i − 1

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 10 / 34

Page 45: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Hash-table: Basic Idea

After inserting n items into m locations

Q : What is the expected number of items needed to fullfill all mlocations?

Xi : the number of items to be added to increase thenumber of occupied locations from i − 1 to i .Given i − 1 occupied locations,

the success probabilityfor each trial is (m − i + 1)/mThus, E (Xi) = m/(m − i + 1)

E (X ) =m∑

i=1E (Xi) =

m∑i=1

m/(m − i + 1)

= mm∑

i=11/(m − i + 1)

= mm∑

j=11/j = Θ(m lg m)

· · ·

# = i − 1

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 10 / 34

Page 46: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Hash-table: Basic Idea

After inserting n items into m locations

Q : What is the expected number of items needed to fullfill all mlocations?

Xi : the number of items to be added to increase thenumber of occupied locations from i − 1 to i .Given i − 1 occupied locations,

the success probabilityfor each trial is (m − i + 1)/mThus, E (Xi) = m/(m − i + 1)

E (X ) =m∑

i=1E (Xi) =

m∑i=1

m/(m − i + 1)

= mm∑

i=11/(m − i + 1)

= mm∑

j=11/j = Θ(m lg m)

· · ·

# = i − 1

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 10 / 34

Page 47: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Hash-table: Basic Idea

After inserting n items into m locations

Q : What is the expected number of items needed to fullfill all mlocations?

Xi : the number of items to be added to increase thenumber of occupied locations from i − 1 to i .Given i − 1 occupied locations, the success probabilityfor each trial is (m − i + 1)/m

Thus, E (Xi) = m/(m − i + 1)

E (X ) =m∑

i=1E (Xi) =

m∑i=1

m/(m − i + 1)

= mm∑

i=11/(m − i + 1)

= mm∑

j=11/j = Θ(m lg m)

· · ·

# = i − 1

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 10 / 34

Page 48: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Hash-table: Basic Idea

After inserting n items into m locations

Q : What is the expected number of items needed to fullfill all mlocations?

Xi : the number of items to be added to increase thenumber of occupied locations from i − 1 to i .Given i − 1 occupied locations, the success probabilityfor each trial is (m − i + 1)/mThus, E (Xi) = m/(m − i + 1)

E (X ) =m∑

i=1E (Xi) =

m∑i=1

m/(m − i + 1)

= mm∑

i=11/(m − i + 1)

= mm∑

j=11/j = Θ(m lg m)

· · ·

# = i − 1

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 10 / 34

Page 49: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Hash-table: Basic Idea

After inserting n items into m locations

Q : What is the expected number of items needed to fullfill all mlocations?

Xi : the number of items to be added to increase thenumber of occupied locations from i − 1 to i .Given i − 1 occupied locations, the success probabilityfor each trial is (m − i + 1)/mThus, E (Xi) = m/(m − i + 1)

E (X ) =m∑

i=1E (Xi) =

m∑i=1

m/(m − i + 1)

= mm∑

i=11/(m − i + 1)

= mm∑

j=11/j = Θ(m lg m)

· · ·

# = i − 1

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 10 / 34

Page 50: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

Contents

1 Hash-table: Basic Idea

2 Hash Functions

3 Collision Resolution

. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .

Page 51: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Hash Functions

Hashing: the idea

Hash Function

Index distributionCollision handling

Key Space

Very large, but only a smallpart is used in an applica-tion at a certain time.

x· · ·

· · ·

E [0]E [1]

E [m − 1]

E [k]

In feasible size

Hash Function is IMPORTANT!

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 11 / 34

Page 52: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Hash Functions

Hashing: the idea

Hash Function

Index distributionCollision handling

Key Space

Very large, but only a smallpart is used in an applica-tion at a certain time.

x· · ·

· · ·

E [0]E [1]

E [m − 1]

E [k]

In feasible size

Hash Function is IMPORTANT!

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 11 / 34

Page 53: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Hash Functions

Two typical hash functions

Division Method: map a key k into one of m slots by taking theremainder of k divided by m.

h(k) = k mod m

Multiplication Method:Step-1: multiply the k by a constant A in the range 0 < A < 1 andextract the fractional part of kA.Step-2: multiply the fractional part by m.

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 12 / 34

Page 54: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Hash Functions

Two typical hash functions

Division Method: map a key k into one of m slots by taking the remainderof k divided by m:

h(k) = k mod m

Q : Why should we avoid m = 2p?Since if m = 2p, then h(k) is just the p lowest-order bits of k.Unless we know that all low-order p-bit patterns are equally likely, weare better off designing the hash function to depend on all the bits ofthe key.

A prime not too close to an exact power of 2 is often a good choice for m.

Exercise 11.3-3

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 13 / 34

Page 55: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Hash Functions

Two typical hash functions

Division Method: map a key k into one of m slots by taking the remainderof k divided by m:

h(k) = k mod m

Q : Why should we avoid m = 2p?

Since if m = 2p, then h(k) is just the p lowest-order bits of k.Unless we know that all low-order p-bit patterns are equally likely, weare better off designing the hash function to depend on all the bits ofthe key.

A prime not too close to an exact power of 2 is often a good choice for m.

Exercise 11.3-3

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 13 / 34

Page 56: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Hash Functions

Two typical hash functions

Division Method: map a key k into one of m slots by taking the remainderof k divided by m:

h(k) = k mod m

Q : Why should we avoid m = 2p?Since if m = 2p, then h(k) is just the p lowest-order bits of k.

Unless we know that all low-order p-bit patterns are equally likely, weare better off designing the hash function to depend on all the bits ofthe key.

A prime not too close to an exact power of 2 is often a good choice for m.

Exercise 11.3-3

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 13 / 34

Page 57: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Hash Functions

Two typical hash functions

Division Method: map a key k into one of m slots by taking the remainderof k divided by m:

h(k) = k mod m

Q : Why should we avoid m = 2p?Since if m = 2p, then h(k) is just the p lowest-order bits of k.Unless we know that all low-order p-bit patterns are equally likely, weare better off designing the hash function to depend on all the bits ofthe key.

A prime not too close to an exact power of 2 is often a good choice for m.

Exercise 11.3-3

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 13 / 34

Page 58: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Hash Functions

Two typical hash functions

Division Method: map a key k into one of m slots by taking the remainderof k divided by m:

h(k) = k mod m

Q : Why should we avoid m = 2p?Since if m = 2p, then h(k) is just the p lowest-order bits of k.Unless we know that all low-order p-bit patterns are equally likely, weare better off designing the hash function to depend on all the bits ofthe key.

A prime not too close to an exact power of 2 is often a good choice for m.

Exercise 11.3-3

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 13 / 34

Page 59: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Hash Functions

Two typical hash functions

Multiplication Method:Step-1: multiply the k by a constant A in the range 0 < A < 1 andextract the fractional part of kA.Step-2: multiply the fractional part by m

h(k) = ⌊m(kA mod 1)⌋

Q : Why do we usually choose m = 2p?

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 14 / 34

Page 60: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Hash Functions

Two typical hash functions

Multiplication Method:Step-1: multiply the k by a constant A in the range 0 < A < 1 andextract the fractional part of kA.Step-2: multiply the fractional part by m

h(k) = ⌊m(kA mod 1)⌋

Q : Why do we usually choose m = 2p?

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 14 / 34

Page 61: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Hash Functions

Two typical hash functions

Multiplication Method:Step-1: multiply the k by a constant A in the range 0 < A < 1 andextract the fractional part of kA.Step-2: multiply the fractional part by m

h(k) = ⌊m(kA mod 1)⌋

Q : Why do we usually choose m = 2p?

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 14 / 34

Page 62: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

Contents

1 Hash-table: Basic Idea

2 Hash Functions

3 Collision Resolution

. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .

Page 63: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Collision Resolution

Collision

Hash FunctionKey Space

xy

· · ·

· · ·

E [0]E [1]

E [m − 1]

E [k]

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 15 / 34

Page 64: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Collision Resolution

Collision Resolution

Chaining (Closed Addressing) Open Addressing

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 16 / 34

Page 65: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Collision Resolution Chaining

Collision Resolution by Chaining

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 17 / 34

Page 66: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Collision Resolution Chaining

Collision Resolution by Chaining

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 17 / 34

Page 67: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Collision Resolution Chaining

Collision Resolution by Chaining: Unsuccessful SearchQ : What is the average cost of an unsuccessful search?

For j = 0, 1, 2, ..., m − 1, the average length of thelist at E [j] is n/m = α.

Any key that is not in the table is equally likely to hash to any of them addresses.The average cost to determine that the key is not in the list E [h(k)]is the cost to search to the end of the list, which is α.Total cost is Θ(1 + α)

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 18 / 34

Page 68: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Collision Resolution Chaining

Collision Resolution by Chaining: Unsuccessful SearchQ : What is the average cost of an unsuccessful search?

For j = 0, 1, 2, ..., m − 1, the average length of thelist at E [j] is n/m = α.

Any key that is not in the table is equally likely to hash to any of them addresses.The average cost to determine that the key is not in the list E [h(k)]is the cost to search to the end of the list, which is α.Total cost is Θ(1 + α)

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 18 / 34

Page 69: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Collision Resolution Chaining

Collision Resolution by Chaining: Unsuccessful SearchQ : What is the average cost of an unsuccessful search?

For j = 0, 1, 2, ..., m − 1, the average length of thelist at E [j] is n/m = α.

Any key that is not in the table is equally likely to hash to any of them addresses.The average cost to determine that the key is not in the list E [h(k)]is the cost to search to the end of the list, which is α.Total cost is Θ(1 + α)

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 18 / 34

Page 70: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Collision Resolution Chaining

Collision Resolution by Chaining: Unsuccessful SearchQ : What is the average cost of an unsuccessful search?

For j = 0, 1, 2, ..., m − 1, the average length of thelist at E [j] is n/m = α.

Any key that is not in the table is equally likely to hash to any of them addresses.

The average cost to determine that the key is not in the list E [h(k)]is the cost to search to the end of the list, which is α.Total cost is Θ(1 + α)

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 18 / 34

Page 71: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Collision Resolution Chaining

Collision Resolution by Chaining: Unsuccessful SearchQ : What is the average cost of an unsuccessful search?

For j = 0, 1, 2, ..., m − 1, the average length of thelist at E [j] is n/m = α.

Any key that is not in the table is equally likely to hash to any of them addresses.The average cost to determine that the key is not in the list E [h(k)]is the cost to search to the end of the list, which is α.

Total cost is Θ(1 + α)

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 18 / 34

Page 72: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Collision Resolution Chaining

Collision Resolution by Chaining: Unsuccessful SearchQ : What is the average cost of an unsuccessful search?

For j = 0, 1, 2, ..., m − 1, the average length of thelist at E [j] is n/m = α.

Any key that is not in the table is equally likely to hash to any of them addresses.The average cost to determine that the key is not in the list E [h(k)]is the cost to search to the end of the list, which is α.Total cost is Θ(1 + α)

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 18 / 34

Page 73: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Collision Resolution Chaining

Collision Resolution by Chaining: Successful SearchQ : What is the average cost of an successful search?

xi : the ith element inserted into the table.The probability of that xi is searched is 1/n.For a specific xi , the number of elements examined in a successfulsearch is t + 1I t: the number of elements inserted into the same list as xi , after xi

has been inserted.I The probability of that xj is inserted into the same list of xi is 1/m.I The expected number of elements examined for a successful search

of xi is (1 +n∑

j=i+1

1m )

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 19 / 34

Page 74: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Collision Resolution Chaining

Collision Resolution by Chaining: Successful SearchQ : What is the average cost of an successful search?

xi : the ith element inserted into the table.The probability of that xi is searched is 1/n.For a specific xi , the number of elements examined in a successfulsearch is t + 1I t: the number of elements inserted into the same list as xi , after xi

has been inserted.I The probability of that xj is inserted into the same list of xi is 1/m.I The expected number of elements examined for a successful search

of xi is (1 +n∑

j=i+1

1m )

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 19 / 34

Page 75: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Collision Resolution Chaining

Collision Resolution by Chaining: Successful SearchQ : What is the average cost of an successful search?

xi : the ith element inserted into the table.

The probability of that xi is searched is 1/n.For a specific xi , the number of elements examined in a successfulsearch is t + 1I t: the number of elements inserted into the same list as xi , after xi

has been inserted.I The probability of that xj is inserted into the same list of xi is 1/m.I The expected number of elements examined for a successful search

of xi is (1 +n∑

j=i+1

1m )

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 19 / 34

Page 76: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Collision Resolution Chaining

Collision Resolution by Chaining: Successful SearchQ : What is the average cost of an successful search?

xi : the ith element inserted into the table.The probability of that xi is searched is 1/n.

For a specific xi , the number of elements examined in a successfulsearch is t + 1I t: the number of elements inserted into the same list as xi , after xi

has been inserted.I The probability of that xj is inserted into the same list of xi is 1/m.I The expected number of elements examined for a successful search

of xi is (1 +n∑

j=i+1

1m )

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 19 / 34

Page 77: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Collision Resolution Chaining

Collision Resolution by Chaining: Successful SearchQ : What is the average cost of an successful search?

xi : the ith element inserted into the table.The probability of that xi is searched is 1/n.For a specific xi , the number of elements examined in a successfulsearch is t + 1

I t: the number of elements inserted into the same list as xi , after xihas been inserted.

I The probability of that xj is inserted into the same list of xi is 1/m.I The expected number of elements examined for a successful search

of xi is (1 +n∑

j=i+1

1m )

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 19 / 34

Page 78: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Collision Resolution Chaining

Collision Resolution by Chaining: Successful SearchQ : What is the average cost of an successful search?

xi : the ith element inserted into the table.The probability of that xi is searched is 1/n.For a specific xi , the number of elements examined in a successfulsearch is t + 1I t: the number of elements inserted into the same list as xi , after xi

has been inserted.

I The probability of that xj is inserted into the same list of xi is 1/m.I The expected number of elements examined for a successful search

of xi is (1 +n∑

j=i+1

1m )

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 19 / 34

Page 79: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Collision Resolution Chaining

Collision Resolution by Chaining: Successful SearchQ : What is the average cost of an successful search?

xi : the ith element inserted into the table.The probability of that xi is searched is 1/n.For a specific xi , the number of elements examined in a successfulsearch is t + 1I t: the number of elements inserted into the same list as xi , after xi

has been inserted.I The probability of that xj is inserted into the same list of xi is 1/m.

I The expected number of elements examined for a successful searchof xi is (1 +

n∑j=i+1

1m )

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 19 / 34

Page 80: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Collision Resolution Chaining

Collision Resolution by Chaining: Successful SearchQ : What is the average cost of an successful search?

xi : the ith element inserted into the table.The probability of that xi is searched is 1/n.For a specific xi , the number of elements examined in a successfulsearch is t + 1I t: the number of elements inserted into the same list as xi , after xi

has been inserted.I The probability of that xj is inserted into the same list of xi is 1/m.I The expected number of elements examined for a successful search

of xi is (1 +n∑

j=i+1

1m )

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 19 / 34

Page 81: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Collision Resolution Chaining

Collision Resolution by Chaining: Successful SearchQ : What is the average cost of a successful search?

Average number of elements examined for an successful search is

1n

n∑i=1

(1 +n∑

j=i+11m )

= 1 + 1nm (

n∑j=i+1

1)

= 1 + 1nm (

n∑i=1

n − i)

= 1 + 1nm (n2 −

n∑i=1

i)

= 1 + 1nm (n2 − n(n+1)

2 )

= 1 + n2m − n

2nm

= 1 + α − α2n

Average cost for a successful search is 2 + α − α2n = Θ(1 + α)

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 20 / 34

Page 82: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Collision Resolution Chaining

Collision Resolution by Chaining: Successful SearchQ : What is the average cost of a successful search?

Average number of elements examined for an successful search is

1n

n∑i=1

(1 +n∑

j=i+11m ) = 1 + 1

nm (n∑

j=i+11)

= 1 + 1nm (

n∑i=1

n − i)

= 1 + 1nm (n2 −

n∑i=1

i)

= 1 + 1nm (n2 − n(n+1)

2 )

= 1 + n2m − n

2nm

= 1 + α − α2n

Average cost for a successful search is 2 + α − α2n = Θ(1 + α)

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 20 / 34

Page 83: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Collision Resolution Chaining

Collision Resolution by Chaining: Successful SearchQ : What is the average cost of a successful search?

Average number of elements examined for an successful search is

1n

n∑i=1

(1 +n∑

j=i+11m ) = 1 + 1

nm (n∑

j=i+11)

= 1 + 1nm (

n∑i=1

n − i)

= 1 + 1nm (n2 −

n∑i=1

i)

= 1 + 1nm (n2 − n(n+1)

2 )

= 1 + n2m − n

2nm

= 1 + α − α2n

Average cost for a successful search is 2 + α − α2n = Θ(1 + α)

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 20 / 34

Page 84: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Collision Resolution Chaining

Collision Resolution by Chaining: Successful SearchQ : What is the average cost of a successful search?

Average number of elements examined for an successful search is

1n

n∑i=1

(1 +n∑

j=i+11m ) = 1 + 1

nm (n∑

j=i+11)

= 1 + 1nm (

n∑i=1

n − i)

= 1 + 1nm (n2 −

n∑i=1

i)

= 1 + 1nm (n2 − n(n+1)

2 )

= 1 + n2m − n

2nm

= 1 + α − α2n

Average cost for a successful search is 2 + α − α2n = Θ(1 + α)

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 20 / 34

Page 85: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Collision Resolution Chaining

Collision Resolution by Chaining: Successful SearchQ : What is the average cost of a successful search?

Average number of elements examined for an successful search is

1n

n∑i=1

(1 +n∑

j=i+11m ) = 1 + 1

nm (n∑

j=i+11)

= 1 + 1nm (

n∑i=1

n − i)

= 1 + 1nm (n2 −

n∑i=1

i)

= 1 + 1nm (n2 − n(n+1)

2 )

= 1 + n2m − n

2nm

= 1 + α − α2n

Average cost for a successful search is 2 + α − α2n = Θ(1 + α)

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 20 / 34

Page 86: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Collision Resolution Chaining

Collision Resolution by Chaining: Successful SearchQ : What is the average cost of a successful search?

Average number of elements examined for an successful search is

1n

n∑i=1

(1 +n∑

j=i+11m ) = 1 + 1

nm (n∑

j=i+11)

= 1 + 1nm (

n∑i=1

n − i)

= 1 + 1nm (n2 −

n∑i=1

i)

= 1 + 1nm (n2 − n(n+1)

2 )

= 1 + n2m − n

2nm

= 1 + α − α2n

Average cost for a successful search is 2 + α − α2n = Θ(1 + α)

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 20 / 34

Page 87: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Collision Resolution Chaining

Collision Resolution by Chaining: Example

JAVA 7 HashMap

1 static int hash( int h) {2 h^= (h>>>20)^(h>>>12);3 return h^(h>>>7)^(h>>>4);4 }

In Java 7, after calculating hash from hashfunction, if more then one element has samehash than they are searched by linear search, soit’s complexity is O(n).In Java 8, that search is performed by binarysearch so the complexity will become log n.

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 21 / 34

Page 88: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Collision Resolution Open Addressing

Open Addressing

All elements are stored in the hash table, no linked list isused. So, α ≤ 1.

Collision is settled by “rehashing”:I A function used to get a new hashing address for each

collided addressI The hash table slots are probed successively, until a valid

location is found.The probing sequence can be seen as a permutation of(0, 1, 2, · · · , m − 1)→⟨h(k, 0), h(k, 1), · · · , h(k, m − 1)⟩

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 22 / 34

Page 89: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Collision Resolution Open Addressing

Open Addressing

All elements are stored in the hash table, no linked list isused. So, α ≤ 1.Collision is settled by “rehashing”:

I A function used to get a new hashing address for eachcollided address

I The hash table slots are probed successively, until a validlocation is found.

The probing sequence can be seen as a permutation of(0, 1, 2, · · · , m − 1)→⟨h(k, 0), h(k, 1), · · · , h(k, m − 1)⟩

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 22 / 34

Page 90: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Collision Resolution Open Addressing

Open Addressing

All elements are stored in the hash table, no linked list isused. So, α ≤ 1.Collision is settled by “rehashing”:I A function used to get a new hashing address for each

collided address

I The hash table slots are probed successively, until a validlocation is found.

The probing sequence can be seen as a permutation of(0, 1, 2, · · · , m − 1)→⟨h(k, 0), h(k, 1), · · · , h(k, m − 1)⟩

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 22 / 34

Page 91: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Collision Resolution Open Addressing

Open Addressing

All elements are stored in the hash table, no linked list isused. So, α ≤ 1.Collision is settled by “rehashing”:I A function used to get a new hashing address for each

collided addressI The hash table slots are probed successively, until a valid

location is found.

The probing sequence can be seen as a permutation of(0, 1, 2, · · · , m − 1)→⟨h(k, 0), h(k, 1), · · · , h(k, m − 1)⟩

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 22 / 34

Page 92: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Collision Resolution Open Addressing

Open Addressing

All elements are stored in the hash table, no linked list isused. So, α ≤ 1.Collision is settled by “rehashing”:I A function used to get a new hashing address for each

collided addressI The hash table slots are probed successively, until a valid

location is found.The probing sequence can be seen as a permutation of(0, 1, 2, · · · , m − 1)→⟨h(k, 0), h(k, 1), · · · , h(k, m − 1)⟩

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 22 / 34

Page 93: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Collision Resolution Open Addressing

Open Addressing: Commonly Used Probings

Linear Probing: (primary clustering may occur)Given an auxiliary hash function h′, the hash function is:

h(k, i) = (h′(k) + i) mod m, (i = 0, 1, ..., m − 1)

Quadratic Probing: (secondary clustering may occur)Given auxiliary function h′ and nonzero auxiliary constant c1 and c2, thehash function is:

h(k, i) = (h′(k) + c1i + c2i2) mod m, (i = 0, 1, ..., m − 1)

Double Hashing:Given auxiliary functions h1 and h2, the hash function is:

h(k, i) = (h1(k) + ih2(k)) mod m, (i = 0, 1, ..., m − 1)

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 23 / 34

Page 94: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Collision Resolution Open Addressing

Open Addressing: Commonly Used Probings

Linear Probing: (primary clustering may occur)Given an auxiliary hash function h′, the hash function is:

h(k, i) = (h′(k) + i) mod m, (i = 0, 1, ..., m − 1)

Quadratic Probing: (secondary clustering may occur)Given auxiliary function h′ and nonzero auxiliary constant c1 and c2, thehash function is:

h(k, i) = (h′(k) + c1i + c2i2) mod m, (i = 0, 1, ..., m − 1)

Double Hashing:Given auxiliary functions h1 and h2, the hash function is:

h(k, i) = (h1(k) + ih2(k)) mod m, (i = 0, 1, ..., m − 1)

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 23 / 34

Page 95: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Collision Resolution Open Addressing

Open Addressing: Commonly Used Probings

Linear Probing: (primary clustering may occur)Given an auxiliary hash function h′, the hash function is:

h(k, i) = (h′(k) + i) mod m, (i = 0, 1, ..., m − 1)

Quadratic Probing: (secondary clustering may occur)Given auxiliary function h′ and nonzero auxiliary constant c1 and c2, thehash function is:

h(k, i) = (h′(k) + c1i + c2i2) mod m, (i = 0, 1, ..., m − 1)

Double Hashing:Given auxiliary functions h1 and h2, the hash function is:

h(k, i) = (h1(k) + ih2(k)) mod m, (i = 0, 1, ..., m − 1)MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 23 / 34

Page 96: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Collision Resolution Open Addressing

Linear Probing: an Example

0 17761

1776

2

1776

3

1776

4

1776

5

1776

6

1776

7

1776

10551492

1918

Hasing function: h(x) = 5x mod 8

Rehasing function: rh(j) = (j + 1) mod 8

1812hashing

1812rehashing

1812

1945

hashing

1945rehashing

1945

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 24 / 34

Page 97: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Collision Resolution Open Addressing

Linear Probing: an Example

0 17761

1776

2

1776

3

1776

4

1776

5

1776

6

1776

7

1776

10551492

1918

Hasing function: h(x) = 5x mod 8

Rehasing function: rh(j) = (j + 1) mod 8

1812

hashing

1812rehashing

1812

1945

hashing

1945rehashing

1945

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 24 / 34

Page 98: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Collision Resolution Open Addressing

Linear Probing: an Example

0 17761

1776

2

1776

3

1776

4

1776

5

1776

6

1776

7

1776

10551492

1918

Hasing function: h(x) = 5x mod 8

Rehasing function: rh(j) = (j + 1) mod 8

1812hashing

1812rehashing

1812

1945

hashing

1945rehashing

1945

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 24 / 34

Page 99: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Collision Resolution Open Addressing

Linear Probing: an Example

0 17761

1776

2

1776

3

1776

4

1776

5

1776

6

1776

7

1776

10551492

1918

Hasing function: h(x) = 5x mod 8

Rehasing function: rh(j) = (j + 1) mod 8

1812hashing

1812rehashing

1812

1945

hashing

1945rehashing

1945

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 24 / 34

Page 100: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Collision Resolution Open Addressing

Linear Probing: an Example

0 17761

1776

2

1776

3

1776

4

1776

5

1776

6

1776

7

1776

10551492

1918

Hasing function: h(x) = 5x mod 8

Rehasing function: rh(j) = (j + 1) mod 8

1812hashing

1812rehashing

1812

1945

hashing

1945rehashing

1945

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 24 / 34

Page 101: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Collision Resolution Open Addressing

Linear Probing: an Example

0 17761

1776

2

1776

3

1776

4

1776

5

1776

6

1776

7

1776

10551492

1918

Hasing function: h(x) = 5x mod 8

Rehasing function: rh(j) = (j + 1) mod 8

1812hashing

1812rehashing

1812

1945

hashing

1945rehashing

1945

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 24 / 34

Page 102: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Collision Resolution Open Addressing

Linear Probing: an Example

0 17761

1776

2

1776

3

1776

4

1776

5

1776

6

1776

7

1776

10551492

1918

Hasing function: h(x) = 5x mod 8

Rehasing function: rh(j) = (j + 1) mod 8

1812hashing

1812rehashing

1812

1945

hashing

1945rehashing

1945

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 24 / 34

Page 103: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Collision Resolution Open Addressing

Linear Probing: an Example

0 17761

1776

2

1776

3

1776

4

1776

5

1776

6

1776

7

1776

10551492

1918

Hasing function: h(x) = 5x mod 8

Rehasing function: rh(j) = (j + 1) mod 8

1812hashing

1812rehashing

1812

1945

hashing

1945

rehashing1945

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 24 / 34

Page 104: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Collision Resolution Open Addressing

Linear Probing: an Example

0 17761

1776

2

1776

3

1776

4

1776

5

1776

6

1776

7

1776

10551492

1918

Hasing function: h(x) = 5x mod 8

Rehasing function: rh(j) = (j + 1) mod 8

1812hashing

1812rehashing

1812

1945

hashing

1945rehashing

1945

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 24 / 34

Page 105: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Collision Resolution Open Addressing

Linear Probing: an Example

0 17761

1776

2

1776

3

1776

4

1776

5

1776

6

1776

7

1776

10551492

1918

Hasing function: h(x) = 5x mod 8

Rehasing function: rh(j) = (j + 1) mod 8

1812hashing

1812rehashing

1812

1945

hashing

1945rehashing

1945

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 24 / 34

Page 106: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Collision Resolution Open Addressing

Open Addressing: Commonly Used Probings

Q : How to evaluate the goodness of a probing?

AssumptionEach key is equally likely to have any of the m! permutationsof (1, 2, · · · , m − 1) as its probe sequence.

Both linear and quadratic probing have only m distinct probesequences, as determined by the first probe.

h(k, i) = (h′(k) + i) mod m, (i = 0, 1, ..., m − 1)h(k, i) = (h′(k) + c1i + c2i2) mod m, (i = 0, 1, ..., m − 1)

Double hashing improves over linear or quadratic probing in thatΘ(m2) probe sequences are used.

h(k, i) = (h1(k) + ih2(k)) mod m, (i = 0, 1, ..., m − 1)

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 25 / 34

Page 107: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Collision Resolution Open Addressing

Open Addressing: Commonly Used Probings

Q : How to evaluate the goodness of a probing?

AssumptionEach key is equally likely to have any of the m! permutationsof (1, 2, · · · , m − 1) as its probe sequence.

Both linear and quadratic probing have only m distinct probesequences, as determined by the first probe.

h(k, i) = (h′(k) + i) mod m, (i = 0, 1, ..., m − 1)h(k, i) = (h′(k) + c1i + c2i2) mod m, (i = 0, 1, ..., m − 1)

Double hashing improves over linear or quadratic probing in thatΘ(m2) probe sequences are used.

h(k, i) = (h1(k) + ih2(k)) mod m, (i = 0, 1, ..., m − 1)

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 25 / 34

Page 108: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Collision Resolution Open Addressing

Open Addressing: Commonly Used Probings

Q : How to evaluate the goodness of a probing?

AssumptionEach key is equally likely to have any of the m! permutationsof (1, 2, · · · , m − 1) as its probe sequence.

Both linear and quadratic probing have only m distinct probesequences, as determined by the first probe.

h(k, i) = (h′(k) + i) mod m, (i = 0, 1, ..., m − 1)h(k, i) = (h′(k) + c1i + c2i2) mod m, (i = 0, 1, ..., m − 1)

Double hashing improves over linear or quadratic probing in thatΘ(m2) probe sequences are used.

h(k, i) = (h1(k) + ih2(k)) mod m, (i = 0, 1, ..., m − 1)

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 25 / 34

Page 109: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Collision Resolution Open Addressing

Open Addressing: Commonly Used Probings

Q : How to evaluate the goodness of a probing?

AssumptionEach key is equally likely to have any of the m! permutationsof (1, 2, · · · , m − 1) as its probe sequence.

Both linear and quadratic probing have only m distinct probesequences, as determined by the first probe.

h(k, i) = (h′(k) + i) mod m, (i = 0, 1, ..., m − 1)h(k, i) = (h′(k) + c1i + c2i2) mod m, (i = 0, 1, ..., m − 1)

Double hashing improves over linear or quadratic probing in thatΘ(m2) probe sequences are used.

h(k, i) = (h1(k) + ih2(k)) mod m, (i = 0, 1, ..., m − 1)

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 25 / 34

Page 110: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Collision Resolution Open Addressing

Open Addressing: Deleting Element

Q : Why Open Addressing is not suitable for situations where itemsmight be deleted?

The probing chain might be broken if an item is deleted.How to deal with it?

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 26 / 34

Page 111: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Collision Resolution Open Addressing

Open Addressing: Deleting Element

Q : Why Open Addressing is not suitable for situations where itemsmight be deleted?

The probing chain might be broken if an item is deleted.

How to deal with it?

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 26 / 34

Page 112: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Collision Resolution Open Addressing

Open Addressing: Deleting Element

Q : Why Open Addressing is not suitable for situations where itemsmight be deleted?

The probing chain might be broken if an item is deleted.How to deal with it?

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 26 / 34

Page 113: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Collision Resolution Open Addressing

Open Addressing: Unsuccessful SearchQ : Assuming uniform hashing, what is the average number of probesin an unsuccessful search?

X : the number of probes made in an unsuccessful search

E (X ) =∞∑

i=0i · Pr(X = i)

=∞∑

i=0i · (Pr(X ≥ i) − Pr(X ≥ i + 1))

=∞∑

i=1Pr(X ≥ i)

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 27 / 34

Page 114: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Collision Resolution Open Addressing

Open Addressing: Unsuccessful SearchQ : Assuming uniform hashing, what is the average number of probesin an unsuccessful search?

X : the number of probes made in an unsuccessful search

E (X ) =∞∑

i=0i · Pr(X = i)

=∞∑

i=0i · (Pr(X ≥ i) − Pr(X ≥ i + 1))

=∞∑

i=1Pr(X ≥ i)

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 27 / 34

Page 115: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Collision Resolution Open Addressing

Open Addressing: Unsuccessful SearchQ : Assuming uniform hashing, what is the average number of probesin an unsuccessful search?

X : the number of probes made in an unsuccessful search

E (X ) =∞∑

i=0i · Pr(X = i)

=∞∑

i=0i · (Pr(X ≥ i) − Pr(X ≥ i + 1))

=∞∑

i=1Pr(X ≥ i)

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 27 / 34

Page 116: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Collision Resolution Open Addressing

Open Addressing: Unsuccessful SearchQ : Assuming uniform hashing, what is the average number of probesin an unsuccessful search?

X : the number of probes made in an unsuccessful search

E (X ) =∞∑

i=0i · Pr(X = i)

=∞∑

i=0i · (Pr(X ≥ i) − Pr(X ≥ i + 1))

=∞∑

i=1Pr(X ≥ i)

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 27 / 34

Page 117: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Collision Resolution Open Addressing

Open Addressing: Unsuccessful Search

Q : How to compute Pr(X ≥ i)?

Ai : the ith probe occurs at an occupied slotX ≥ i = A1 ∩ A2 ∩ · · · ∩ Ai−1

Pr(X ≥ i) = Pr(A1 ∩ A2 ∩ · · · ∩ Ai−1)= Pr(A1) · Pr(A2|A1) · Pr(A3|A1 ∩ A2) · · · Pr(Ai−1|A1 ∩ A2 ∩ · · · ∩ Ai−2)

= nm · n−1

m−1 · · · n−i+2m−i+2

≤ ( nm )i−1

= αi−1

Then, E (x) =∞∑

i=1Pr(X ≥ i) ≤

∞∑i=1

αi−1 =≤∞∑

i=0αi = 1

1−α

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 28 / 34

Page 118: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Collision Resolution Open Addressing

Open Addressing: Unsuccessful Search

Q : How to compute Pr(X ≥ i)?Ai : the ith probe occurs at an occupied slot

X ≥ i = A1 ∩ A2 ∩ · · · ∩ Ai−1

Pr(X ≥ i) = Pr(A1 ∩ A2 ∩ · · · ∩ Ai−1)= Pr(A1) · Pr(A2|A1) · Pr(A3|A1 ∩ A2) · · · Pr(Ai−1|A1 ∩ A2 ∩ · · · ∩ Ai−2)

= nm · n−1

m−1 · · · n−i+2m−i+2

≤ ( nm )i−1

= αi−1

Then, E (x) =∞∑

i=1Pr(X ≥ i) ≤

∞∑i=1

αi−1 =≤∞∑

i=0αi = 1

1−α

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 28 / 34

Page 119: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Collision Resolution Open Addressing

Open Addressing: Unsuccessful Search

Q : How to compute Pr(X ≥ i)?Ai : the ith probe occurs at an occupied slotX ≥ i = A1 ∩ A2 ∩ · · · ∩ Ai−1

Pr(X ≥ i) = Pr(A1 ∩ A2 ∩ · · · ∩ Ai−1)= Pr(A1) · Pr(A2|A1) · Pr(A3|A1 ∩ A2) · · · Pr(Ai−1|A1 ∩ A2 ∩ · · · ∩ Ai−2)

= nm · n−1

m−1 · · · n−i+2m−i+2

≤ ( nm )i−1

= αi−1

Then, E (x) =∞∑

i=1Pr(X ≥ i) ≤

∞∑i=1

αi−1 =≤∞∑

i=0αi = 1

1−α

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 28 / 34

Page 120: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Collision Resolution Open Addressing

Open Addressing: Unsuccessful Search

Q : How to compute Pr(X ≥ i)?Ai : the ith probe occurs at an occupied slotX ≥ i = A1 ∩ A2 ∩ · · · ∩ Ai−1

Pr(X ≥ i) = Pr(A1 ∩ A2 ∩ · · · ∩ Ai−1)= Pr(A1) · Pr(A2|A1) · Pr(A3|A1 ∩ A2) · · · Pr(Ai−1|A1 ∩ A2 ∩ · · · ∩ Ai−2)

= nm · n−1

m−1 · · · n−i+2m−i+2

≤ ( nm )i−1

= αi−1

Then, E (x) =∞∑

i=1Pr(X ≥ i) ≤

∞∑i=1

αi−1 =≤∞∑

i=0αi = 1

1−α

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 28 / 34

Page 121: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Collision Resolution Open Addressing

Open Addressing: Unsuccessful Search

Q : How to compute Pr(X ≥ i)?Ai : the ith probe occurs at an occupied slotX ≥ i = A1 ∩ A2 ∩ · · · ∩ Ai−1

Pr(X ≥ i) = Pr(A1 ∩ A2 ∩ · · · ∩ Ai−1)= Pr(A1) · Pr(A2|A1) · Pr(A3|A1 ∩ A2) · · · Pr(Ai−1|A1 ∩ A2 ∩ · · · ∩ Ai−2)

= nm · n−1

m−1 · · · n−i+2m−i+2

≤ ( nm )i−1

= αi−1

Then, E (x) =∞∑

i=1Pr(X ≥ i) ≤

∞∑i=1

αi−1 =≤∞∑

i=0αi = 1

1−α

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 28 / 34

Page 122: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Collision Resolution Open Addressing

Open Addressing: Inserting an Element

Q : What is the average cost for inserting an element into a tablewith load factor α?

Proof.An element is inserted only if there is room in the table, and thusα < 1.Inserting a key requires an unsuccessful search followed by placing thekey into the first empty slot found.Thus, the expected number of probes is at most 1/(1 − α)

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 29 / 34

Page 123: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Collision Resolution Open Addressing

Open Addressing: Inserting an Element

Q : What is the average cost for inserting an element into a tablewith load factor α?

Proof.An element is inserted only if there is room in the table, and thusα < 1.Inserting a key requires an unsuccessful search followed by placing thekey into the first empty slot found.Thus, the expected number of probes is at most 1/(1 − α)

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 29 / 34

Page 124: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Collision Resolution Open Addressing

Open Addressing: Insertion/Unsuccessful Search

11−α

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 30 / 34

Page 125: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Collision Resolution Open Addressing

Open Addressing: Successful Search

Q : What is the average number of probes in a successful search ofan element in a table with load factor α?

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 31 / 34

Page 126: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Collision Resolution Open Addressing

Open Addressing: Successful Search

Q : What is the average number of probes in a successful search ofan element in a table with load factor α?

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 31 / 34

Page 127: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Collision Resolution Open Addressing

Open Addressing: Successful SearchQ : what is the average number of probes in a successful search foran element in a table with load factor α?

Proof.

A search for a key k reproduces the same probe sequence as whenthe element with key k was inserted.If k is the (i+1)st key inserted into the table, the expected number ofprobes in a search for k is at most 1/(1 − i/m) = m/(m − i)The expected number of probes is:

1n

n−1∑i=0

mm−i = m

nn−1∑i=0

1m−i = 1

α

m∑k=m−n+1

1k

≤ 1α

∫ mm−n

1x dx = 1

α ln mm−n

= 1α ln 1

1−α

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 32 / 34

Page 128: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Collision Resolution Open Addressing

Open Addressing: Successful SearchQ : what is the average number of probes in a successful search foran element in a table with load factor α?

Proof.A search for a key k reproduces the same probe sequence as whenthe element with key k was inserted.

If k is the (i+1)st key inserted into the table, the expected number ofprobes in a search for k is at most 1/(1 − i/m) = m/(m − i)The expected number of probes is:

1n

n−1∑i=0

mm−i = m

nn−1∑i=0

1m−i = 1

α

m∑k=m−n+1

1k

≤ 1α

∫ mm−n

1x dx = 1

α ln mm−n

= 1α ln 1

1−α

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 32 / 34

Page 129: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Collision Resolution Open Addressing

Open Addressing: Successful SearchQ : what is the average number of probes in a successful search foran element in a table with load factor α?

Proof.A search for a key k reproduces the same probe sequence as whenthe element with key k was inserted.If k is the (i+1)st key inserted into the table, the expected number ofprobes in a search for k is at most 1/(1 − i/m) = m/(m − i)

The expected number of probes is:1n

n−1∑i=0

mm−i = m

nn−1∑i=0

1m−i = 1

α

m∑k=m−n+1

1k

≤ 1α

∫ mm−n

1x dx = 1

α ln mm−n

= 1α ln 1

1−α

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 32 / 34

Page 130: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Collision Resolution Open Addressing

Open Addressing: Successful SearchQ : what is the average number of probes in a successful search foran element in a table with load factor α?

Proof.A search for a key k reproduces the same probe sequence as whenthe element with key k was inserted.If k is the (i+1)st key inserted into the table, the expected number ofprobes in a search for k is at most 1/(1 − i/m) = m/(m − i)The expected number of probes is:

1n

n−1∑i=0

mm−i = m

nn−1∑i=0

1m−i = 1

α

m∑k=m−n+1

1k

≤ 1α

∫ mm−n

1x dx = 1

α ln mm−n

= 1α ln 1

1−α

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 32 / 34

Page 131: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Collision Resolution Open Addressing

Open Addressing: Successful Search

1α ln 1

1−α

For your reference:α = 0.5: 1.387α = 0.9: 2.559

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 33 / 34

Page 132: 2-12 Hashing MA Juncslabcms.nju.edu.cn/.../f/f7/2019-2-12_Hashing.pdf · Hashing: the idea Q: When should we use hash table? What situations is the hash table suitable for? Key Space

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Collision Resolution Open Addressing

Thank You!Questions?

Office [email protected]

MA Jun (Institute of Computer Software) Problem Solving May 14, 2020 34 / 34