1 Hashing by Adlane Habed School of Computer Science University of Windsor May 6, 2005.

52
1 Hashing Hashing by by Adlane Habed Adlane Habed School of Computer Science University of Windsor May 6, 2005

description

3 Review Arrays, lists, queues, stacks and trees are used to store and retrieve records. Arrays, lists, queues, stacks and trees are used to store and retrieve records. Each record has a key value: Each record has a key value: Student #: Name: Adelson-Velskii Grade: A+ Other information: avl

Transcript of 1 Hashing by Adlane Habed School of Computer Science University of Windsor May 6, 2005.

Page 1: 1 Hashing by Adlane Habed School of Computer Science University of Windsor May 6, 2005.

1

HashingHashingbyby

Adlane HabedAdlane Habed

School of Computer Science

University of Windsor

May 6, 2005

Page 2: 1 Hashing by Adlane Habed School of Computer Science University of Windsor May 6, 2005.

2

ReviewReview

Page 3: 1 Hashing by Adlane Habed School of Computer Science University of Windsor May 6, 2005.

3

ReviewReview Arrays, lists, queues, stacks and Arrays, lists, queues, stacks and

trees are used to store and retrieve trees are used to store and retrieve records.records.

Each record has a key value:Each record has a key value:Student #: 999999999

Name: Adelson-Velskii

Grade: A+

Other information: avl

Page 4: 1 Hashing by Adlane Habed School of Computer Science University of Windsor May 6, 2005.

4

……ReviewReviewBinary search: key = 13Binary search: key = 13

Sequential search: key = 13Sequential search: key = 13

1 3 5 7 9 11 13 15 17 19 21 3 comparisons3 comparisons

1 3 5 7 9 11 13 15 17 19 21 7 comparisons7 comparisons

Page 5: 1 Hashing by Adlane Habed School of Computer Science University of Windsor May 6, 2005.

5

……ReviewReviewRetrieve key=13 in a balanced Binary Retrieve key=13 in a balanced Binary

Search TreeSearch Tree 15

3 11

7

1 5 9 13 25 2917 21

19 27

23

4 comparisons

Page 6: 1 Hashing by Adlane Habed School of Computer Science University of Windsor May 6, 2005.

6

……ReviewReviewData Data

structurestructureComplexityComplexity

O(logn)O(logn) O(n)O(n)

Sorted Sorted arrayarray

searchsearch insert, insert, deletedelete

Sorted Sorted linked-listlinked-list

search, search, insert, insert, delete delete

Balanced Balanced BSTBST

search, search, insert, insert, deletedelete

Page 7: 1 Hashing by Adlane Habed School of Computer Science University of Windsor May 6, 2005.

7

AgendaAgenda

What is hashing?What is hashing? Hash functions Hash functions Collision-resolution strategiesCollision-resolution strategies AnalysisAnalysis Problems to think aboutProblems to think about

Page 8: 1 Hashing by Adlane Habed School of Computer Science University of Windsor May 6, 2005.

8

What is hashing?What is hashing?1.1. Basic ideaBasic idea2.2. DefinitionsDefinitions3.3. Perfect hashingPerfect hashing4.4. CollisionsCollisions5.5. Open-addressing vs. ChainingOpen-addressing vs. Chaining

Page 9: 1 Hashing by Adlane Habed School of Computer Science University of Windsor May 6, 2005.

9

Basic ideaBasic idea A data structure that allows insertion, A data structure that allows insertion,

deletion and search in O(1) in average.deletion and search in O(1) in average. A data structure that requires a limited A data structure that requires a limited

or no search in order to find a record.or no search in order to find a record. The location of the record is calculated The location of the record is calculated

from the value of its key.from the value of its key. No order in the stored records.No order in the stored records. No findMin or findMax.No findMin or findMax.

Page 10: 1 Hashing by Adlane Habed School of Computer Science University of Windsor May 6, 2005.

10

……Basic ideaBasic idea Consider records with Consider records with

integer key values: integer key values: 0, 1, 2, 3, 4, 5, 6, 7, 8, 90, 1, 2, 3, 4, 5, 6, 7, 8, 9

Create a table of 10 Create a table of 10 cells: index of each cell cells: index of each cell in the range [0..9].in the range [0..9].

Each record is stored in Each record is stored in the cell whose index the cell whose index corresponds to its key corresponds to its key value.value.

00 11 22 33 44 55 66 77 88 99

key: 2

key: 8

Page 11: 1 Hashing by Adlane Habed School of Computer Science University of Windsor May 6, 2005.

11

DefinitionsDefinitions HashingHashing

The process of accessing a record, stored in a The process of accessing a record, stored in a table, by mapping the value of its key to a table, by mapping the value of its key to a position in the table.position in the table.

Hash functionHash functionA function that maps key values to table A function that maps key values to table positions.positions.

Hash tableHash tableThe array where the records are stored. The array where the records are stored.

Hash valueHash valueThe value returned by the hash function. It The value returned by the hash function. It usually corresponds to a position in the hash usually corresponds to a position in the hash table.table.

Page 12: 1 Hashing by Adlane Habed School of Computer Science University of Windsor May 6, 2005.

12

Perfect hashingPerfect hashing

Key

Hash function:

00112233445566778899

8

H(key)=keyH(8)=8

Key Key 88

2

H(2)=2

Key Key 22

Record

Hash table

Page 13: 1 Hashing by Adlane Habed School of Computer Science University of Windsor May 6, 2005.

13

……Perfect hashingPerfect hashing Each key value maps to a different Each key value maps to a different

position in the table.position in the table. All the keys need to be known before All the keys need to be known before

the table is created.the table is created. ProblemProblem: what if the keys are neither : what if the keys are neither

contiguous nor in the range of the contiguous nor in the range of the indices of the table?indices of the table?

SolutionSolution: find a hash function that : find a hash function that allows perfect hashing! Is this always allows perfect hashing! Is this always possible?possible?

Page 14: 1 Hashing by Adlane Habed School of Computer Science University of Windsor May 6, 2005.

14

……Perfect hashingPerfect hashing ExampleExample: a company has 100 employees. : a company has 100 employees.

Social Insurance Number (SIN) is used as Social Insurance Number (SIN) is used as a key for a each record.a key for a each record.

Given a 9 digits SIN, should we create a Given a 9 digits SIN, should we create a table of 1,000,000,000 cells for only 100 table of 1,000,000,000 cells for only 100 employees? employees?

Knowing the SI Numbers of all 100 Knowing the SI Numbers of all 100 employees are known in advance does not employees are known in advance does not guarantee to find a perfect hash function.guarantee to find a perfect hash function.

Page 15: 1 Hashing by Adlane Habed School of Computer Science University of Windsor May 6, 2005.

15

……Perfect hashingPerfect hashing The birthday paradoxThe birthday paradox: :

what is the number of persons that what is the number of persons that need to be together in a room in order need to be together in a room in order to, “most likely”, have two of them with to, “most likely”, have two of them with the same date of birth (month/day)? the same date of birth (month/day)?

AnswerAnswer: only 23 people.: only 23 people.HintHint: calculate : calculate p p the probability that no the probability that no

two persons have the same date of two persons have the same date of birth.birth.

Page 16: 1 Hashing by Adlane Habed School of Computer Science University of Windsor May 6, 2005.

16

……Perfect hashingPerfect hashing

Hash functions that allow perfect Hash functions that allow perfect hashing are so rare that it is worth hashing are so rare that it is worth looking for them only in special looking for them only in special circumstances.circumstances.

In addition, it is often that the In addition, it is often that the collection of records is not known in collection of records is not known in advance.advance.

Page 17: 1 Hashing by Adlane Habed School of Computer Science University of Windsor May 6, 2005.

17

CollisionsCollisions What if we cannot find a perfect hash What if we cannot find a perfect hash

function?function?CollisionCollision: more than one key will map : more than one key will map to the same location in the table! to the same location in the table!

Can we avoid collisions? No, except in Can we avoid collisions? No, except in the case of perfect hashing (rare).the case of perfect hashing (rare).

SolutionSolution: select a “good” hash : select a “good” hash function and use a collision-resolution function and use a collision-resolution strategy.strategy.

Page 18: 1 Hashing by Adlane Habed School of Computer Science University of Windsor May 6, 2005.

18

……CollisionsCollisionsExampleExample: The keys are integers and the hash : The keys are integers and the hash

function isfunction is hashValuehashValue = = keykey mod mod tableSizetableSize If If tableSize = 10, tableSize = 10, all records whose keys have all records whose keys have

the same rightmost digit have the same hash the same rightmost digit have the same hash value.value.

Insert 13 and 23 Insert 13 and 23

00 11 22 33 44 55 66 77 88 99131323

Page 19: 1 Hashing by Adlane Habed School of Computer Science University of Windsor May 6, 2005.

19

Open-addressing vs. Open-addressing vs. chainingchaining

Open-addressingOpen-addressing: Storing the : Storing the record directly in the table.record directly in the table.Deal with collisions using collision-Deal with collisions using collision-resolution strategies.resolution strategies.

ChainingChaining: Each cell of the hash : Each cell of the hash table points towards a linked-list.table points towards a linked-list.

Page 20: 1 Hashing by Adlane Habed School of Computer Science University of Windsor May 6, 2005.

20

……ChainingChaining00112233445566778899

13

23

18

H(key)=key mod tableSize

Insert 13

Insert 23

Insert 18

Collision is resolved by inserting the elements in a linked-list.

Page 21: 1 Hashing by Adlane Habed School of Computer Science University of Windsor May 6, 2005.

21

Hash functionsHash functions1.1. Hash functionsHash functions2.2. DivisionDivision3.3. Digits selectionDigits selection4.4. Mid-squareMid-square5.5. FoldingFolding6.6. String keysString keys

Page 22: 1 Hashing by Adlane Habed School of Computer Science University of Windsor May 6, 2005.

22

Hash functionsHash functions Can we have a hash function that Can we have a hash function that

avoids collisions?avoids collisions?Collisions are nearly unavoidable! If Collisions are nearly unavoidable! If we are careful when selecting the we are careful when selecting the hash function, then the number of hash function, then the number of collisions will be few.collisions will be few.

ExceptionException: the hash function is : the hash function is selected for a specific set of records selected for a specific set of records Perfect hashingPerfect hashing

Page 23: 1 Hashing by Adlane Habed School of Computer Science University of Windsor May 6, 2005.

23

……Hash functionsHash functions A poor hash functionA poor hash function::

Maps keys non-uniformly into table Maps keys non-uniformly into table locations, or maps a set of contiguous keys locations, or maps a set of contiguous keys into clusters.into clusters.

An ideal hash functionAn ideal hash function::- - Maps keys uniformly and randomly Maps keys uniformly and randomly onto the onto the entire range of table locations.entire range of table locations.-- Each location is equally likely to be Each location is equally likely to be used for used for a randomly chosen key.a randomly chosen key.-- Fast computation.Fast computation.

Page 24: 1 Hashing by Adlane Habed School of Computer Science University of Windsor May 6, 2005.

24

Hash functions: divisionHash functions: division Division:Division:

H(key)H(key) = = keykey modmod tableSizetableSize 0 ≤ key0 ≤ key modmod tableSize ≤ tableSize ≤

tableSize-1tableSize-1Empirical studies have shown that Empirical studies have shown that this function gives very good results.this function gives very good results.

Page 25: 1 Hashing by Adlane Habed School of Computer Science University of Windsor May 6, 2005.

25

……divisiondivision Assume Assume H(key) = keyH(key) = key modmod tableSizetableSize All keys such that All keys such that key key modmod tableSize tableSize

= 0 = 0 map into position map into position 00 in the table. in the table. All keys such that All keys such that key key modmod tableSize tableSize

= 1 = 1 map into position map into position 11 in the table. in the table. This phenomenon is unavoidable for This phenomenon is unavoidable for

positions 0 and 1: positions 0 and 1: we wish to avoid we wish to avoid this phenomenon when possible.this phenomenon when possible.

Page 26: 1 Hashing by Adlane Habed School of Computer Science University of Windsor May 6, 2005.

26

……divisiondivision Assume Assume tableSize = 25tableSize = 25 All keys that are multiples of 5 will map All keys that are multiples of 5 will map

into positions 0, 5, 10, 15 and 20 in the into positions 0, 5, 10, 15 and 20 in the table!table!

Why? because Why? because keykey and and tableSizetableSize have have 55 as a common factor: as a common factor: There exists an integer There exists an integer mm such that: such that:keykey = = mm××55Therefore, Therefore, keykey mod mod 2525 = = 55×(×(mm modmod 5)5) is a multiple of is a multiple of 55

Page 27: 1 Hashing by Adlane Habed School of Computer Science University of Windsor May 6, 2005.

27

… … divisiondivision Choose Choose tableSizetableSize as a as a prime number.prime number. Example: Example: tableSize = 29 tableSize = 29 (a prime number)(a prime number)55 modmod 29 = 529 = 5, , 10 10 modmod 29 = 10 29 = 10,, 15 15 modmod 29 = 15 29 = 15,, 20 20 modmod 29 = 20 29 = 20,, 25 25 modmod 29 = 25 29 = 25,, 30 30 modmod 29 = 1 29 = 1,, 35 35 modmod 29 = 6 29 = 6,, 40 40 modmod 29 = 11 29 = 11……

Page 28: 1 Hashing by Adlane Habed School of Computer Science University of Windsor May 6, 2005.

28

Hash functions: digit Hash functions: digit selectionselection

Digit(s) selection:Digit(s) selection:keykey = = dd1 1 dd2 2 dd3 3 dd4 4 dd5 5 dd6 6 dd7 7 dd8 8 dd99

If the collection of records is known, If the collection of records is known, how to choose the digit(s)?how to choose the digit(s)?

Analysis of the occurrence of each Analysis of the occurrence of each digit.digit.

Page 29: 1 Hashing by Adlane Habed School of Computer Science University of Windsor May 6, 2005.

29

Digit selection: analysisDigit selection: analysisAssume 10 records are to Assume 10 records are to

be stored: be stored:

0123456789

10

Occurrenc

e0 1 2 3 4 5 6 7 8 9

d5 value

d5 distribution

d5

0123456789

10

Occ

urrenc

e

0 1 2 3 4 5 6 7 8 9d1 value

d1 distribution

d1

Page 30: 1 Hashing by Adlane Habed School of Computer Science University of Windsor May 6, 2005.

30

……Digit selection: Digit selection: analysisanalysis

020406080

100

Occurrenc

e

0 1 2 3 4 5 6 7 8 9d1 value

d1 distribution

d1

0102030405060708090

100

Occurrenc

e0 1 2 3 4 5 6 7 8 9

d5 value

d5 distribution

d5

Non-uniform distribution Uniform distribution

Assume 100 records are to be stored:

Page 31: 1 Hashing by Adlane Habed School of Computer Science University of Windsor May 6, 2005.

31

……Digit selection: Digit selection: analysisanalysis

Consider the hash function: Consider the hash function: H(dH(d1 1 dd2 2 dd3 3 dd4 4 dd5 5 dd6 6 dd7 7 dd8 8 dd99)=d)=d55dd77

dd5 5 and and dd7 7 are uniformly distributedare uniformly distributed

……but but dd5 5 = 3 and d= 3 and d7 7 = 8 = 8 appear very often in appear very often in common!common!

38 38 is the only position used in the rangeis the only position used in the range 30...39 30...39 increasing the chances for collisions.increasing the chances for collisions.

Analysis of correlation is required.Analysis of correlation is required.

Page 32: 1 Hashing by Adlane Habed School of Computer Science University of Windsor May 6, 2005.

32

Hash functions: mid-Hash functions: mid-squaresquare

Mid-square: consider Mid-square: consider keykey = = dd1 1 dd2 2 dd3 3 dd4 4 dd5 5

dd1 1 dd2 2 dd3 3 dd4 4 dd55

× × dd1 1 dd2 2 dd3 3 dd4 4 dd55

------------------------------------------------------------------------------------

rr1 1 rr2 2 rr3 3 rr4 4 rr5 5 rr6 6 rr7 7 rr8 8 rr9 9 rr1010

Select middle digits, for example Select middle digits, for example rr4 4 rr5 5 rr6 6

Why the middle digits and not leftmost or Why the middle digits and not leftmost or rightmost digits?rightmost digits?

Page 33: 1 Hashing by Adlane Habed School of Computer Science University of Windsor May 6, 2005.

33

Mid-square: exampleMid-square: example5432154321 ××

5432154321 ------------------------------------------------------------------------------------

5432154321 108642108642 162963 162963 217284217284 271605 271605 ------------------------------------------------------------------------------------

29507710412950771041

Only 321 contribute in the 3 rightmost digits (041) of the multiplication result.Similar remark regarding the leftmost digits.All key digits contribute in the middle digits of the multiplication result.

Page 34: 1 Hashing by Adlane Habed School of Computer Science University of Windsor May 6, 2005.

34

Hash functions: foldingHash functions: folding FoldingFolding: consider : consider keykey = = dd1 1 dd2 2 dd3 3 dd4 4 dd55

Combine portions of the key to form a Combine portions of the key to form a smaller result.smaller result.

In general, folding is used in In general, folding is used in conjunction with other functions.conjunction with other functions.Example: Example: H(key)H(key) = = dd1 1 +d+d22++ dd33++ dd44++ dd5 5

≤ 45≤ 45 or, H(key) = dor, H(key) = d11 + d + d22dd33++ dd44dd5 5 ≤ 207≤ 207

Page 35: 1 Hashing by Adlane Habed School of Computer Science University of Windsor May 6, 2005.

35

Folding: exampleFolding: example Consider a computer with 16-bit Consider a computer with 16-bit

registers, i.e. integers < registers, i.e. integers < 2216 16 = 65536= 65536 Assume the 9-digit SIN is used as a Assume the 9-digit SIN is used as a

key.key. SIN requires folding before it is SIN requires folding before it is

used:used: dd11 + d + d22dd3 3 dd44dd5 5 ++ dd66dd7 7 dd88dd9 9 ≤ ≤

2000720007

Page 36: 1 Hashing by Adlane Habed School of Computer Science University of Windsor May 6, 2005.

36

The key is a stringThe key is a string When the key is a string, the ASCII code of When the key is a string, the ASCII code of

each character in the string is considered.each character in the string is considered. The ASCII code is an integer value in the The ASCII code is an integer value in the

range 0…127.range 0…127.String to decimal conversion:String to decimal conversion:Consider Consider keykey = “data” = “data”hashValue =hashValue = (‘a’+’t’×128+’a’ ×128(‘a’+’t’×128+’a’ ×12822+’d’ ×128+’d’ ×12833) ) modmod

tableSizetableSize

Page 37: 1 Hashing by Adlane Habed School of Computer Science University of Windsor May 6, 2005.

37

……The key is a stringThe key is a stringThis method generates huge numbers that the This method generates huge numbers that the machine might not store correctly.machine might not store correctly.

Goal: reduce the number of arithmetic Goal: reduce the number of arithmetic operations and generate relatively small operations and generate relatively small numbers.numbers.

hashValuehashValue = ‘d’ = ‘d’ modmod tableSizetableSizehashValuehashValue = ( = (hashValuehashValue×128 + ‘a’) ×128 + ‘a’) modmod tableSizetableSizehashValuehashValue = ( = (hashValuehashValue×128 + ‘t’) ×128 + ‘t’) modmod tableSizetableSizehashValuehashValue = ( = (hashValuehashValue×128 + ‘a’) ×128 + ‘a’) modmod tableSizetableSize

Page 38: 1 Hashing by Adlane Habed School of Computer Science University of Windsor May 6, 2005.

38

Collision-Collision-resolution resolution

strategies in open strategies in open addressingaddressing

1.1. Linear probing:Linear probing:The problem of clusteringThe problem of clustering

2.2. Quadratic probingQuadratic probing

Page 39: 1 Hashing by Adlane Habed School of Computer Science University of Windsor May 6, 2005.

39

Linear probingLinear probingIf If H(key)H(key) is already occupied: is already occupied: Search sequentially (and by wrapping Search sequentially (and by wrapping

around the table if necessary) until an around the table if necessary) until an empty position is found.empty position is found.

ExampleExample: : H(key)=key H(key)=key modmod tableSize tableSize

00 11 22 33 44 55 66 77 88 99

Insert 89

89

Insert 18

18

Insert 49

49

Insert 58

58

Insert 9

9

Page 40: 1 Hashing by Adlane Habed School of Computer Science University of Windsor May 6, 2005.

40

……Linear probingLinear probing

hashValue = H(key)hashValue = H(key)Probe table positions Probe table positions

((hashValue + i) hashValue + i) mod mod tableSizetableSizewith with i= 1,2,…tableSize-1i= 1,2,…tableSize-1

Until an empty position is found in the Until an empty position is found in the table, or all positions have been table, or all positions have been checked.checked.

Page 41: 1 Hashing by Adlane Habed School of Computer Science University of Windsor May 6, 2005.

41

Primary clusteringPrimary clustering Linear probing makes that many items are Linear probing makes that many items are

stored in a few areas creating clusters:stored in a few areas creating clusters:This is known as This is known as primary clustering.primary clustering.

Contiguous keys are mapped into Contiguous keys are mapped into contiguous table locations. contiguous table locations.

ConsequenceConsequence: Slow search even when : Slow search even when the table’s load factor the table’s load factor λλ is small: is small:

λλ=(number of occupied =(number of occupied locations)/locations)/tableSizetableSize

Page 42: 1 Hashing by Adlane Habed School of Computer Science University of Windsor May 6, 2005.

42

Quadratic probingQuadratic probing Collision-resolution strategy that Collision-resolution strategy that

eliminates primary clustering.eliminates primary clustering. It works as follows:It works as follows: hashValue = H(key)hashValue = H(key)

if if table[hashValue] table[hashValue] is occupiedis occupiedprobe table positionsprobe table positions((hashValue + ihashValue + i22) ) mod mod tableSize, tableSize, i=1,2,3...i=1,2,3...until an empty position is found.until an empty position is found.

Page 43: 1 Hashing by Adlane Habed School of Computer Science University of Windsor May 6, 2005.

43

……Quadratic probingQuadratic probing

00 11 22 33 44 55 66 77 88 99

8989

Insert Insert 8989Insert Insert 1818

18184949

Insert Insert 4949Insert 58

58

Insert 9

9

Quadratic probing creates spaces between the inserted elements hashing to the same position: eliminates primary clustering.

Page 44: 1 Hashing by Adlane Habed School of Computer Science University of Windsor May 6, 2005.

44

……Quadratic probingQuadratic probing Very important resultVery important result::

If quadratic probing is used, If quadratic probing is used, tableSizetableSize is prime and table is at is prime and table is at least half empty, the insertion of a least half empty, the insertion of a new element is guaranteed and no new element is guaranteed and no cell is probed twice.cell is probed twice.

Page 45: 1 Hashing by Adlane Habed School of Computer Science University of Windsor May 6, 2005.

45

AnalysisAnalysis

Page 46: 1 Hashing by Adlane Habed School of Computer Science University of Windsor May 6, 2005.

46

AnalysisAnalysisWe calculate the average number of We calculate the average number of comparisons to search successfully S comparisons to search successfully S and unsuccessfully U for a record and unsuccessfully U for a record given the load factor of the table.given the load factor of the table.

Page 47: 1 Hashing by Adlane Habed School of Computer Science University of Windsor May 6, 2005.

47

AnalysisAnalysis U=unsuccessful searchU=unsuccessful search

S=successful searchS=successful search HH, is uniform, is uniform Linear probing:Linear probing:

U=(1+1/(1-U=(1+1/(1-λλ))22)/2)/2 S=(1+1/(1-S=(1+1/(1-λλ))/2))/2 Quadratic probing:Quadratic probing:

U=1/(1- U=1/(1- λλ)) S=-(1/ S=-(1/ λλ)ln(1- )ln(1- λλ)) Chaining:Chaining:

U= U= λλ S=1+ S=1+ λλ/2/2

Page 48: 1 Hashing by Adlane Habed School of Computer Science University of Windsor May 6, 2005.

48

ComparisonComparisonUU SS

Linear Linear probingprobing

λλ = 0.1 = 0.1 1.111.11

λλ = 0.5 = 0.5 2.502.50

λλ = 0.9 = 0.9 50.550.5

1.051.051.51.55.55.5

Quadratic Quadratic probingprobing

λλ = 0.1 = 0.1 1.111.11

λλ = 0.5 = 0.5 2.002.00

λλ = 0.9 10.00 = 0.9 10.00

1.051.051.381.382.552.55

ChainingChaining λλ = 0.1 0.1 = 0.1 0.1λλ = 0.5 0.5 = 0.5 0.5λλ = 0.9 0.9 = 0.9 0.9

1.051.051.251.251.451.45

Page 49: 1 Hashing by Adlane Habed School of Computer Science University of Windsor May 6, 2005.

49

Problems to Problems to think aboutthink about

Page 50: 1 Hashing by Adlane Habed School of Computer Science University of Windsor May 6, 2005.

50

ProofsProofs Proof of the birthday paradox.Proof of the birthday paradox. In quadratic probing:In quadratic probing:

posposii = (H(key)+i = (H(key)+i22)) modmod tableSizetableSizeShow that: Show that: posposii = (pos = (posi-1i-1 + 2i – 1) + 2i – 1) modmod tableSizetableSizeWhat is the advantage of this result?What is the advantage of this result?

Page 51: 1 Hashing by Adlane Habed School of Computer Science University of Windsor May 6, 2005.

51

Implementation issuesImplementation issues Implementation of hash tables.Implementation of hash tables. Deletion in the case of open-addressing.Deletion in the case of open-addressing. How to keep a table at least half empty?How to keep a table at least half empty? Empirical evaluation of different hash Empirical evaluation of different hash

functions for a particular problem.functions for a particular problem. Empirical evaluation of probing Empirical evaluation of probing

strategies for a particular problem.strategies for a particular problem.

Page 52: 1 Hashing by Adlane Habed School of Computer Science University of Windsor May 6, 2005.

52

Other questionsOther questions What is the relationship between the What is the relationship between the

number of probes for an insertion number of probes for an insertion and an unsuccessful search?and an unsuccessful search?