COSC 1030 Lecture 10

21
COSC 1030 Lecture 10 COSC 1030 Lecture 10 Hash Table

description

COSC 1030 Lecture 10. Hash Table. Topics. Table Hash Concept Hash Function Resolve collision Complexity Analysis. Table. Table A collection of entries Entry : Insert, search and delete Update, and retrieve Array representation Indexed Maps key to index. Hash Table. - PowerPoint PPT Presentation

Transcript of COSC 1030 Lecture 10

Page 1: COSC 1030 Lecture 10

COSC 1030 Lecture 10COSC 1030 Lecture 10

Hash Table

Page 2: COSC 1030 Lecture 10

TopicsTopics

TableHash ConceptHash FunctionResolve collisionComplexity Analysis

Page 3: COSC 1030 Lecture 10

TableTable

Table– A collection of entries– Entry :<key, info>– Insert, search and delete– Update, and retrieve

Array representation– Indexed– Maps key to index

Page 4: COSC 1030 Lecture 10

Hash TableHash Table Hash Table

– A table– Key range >> table size– Many-to-one mapping (hashing)– Indexed – hash code as index

Tabbed Address Book– Map names to A:Z– Multiple names start with same letter

Same tab, sequential slots

Page 5: COSC 1030 Lecture 10

Hash Table ADTHash Table ADT

Interface Hashtable {

void insert(Item anItem);

Item search(Key aKey);

boolean remove(Key aKey);

boolean isFull();

boolean isEmpty();

}

Page 6: COSC 1030 Lecture 10

Hash FunctionHash Function

Maps key to index evenlyFor any n in N,

hash(n) = n mod Mwhere M is the size of hash table.

hash(k*M + n) = n, where n < M, k: integerMap to integer first if key is not an integer

– A:Z 0:25String s h(s[0]) + h(s[1])*26 +…+ h(s[n-1])*26^(n-1)String s h(s[0])*26^(n-1) + …+h(s[n-1])

Page 7: COSC 1030 Lecture 10

Hash FunctionHash Function

String s h(s[0])*26^(n-1) + …+h(s[n-1])

int toInt(String s) {

assert(s != null);

int c = 0;

for (int I = 0; I < s.length(); I ++) {

c = c*26 + toInt(s.charAt(I));

}

return c;

}

int hash(String s) { return hash(toInt(s)); }

Page 8: COSC 1030 Lecture 10

Example Example

Table[7] – HASHTABLE_SIZE = 7 Insert ‘B2’, ‘H7’, ‘M12’, ‘D4’, ‘Z26’ into the table

2, 0, 5, 4, 5 Collision

– The slot indexed by hash code is already occupied

A simple solution– Sequentially decreases index until find an empty slot or

table is full

Page 9: COSC 1030 Lecture 10

Collision PossibilityCollision Possibility

How often collision may occur? Insert 100 random number into a table of 200 slots 1 – ((200 – I)/200), I=0:99

= 1 – 6.66E-14 > 0.99999999999993 Load factor

– 100/200 = 0.5 = 50% 0.99999999999993– 20/ 200 = 0.1 = 10% 0.63– 10/200 = 0.05 = 5% 0.2

Default load factor is 75% in java Hashtable

Page 10: COSC 1030 Lecture 10

Primary ClusterPrimary Cluster

The biggest solid block in hash tableJoin clustersThe bigger the primary cluster is, the easier

to growDistributed evenly to avoid primary cluster

Page 11: COSC 1030 Lecture 10

Probe MethodProbe Method

What we can do when collision occurred?– A consistent way of searching for an empty slot– Probe

Linear probe – decrease index by 1, wrap up when 0 Double hash – use quotient to calculate decrement

– Max(1, (Key / M) % M)

Separate chaining – linked list to store collision items Hash tree – link to another hash table (A4)

Page 12: COSC 1030 Lecture 10

Probe sequence coverageProbe sequence coverage

Ensure probe sequence cover all table– Utilizes the whole table– Even distribution– M and probe decrement are relative prime

No common factor except 1

– Makes M a prime number M and any decrement (< M) are relative prime

Page 13: COSC 1030 Lecture 10

Probe MethodProbe Method

void insert(Item item) {

if(!isFull()) {

int index = probe(item.key);

assert(index >=0 && index < M);

table[index] = item;

count ++;

}

}

Page 14: COSC 1030 Lecture 10

Linear Probe MethodLinear Probe Method int probe(int key) { int hashcode = key % HASHTABLE_SIZE;

if(table[hashcode] == null) { return hashcode;

} else { int index = hashcode;

do { index--; if(index < 0) index += HASHTABLE_SIZE;

} while (index != hashcode && table[index] != null); if(index == hashcode) return –1; else return index; }}

Page 15: COSC 1030 Lecture 10

Double Hash Probe MethodDouble Hash Probe Method int probe(int key) {

int hashcode = key % HASHTABLE_SIZE;if(table[hashcode] == null) { return hashcode;

} else { int index = hashcode;

int dec = (key / HASHTABLE_SIZE) % HASHTABLE_SIZE; dec = Math.max(1, dec);

do { index -= dec; if(index < 0) index += HASHTABLE_SIZE;

} while (index != hashcode && table[index] != null); if(index == hashcode) return –1; else return index; }}

Page 16: COSC 1030 Lecture 10

Search MethodSearch Method Item search(int key) {

int hashcode = key % HASHTABLE_SIZE;

int dec = max(1, (key / HASHTABLE_SIZE) % HASHTABLE_SIZE);

while(table[hashcode] != null) {

if(table[hashcode].key == key) break;

hashcode -= dec;

}

return table[hashcode];

}

Page 17: COSC 1030 Lecture 10

Delete MethodDelete Method

Difficulty with delete when open addressing– Destroy hash probe chain

Solution– Set a deleted flag– Search takes it as occupied– Insert takes it as deleted– Forms primary cluster

Separate chaining– Move one up from chained structure

Page 18: COSC 1030 Lecture 10

EfficiencyEfficiency Successful search

– Best case – first hit, one comparison– Average

Half of average length of probe sequence Load factor dependent O(1) if load factor < 0.5

– Worst case – longest probe sequence Load factor dependent

Unsuccessful search– Average - average length of probe sequence– Worst case - longest probe sequence

Page 19: COSC 1030 Lecture 10

Advanced TopicsAdvanced Topics Choosing Hash Functions

– Generate hash code randomly and uniformly– Use all bits of the key– Assume K=b0b1b2b3– Division

h(k) = k % M; p(k) = max (1, (k / M) % M)

– Folding h(k) = b1^b3 % M; p(k) = b0^b2 % M; // XOR

– Middle squaring h(k) = (b1b2) ^ 2

– Truncating h(k) = b3;

Page 20: COSC 1030 Lecture 10

Advanced TopicsAdvanced TopicsHash Tree

– Separate chained collision resolution– Recursively hashing the key

Hash Table

Hash Table Hash Table Hash Table

Hash Table

Hash Table

Page 21: COSC 1030 Lecture 10

Hash TreeHash Treevoid insert(int key, Item item) {

Int h = h(key);Int k = g(key); // one-to-one mapping Key KeyIf(table[h] == null) {

table[h] = item;} else {

if(table[h].link == null) table[h].link = new HashTree();

table[h].link.insert(k, item);}

}