COSC 1030 Lecture 10

COSC 1030 Lecture 10COSC 1030 Lecture 10

Hash Table

TopicsTopics

TableHash ConceptHash FunctionResolve collisionComplexity Analysis

TableTable

Table– A collection of entries– Entry :<key, info>– Insert, search and delete– Update, and retrieve

Array representation– Indexed– Maps key to index

Hash TableHash Table Hash Table

– A table– Key range >> table size– Many-to-one mapping (hashing)– Indexed – hash code as index

Tabbed Address Book– Map names to A:Z– Multiple names start with same letter

Same tab, sequential slots

Hash Table ADTHash Table ADT

Interface Hashtable {

void insert(Item anItem);

Item search(Key aKey);

boolean remove(Key aKey);

boolean isFull();

boolean isEmpty();

Hash FunctionHash Function

Maps key to index evenlyFor any n in N,

hash(n) = n mod Mwhere M is the size of hash table.

hash(k*M + n) = n, where n < M, k: integerMap to integer first if key is not an integer

– A:Z 0:25String s h(s[0]) + h(s[1])*26 +…+ h(s[n-1])*26^(n-1)String s h(s[0])*26^(n-1) + …+h(s[n-1])

Hash FunctionHash Function

String s h(s[0])*26^(n-1) + …+h(s[n-1])

int toInt(String s) {

assert(s != null);

int c = 0;

for (int I = 0; I < s.length(); I ++) {

c = c*26 + toInt(s.charAt(I));

return c;

int hash(String s) { return hash(toInt(s)); }

Example Example

Table[7] – HASHTABLE_SIZE = 7 Insert ‘B2’, ‘H7’, ‘M12’, ‘D4’, ‘Z26’ into the table

2, 0, 5, 4, 5 Collision

– The slot indexed by hash code is already occupied

A simple solution– Sequentially decreases index until find an empty slot or

table is full

Collision PossibilityCollision Possibility

How often collision may occur? Insert 100 random number into a table of 200 slots 1 – ((200 – I)/200), I=0:99

= 1 – 6.66E-14 > 0.99999999999993 Load factor

– 100/200 = 0.5 = 50% 0.99999999999993– 20/ 200 = 0.1 = 10% 0.63– 10/200 = 0.05 = 5% 0.2

Default load factor is 75% in java Hashtable

Primary ClusterPrimary Cluster

The biggest solid block in hash tableJoin clustersThe bigger the primary cluster is, the easier

to growDistributed evenly to avoid primary cluster

Probe MethodProbe Method

What we can do when collision occurred?– A consistent way of searching for an empty slot– Probe

Linear probe – decrease index by 1, wrap up when 0 Double hash – use quotient to calculate decrement

– Max(1, (Key / M) % M)

Separate chaining – linked list to store collision items Hash tree – link to another hash table (A4)

Probe sequence coverageProbe sequence coverage

Ensure probe sequence cover all table– Utilizes the whole table– Even distribution– M and probe decrement are relative prime

No common factor except 1

– Makes M a prime number M and any decrement (< M) are relative prime

Probe MethodProbe Method

void insert(Item item) {

if(!isFull()) {

int index = probe(item.key);

assert(index >=0 && index < M);

table[index] = item;

count ++;

Linear Probe MethodLinear Probe Method int probe(int key) { int hashcode = key % HASHTABLE_SIZE;

if(table[hashcode] == null) { return hashcode;

} else { int index = hashcode;

do { index--; if(index < 0) index += HASHTABLE_SIZE;

} while (index != hashcode && table[index] != null); if(index == hashcode) return –1; else return index; }}

Double Hash Probe MethodDouble Hash Probe Method int probe(int key) {

int hashcode = key % HASHTABLE_SIZE;if(table[hashcode] == null) { return hashcode;

} else { int index = hashcode;

int dec = (key / HASHTABLE_SIZE) % HASHTABLE_SIZE; dec = Math.max(1, dec);

do { index -= dec; if(index < 0) index += HASHTABLE_SIZE;

} while (index != hashcode && table[index] != null); if(index == hashcode) return –1; else return index; }}

Search MethodSearch Method Item search(int key) {

int hashcode = key % HASHTABLE_SIZE;

int dec = max(1, (key / HASHTABLE_SIZE) % HASHTABLE_SIZE);

while(table[hashcode] != null) {

if(table[hashcode].key == key) break;

hashcode -= dec;

return table[hashcode];

Delete MethodDelete Method

Difficulty with delete when open addressing– Destroy hash probe chain

Solution– Set a deleted flag– Search takes it as occupied– Insert takes it as deleted– Forms primary cluster

Separate chaining– Move one up from chained structure

EfficiencyEfficiency Successful search

– Best case – first hit, one comparison– Average

Half of average length of probe sequence Load factor dependent O(1) if load factor < 0.5

– Worst case – longest probe sequence Load factor dependent

Unsuccessful search– Average - average length of probe sequence– Worst case - longest probe sequence

Advanced TopicsAdvanced Topics Choosing Hash Functions

– Generate hash code randomly and uniformly– Use all bits of the key– Assume K=b0b1b2b3– Division

h(k) = k % M; p(k) = max (1, (k / M) % M)

– Folding h(k) = b1^b3 % M; p(k) = b0^b2 % M; // XOR

– Middle squaring h(k) = (b1b2) ^ 2

– Truncating h(k) = b3;

Advanced TopicsAdvanced TopicsHash Tree

– Separate chained collision resolution– Recursively hashing the key

Hash Table

Hash Table Hash Table Hash Table

Hash Table

Hash TreeHash Treevoid insert(int key, Item item) {

Int h = h(key);Int k = g(key); // one-to-one mapping Key KeyIf(table[h] == null) {

table[h] = item;} else {

if(table[h].link == null) table[h].link = new HashTree();

table[h].link.insert(k, item);}

COSC 1030 Lecture 10

Documents

Transcript of COSC 1030 Lecture 10

COSC 3407: Operating Systems Lecture 1: Introduction Kalpdrum Passi.

13 Lecture BIOL 1030-30 Gillette College

Programming Language Concepts, COSC-3308-01 Lecture 5

1 Overloading Operators COSC 1567 C++ Programming Lecture 7.

2014 COSC 426 Lecture 2: Augmented Reality Technology

Real-Time Systems, COSC-4301-01, Lecture 18

COSC 460 Lecture 2: Data Storage

COSC 3407: Operating Systems Lecture 7: Implementing Mutual Exclusion.

9/10/20151 Survey of Programming Languages Concepts, COSC-3308-01, Lecture 1 Stefan Andrei COSC-3308, Lecture 1.

COSC 4607: Computer Security Lecture 9 Cryptography.

COSC 1030 Section 5

Real-Time Systems, COSC-4301-01, Lecture 10

COSC 3407: Operating Systems Lecture 3: Processes.

COSC 3407: Operating Systems Lecture 10: Deadlocks.

COSC 426 Lecture 1: Introduction to Augmented Reality

1 Handling Exceptions COSC 1567 C++ Programming Lecture 11.

COSC 1030 Lecture 10 Hash Table. Topics Table Hash Concept Hash Function Resolve collision Complexity Analysis.

1030 Egg Lecture revised - Copy - Copy.ppt

SCI 1030 Lecture slides 6

Programming Fundamentals I (COSC- 1336), Lecture 6 (prepared after Chapter 9 of Liang’s 2011 textbook) Stefan Andrei 11/24/20151 COSC-1336, Lecture 6.