1 Data Structures. 2 Motivating Quotation “Every program depends on algorithms and data...
-
Upload
quentin-byrd -
Category
Documents
-
view
218 -
download
0
Transcript of 1 Data Structures. 2 Motivating Quotation “Every program depends on algorithms and data...
![Page 1: 1 Data Structures. 2 Motivating Quotation “Every program depends on algorithms and data structures, but few programs depend on the invention of brand.](https://reader035.fdocuments.us/reader035/viewer/2022062409/56649f355503460f94c5384d/html5/thumbnails/1.jpg)
1
Data Structures
![Page 2: 1 Data Structures. 2 Motivating Quotation “Every program depends on algorithms and data structures, but few programs depend on the invention of brand.](https://reader035.fdocuments.us/reader035/viewer/2022062409/56649f355503460f94c5384d/html5/thumbnails/2.jpg)
2
Motivating Quotation
“Every program depends on algorithms and data structures, but few programs depend on the invention of brand new ones.”
-- Kernighan & Pike
![Page 3: 1 Data Structures. 2 Motivating Quotation “Every program depends on algorithms and data structures, but few programs depend on the invention of brand.](https://reader035.fdocuments.us/reader035/viewer/2022062409/56649f355503460f94c5384d/html5/thumbnails/3.jpg)
“Programming in the Large” Steps
Design & Implement• Program & programming style (done)• Common data structures and algorithms <-- we are here• Modularity• Building techniques & tools (done)
Debug• Debugging techniques & tools (done)
Test• Testing techniques (done)
Maintain• Performance improvement techniques & tools
3
![Page 4: 1 Data Structures. 2 Motivating Quotation “Every program depends on algorithms and data structures, but few programs depend on the invention of brand.](https://reader035.fdocuments.us/reader035/viewer/2022062409/56649f355503460f94c5384d/html5/thumbnails/4.jpg)
Goals of this Lecture
Help you learn (or refresh your memory) about:• Common data structures: linked lists and hash tables
Why? Deep motivation:• Common data structures serve as “high level building blocks”• A power programmer:
• Rarely creates programs from scratch• Often creates programs using high level building blocks
Why? Shallow motivation:• Provide background pertinent to Assignment 3• … esp. for those who have not taken COS 226
4
![Page 5: 1 Data Structures. 2 Motivating Quotation “Every program depends on algorithms and data structures, but few programs depend on the invention of brand.](https://reader035.fdocuments.us/reader035/viewer/2022062409/56649f355503460f94c5384d/html5/thumbnails/5.jpg)
Common Task
Maintain a collection of key/value pairs• Each key is a string; each value is an int• Unknown number of key-value pairs
Examples• (student name, grade)
• (“john smith”, 84), (“jane doe”, 93), (“bill clinton”, 81)• (baseball player, number)
• (“Ruth”, 3), (“Gehrig”, 4), (“Mantle”, 7)• (variable name, value)
• (“maxLength”, 2000), (“i”, 7), (“j”, -10)
5
![Page 6: 1 Data Structures. 2 Motivating Quotation “Every program depends on algorithms and data structures, but few programs depend on the invention of brand.](https://reader035.fdocuments.us/reader035/viewer/2022062409/56649f355503460f94c5384d/html5/thumbnails/6.jpg)
Agenda
Linked lists
Hash tables
Hash table issues
6
![Page 7: 1 Data Structures. 2 Motivating Quotation “Every program depends on algorithms and data structures, but few programs depend on the invention of brand.](https://reader035.fdocuments.us/reader035/viewer/2022062409/56649f355503460f94c5384d/html5/thumbnails/7.jpg)
7
Linked List Data Structure
struct Node{ const char *key; int value; struct Node *next;};
struct List{ struct Node *first;};
4"Gehrig"
3"Ruth"
NULL
structList
structNode
structNode
Your Assignment 3 data structures willbe more elaborate
Really this is theaddress at which“Ruth” resides
![Page 8: 1 Data Structures. 2 Motivating Quotation “Every program depends on algorithms and data structures, but few programs depend on the invention of brand.](https://reader035.fdocuments.us/reader035/viewer/2022062409/56649f355503460f94c5384d/html5/thumbnails/8.jpg)
Linked List Algorithms
Create• Allocate List structure; set first to NULL• Performance: O(1) => fast
Add (no check for duplicate key required)• Insert new node containing key/value pair at front of list• Performance: O(1) => fast
Add (check for duplicate key required)• Traverse list to check for node with duplicate key• Insert new node containing key/value pair into list• Performance: O(n) => slow
8
![Page 9: 1 Data Structures. 2 Motivating Quotation “Every program depends on algorithms and data structures, but few programs depend on the invention of brand.](https://reader035.fdocuments.us/reader035/viewer/2022062409/56649f355503460f94c5384d/html5/thumbnails/9.jpg)
Linked List Algorithms
Search• Traverse the list, looking for given key• Stop when key found, or reach end• Performance: O(n) => slow
Free• Free Node structures while traversing• Free List structure• Performance: O(n) => slow
9
Would it be better to keep the nodes sorted by key?
Would it be better to keep the nodes sorted by key?
![Page 10: 1 Data Structures. 2 Motivating Quotation “Every program depends on algorithms and data structures, but few programs depend on the invention of brand.](https://reader035.fdocuments.us/reader035/viewer/2022062409/56649f355503460f94c5384d/html5/thumbnails/10.jpg)
Agenda
Linked lists
Hash tables
Hash table issues
10
![Page 11: 1 Data Structures. 2 Motivating Quotation “Every program depends on algorithms and data structures, but few programs depend on the invention of brand.](https://reader035.fdocuments.us/reader035/viewer/2022062409/56649f355503460f94c5384d/html5/thumbnails/11.jpg)
11
Hash Table Data Structure
enum {BUCKET_COUNT = 1024};
struct Binding{ const char *key; int value; struct Binding *next;};
struct Table{ struct Binding *buckets[BUCKET_COUNT];};
NULL
4"Gehrig"
NULL
3"Ruth"
NULL
NULLNULL0
1
806
23
723
…
…
…
NULL1023
…
structTable
structBinding
structBinding
Your Assignment 3 data structures willbe more elaborate
Array of linked lists Really this is theaddress at which“Ruth” resides
![Page 12: 1 Data Structures. 2 Motivating Quotation “Every program depends on algorithms and data structures, but few programs depend on the invention of brand.](https://reader035.fdocuments.us/reader035/viewer/2022062409/56649f355503460f94c5384d/html5/thumbnails/12.jpg)
12
Hash Table Data Structure
Hash function maps given key to an integer
Mod integer by BUCKET_COUNT to determine proper bucket
0
BUCKET_COUNT-1
Binding
Bucket
![Page 13: 1 Data Structures. 2 Motivating Quotation “Every program depends on algorithms and data structures, but few programs depend on the invention of brand.](https://reader035.fdocuments.us/reader035/viewer/2022062409/56649f355503460f94c5384d/html5/thumbnails/13.jpg)
Hash Table Example
Example: BUCKET_COUNT = 7
Add (if not already present) bindings with these keys:• the, cat, in, the, hat
13
![Page 14: 1 Data Structures. 2 Motivating Quotation “Every program depends on algorithms and data structures, but few programs depend on the invention of brand.](https://reader035.fdocuments.us/reader035/viewer/2022062409/56649f355503460f94c5384d/html5/thumbnails/14.jpg)
Hash Table Example (cont.)
First key: “the”• hash(“the”) = 965156977; 965156977 % 7 = 1
Search buckets[1] for binding with key “the”; not found
14
0123456
![Page 15: 1 Data Structures. 2 Motivating Quotation “Every program depends on algorithms and data structures, but few programs depend on the invention of brand.](https://reader035.fdocuments.us/reader035/viewer/2022062409/56649f355503460f94c5384d/html5/thumbnails/15.jpg)
Hash Table Example (cont.)
Add binding with key “the” and its value to buckets[1]
15
0123456
the
![Page 16: 1 Data Structures. 2 Motivating Quotation “Every program depends on algorithms and data structures, but few programs depend on the invention of brand.](https://reader035.fdocuments.us/reader035/viewer/2022062409/56649f355503460f94c5384d/html5/thumbnails/16.jpg)
Hash Table Example (cont.)
Second key: “cat”• hash(“cat”) = 3895848756; 3895848756 % 7 = 2
Search buckets[2] for binding with key “cat”; not found
16
0123456
the
![Page 17: 1 Data Structures. 2 Motivating Quotation “Every program depends on algorithms and data structures, but few programs depend on the invention of brand.](https://reader035.fdocuments.us/reader035/viewer/2022062409/56649f355503460f94c5384d/html5/thumbnails/17.jpg)
Hash Table Example (cont.)
Add binding with key “cat” and its value to buckets[2]
17
0123456
the
cat
![Page 18: 1 Data Structures. 2 Motivating Quotation “Every program depends on algorithms and data structures, but few programs depend on the invention of brand.](https://reader035.fdocuments.us/reader035/viewer/2022062409/56649f355503460f94c5384d/html5/thumbnails/18.jpg)
Hash Table Example (cont.)
Third key: “in”• hash(“in”) = 6888005; 6888005% 7 = 5
Search buckets[5] for binding with key “in”; not found
18
0123456
the
cat
![Page 19: 1 Data Structures. 2 Motivating Quotation “Every program depends on algorithms and data structures, but few programs depend on the invention of brand.](https://reader035.fdocuments.us/reader035/viewer/2022062409/56649f355503460f94c5384d/html5/thumbnails/19.jpg)
Hash Table Example (cont.)
Add binding with key “in” and its value to buckets[5]
19
0123456
the
cat
in
![Page 20: 1 Data Structures. 2 Motivating Quotation “Every program depends on algorithms and data structures, but few programs depend on the invention of brand.](https://reader035.fdocuments.us/reader035/viewer/2022062409/56649f355503460f94c5384d/html5/thumbnails/20.jpg)
Hash Table Example (cont.)
Fourth word: “the”• hash(“the”) = 965156977; 965156977 % 7 = 1
Search buckets[1] for binding with key “the”; found it!• Don’t change hash table
20
0123456
the
cat
in
![Page 21: 1 Data Structures. 2 Motivating Quotation “Every program depends on algorithms and data structures, but few programs depend on the invention of brand.](https://reader035.fdocuments.us/reader035/viewer/2022062409/56649f355503460f94c5384d/html5/thumbnails/21.jpg)
Hash Table Example (cont.)
Fifth key: “hat”• hash(“hat”) = 865559739; 865559739 % 7 = 2
Search buckets[2] for binding with key “hat”; not found
21
0123456
the
cat
in
![Page 22: 1 Data Structures. 2 Motivating Quotation “Every program depends on algorithms and data structures, but few programs depend on the invention of brand.](https://reader035.fdocuments.us/reader035/viewer/2022062409/56649f355503460f94c5384d/html5/thumbnails/22.jpg)
Hash Table Example (cont.)
Add binding with key “hat” and its value to buckets[2]• At front or back? Doesn’t matter• Inserting at the front is easier, so add at the front
22
0123456
the
hat
in
cat
![Page 23: 1 Data Structures. 2 Motivating Quotation “Every program depends on algorithms and data structures, but few programs depend on the invention of brand.](https://reader035.fdocuments.us/reader035/viewer/2022062409/56649f355503460f94c5384d/html5/thumbnails/23.jpg)
Hash Table Algorithms
Create• Allocate Table structure; set each bucket to NULL• Performance: O(1) => fast
Add• Hash the given key• Mod by BUCKET_COUNT to determine proper bucket• Traverse proper bucket to make sure no duplicate key• Insert new binding containing key/value pair into proper bucket• Performance: O(1) => fast
23
Is the add performance always fast?
Is the add performance always fast?
![Page 24: 1 Data Structures. 2 Motivating Quotation “Every program depends on algorithms and data structures, but few programs depend on the invention of brand.](https://reader035.fdocuments.us/reader035/viewer/2022062409/56649f355503460f94c5384d/html5/thumbnails/24.jpg)
Hash Table Algorithms
Search• Hash the given key• Mod by BUCKET_COUNT to determine proper bucket• Traverse proper bucket, looking for binding with given key• Stop when key found, or reach end• Performance: O(1) => fast
Free• Traverse each bucket, freeing bindings• Free Table structure• Performance: O(n) => slow
24
Is the search performance always fast?
Is the search performance always fast?
![Page 25: 1 Data Structures. 2 Motivating Quotation “Every program depends on algorithms and data structures, but few programs depend on the invention of brand.](https://reader035.fdocuments.us/reader035/viewer/2022062409/56649f355503460f94c5384d/html5/thumbnails/25.jpg)
Agenda
Linked lists
Hash tables
Hash table issues
25
![Page 26: 1 Data Structures. 2 Motivating Quotation “Every program depends on algorithms and data structures, but few programs depend on the invention of brand.](https://reader035.fdocuments.us/reader035/viewer/2022062409/56649f355503460f94c5384d/html5/thumbnails/26.jpg)
How Many Buckets?
Many!• Too few => large buckets => slow add, slow search
But not too many!• Too many => memory is wasted
This is OK:
26
0
BUCKET_COUNT-1
![Page 27: 1 Data Structures. 2 Motivating Quotation “Every program depends on algorithms and data structures, but few programs depend on the invention of brand.](https://reader035.fdocuments.us/reader035/viewer/2022062409/56649f355503460f94c5384d/html5/thumbnails/27.jpg)
27
What Hash Function?
Should distribute bindings across the buckets well• Distribute bindings over the range 0, 1, …, BUCKET_COUNT-1• Distribute bindings evenly to avoid very long buckets
This is not so good:
0
BUCKET_COUNT-1What would be the worst possible hash function?
What would be the worst possible hash function?
![Page 28: 1 Data Structures. 2 Motivating Quotation “Every program depends on algorithms and data structures, but few programs depend on the invention of brand.](https://reader035.fdocuments.us/reader035/viewer/2022062409/56649f355503460f94c5384d/html5/thumbnails/28.jpg)
28
How to Hash Strings?
Simple hash schemes don’t distribute the keys evenly enough• Number of characters, mod BUCKET_COUNT• Sum the numeric codes of all characters, mod BUCKET_COUNT• …
A reasonably good hash function:• Weighted sum of characters si in the string s
•(Σ aisi) mod BUCKET_COUNT• Best if a and BUCKET_COUNT are relatively prime
• E.g., a = 65599, BUCKET_COUNT = 1024
![Page 29: 1 Data Structures. 2 Motivating Quotation “Every program depends on algorithms and data structures, but few programs depend on the invention of brand.](https://reader035.fdocuments.us/reader035/viewer/2022062409/56649f355503460f94c5384d/html5/thumbnails/29.jpg)
29
How to Hash Strings?
Potentially expensive to compute Σ aisi
So let’s do some algebra• (by example, for string s of length 5, a=65599):
h = Σ65599i*si
h = 655990*s0 + 655991*s1 + 655992*s2 + 655993*s3 + 655994*s4
Direction of traversal of s doesn’t matter, so…
h = 655990*s4 + 655991*s3 + 655992*s2 + 655993*s1 + 655994*s0
h = 655994*s0 + 655993*s1 + 655992*s2 + 655991*s3 + 655990*s4
h = (((((s0) * 65599 + s1) * 65599 + s2) * 65599 + s3) * 65599) + s4
![Page 30: 1 Data Structures. 2 Motivating Quotation “Every program depends on algorithms and data structures, but few programs depend on the invention of brand.](https://reader035.fdocuments.us/reader035/viewer/2022062409/56649f355503460f94c5384d/html5/thumbnails/30.jpg)
30
How to Hash Strings?
Yielding this function
unsigned int hash(const char *s, int bucketCount){ int i; unsigned int h = 0U; for (i=0; s[i]!='\0'; i++) h = h * 65599U + (unsigned int)s[i]; return h % bucketCount;}
![Page 31: 1 Data Structures. 2 Motivating Quotation “Every program depends on algorithms and data structures, but few programs depend on the invention of brand.](https://reader035.fdocuments.us/reader035/viewer/2022062409/56649f355503460f94c5384d/html5/thumbnails/31.jpg)
31
How to Protect Keys?
Suppose Table_add() function contains this code:
void Table_add(struct Table *t, const char *key, int value){ … struct Binding *p = (struct Binding*)malloc(sizeof(struct Binding)); p->key = key; …}
![Page 32: 1 Data Structures. 2 Motivating Quotation “Every program depends on algorithms and data structures, but few programs depend on the invention of brand.](https://reader035.fdocuments.us/reader035/viewer/2022062409/56649f355503460f94c5384d/html5/thumbnails/32.jpg)
32
How to Protect Keys?
Problem: Consider this calling code:
struct Table *t;char k[100] = "Ruth";…Table_add(t, k, 3);
3NULL
N01
806
23
723
…
…
1023
…
t
Ruth\0k
![Page 33: 1 Data Structures. 2 Motivating Quotation “Every program depends on algorithms and data structures, but few programs depend on the invention of brand.](https://reader035.fdocuments.us/reader035/viewer/2022062409/56649f355503460f94c5384d/html5/thumbnails/33.jpg)
33
How to Protect Keys?
Problem: Consider this calling code:
struct Table *t;char k[100] = "Ruth";…Table_add(t, k, 3);strcpy(k, "Gehrig");
What happens if the client searches t for “Ruth”? For Gehrig?
What happens if the client searches t for “Ruth”? For Gehrig?
3NULL
N01
806
23
723
…
…
1023
…
t
Gehrig\0k
![Page 34: 1 Data Structures. 2 Motivating Quotation “Every program depends on algorithms and data structures, but few programs depend on the invention of brand.](https://reader035.fdocuments.us/reader035/viewer/2022062409/56649f355503460f94c5384d/html5/thumbnails/34.jpg)
34
How to Protect Keys?
Solution: Table_add() saves a defensive copy of the given key
void Table_add(struct Table *t, const char *key, int value){ … struct Binding *p = (struct Binding*)malloc(sizeof(struct Binding)); p->key = (const char*)malloc(strlen(key) + 1); strcpy((char*)p->key, key); …} Why add 1?Why add 1?
![Page 35: 1 Data Structures. 2 Motivating Quotation “Every program depends on algorithms and data structures, but few programs depend on the invention of brand.](https://reader035.fdocuments.us/reader035/viewer/2022062409/56649f355503460f94c5384d/html5/thumbnails/35.jpg)
35
How to Protect Keys?
Now consider same calling code:
struct Table *t;char k[100] = "Ruth";…Table_add(t, k, 3);
3NULL
N01
806
23
723
…
…
1023
…
t
Ruth\0k
Ruth\0
![Page 36: 1 Data Structures. 2 Motivating Quotation “Every program depends on algorithms and data structures, but few programs depend on the invention of brand.](https://reader035.fdocuments.us/reader035/viewer/2022062409/56649f355503460f94c5384d/html5/thumbnails/36.jpg)
36
How to Protect Keys?
Now consider same calling code:
struct Table *t;char k[100] = "Ruth";…Table_add(t, k, 3);strcpy(k, "Gehrig");
3NULL
N01
806
23
723
…
…
1023
…
t
Gehrig\0k
Ruth\0
Hash table isnot corrupted
![Page 37: 1 Data Structures. 2 Motivating Quotation “Every program depends on algorithms and data structures, but few programs depend on the invention of brand.](https://reader035.fdocuments.us/reader035/viewer/2022062409/56649f355503460f94c5384d/html5/thumbnails/37.jpg)
37
Who Owns the Keys?
Then the hash table owns its keys• That is, the hash table owns the memory in
which its keys reside• Hash_free() function must free the memory
in which the key resides
![Page 38: 1 Data Structures. 2 Motivating Quotation “Every program depends on algorithms and data structures, but few programs depend on the invention of brand.](https://reader035.fdocuments.us/reader035/viewer/2022062409/56649f355503460f94c5384d/html5/thumbnails/38.jpg)
Summary
Common data structures and associated algorithms• Linked list
• (Maybe) fast add• Slow search
• Hash table• (Potentially) fast add• (Potentially) fast search• Very common
Hash table issues• Hashing algorithms• Defensive copies• Key ownership
38