CS 315 Data Structures B. Ravikumar Office: 116 I Darwin Hall Phone: 664 3335 E-mail:...
-
date post
20-Dec-2015 -
Category
Documents
-
view
216 -
download
1
Transcript of CS 315 Data Structures B. Ravikumar Office: 116 I Darwin Hall Phone: 664 3335 E-mail:...
![Page 1: CS 315 Data Structures B. Ravikumar Office: 116 I Darwin Hall Phone: 664 3335 E-mail: ravi93@gmail.edu Course Web site: .](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d4c5503460f94a29ed7/html5/thumbnails/1.jpg)
CS 315 Data Structures
B. Ravikumar Office: 116 I Darwin Hall
Phone: 664 3335E-mail: [email protected]
Course Web site: http://ravi.cs.sonoma.edu/cs315sp08
![Page 2: CS 315 Data Structures B. Ravikumar Office: 116 I Darwin Hall Phone: 664 3335 E-mail: ravi93@gmail.edu Course Web site: .](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d4c5503460f94a29ed7/html5/thumbnails/2.jpg)
Textbook for the course:
Data Structures and Algorithm Analysis in C++ by Mark Allen Weiss
Lab: T 1 to 2:50 PM, Darwin Hall # 28
![Page 3: CS 315 Data Structures B. Ravikumar Office: 116 I Darwin Hall Phone: 664 3335 E-mail: ravi93@gmail.edu Course Web site: .](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d4c5503460f94a29ed7/html5/thumbnails/3.jpg)
Course Goals • Learn to use fundamental data structures:
• arrays, linked lists, stacks and queues• hash table• priority queue• binary trees• others
• Enhance your skill in programming in c++• recursion, classes, algorithm implementation• build projects using different data structures
• Analytical and experimental analysis• quantitative reasoning about the performance of algorithms• comparing different data structures
![Page 4: CS 315 Data Structures B. Ravikumar Office: 116 I Darwin Hall Phone: 664 3335 E-mail: ravi93@gmail.edu Course Web site: .](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d4c5503460f94a29ed7/html5/thumbnails/4.jpg)
Goals for today’s lecture
• Course outline
• Discuss Course work• lab assignments• projects• tests, final exam
• If any time left, we will start discussion of recursion. (will continue it in the lab).
![Page 5: CS 315 Data Structures B. Ravikumar Office: 116 I Darwin Hall Phone: 664 3335 E-mail: ravi93@gmail.edu Course Web site: .](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d4c5503460f94a29ed7/html5/thumbnails/5.jpg)
Data Structures – key to software design
• Data structures play a key role in every type of software.
•Data structure deals with how to store the data internally while solving a problem in order to
•Optimize the overall running time of a program•Optimize the response time (for queries)•Optimize the memory requirements•Optimize other resources (e.g. network)• Simplify software design• make solution extendible, more robust
![Page 6: CS 315 Data Structures B. Ravikumar Office: 116 I Darwin Hall Phone: 664 3335 E-mail: ravi93@gmail.edu Course Web site: .](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d4c5503460f94a29ed7/html5/thumbnails/6.jpg)
Abstract vs. concrete data structures
Abstract data structure (sometimes called ADT) is a collection of data with a set of operations supported to manipulate the structure
Examples:• stack, queue• priority queue• Dictionary
Concrete data structures are the implementations of abstract data structures: • Arrays, linked lists, trees, heaps, hash table
Main emphasis of the course: Find the best mapping between abstract and concrete data structures.
![Page 7: CS 315 Data Structures B. Ravikumar Office: 116 I Darwin Hall Phone: 664 3335 E-mail: ravi93@gmail.edu Course Web site: .](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d4c5503460f94a29ed7/html5/thumbnails/7.jpg)
Abstract Data Structure (ADT)container supporting operations
•Dictionary•search •insert primary operations•Delete
•deleteMin•Range search•Successor secondary operations•Merge
•Priority queue•Insert•deleteMin •Merge, split etc. Secondary operations
primary operations
![Page 8: CS 315 Data Structures B. Ravikumar Office: 116 I Darwin Hall Phone: 664 3335 E-mail: ravi93@gmail.edu Course Web site: .](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d4c5503460f94a29ed7/html5/thumbnails/8.jpg)
Linear data structures
• key properties of the (1-d) array:
• a sequence of items are stored in consecutive memory locations. • array provides a constant time access to k-th element for any k. (access the element by: Element[k].)• inserting at the end is easy. if the current size is S, then we can add x at the end using the single instruction: Element[S++] = x;• deleting at the end is also easy.• inserting or deleting at any other position is expensive. • Even searching is expensive (unless sorted).
![Page 9: CS 315 Data Structures B. Ravikumar Office: 116 I Darwin Hall Phone: 664 3335 E-mail: ravi93@gmail.edu Course Web site: .](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d4c5503460f94a29ed7/html5/thumbnails/9.jpg)
Images are stored in 2-d arrays
We will work on 2-d arrays by manipulating images:• Each pixel is represented by a blue value, a red value and a
green value (any color is a combination of these colors). (255, 255, 255) represents white, (255, 0, 0) represents red etc.• pic(i , j)-> Blue represents the blue component of the i-th
row, j-th column pixel of pic and so on.• Some basic operations on images:
• open, read, write• rotate, mirror image • filter (remove blemishes)• extract features (identify where buildings are in an
aerial photograph)
![Page 10: CS 315 Data Structures B. Ravikumar Office: 116 I Darwin Hall Phone: 664 3335 E-mail: ravi93@gmail.edu Course Web site: .](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d4c5503460f94a29ed7/html5/thumbnails/10.jpg)
Linked lists
Linked lists:• Storing a sequence of items in non-
consecutive locations of the memory.• Not easy to search for a key (even if
sorted).• Inserting next to a given item is easy.• In doubly linked list, inserting before or
after a given item is easy.• Don’t need to know the number of items
in advance. (dynamic memory)
order is important
![Page 11: CS 315 Data Structures B. Ravikumar Office: 116 I Darwin Hall Phone: 664 3335 E-mail: ravi93@gmail.edu Course Web site: .](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d4c5503460f94a29ed7/html5/thumbnails/11.jpg)
stacks and queues
• stacks:• insert and delete at the same end.• equivalently, last element inserted will be the first one to be deleted.• very useful to solve many problems
•Processing arithmetic expressions
• queues:• insert at one end, deletion at the other end.• equivalently, first element inserted is the first one to be deleted.
![Page 12: CS 315 Data Structures B. Ravikumar Office: 116 I Darwin Hall Phone: 664 3335 E-mail: ravi93@gmail.edu Course Web site: .](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d4c5503460f94a29ed7/html5/thumbnails/12.jpg)
Non-linear data structures
Various versions of trees• Binary search trees• Height-balanced trees etc.
Lptr key Rptr
15
Main purpose of a binary search tree supportsdictionary operations efficiently
![Page 13: CS 315 Data Structures B. Ravikumar Office: 116 I Darwin Hall Phone: 664 3335 E-mail: ravi93@gmail.edu Course Web site: .](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d4c5503460f94a29ed7/html5/thumbnails/13.jpg)
Priority queue Max priority key is the one that gets deleted
next.• Equivalently, support for the following
operations:• insert• deleteMin
Useful in solving many problems• fast sorting (heap-sorting)• shortest-path, minimum spanning tree,
scheduling etc.
![Page 14: CS 315 Data Structures B. Ravikumar Office: 116 I Darwin Hall Phone: 664 3335 E-mail: ravi93@gmail.edu Course Web site: .](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d4c5503460f94a29ed7/html5/thumbnails/14.jpg)
Hashing • Supports dictionary operations very efficiently (most of the time).
• Main advantages:•Simple to design, implement• on average very fast• not good in the worst-case.
![Page 15: CS 315 Data Structures B. Ravikumar Office: 116 I Darwin Hall Phone: 664 3335 E-mail: ravi93@gmail.edu Course Web site: .](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d4c5503460f94a29ed7/html5/thumbnails/15.jpg)
What data structure to use?
Example 1: There are more than 1 billion web pages. When you type on google something like:
You get instantaneous response. What kind of data structure is used here?
• The details are quite complicated, but the main data structure used is quite simple.
![Page 16: CS 315 Data Structures B. Ravikumar Office: 116 I Darwin Hall Phone: 664 3335 E-mail: ravi93@gmail.edu Course Web site: .](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d4c5503460f94a29ed7/html5/thumbnails/16.jpg)
Data structure used - inverted index
Array of lists – each array entry contains a word and a pointer to all the web pages that contain that word:
Data structure38 97 145 297
This list is kept sorted
876
Question: How do we access the array index from key word?Hashing or some other clever data structure is needed.
![Page 17: CS 315 Data Structures B. Ravikumar Office: 116 I Darwin Hall Phone: 664 3335 E-mail: ravi93@gmail.edu Course Web site: .](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d4c5503460f94a29ed7/html5/thumbnails/17.jpg)
Example 2: The entire landscape of the world is being digitized (there is a whole new branch that combines information technology and geography called GIS – Geographic Information System). What kind of data structure should be used to store all this information?
SnapshotfromGoogleearth
![Page 18: CS 315 Data Structures B. Ravikumar Office: 116 I Darwin Hall Phone: 664 3335 E-mail: ravi93@gmail.edu Course Web site: .](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d4c5503460f94a29ed7/html5/thumbnails/18.jpg)
Some general issues related to GIS
• How much memory do we need? Can this be stored in one computer? (or need a distributed data base?)
Building the database is done in the background (off-line processing)
• How fast can the queries be answered?
Response to query is called the on-line processing
• Suppose each square mile is represented by a 1024 by 1024 pixel image, how much storage do we need to store the terrain of United States?
![Page 19: CS 315 Data Structures B. Ravikumar Office: 116 I Darwin Hall Phone: 664 3335 E-mail: ravi93@gmail.edu Course Web site: .](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d4c5503460f94a29ed7/html5/thumbnails/19.jpg)
Calculate the memory needed
Very rough estimate of the memory needed:
•Area of USA is 4 x 106 sq miles (roughly)
•Each square mile needs 106 pixels (roughly)
•Each pixel requires 32 bits usually.
Thus the total memory needed = 4 x 106 x 32 x 106 = 168 x 1012 = 168000 Giga bits
(A standard desk top has ~ 200 Giga bits of memory.)
Need about 800 such computers to store the data
![Page 20: CS 315 Data Structures B. Ravikumar Office: 116 I Darwin Hall Phone: 664 3335 E-mail: ravi93@gmail.edu Course Web site: .](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d4c5503460f94a29ed7/html5/thumbnails/20.jpg)
What data structure to store the images?
• each 1024 x 1024 image can be stored in a two-dimensional array. (standard way to store all kinds of images – bmp, jpg, png etc.) The actual images are stored in a secondary memory (hard disks on several servers either in a central location or distributed).
• The number of images would be roughly 4 x 106. A set of pointers to these images can be stored in a 1 (or 2) dimensional array.
• When you click on a point on the map, its index in the array is calculated.
• From that index, the image is accessed and sent by a network to the requesting client.
![Page 21: CS 315 Data Structures B. Ravikumar Office: 116 I Darwin Hall Phone: 664 3335 E-mail: ravi93@gmail.edu Course Web site: .](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d4c5503460f94a29ed7/html5/thumbnails/21.jpg)
Overview of the projects:
1) Generate all the poker hands More generally, given a set of N items and a
number k<= N, generate all possible combinations or permutations of k items.
(concept: recursion, arrays, lists)
2) Image manipulation: (concept: arrays, library, analysis of algorithm)
1) After filtering
![Page 22: CS 315 Data Structures B. Ravikumar Office: 116 I Darwin Hall Phone: 664 3335 E-mail: ravi93@gmail.edu Course Web site: .](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d4c5503460f94a29ed7/html5/thumbnails/22.jpg)
3) Spelling checker: Given a text file T, identify all the misspelled words in T.
Idea: build a hash table H of all the words in a dictionary, and search for each word of the text T in the table H.
(concept: hashing, string processing)
4) Scheduling problem: A set of tasks should be assigned to several identical machines. What is the maximum number of jobs that can be assigned?
Concept: priority queue (heap)
![Page 23: CS 315 Data Structures B. Ravikumar Office: 116 I Darwin Hall Phone: 664 3335 E-mail: ravi93@gmail.edu Course Web site: .](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d4c5503460f94a29ed7/html5/thumbnails/23.jpg)
5) Geometric computation problem – given a set of rectangles, determine the total area covered by them. Also report all the intersections between them.
Concept: binary search tree.