Introduction to Computation and Problem Solving Class 33: Activ Learning: Sorting Prof. Steven R....

25
Introduction to Computation and Problem Solving Class 33: Activ Learning: Sorting Prof. Steven R. Lerman and Dr. V. Judson Harward

Transcript of Introduction to Computation and Problem Solving Class 33: Activ Learning: Sorting Prof. Steven R....

Page 1: Introduction to Computation and Problem Solving Class 33: Activ Learning: Sorting Prof. Steven R. Lerman and Dr. V. Judson Harward.

Introduction to Computation and ProblemSolving

Class 33:Activ Learning: Sorting

Prof. Steven R. Lermanand

Dr. V. Judson Harward

Page 2: Introduction to Computation and Problem Solving Class 33: Activ Learning: Sorting Prof. Steven R. Lerman and Dr. V. Judson Harward.

Why Is Sorting Interesting?• Sorting is an operation that occurs as part ofmany larger programs.• There are many ways to sort, and computerscientists have devoted much research todevising efficient sorting algorithms.• Sorting algorithms illustrate a number ofimportant principles of algorithm design, some ofthem counterintuitive.• We will examine a series of sorting algorithms inthis session and try and discover the tradeoffsbetween them.

Page 3: Introduction to Computation and Problem Solving Class 33: Activ Learning: Sorting Prof. Steven R. Lerman and Dr. V. Judson Harward.

Sorting Order• Sorting implies that the items to be sorted areordered. We shall use the same approach that weemployed for binary search trees .• That is, we will sort Objects, and whenever weinvoke a sort,– we shall either supply a Comparator to order theelements to be sorted or– the sort routine can assume that the objects to besorted belong to a class like Integer that implementsComparable and possesses a native ordering.

Page 4: Introduction to Computation and Problem Solving Class 33: Activ Learning: Sorting Prof. Steven R. Lerman and Dr. V. Judson Harward.

The Sort Interface• There are times when one wants to sort only partof a collection.• We define an interface for sorts with this in mind.A Sort must supply two methods, one to sort anentire array of Objects and one to sort a portionof an array specified by start and end elements (atrue closed interval, not half open):public interface Sort {public void sort(Object[] d,int start,int end);public void sort(Object[] d);}

Page 5: Introduction to Computation and Problem Solving Class 33: Activ Learning: Sorting Prof. Steven R. Lerman and Dr. V. Judson Harward.

Using the Sort Interface• This interface makes sorting algorithms become classes.• We must create an instance of the class before we can useit to perform a sort.• Each sort class will have two constructors. The defaultconstructor will sort elements that have a native order. Theother constructor will take a Comparator as the onlyargument.• Here is an example code fragment that creates an instanceof InsertionSort to sort an array of Integers:Integer [] iArray = new Integer[ 100 ];. . . // initialize iArraySort iSort = new InsertionSort();iSort.sort( iArray );

Page 6: Introduction to Computation and Problem Solving Class 33: Activ Learning: Sorting Prof. Steven R. Lerman and Dr. V. Judson Harward.

Insertion Sort• Insertion sorting is a formalized version of theway most of us sort cards. In an insertion sort,items are selected one at a time from an unsortedlist and inserted into a sorted list.• It is a good example of an incremental or iterativealgorithm. It is simple and intuitive, but we canstill do a little to optimize it.• To save memory and unnecessary copying wesort the array in place.• We do this by using the low end of the array togrow the sorted list against the unsorted upperend of the array.

Page 7: Introduction to Computation and Problem Solving Class 33: Activ Learning: Sorting Prof. Steven R. Lerman and Dr. V. Judson Harward.

Insertion Sort Diagram

Page 8: Introduction to Computation and Problem Solving Class 33: Activ Learning: Sorting Prof. Steven R. Lerman and Dr. V. Judson Harward.

Download Simulations• Go to the Lecture page on the class website and download Sorting.zip from theSupporting Files directory using InternetExplorer, not Netscape.• Double click the downloaded file and clickthe Extract button. Use the file dialog boxto choose a place to unpack thesimulations we will be using today.

Page 9: Introduction to Computation and Problem Solving Class 33: Activ Learning: Sorting Prof. Steven R. Lerman and Dr. V. Judson Harward.

Run the InsertionSort Simulation• Navigate to where you unpacked the zip file anddouble click the InsertionSort.jar file inWindows Explorer, not Forte. It should bring upthe simulation that you will use to examine thealgorithm.• Type in a series of numbers each followed by areturn. These are the numbers to sort.• Click start and then stepInto to single stepthrough the code.• reset restarts the current sort.• new allows you to specify a new sort.

Page 10: Introduction to Computation and Problem Solving Class 33: Activ Learning: Sorting Prof. Steven R. Lerman and Dr. V. Judson Harward.

InsertionSort QuestionsUse the simulator to explore the following questionsand let us know when you think you know theanswers:– How many elements have to be moved in the inner forloop if you run InsertionSort on an already sorted list?Does it run in O(1), O(n), or O(n2) time?– What order of elements will produce the worstperformance? Does this case run in O(1), O(n), or O(n2)time? Why?– Does the average case have more in common with thebest case or the worst case?

Page 11: Introduction to Computation and Problem Solving Class 33: Activ Learning: Sorting Prof. Steven R. Lerman and Dr. V. Judson Harward.

QuickSort• Quicksort is a classic and subtle algorithmoriginally invented by C. A. R. Hoare in1962.• There are now endless variations on thebasic algorithm, and it is considered thebest overall sorting algorithm forindustrial-strength applications.

Page 12: Introduction to Computation and Problem Solving Class 33: Activ Learning: Sorting Prof. Steven R. Lerman and Dr. V. Judson Harward.

QuickSort Strategy• QuickSort is a divide-and-conquer algorithm. It sortsrecursively by performing an operation called partitioningon smaller and smaller portions of the array.• The essential idea of quicksort is to choose an element ofthe portion of the array to be sorted called the pivot. Thenthe algorithm partitions the array with respect to the pivot.• Partitioning means separating the array into two subarrays,the left one containing elements that are less than or equalto the pivot and the right one containing elements greaterthan or equal to the pivot. The algorithm swaps elements toachieve this condition.

Page 13: Introduction to Computation and Problem Solving Class 33: Activ Learning: Sorting Prof. Steven R. Lerman and Dr. V. Judson Harward.

QuickSort Strategy, 2• Note that these subarrays are not sorted!• In the version we are using today (but not allQuickSort implementations), we move the pivotbetween the two partitions.• Quicksort is then recursively applied to thesubarrays.• The algorithm terminates because subarrays ofone element are trivially sorted.

Page 14: Introduction to Computation and Problem Solving Class 33: Activ Learning: Sorting Prof. Steven R. Lerman and Dr. V. Judson Harward.

QuickSort Single Partition

1. Select the last element of the current array segment to be

2. Swap elements to fulfill the partition condition:

3. Swap pivot with first element of upper half or partition:

Page 15: Introduction to Computation and Problem Solving Class 33: Activ Learning: Sorting Prof. Steven R. Lerman and Dr. V. Judson Harward.

Partitioning• Partitioning is the key step in quicksort.• In our first version of quicksort, the pivot is chosen to bethe last element of the (sub)array to be sorted.• We scan the (sub)array from the left end using index lowlooking for an element >= the pivot.• When we find one we scan from the right end using indexhigh looking for an element <= pivot.• If low <= high, we swap them and start scanning foranother pair of swappable elements.• If low > high, we are done and we swap low with thepivot, which now stands between the two partitions.

Page 16: Introduction to Computation and Problem Solving Class 33: Activ Learning: Sorting Prof. Steven R. Lerman and Dr. V. Judson Harward.

Recursive Partitioning

Page 17: Introduction to Computation and Problem Solving Class 33: Activ Learning: Sorting Prof. Steven R. Lerman and Dr. V. Judson Harward.

Quicksort Simulation• Double click the QuickSort.jar file.• It works the same way as the InsertionSortsimulation.

Page 18: Introduction to Computation and Problem Solving Class 33: Activ Learning: Sorting Prof. Steven R. Lerman and Dr. V. Judson Harward.

QuickSort QuestionsUse the simulator to explore the followingquestions and let us know when you thinkyou know the answers:– Why do the low and high indices stay withinthe subarray?– How can we be sure that when the procedureterminates, the subarray is legally partitioned?Can low stop at an out of place elementwithout a swap occurring? What if low haspassed high? Why won’t low be out of place?

Page 19: Introduction to Computation and Problem Solving Class 33: Activ Learning: Sorting Prof. Steven R. Lerman and Dr. V. Judson Harward.

QuickSort Questions, 2– What happens when the pivot is the largest orsmallest element in the subarray?– Is quicksort more or less efficient if the pivot isconsistently the smallest or the largestelement of the (sub)array? What input datacould make this happen?– Why swap high and low when they are bothequal to the pivot? Isn’t this unnecessary?What will happen if you try to sort an arraywhere all the elements are equal?

Page 20: Introduction to Computation and Problem Solving Class 33: Activ Learning: Sorting Prof. Steven R. Lerman and Dr. V. Judson Harward.

A Better Quicksort• The choice of pivot is crucial to quicksort's performance.• The ideal pivot is the median of the subarray, that is, themiddle member of the sorted array. But we can't find themedian without sorting first.• It turns out that the median of the first, middle and lastelement of each subarray is a good substitute for themedian. It guarantees that each part of the partition willhave at least two elements, provided that the array has atleast four, but its performance is usually much better. Andthere are no natural cases that will produce worst casebehavior.• Try running MedianQuickSort from MedianQuickSort.jar.

Page 21: Introduction to Computation and Problem Solving Class 33: Activ Learning: Sorting Prof. Steven R. Lerman and Dr. V. Judson Harward.

CountingSort• InsertionSort and Quicksort both sort bycomparing elements. Is there any other way tosort?• Assume that you are sorting data with a limitedset of integer keys that range from 0 to range.• CountingSort sorts the data into a temporaryarray by taking a “census” of the keys and layingout a directory of where each key must go.

Page 22: Introduction to Computation and Problem Solving Class 33: Activ Learning: Sorting Prof. Steven R. Lerman and Dr. V. Judson Harward.

Steps in the CountingSort Algorithm1. Copy the numbers to be sorted to a temporary array.2. Initialize an array indexed by key values (a histogram ofkeys) to 0.3. Iterate over the array to be sorted counting the frequencyof each key.4. Now calculate the cumulative histogram for each keyvalue, k. The first element, k=0, is the same as thefrequency of key k. The second, k=1, is the sum of thefrequency for k=0 and k=1. The third is the sum of thesecond plus the frequency for k=2. And so on.

Page 23: Introduction to Computation and Problem Solving Class 33: Activ Learning: Sorting Prof. Steven R. Lerman and Dr. V. Judson Harward.

Steps in the CountingSort Algorithm, 25. The first element of the cumulative histogram containsthe number of elements of the original array with values<= 0. the second, those <= 1. They lay out blocks ofvalues in the sorted array.6. Starting with the last element in the original array andworking back to the first, look up its key in the cumulativehistogram to find its destination in the sorted array. It willbe the histogram value – 1.7. Decrement the entry in the cumulative histogram so thenext key is not stored on top of the first.

Page 24: Introduction to Computation and Problem Solving Class 33: Activ Learning: Sorting Prof. Steven R. Lerman and Dr. V. Judson Harward.

CountingSort

There are no keys0, 1Key 2 occurs 1 timeand should occupyPosition 0.Key 3 occurs 2 timesand should occupypositions 1 and 2.Key 4 occurs 2 timesand should occupypositions 3 and 4.Etc.

Page 25: Introduction to Computation and Problem Solving Class 33: Activ Learning: Sorting Prof. Steven R. Lerman and Dr. V. Judson Harward.

CountingSort Questions• Double click the CountingSort.jar file.• Note that since CountingSort requires the specification of arange (or two passes over the data), it can’t be implementedusing our Sort interface.• Experiment with the simulation. Type the numbers to besorted first. Then enter the range in the range field andpress the start button. Use stepinto to trace the code.• Does counting sort have a best case or worst case?• Does it sort in O(1), O(n), or O(n log n) time? Why?