CS 137 Part 8cbruni/CS137Resources/lectures/CS137Part… · Binary Search Recall we started with...
Transcript of CS 137 Part 8cbruni/CS137Resources/lectures/CS137Part… · Binary Search Recall we started with...
![Page 1: CS 137 Part 8cbruni/CS137Resources/lectures/CS137Part… · Binary Search Recall we started with linear searching which we said had a O(n) runtime. If we already have a sorted array,](https://reader033.fdocuments.us/reader033/viewer/2022053100/605c7600d000813d047e9546/html5/thumbnails/1.jpg)
CS 137 Part 8Merge Sort, Quick Sort, Binary Search
![Page 2: CS 137 Part 8cbruni/CS137Resources/lectures/CS137Part… · Binary Search Recall we started with linear searching which we said had a O(n) runtime. If we already have a sorted array,](https://reader033.fdocuments.us/reader033/viewer/2022053100/605c7600d000813d047e9546/html5/thumbnails/2.jpg)
This Week
• We’re going to see two more complicated sorting algorithmsthat will be our first introduction to O(n log n) sortingalgorithms.
• The first of which is Merge Sort.
• Basic Idea:
1. Divide array in half2. Sort each half recursively3. Merge the results
![Page 3: CS 137 Part 8cbruni/CS137Resources/lectures/CS137Part… · Binary Search Recall we started with linear searching which we said had a O(n) runtime. If we already have a sorted array,](https://reader033.fdocuments.us/reader033/viewer/2022053100/605c7600d000813d047e9546/html5/thumbnails/3.jpg)
Example
https://upload.wikimedia.org/wikipedia/commons/e/e6/
Merge_sort_algorithm_diagram.svg
![Page 4: CS 137 Part 8cbruni/CS137Resources/lectures/CS137Part… · Binary Search Recall we started with linear searching which we said had a O(n) runtime. If we already have a sorted array,](https://reader033.fdocuments.us/reader033/viewer/2022053100/605c7600d000813d047e9546/html5/thumbnails/4.jpg)
Merge Sort
void sort(int a[], int n) {
int *t = malloc(n*sizeof(a[0]));
assert(t);
merge_sort(a, t, n);
free(t);
}
int main (void){
int a[] = {-10,2,14,-7,11,38};
int n = sizeof(a)/ sizeof(a[0]);
sort(a,n);
for (int i = 0; i < n; i++) {
printf("%d, ", a[i]);
}
printf("\n");
return 0;
}
![Page 5: CS 137 Part 8cbruni/CS137Resources/lectures/CS137Part… · Binary Search Recall we started with linear searching which we said had a O(n) runtime. If we already have a sorted array,](https://reader033.fdocuments.us/reader033/viewer/2022053100/605c7600d000813d047e9546/html5/thumbnails/5.jpg)
void merge_sort (int a[], int t[], int n) {
if (n <= 1) return;
int middle = n / 2;
int *lower = a;
int *upper = a + middle;
merge_sort(lower , t, middle );
merge_sort(upper , t, n - middle );
int i = 0; // lower index
int j = middle; // upper index
int k = 0; // temp index
while (i < middle && j < n) {
if (a[i] <= a[j]) t[k++] = a[i++];
else t[k++] = a[j++];
}
while (i < middle) t[k++] = a[i++];
while (j < n) t[k++] = a[j++];
for (i = 0; i < n; i++) a[i] = t[i];
}
![Page 6: CS 137 Part 8cbruni/CS137Resources/lectures/CS137Part… · Binary Search Recall we started with linear searching which we said had a O(n) runtime. If we already have a sorted array,](https://reader033.fdocuments.us/reader033/viewer/2022053100/605c7600d000813d047e9546/html5/thumbnails/6.jpg)
Runtime of Merge Sort
![Page 7: CS 137 Part 8cbruni/CS137Resources/lectures/CS137Part… · Binary Search Recall we started with linear searching which we said had a O(n) runtime. If we already have a sorted array,](https://reader033.fdocuments.us/reader033/viewer/2022053100/605c7600d000813d047e9546/html5/thumbnails/7.jpg)
Analysis
How much work is done by each instance?
• Two function calls of O(1)
• Copy the left and right into t[] is O(k) where k is the size ofthe current instance
• Copy t[] into a[] which is also O(k).
• Therefore, each instance of a merge sort is O(k).
![Page 8: CS 137 Part 8cbruni/CS137Resources/lectures/CS137Part… · Binary Search Recall we started with linear searching which we said had a O(n) runtime. If we already have a sorted array,](https://reader033.fdocuments.us/reader033/viewer/2022053100/605c7600d000813d047e9546/html5/thumbnails/8.jpg)
Continuing
So each instance is O(k) but how many instances are there at eachlevel?
• The number of bubbles per row is n/k . This follows since afterk = n/2` halves, we’ve created 2` bubbles and this is n/k .
• Thus, for each level, the total amount of work done isO(n/k · k) = O(n).
![Page 9: CS 137 Part 8cbruni/CS137Resources/lectures/CS137Part… · Binary Search Recall we started with linear searching which we said had a O(n) runtime. If we already have a sorted array,](https://reader033.fdocuments.us/reader033/viewer/2022053100/605c7600d000813d047e9546/html5/thumbnails/9.jpg)
Wrapping Up
• Finally, how many levels are there?
• To answer this, we are looking for a number m such thatn2m = 1.
• Solving gives m = log2(n).
• Hence, the total time is O(n log n).
• This analysis applies for the best, worst and average cases!
![Page 10: CS 137 Part 8cbruni/CS137Resources/lectures/CS137Part… · Binary Search Recall we started with linear searching which we said had a O(n) runtime. If we already have a sorted array,](https://reader033.fdocuments.us/reader033/viewer/2022053100/605c7600d000813d047e9546/html5/thumbnails/10.jpg)
Quick Sort
• Created by Tony Hoare in 1959 (or 1960)
• Basic Idea:
1. Pick a pivot element p in the array2. Partition the array into
< p p ≥ p
3. Recursively sort the two partitions.
• Key benefit: No temporary array!
![Page 11: CS 137 Part 8cbruni/CS137Resources/lectures/CS137Part… · Binary Search Recall we started with linear searching which we said had a O(n) runtime. If we already have a sorted array,](https://reader033.fdocuments.us/reader033/viewer/2022053100/605c7600d000813d047e9546/html5/thumbnails/11.jpg)
Tricky Point
• How do we pick the pivot and partition?
• We will discuss two such ways, the Lomuto Partition andchoosing a median of medians.
• The Lomuto Partition is usually easier to implement butdoesn’t preform as well in the worst case.
• The median of medians enhancement vastly improves theworst case runtime
• Note that Hoare’s original partitioning scheme has similarruntimes to Lomuto’s but is slightly harder to implement sowe won’t do so.
![Page 12: CS 137 Part 8cbruni/CS137Resources/lectures/CS137Part… · Binary Search Recall we started with linear searching which we said had a O(n) runtime. If we already have a sorted array,](https://reader033.fdocuments.us/reader033/viewer/2022053100/605c7600d000813d047e9546/html5/thumbnails/12.jpg)
Lomuto Partition
• Swaps left to right.
• Pivot is first element (Could by last element withmodifications below if desired).
• Have two index variables:
1. m which represents the last index in the partition < p2. i which represents the first index of the unpartitioned part.3. In other words
So elements with indices between 1 and m inclusive are < pand elements with indices
![Page 13: CS 137 Part 8cbruni/CS137Resources/lectures/CS137Part… · Binary Search Recall we started with linear searching which we said had a O(n) runtime. If we already have a sorted array,](https://reader033.fdocuments.us/reader033/viewer/2022053100/605c7600d000813d047e9546/html5/thumbnails/13.jpg)
Lomuto Partition Continued
There are two cases:
• If x < p, then increment m and swap a[m] with a[i] thenincrement i
• If x ≥ p, then just increment i .
• End case: When i = n, then we swap a[0] and a[m] so thatthe partition is in the middle.
Let’s see this in action
![Page 14: CS 137 Part 8cbruni/CS137Resources/lectures/CS137Part… · Binary Search Recall we started with linear searching which we said had a O(n) runtime. If we already have a sorted array,](https://reader033.fdocuments.us/reader033/viewer/2022053100/605c7600d000813d047e9546/html5/thumbnails/14.jpg)
Quick Sort Example
![Page 15: CS 137 Part 8cbruni/CS137Resources/lectures/CS137Part… · Binary Search Recall we started with linear searching which we said had a O(n) runtime. If we already have a sorted array,](https://reader033.fdocuments.us/reader033/viewer/2022053100/605c7600d000813d047e9546/html5/thumbnails/15.jpg)
Quick Sort Main
#include <stdio.h>
#include <stdlib.h>
#include <assert.h>
// Code here is on next slides
int main(void){
int a[] = {-10,2,14,-7,11,38};
const int n = sizeof(a)/ sizeof(a[0]);
quick_sort(a,n);
for (int i = 0; i < n; i++) {
printf("%d, ", a[i]);
}
printf("\n");
return 0;
}
![Page 16: CS 137 Part 8cbruni/CS137Resources/lectures/CS137Part… · Binary Search Recall we started with linear searching which we said had a O(n) runtime. If we already have a sorted array,](https://reader033.fdocuments.us/reader033/viewer/2022053100/605c7600d000813d047e9546/html5/thumbnails/16.jpg)
Swapping
void swap(int *a, int *b) {
int t = *a;
*a = *b;
*b = t;
}
![Page 17: CS 137 Part 8cbruni/CS137Resources/lectures/CS137Part… · Binary Search Recall we started with linear searching which we said had a O(n) runtime. If we already have a sorted array,](https://reader033.fdocuments.us/reader033/viewer/2022053100/605c7600d000813d047e9546/html5/thumbnails/17.jpg)
Quick Sort
void quick_sort(int a[], int n) {
if (n <= 1) return;
int m = 0;
for (int i = 1; i < n; i++) {
if (a[i] < a[0]) {
m++;
swap(&a[m],&a[i]);
}
}
// put pivot between partitions
// i.e, swap a[0] and a[m]
swap(&a[0],&a[m]);
quick_sort(a, m);
quick_sort(a + m + 1, n - m - 1);
}
![Page 18: CS 137 Part 8cbruni/CS137Resources/lectures/CS137Part… · Binary Search Recall we started with linear searching which we said had a O(n) runtime. If we already have a sorted array,](https://reader033.fdocuments.us/reader033/viewer/2022053100/605c7600d000813d047e9546/html5/thumbnails/18.jpg)
Time Complexity Analysis
Best Case:
• Ideally, each partition is split into (roughly) equal halves.
• Going down the recursion tree, we notice that the analysishere is almost identical to merge sort.
• There are log n levels and at each level we have O(n) work.
• Therefore, in the best case, the runtime is O(n log n).
![Page 19: CS 137 Part 8cbruni/CS137Resources/lectures/CS137Part… · Binary Search Recall we started with linear searching which we said had a O(n) runtime. If we already have a sorted array,](https://reader033.fdocuments.us/reader033/viewer/2022053100/605c7600d000813d047e9546/html5/thumbnails/19.jpg)
Time Complexity Analysis
Average Case:
• Again this isn’t quite easy to quantify but we’ll suppose ateach level, the pivot is between the 25th and 75th percentiles.
• Then, in this case, the worst partition is 3:1.
• Now, there are log4/3 n levels (solve (3/4)mn = 1) and ateach level we still have O(n) work.
• Therefore, in the average case, the runtime is O(n log n).
![Page 20: CS 137 Part 8cbruni/CS137Resources/lectures/CS137Part… · Binary Search Recall we started with linear searching which we said had a O(n) runtime. If we already have a sorted array,](https://reader033.fdocuments.us/reader033/viewer/2022053100/605c7600d000813d047e9546/html5/thumbnails/20.jpg)
Time Complexity Analysis
Worst Case:
• Worst case turns out to be very poor.
• What if the array were sorted (or reverse sorted)?
• Then the array is always partitioned into an array of size 1and an array of size len-1.
• At each level we still have O(n) work.
• Unfortunately, there are now n levels for a total runtime ofO(n2).
• To be slightly more accurate, note that at each level, we havea constant times
(n − 1) + (n − 2) + ... + 2 + 1 =n(n − 1)
2= O(n2)
amount of work.
![Page 21: CS 137 Part 8cbruni/CS137Resources/lectures/CS137Part… · Binary Search Recall we started with linear searching which we said had a O(n) runtime. If we already have a sorted array,](https://reader033.fdocuments.us/reader033/viewer/2022053100/605c7600d000813d047e9546/html5/thumbnails/21.jpg)
Well if that’s the case...
• ... then why is quick sort one of more frequently used sortingalgorithms?
• One of the reasons why is that we have other schemes thatcan help ensure that even in the worst case, we get O(n log n)as our runtime.
• The key idea is picking our pivot intelligently.
• Turns out while this does theoretically reduces our worst caseruntime, often in practice this isn’t done because it increasesour overhead of choosing a pivot.
![Page 22: CS 137 Part 8cbruni/CS137Resources/lectures/CS137Part… · Binary Search Recall we started with linear searching which we said had a O(n) runtime. If we already have a sorted array,](https://reader033.fdocuments.us/reader033/viewer/2022053100/605c7600d000813d047e9546/html5/thumbnails/22.jpg)
Idea
• What we’ll do is group the array into groups of five (a total ofn/5 such groups)
• Then we’ll take the median of each of those groups
• Now of these n/5 medians, repeat until you eventually findthe median of these medians. Call this m.
• Thus, with this m, we know that m is bigger than n/10medians and each of those medians was at least the size of 3other numbers (including itself) and so in total, we know thatm is bigger than 3n/10 of the numbers and similarly that it issmaller than 3n/10 of the numbers.
• In the worst case then, we split the array into an array with3n/10 elements and one with 7n/10 elements. The analysis issimilar to the average case from before.
• For a more formal proof take CS 341 (Algorithms)!
![Page 23: CS 137 Part 8cbruni/CS137Resources/lectures/CS137Part… · Binary Search Recall we started with linear searching which we said had a O(n) runtime. If we already have a sorted array,](https://reader033.fdocuments.us/reader033/viewer/2022053100/605c7600d000813d047e9546/html5/thumbnails/23.jpg)
Picture for 25 elements
a b c d ef g h i jk l m n op q r s tu v w x y
![Page 24: CS 137 Part 8cbruni/CS137Resources/lectures/CS137Part… · Binary Search Recall we started with linear searching which we said had a O(n) runtime. If we already have a sorted array,](https://reader033.fdocuments.us/reader033/viewer/2022053100/605c7600d000813d047e9546/html5/thumbnails/24.jpg)
Space Requirements for Quick Sort
• Partition uses only a constant number of variables (m, i andpossibly swapping)
• However recursive calls add to the stack.
• In the best case, we have 50/50 splits and in this case, thestack only has size log n
• In the worst case, we have 1 and n − 1 splits (where nchanges with the length of the array we are considering) andso we have a stack with size n.
![Page 25: CS 137 Part 8cbruni/CS137Resources/lectures/CS137Part… · Binary Search Recall we started with linear searching which we said had a O(n) runtime. If we already have a sorted array,](https://reader033.fdocuments.us/reader033/viewer/2022053100/605c7600d000813d047e9546/html5/thumbnails/25.jpg)
Picture of the splits
Best Case
quicksort(a,1)...
...quicksort(a,n/4)
quicksort(a,n/2)
quicksort(a,n)
stack
Worst Case
quicksort(a,1)...
...quicksort(a,n-5)
quicksort(a,n-4)
quicksort(a,n-3)
quicksort(a,n-2)
quicksort(a,n-1)
quicksort(a,n)
stack
![Page 26: CS 137 Part 8cbruni/CS137Resources/lectures/CS137Part… · Binary Search Recall we started with linear searching which we said had a O(n) runtime. If we already have a sorted array,](https://reader033.fdocuments.us/reader033/viewer/2022053100/605c7600d000813d047e9546/html5/thumbnails/26.jpg)
Optimizations
Tail Recursion
This is when the recursive call is the last action of the function
For example,
void quicksort(int a[], int n){
// Commands here ... no recursion
quicksort(a+m+1,n-m-1);
}
![Page 27: CS 137 Part 8cbruni/CS137Resources/lectures/CS137Part… · Binary Search Recall we started with linear searching which we said had a O(n) runtime. If we already have a sorted array,](https://reader033.fdocuments.us/reader033/viewer/2022053100/605c7600d000813d047e9546/html5/thumbnails/27.jpg)
Tail Call Elimination
• When the recursive call returns, its return is simply passed on.
• Therefore, the activation record of the caller can be reused bythe callee and thus the stack doesn’t grow.
• This is guaranteed by some languages like Scheme
• It can be enabled in gcc by using the -O2 optimization flag.
• With this and sorting the smallest partition first, the stackdepth in the worst case space complexity of quick sort isO(log n).
![Page 28: CS 137 Part 8cbruni/CS137Resources/lectures/CS137Part… · Binary Search Recall we started with linear searching which we said had a O(n) runtime. If we already have a sorted array,](https://reader033.fdocuments.us/reader033/viewer/2022053100/605c7600d000813d047e9546/html5/thumbnails/28.jpg)
Built in Sorting
• In practice, people almost never create their own sortingalgorithms as there is usually one built into the languagealready.
• For C, <stdlib.h> contains a library function called qsort
(Note: This isn’t necessarily quick sort!)
void qsort(void *base , size_t n, size_t ,size ,
int (* compare )( const void *a, const void *b));
![Page 29: CS 137 Part 8cbruni/CS137Resources/lectures/CS137Part… · Binary Search Recall we started with linear searching which we said had a O(n) runtime. If we already have a sorted array,](https://reader033.fdocuments.us/reader033/viewer/2022053100/605c7600d000813d047e9546/html5/thumbnails/29.jpg)
Description of Parameters
• base is the beginning of the array
• n is the length of the array
• size is the size of a byte in the array
• *compare is a function pointer to a comparison criteria.
![Page 30: CS 137 Part 8cbruni/CS137Resources/lectures/CS137Part… · Binary Search Recall we started with linear searching which we said had a O(n) runtime. If we already have a sorted array,](https://reader033.fdocuments.us/reader033/viewer/2022053100/605c7600d000813d047e9546/html5/thumbnails/30.jpg)
Compare Function
#include <stdlib.h>
int compare(const void *a, const void *b) {
int p = *(int *)a;
int q = *(int *)b;
if (p < q) return -1;
else if (p == q) return 0;
else return 1;
}
void sort(int a[], int n) {
qsort(a,n,sizeof(int),compare );
}
![Page 31: CS 137 Part 8cbruni/CS137Resources/lectures/CS137Part… · Binary Search Recall we started with linear searching which we said had a O(n) runtime. If we already have a sorted array,](https://reader033.fdocuments.us/reader033/viewer/2022053100/605c7600d000813d047e9546/html5/thumbnails/31.jpg)
Alternatively
int compare(const void *a, const void *b) {
int p = *(int *)a;
int q = *(int *)b;
return (p < q) ? -1 : ((q > p) ? 1 : 0);
}
![Page 32: CS 137 Part 8cbruni/CS137Resources/lectures/CS137Part… · Binary Search Recall we started with linear searching which we said had a O(n) runtime. If we already have a sorted array,](https://reader033.fdocuments.us/reader033/viewer/2022053100/605c7600d000813d047e9546/html5/thumbnails/32.jpg)
Binary Search
• Recall we started with linear searching which we said had aO(n) runtime.
• If we already have a sorted array, linear searching seems likewe’re not effectively using our information.
• We’ll discuss binary searching which will allow us to searchthrough a sorted array.
![Page 33: CS 137 Part 8cbruni/CS137Resources/lectures/CS137Part… · Binary Search Recall we started with linear searching which we said had a O(n) runtime. If we already have a sorted array,](https://reader033.fdocuments.us/reader033/viewer/2022053100/605c7600d000813d047e9546/html5/thumbnails/33.jpg)
Binary Searching
Basic Idea
• Check the middle element a[m]
• If we match with value, return this element
• If a[m] > value then search the lower half
• If a[m] < value then search the upper half
• Stop when lo > hi where lo is the lower index and hi is theupper index (start and end in the next graphic)
![Page 34: CS 137 Part 8cbruni/CS137Resources/lectures/CS137Part… · Binary Search Recall we started with linear searching which we said had a O(n) runtime. If we already have a sorted array,](https://reader033.fdocuments.us/reader033/viewer/2022053100/605c7600d000813d047e9546/html5/thumbnails/34.jpg)
Binary SearchLet’s find 6 in the following sorted array
https://puzzle.ics.hut.fi/ICS-A1120/2016/notes/
round-efficiency--binarysearch.html
![Page 35: CS 137 Part 8cbruni/CS137Resources/lectures/CS137Part… · Binary Search Recall we started with linear searching which we said had a O(n) runtime. If we already have a sorted array,](https://reader033.fdocuments.us/reader033/viewer/2022053100/605c7600d000813d047e9546/html5/thumbnails/35.jpg)
Binary Search
#include <stdio.h>
int search(int a[], int n, int value) {
int lo = 0, hi = n-1;
while (hi >= lo) {
//Note int m = (hi +lo)/2 is equivalent
//but may overflow.
//Same with (hi+lo)>>1;
int m = lo+(hi -lo)/2;
if (a[m] == value) return m;
if (a[m] < value) lo = m+1;
if (a[m] > value) hi = m-1;
}
return -1;
}
![Page 36: CS 137 Part 8cbruni/CS137Resources/lectures/CS137Part… · Binary Search Recall we started with linear searching which we said had a O(n) runtime. If we already have a sorted array,](https://reader033.fdocuments.us/reader033/viewer/2022053100/605c7600d000813d047e9546/html5/thumbnails/36.jpg)
Binary Searching
int main() {
int a[] = {-10,-7,0,2,11,14,38,42};
const int n = sizeof(a)/ sizeof(a[0]);
printf("%d\n", search(a,n ,10));
printf("%d\n", search(a,n ,11));
printf("%d\n", search(a,n, -100));
return 0;
}
![Page 37: CS 137 Part 8cbruni/CS137Resources/lectures/CS137Part… · Binary Search Recall we started with linear searching which we said had a O(n) runtime. If we already have a sorted array,](https://reader033.fdocuments.us/reader033/viewer/2022053100/605c7600d000813d047e9546/html5/thumbnails/37.jpg)
Tracing
Let’s trace the code with value = 10, 11 and -100
![Page 38: CS 137 Part 8cbruni/CS137Resources/lectures/CS137Part… · Binary Search Recall we started with linear searching which we said had a O(n) runtime. If we already have a sorted array,](https://reader033.fdocuments.us/reader033/viewer/2022053100/605c7600d000813d047e9546/html5/thumbnails/38.jpg)
Time Complexity of Binary Search
Worst Case (and Average Case) Analysis
• If we don’t find the element, then at each step we search onlyin half as many elements as before and searching takes onlyO(1) work.
• We can cut the array in half a total of O(log n) times.
• Thus, the total runtime is O(log n).
Best Case Analysis
• We find the element immediately and return taking only O(1)time.
Note this is extremely fast - even with 1 billion integers, we onlyneed at worst 30 probes.
![Page 39: CS 137 Part 8cbruni/CS137Resources/lectures/CS137Part… · Binary Search Recall we started with linear searching which we said had a O(n) runtime. If we already have a sorted array,](https://reader033.fdocuments.us/reader033/viewer/2022053100/605c7600d000813d047e9546/html5/thumbnails/39.jpg)
Sorting Summary
Selection sort
• Find the smallest and swap with the first
• Runtime for best, average and worst case: O(n2)
Insertion sort
• Find where element i goes and shift the rest to make room forelement i .
• Runtime for best case is O(n) and for average and worst case:O(n2)
![Page 40: CS 137 Part 8cbruni/CS137Resources/lectures/CS137Part… · Binary Search Recall we started with linear searching which we said had a O(n) runtime. If we already have a sorted array,](https://reader033.fdocuments.us/reader033/viewer/2022053100/605c7600d000813d047e9546/html5/thumbnails/40.jpg)
Sorting Summary
Merge sort
• Divide and conquer - Split in halves, recurse then merge.
• Runtime for best, average and worst case: O(n log n) but hasO(n) extra space.
Quick Sort
• Pick pivot, split elements into smaller and large than pivot,repeat.
• Runtime for best, average and worst case: O(n log n)
![Page 41: CS 137 Part 8cbruni/CS137Resources/lectures/CS137Part… · Binary Search Recall we started with linear searching which we said had a O(n) runtime. If we already have a sorted array,](https://reader033.fdocuments.us/reader033/viewer/2022053100/605c7600d000813d047e9546/html5/thumbnails/41.jpg)
Searching Summary
Linear Search
• Scan one by one until found
• Runtime for best case is O(1) and for, average and worst case:O(n)
Binary Search
• Probe middle and repeat (requires sorted array)
• Runtime for best case is O(1) and for, average and worst case:O(log n)