Searching and Sorting - Tiziana Ligorio · Searching and Sorting Tiziana Ligorio...

Post on 13-Jun-2020

15 views 0 download

Transcript of Searching and Sorting - Tiziana Ligorio · Searching and Sorting Tiziana Ligorio...

Searching and SortingTiziana Ligorio

tligorio@hunter.cuny.edu

!1

Today’s Plan

Recap

Searching algorithms and their analysis

Sorting algorithms and their analysis

!2

Announcements

Questions?

!3

Searching

!4

Looking for something!In this discussion we will assume

searching for an element in an array

Linear searchMost intuitiveStart at first position and keep looking until you find it

int linearSearch(int a[], int size, int value){

for (int i = 0; i < size; i++) { if (a[i] == value) { return i; } } return-1;}

!5

How long does linear search take?

If you assume value is in the array and probability of finding it at any location is uniform, on average n/2

If value is not in the array (worst case) n

Either way it’s O(n)

!6

What if you know array is sorted? Can you do better than linear search?

!7

Lecture Activity

You are given a sorted array of integers.

How would you search for 115? ( try to do it in fewer than n steps: don’t search sequentially)

You can write pseudocode or succinctly explain your algorithm

!8

We have done this before! When?

!9

!10

Look in ?

Binary Search

!11

14 43 76 100 108 158 195 200 274 523 543 5993

Binary Search

!12

14 43 76 100 108 158 195 200 274 523 543 5993

Binary Search

!13

14 43 76 100 108 158 195 200 274 523 543 5993

Binary Search

!14

14 43 76 100 108 158 195 200 274 523 543 5993

Binary Search

!15

14 43 76 100 108 158 195 200 274 523 543 5993

Binary Search

!16

14 43 76 100 108 158 195 200 274 523 543 5993

Binary Search

!17

14 43 76 100 108 158 195 200 274 523 543 5993

Binary Search

What is happening here?

!18

Binary Search

What is happening here?

Size of search is cut in half at each step

!19

Binary Search

What is happening here?

Size of search is cut in half at each step

Let T(n) be the running time and assume n = 2kT(n) = T(n/2) + 1

!20

Simplification: assume n is a power of 2 so it can be

evenly divided in two partsThe running time

One comparison

Search lower OR upper half

Binary Search

What is happening here?

Size of search is cut in half at each step

Let T(n) be the running time and assume n = 2kT(n) = T(n/2) + 1 T(n/2) = T(n/4) +1

!21

One comparison

Search lower OR upper half of n/2

Binary Search

What is happening here?

Size of search is cut in half at each step

Let T(n) be the running time and assume n = 2kT(n) = T(n/2) + 1 T(n/2) = T(n/4) +1 T(n) = T(n/4) + 1 + 1

!22

Binary Search

What is happening here?

Size of search is cut in half at each step

Let T(n) be the running time and assume n = 2kT(n) = T(n/2) + 1 T(n) = T(n/4) + 2 . . .

!23

22 2

211

Binary Search

What is happening here?

Size of search is cut in half at each step

Let T(n) be the running time and assume n = 2kT(n) = T(n/2) + 1 T(n) = T(n/4) + 2 . . . T(n) = T(n/2k) + k

!24

Binary Search

What is happening here?

Size of search is cut in half at each step

Let T(n) be the running time and assume n = 2kT(n) = T(n/2) + 1 T(n) = T(n/4) + 2 . . . T(n) = T(n/2k) + k T(n) = T(1) + log2(n)

!25n/n = 1

The number to which I need to raise 2 to get n

And we said n = 2k

Binary Search

What is happening here?

Size of search is cut in half at each step

Let T(n) be the running time and assume n = 2kT(n) = T(n/2) + 1 T(n) = T(n/4) + 2 . . . T(n) = T(n/2k) + k T(n) = T(1) + log2(n)

!26

Binary search is O(log(n))

Sorting

!27

Rearranging a sequence into increasing (decreasing) order!

Several approaches

Can do it in may ways

What is the best way?

Let’s find out using Big-O

!28

Lecture Activity

Write pseudocode to sort an array.

!29

144376 100108158 195200 274523543 5993

There are many approaches to sorting We will look at some comparison-

based approaches here

!30

Selection Sort

!31

Selection Sort

!32

Find smallest element and move it at lowest position

Unsorted

Sorted

1st Pass

Selection Sort

!33

Find smallest element and move it at lowest position

Unsorted

Sorted

Swap

1st Pass

Selection Sort

!34

Find smallest element and move it at lowest position

Unsorted

Sorted

1st Pass

Selection Sort

!35

Find smallest element and move it at lowest position

Unsorted

Sorted

Unsorted

2nd Pass

Selection Sort

!36

Find smallest element and move it at lowest position

Unsorted

Sorted

Swap

2nd Pass

Selection Sort

!37

Find smallest element and move it at lowest position

Unsorted

Sorted

2nd Pass

Selection Sort

!38

Find smallest element and move it at lowest position

Unsorted

Sorted

3rd Pass

Selection Sort

!39

Find smallest element and move it at lowest position

Unsorted

Sorted

3rd Pass

Selection Sort

!40

Find smallest element and move it at lowest position

Unsorted

Sorted

4th Pass

Selection Sort

!41

Find smallest element and move it at lowest position

Unsorted

Sorted

4th Pass

Selection Sort

!42

Find smallest element and move it at lowest position

Unsorted

Sorted

5th Pass

Selection Sort

!43

Find smallest element and move it at lowest position

Unsorted

Sorted

Swap

5th Pass

Selection Sort

!44

Find smallest element and move it at lowest position

Unsorted

Sorted

5th Pass

Selection Sort

!45

Find smallest element and move it at lowest position

Unsorted

Sorted

6th Pass

Selection Sort

!46

Find smallest element and move it at lowest position

Unsorted

Sorted

Selection Sort

Find the smallest item and move it at position 1

Find the next-smallest item and move it at position 2

. . .

!47

Selection Sort Analysis

How much work?

Find smallest: look at n elements

!48

Selection Sort Analysis

How much work?

Find smallest: look at n elements

Find second smallest: look at n-1 elements

!49

Selection Sort Analysis

How much work?

Find smallest: look at n elements

Find second smallest: look at n-1 elements

Find third smallest: look at n-2 elements. . .

!50

Selection Sort Analysis

How much work?

Find smallest: look at n elements

Find second smallest: look at n-1 elements

Find third smallest: look at n-2 elements. . .

Total work: n + (n-1) + (n-2) + . . . +1

!51

!52

n + (n-1) + ... + 2 + 1

n

n + 1

= n(n+1) / 2

Selection Sort Analysis

T(n) = n(n+1) / 2 comparisons + n data moves = O( )?

!53

Selection Sort Analysis

T(n) = n(n+1) / 2 comparisons + n data moves = O( )?

T(n) = (n2+n) / 2 + n = O( )?

!54

Selection Sort Analysis

T(n) = n(n+1) / 2 comparisons + n data moves = O( )?

T(n) = (n2+n) / 2 + n = O( )?

!55

Ignore constant

Ignore non-dominant terms

Selection Sort Analysis

T(n) = n(n+1) / 2 comparisons + n data moves = O( )?

T(n) = (n2+n) / 2 + n = O( n2)

!56

Ignore constant

Ignore non-dominant terms

Selection Sort Analysis

T(n) = n(n+1) / 2 comparisons + n data moves = O( )?

T(n) = (n2+n) / 2 + n = O( n2)

Selection Sort run time is O( n2)

!57

template<class T>void selectionSort(T the_array[], int n) { // last = index of the last item in the subarray of items yet // to be sorted; // largest = index of the largest item found for (int last = n - 1; last >= 1; last--) { // At this point, the_array[last+1..n-1] is sorted, and its // entries are greater than those in the_array[0..last]. // Select the largest entry in the_array[0..last] int largest = findIndexOfLargest(the_array, last+1); // Swap the largest entry, the_array[largest], with // the_array[last] std::swap(the_array[largest], the_array[last]); } // end for } // end selectionSort

!58

template<class T>void selectionSort(T the_array[], int n) { // last = index of the last item in the subarray of items yet // to be sorted; // largest = index of the largest item found for (int last = n - 1; last >= 1; last--) { // At this point, the_array[last+1..n-1] is sorted, and its // entries are greater than those in the_array[0..last]. // Select the largest entry in the_array[0..last] int largest = findIndexOfLargest(the_array, last+1); // Swap the largest entry, the_array[largest], with // the_array[last] std::swap(the_array[largest], the_array[last]); } // end for } // end selectionSort

!59

PassO(n)

O(n)

O( n2)

Stability

A sorting algorithm is Stable if elements that are equal remain is same order relative to each other after sorting

!60

Selection Sort

!61

Find smallest element and move it at lowest position

Unsorted

Sorted

Selection Sort

!62

Find smallest element and move it at lowest position

Unsorted

Sorted

Selection Sort

!63

Find smallest element and move it at lowest position

Unsorted

Sorted

Swap

Selection Sort

!64

Find smallest element and move it at lowest position

Unsorted

Sorted

Unstable

Selection Sort Analysis

Execution time DOES NOT depend on initial arrangement of data => ALWAYS O( n2)

O( n2) comparisons

Good choice for small n and/or data moves are costly ( O( n ) data moves )

Unstable

!65

Understanding O(n2)

!66

14 43100 200 2743

T(n)

Understanding O(n2)

!67

14 43

76

100

108 158195

200 274

523 599

3

14 43100 200 2743

T(2n) ≈ 4T(n)

T(n)

(2n)2 = 4n2

Understanding O(n2)

!68

14 43

76

100

108 158195

200 274

523 599

3

14 43100 200 2743 11260 5642 932

T(n)

T(3n) ≈ 9T(n)

(3n)2 = 9n2

Understanding O(n2) on large input

If size of input increases by factor of 100Execution time increases by factor of 10,000 T(100n) = 10,000T(n)

!69

Understanding O(n2) on large input

If size of input increases by factor of 100Execution time increases by factor of 10,000 T(100n) = 10,000T(n)

Assume n = 100,000 and T(n) = 17 seconds Sorting 10,000,000 takes 10,000 longer

!70

Understanding O(n2) on large input

If size of input increases by factor of 100Execution time increases by factor of 10,000 T(100n) = 10,000T(n)

Assume n = 100,000 and T(n) = 17 seconds Sorting 10,000,000 takes 10,000 longer

Sorting 10,000,000 entries takes ≈ 2 days

Multiplying input by 100 to go from 17sec to 2 days!!!

!71

Raise your hand if you had Selection Sort

!72

Bubble Sort

!73

Bubble Sort

!74

Compare adjacent elements and if necessary swap them

Unsorted

Sorted

Bubble Sort

!75

Compare adjacent elements and if necessary swap them 1st Pass

Unsorted

Sorted

Bubble Sort

!76

Compare adjacent elements and if necessary swap them 1st Pass

Unsorted

Sorted

Swap

Bubble Sort

!77

Compare adjacent elements and if necessary swap them 1st Pass

Unsorted

Sorted

Bubble Sort

!78

Compare adjacent elements and if necessary swap them 1st Pass

Unsorted

Sorted

Swap

Bubble Sort

!79

Compare adjacent elements and if necessary swap them 1st Pass

Unsorted

Sorted

Bubble Sort

!80

Compare adjacent elements and if necessary swap them 1st Pass

Unsorted

Sorted

Swap

Bubble Sort

!81

Compare adjacent elements and if necessary swap them 1st Pass

Unsorted

Sorted

Bubble Sort

!82

Compare adjacent elements and if necessary swap them 1st Pass

Unsorted

Sorted

Bubble Sort

!83

Compare adjacent elements and if necessary swap them 1st Pass

Unsorted

Sorted

Swap

Bubble Sort

!84

Compare adjacent elements and if necessary swap them

End of1st Pass: Not sorted, but largest has “bubbled up” to its proper

position

Bubble Sort

!85

Compare adjacent elements and if necessary swap them

2nd Pass: Sort n-1

Bubble Sort

!86

Compare adjacent elements and if necessary swap them 2nd Pass

Unsorted

Sorted

Bubble Sort

!87

Compare adjacent elements and if necessary swap them 2nd Pass

Unsorted

Sorted

Bubble Sort

!88

Compare adjacent elements and if necessary swap them

Swap

2nd Pass

Unsorted

Sorted

Bubble Sort

!89

Compare adjacent elements and if necessary swap them 2nd Pass

Unsorted

Sorted

Bubble Sort

!90

Compare adjacent elements and if necessary swap them 2nd Pass

Unsorted

Sorted

Bubble Sort

!91

Compare adjacent elements and if necessary swap them

3rd Pass: Sort n-2

Bubble Sort

!92

Compare adjacent elements and if necessary swap them 3rd Pass

Unsorted

Sorted

Bubble Sort

!93

Compare adjacent elements and if necessary swap them 3rd Pass

SwapArray is sorted

But our algorithm doesn’t know It keeps on going

Unsorted

Sorted

Bubble Sort

!94

Compare adjacent elements and if necessary swap them 3rd Pass

Unsorted

Sorted

Bubble Sort

!95

Compare adjacent elements and if necessary swap them 3rd Pass

Unsorted

Sorted

Bubble Sort

!96

Compare adjacent elements and if necessary swap them

4th Pass: Sort n-3

Bubble Sort

!97

Compare adjacent elements and if necessary swap them 4th Pass

Unsorted

Sorted

Bubble Sort

!98

Compare adjacent elements and if necessary swap them 4th Pass

Unsorted

Sorted

Bubble Sort

!99

Compare adjacent elements and if necessary swap them

5th Pass: Sort n-4

Bubble Sort

!100

Compare adjacent elements and if necessary swap them 5th Pass

Unsorted

Sorted

Bubble Sort

!101

Compare adjacent elements and if necessary swap them Done!

Unsorted

Sorted

Bubble Sort Analysis

How much work?

First pass: n-1 comparisons and at most n-1 swaps

Second pass: n-2 comparisons and at most n-2 swaps

Third pass: n-3 comparisons and at most n-3 swaps. . .

Total work: (n-1) + (n-2) + . . . +1

!102

!103

n + (n-1) + ... + 2 + 1

n

n + 1

= n(n+1) / 2 (n-1) + (n-2) + . . . + 2 + 1 = n(n-1)/2

(n-1)

n

Bubble Sort Analysis

T(n) = n(n-1) / 2 comparisons + n(n-1) / 2 swaps = O( )?

A swap is usually more than one operation but this simplification does not change the analysis

T(n) = 2( n(n-1) / 2 )= O( )?

!104

Bubble Sort Analysis

T(n) = n(n-1) / 2 comparisons + n(n-1) / 2 swaps = O( )?

A swap is usually more than one operation but this simplification does not change the analysis

T(n) = 2( n(n-1) / 2 )= O( )?

T(n) = 2( (n2-n) / 2 )= O( )?

!105

Bubble Sort Analysis

T(n) = n(n-1) / 2 comparisons + n(n-1) / 2 swaps = O( )?

A swap is usually more than one operation but this simplification does not change the analysis

T(n) = 2( n(n-1) / 2 )= O( )?

T(n) = 2( (n2-n) / 2 )= O( )?

T(n) = n2-n = O( )?

!106Ignore non-dominant terms

Bubble Sort Analysis

T(n) = n(n-1) / 2 comparisons + n(n-1) / 2 swaps = O( )?

A swap is usually more than one operation but this simplification does not change the analysis

T(n) = 2( n(n-1) / 2 )= O( )?

T(n) = 2( (n2-n) / 2 )= O( )?

T(n) = n2-n = O( n2)

Bubble Sort run time is O(n2)!107

Optimize!

Easy to check: if there are no swaps in any given pass stop because it is sorted

!108

Bubble Sort

!109

Compare adjacent elements and if necessary swap them

Bubble Sort

!110

Compare adjacent elements and if necessary swap them

Bubble Sort

!111

Compare adjacent elements and if necessary swap them

Bubble Sort

!112

Compare adjacent elements and if necessary swap them

Bubble Sort

!113

Compare adjacent elements and if necessary swap them

Bubble Sort

!114

Compare adjacent elements and if necessary swap them

Bubble Sort

!115

Compare adjacent elements and if necessary swap them

template<class T> void bubbleSort(T the_array[], int n) { bool sorted = false; // False when swaps occur int pass = 1; while (!sorted && (pass < n)) { // At this point, the_array[n+1-pass..n-1] is sorted // and all of its entries are > the entries in the_array[0..n-pass] sorted = true; // Assume sorted for (int index = 0; index < n - pass; index++) { // At this point, all entries in the_array[0..index-1] // are <= the_array[index] int nextIndex = index + 1; if (the_array[index] > the_array[nextIndex]) { // Exchange entries std::swap(the_array[index], the_array[nextIndex]); sorted = false; // Signal exchange } // end if } // end for // Assertion: the_array[0..n-pass-1] < the_array[n-pass] pass++; } // end while } // end bubbleSort

!116

template<class T> void bubbleSort(T the_array[], int n) { bool sorted = false; // False when swaps occur int pass = 1; while (!sorted && (pass < n)) { // At this point, the_array[n+1-pass..n-1] is sorted // and all of its entries are > the entries in the_array[0..n-pass] sorted = true; // Assume sorted for (int index = 0; index < n - pass; index++) { // At this point, all entries in the_array[0..index-1] // are <= the_array[index] int nextIndex = index + 1; if (the_array[index] > the_array[nextIndex]) { // Exchange entries std::swap(the_array[index], the_array[nextIndex]); sorted = false; // Signal exchange } // end if } // end for // Assertion: the_array[0..n-pass-1] < the_array[n-pass] pass++; } // end while } // end bubbleSort

!117

PassO(n)

O(n)

O( n2)

Bubble Sort Analysis

Execution time DOES depend on initial arrangement of data

Worst case: O(n2) comparisons and data moves

Best case: O(n) comparisons and data moves

Stable

If array is already sorted bubble sort will stop after first pass and no swaps => good choice for small n and data likely somewhat sorted

!118

Raise your hand if you had Bubble Sort

!119

https://www.youtube.com/watch?v=lyZQPjUT5B4

!120

Insertion Sort

!121

Insertion Sort

!122

Pick first element in unsorted region and put it in right place

in sorted region

Unsorted

Sorted

1st Pass

Insertion Sort

!123

Pick first element in unsorted region and put it in right place

in sorted region

Unsorted

Sorted

1st Pass

Insertion Sort

!124

Pick first element in unsorted region and put it in right place

in sorted region

Unsorted

Sorted

Swap

1st Pass

Insertion Sort

!125

Pick first element in unsorted region and put it in right place

in sorted region

Unsorted

Sorted

1st Pass

Insertion Sort

!126

Pick first element in unsorted region and put it in right place

in sorted region

Unsorted

Sorted

Insertion Sort

!127

Pick first element in unsorted region and put it in right place

in sorted region

Unsorted

Sorted

2nd Pass

Insertion Sort

!128

Pick first element in unsorted region and put it in right place

in sorted region

Unsorted

Sorted

Swap

2nd Pass

Insertion Sort

!129

Pick first element in unsorted region and put it in right place

in sorted region

Unsorted

Sorted

2nd Pass

Insertion Sort

!130

Pick first element in unsorted region and put it in right place

in sorted region

Unsorted

Sorted

2nd Pass

Insertion Sort

!131

Pick first element in unsorted region and put it in right place

in sorted region

Unsorted

Sorted

Insertion Sort

!132

Pick first element in unsorted region and put it in right place

in sorted region

Unsorted

Sorted

3rd Pass

Insertion Sort

!133

Pick first element in unsorted region and put it in right place

in sorted region

Unsorted

Sorted

Swap

3rd Pass

Insertion Sort

!134

Pick first element in unsorted region and put it in right place

in sorted region

Unsorted

Sorted

3rd Pass

Insertion Sort

!135

Pick first element in unsorted region and put it in right place

in sorted region

Unsorted

Sorted

Swap

3rd Pass

Insertion Sort

!136

Pick first element in unsorted region and put it in right place

in sorted region

Unsorted

Sorted

3rd Pass

Insertion Sort

!137

Pick first element in unsorted region and put it in right place

in sorted region

Unsorted

Sorted

Swap

3rd Pass

Insertion Sort

!138

Pick first element in unsorted region and put it in right place

in sorted region

Unsorted

Sorted

3rd Pass

Insertion Sort

!139

Pick first element in unsorted region and put it in right place

in sorted region

Unsorted

Sorted

Insertion Sort

!140

Pick first element in unsorted region and put it in right place

in sorted region

Unsorted

Sorted

4th Pass

Insertion Sort

!141

Pick first element in unsorted region and put it in right place

in sorted region

Unsorted

Sorted

4th Pass

Insertion Sort

!142

Pick first element in unsorted region and put it in right place

in sorted region

Unsorted

Sorted

Insertion Sort

!143

Pick first element in unsorted region and put it in right place

in sorted region

Unsorted

Sorted

5th Pass

Insertion Sort

!144

Pick first element in unsorted region and put it in right place

in sorted region

Unsorted

Sorted

Swap

5th Pass

Insertion Sort

!145

Pick first element in unsorted region and put it in right place

in sorted region

Unsorted

Sorted

5th Pass

Insertion Sort

!146

Pick first element in unsorted region and put it in right place

in sorted region

Unsorted

Sorted

5th Pass

Insertion Sort

!147

Pick first element in unsorted region and put it in right place

in sorted region

Unsorted

Sorted

Insertion Sort Analysis

How much work?

First pass: 1 comparison and at most 1 swap

Second pass: at most 2 comparisons and at most 2 swaps

Third pass: at most 3 comparisons and at most 3 swaps. . .

Total work: 1 + 2 + 3 + . . . + (n-1)

!148

!149

n + (n-1) + ... + 2 + 1

n

n + 1

= n(n+1) / 2 1 + 2 + . . . (n-2) + (n-1) = n(n-1)/2

(n-1)

n

Insertion Sort Analysis

T(n) = n(n-1) / 2 comparisons + n(n-1) / 2 swaps = O( )?

T(n) = 2( (n2-n) / 2 )= O( )?

T(n) = n2-n = O( n2)

Insertion Sort run time is O(n2)

!150

Insertion Sort

!151

Pick first element in unsorted region and put it in right place

in sorted region

Unsorted

Sorted

Insertion Sort

!152

Pick first element in unsorted region and put it in right place

in sorted region

Unsorted

Sorted

Insertion Sort

!153

Pick first element in unsorted region and put it in right place

in sorted region

Unsorted

Sorted

Insertion Sort

!154

Pick first element in unsorted region and put it in right place

in sorted region

Unsorted

Sorted

Insertion Sort

!155

Pick first element in unsorted region and put it in right place

in sorted region

Unsorted

Sorted

Insertion Sort

!156

Pick first element in unsorted region and put it in right place

in sorted region

Unsorted

Sorted

Insertion Sort

!157

Pick first element in unsorted region and put it in right place

in sorted region

Unsorted

Sorted

Insertion Sort

!158

Pick first element in unsorted region and put it in right place

in sorted region

Unsorted

Sorted

Insertion Sort

!159

Pick first element in unsorted region and put it in right place

in sorted region

Unsorted

Sorted

Insertion Sort

!160

Pick first element in unsorted region and put it in right place

in sorted region

Unsorted

Sorted

Insertion Sort

!161

Pick first element in unsorted region and put it in right place

in sorted region

Unsorted

Sorted

Insertion Sort Analysis

Execution time DOES depend on initial arrangement of data

Worst case: O( n2) comparisons and data moves

Best case: O(n) comparisons and data moves

Stable

If array is already sorted Insertion sort will do only n comparisons and no swaps => good choice for small n and data likely somewhat sorted

!162

template<class T> void insertionSort(T the_array[], int n) { // unsorted = first index of the unsorted region, // loc = index of insertion in the sorted region, // next_item = next item in the unsorted region. // Initially, sorted region is the_array[0], // unsorted region is the_array[1..n-1]. // In general, sorted region is the_array[0..unsorted-1], // unsorted region the_array[unsorted..n-1] for (int unsorted = 1; unsorted < n; unsorted++) { // At this point, the_array[0..unsorted-1] is sorted. // Find the right position (loc) in the_array[0..unsorted] // for the_array[unsorted], which is the first entry in the // unsorted region; shift, if necessary, to make room T next_item = the_array[unsorted]; int loc = unsorted; while ((loc > 0) && (the_array[loc - 1] > next_item)) { // Shift the_array[loc - 1] to the right the_array[loc] = the_array[loc - 1]; loc--; } // end while // At this point, the_array[loc] is where next_item belongs the_array[loc] = next_item; // Insert next_item into sorted region } // end for } // end insertionSort

!163

template<class T> void insertionSort(T the_array[], int n) { // unsorted = first index of the unsorted region, // loc = index of insertion in the sorted region, // next_item = next item in the unsorted region. // Initially, sorted region is the_array[0], // unsorted region is the_array[1..n-1]. // In general, sorted region is the_array[0..unsorted-1], // unsorted region the_array[unsorted..n-1] for (int unsorted = 1; unsorted < n; unsorted++) { // At this point, the_array[0..unsorted-1] is sorted. // Find the right position (loc) in the_array[0..unsorted] // for the_array[unsorted], which is the first entry in the // unsorted region; shift, if necessary, to make room T next_item = the_array[unsorted]; int loc = unsorted; while ((loc > 0) && (the_array[loc - 1] > next_item)) { // Shift the_array[loc - 1] to the right the_array[loc] = the_array[loc - 1]; loc--; } // end while // At this point, the_array[loc] is where next_item belongs the_array[loc] = next_item; // Insert next_item into sorted region } // end for } // end insertionSort

!164

PassO(n)

O(n)

O( n2)

Raise your hand if you had Insertion Sort

!165

What we have so far

!166

Worst Case Best Case

Selection Sort O( n2 ) O( n2 )

Bubble Sort O( n2 ) O( n )

Insertion Sort O( n2 ) O( n )

Lecture Activity

Sort the array using Insertion SortShow the entire array after each comparison/swap operation and at each step mark clearly the division between the sorted and unsorted portions of the array

!167

5 8 3 4 9

5 8 3 4 9

Pick first element in unsorted region and put it in right place in sorted region

2 7

2 7

!168

5 8 3 4 9

5 8 3 4 9

2 7

2 7

5 3 8 4 9

3 5 8 4 9

2 7

2 7

3 5 4 8 9 2 7

3 4 5 8 9 2 7

3 4 5 8 9 2 7

3 4 5 8 2 9 7

3 4 5 2 8 9 7

3 4 2 5 8 9 7

3 2 4 5 8 9 7

2 3 4 5 8 9 7

2 3 4 5 8 7 9

2 3 4 5 7 8 9

!169

https://www.toptal.com/developers/sorting-algorithms

Can we do better?

!170

Can we do better?

!171

Divide and Conquer!!!

Merge Sort

!172

Understanding O(n2)

!173

76108 158195523 59914 43100 200 2743 11260 5642 932

T(n)

Understanding O(n2)

!174

76108 158195523 59914 43100 200 2743 11260 5642 932

T(n)

10852314 43100 200 2743 76 158195 599 11260 5642 932

Understanding O(n2)

!175

76108 158195523 59914 43100 200 2743 11260 5642 932

T(n)

10852314 43100 200 2743 76 158195 599 11260 5642 932

T(1/2n) T(1/2n)

Understanding O(n2)

!176

76108 158195523 59914 43100 200 2743 11260 5642 932

T(n)

10852314 43100 200 2743 76 158195 599 11260 5642 932

T(1/2n) T(1/2n)

(n/2)2 = n2/4

Understanding O(n2)

!177

76108 158195523 59914 43100 200 2743 11260 5642 932

T(n)

10852314 43100 200 2743 76 158195 599 11260 5642 932

(n/2)2 = n2/4

T(1/2n) ≈ 1/4 T(n) T(1/2n) ≈ 1/4 T(n)

Understanding O(n2)

!178

76108 158195523 59914 43100 200 2743 11260 5642 932

T(n)

108 52314 43 100 200 2743 76 158 195 59911 2605 642 932

(n/2)2 = n2/4

T(1/2n) ≈ 1/4 T(n) T(1/2n) ≈ 1/4 T(n)

Key Insight: Merge is linear

!179

2 1 34 57 68 910

Key Insight: Merge is linear

!180

2 1 34 57 68 910

Key Insight: Merge is linear

!181

2 34 57 68 910

1

Key Insight: Merge is linear

!182

2 34 57 68 910

1

Key Insight: Merge is linear

!183

34 57 68 910

21

Key Insight: Merge is linear

!184

34 57 68 910

21

Key Insight: Merge is linear

!185

4 57 68 910

21 3

Key Insight: Merge is linear

!186

4 57 68 910

21 3

Key Insight: Merge is linear

!187

57 68 910

21 3 4

Key Insight: Merge is linear

!188

57 68 910

21 3 4

Key Insight: Merge is linear

!189

7 68 910

21 3 4 5

Key Insight: Merge is linear

!190

7 68 910

21 3 4 5

Key Insight: Merge is linear

!191

7 8 910

21 3 4 5 6

Key Insight: Merge is linear

!192

7 8 910

21 3 4 5 6

Key Insight: Merge is linear

!193

8 910

21 3 4 5 76

Key Insight: Merge is linear

!194

8 910

21 3 4 5 76

Key Insight: Merge is linear

!195

910

21 3 4 5 76 8

Key Insight: Merge is linear

!196

910

21 3 4 5 76 8

Key Insight: Merge is linear

!197

10

21 3 4 5 76 8 9

Key Insight: Merge is linear

!198

10

21 3 4 5 76 8 9

Key Insight: Merge is linear

!199

21 3 4 5 76 8 9 10

Key Insight: Merge is linear

!200

21 3 4 5 76 8 9 10

Each step makes one comparison and reduces the number of elements to

be merged by 1. If there are n total elements to be

merged, merging is O(n)

Divide and Conquer

!201

76108 158195523 59914 43100 200 2743 11260 5642 932

T(n)

108 52314 43 100 200 27476 158 195 59911 26064 932

T(1/2n) ≈ 1/4 T(n) T(1/2n) ≈ 1/4 T(n)

Divide and Conquer

!202

76108 158195523 59914 43100 200 2743 11260 5642 932

T(n)

108 52314 43 100 200 2743 76 158 195 59911 2605 642 932

Speed up insertion sort by a factor of two by splitting in half, sorting separately and merging results!

T(1/2n) ≈ 1/4 T(n) T(1/2n) ≈ 1/4 T(n)

108 52314 43 100 200 27476 158 195 59911 26064 932

T(n) ≈ 1/2 T(n) + n

Divide and Conquer

Splitting in two gives 2x improvement.

!203

Divide and Conquer

Splitting in two gives 2x improvement.

Splitting in four gives 4x improvement.

!204

Divide and Conquer

Splitting in two gives 2x improvement.

Splitting in four gives 4x improvement.

Splitting in eight gives 8x improvement.

!205

Divide and Conquer

Splitting in two gives 2x improvement.

Splitting in four gives 4x improvement.

Splitting in eight gives 8x improvement.

What if we never stop splitting?

!206

Merge Sort

!207

76108 158195523 59914 43 200 2743 11260 642 932

7610852314 43 200 2743 158195 599 11260 642 932

14 43 2003 76108523274 158195 599 2 11260 64 932

14 3 43 200 523274 76108 195 599 158 2 11260 64 932

14 3 43 200 523274 76108 195 599 158 2 11260 64 932

Merge Sort

!208

76108 158195523 59914 43 200 2743 11260 642 932

7610852314 43 200 2743 158195 599 11260 642 932

14 43 2003 76108523274 158195 599 2 11260 64 932

14 3 43 200 523274 76108 195 599 158 2 11260 64 932

14 3 43 200 523274 76108 195 599 158 2 11260 64 932

Merge Sort

!209

76108 158195523 59914 43 200 2743 11260 642 932

7610852314 43 200 2743 158195 599 11260 642 932

14 43 2003 76108523274 158195 599 2 11260 64 932

3 14 43 200 523274 10876 195 599 2 158 2611 64 932

14 3 43 200 523274 76108 195 599 158 2 11260 64 932

Merge Sort

!210

76108 158195523 59914 43 200 2743 11260 642 932

7610852314 43 200 2743 158195 599 11260 642 932

3 43 20014 52327410876 1952 158 599 2611 64 932

3 14 43 200 523274 10876 195 599 2 158 2611 64 932

14 3 43 200 523274 76108 195 599 158 2 11260 64 932

Merge Sort

!211

76108 158195523 59914 43 200 2743 11260 642 932

5232742003 43 76 10814 262 11 195158 59964 932

3 43 20014 52327410876 1952 158 599 2611 64 932

3 14 43 200 523274 10876 195 599 2 158 2611 64 932

14 3 43 200 523274 76108 195 599 158 2 11260 64 932

Merge Sort

!212

7664 19510843 1582 11 14 263 523274 599200 932

5232742003 43 76 10814 262 11 195158 59964 932

3 43 20014 52327410876 1952 158 599 2611 64 932

3 14 43 200 523274 10876 195 599 2 158 2611 64 932

14 3 43 200 523274 76108 195 599 158 2 11260 64 932

Merge Sort

!213

7664 19510843 1582 11 14 263 523274 599200 932

5232742003 43 76 10814 262 11 195158 59964 932

3 43 20014 52327410876 1952 158 599 2611 64 932

3 14 43 200 523274 10876 195 599 2 158 2611 64 932

14 3 43 200 523274 76108 195 599 158 2 11260 64 932

Merge Sort Analysis

!214

7664 19510843 1582 11 14 263 523274 599200 932

5232742003 43 76 10814 262 11 195158 59964 932

3 43 20014 52327410876 1952 158 599 2611 64 932

3 14 43 200 523274 10876 195 599 2 158 2611 64 932

14 3 43 200 523274 76108 195 599 158 2 11260 64 932

O(n)

O(n)

O(n)

O(n)

O(n)

Merge Sort Analysis

!215

7664 19510843 1582 11 14 263 523274 599200 932

5232742003 43 76 10814 262 11 195158 59964 932

3 43 20014 52327410876 1952 158 599 2611 64 932

3 14 43 200 523274 10876 195 599 2 158 2611 64 932

14 3 43 200 523274 76108 195 599 158 2 11260 64 932

Merge how many times?

n

n/2

n/4

. . .

n/2k

Merge Sort Analysis

!216

7664 19510843 1582 11 14 263 523274 599200 932

5232742003 43 76 10814 262 11 195158 59964 932

3 43 20014 52327410876 1952 158 599 2611 64 932

3 14 43 200 523274 10876 195 599 2 158 2611 64 932

14 3 43 200 523274 76108 195 599 158 2 11260 64 932

Merge how may times? n/2k = 1 n = 2k

log2 n = k

n

n/2

n/4

. . .

n/2k

Merge Sort Analysis

!217

7664 19510843 1582 11 14 263 523274 599200 932

5232742003 43 76 10814 262 11 195158 59964 932

3 43 20014 52327410876 1952 158 599 2611 64 932

3 14 43 200 523274 10876 195 599 2 158 2611 64 932

14 3 43 200 523274 76108 195 599 158 2 11260 64 932

n

n/2

n/4

. . .

n/2k

Merge n elements log2 n times

Merge Sort Analysis

!218

7664 19510843 1582 11 14 263 523274 599200 932

5232742003 43 76 10814 262 11 195158 59964 932

3 43 20014 52327410876 1952 158 599 2611 64 932

3 14 43 200 523274 10876 195 599 2 158 2611 64 932

14 3 43 200 523274 76108 195 599 158 2 11260 64 932

O(n log n )

O(n)

O(n)

O(n)

O(n)

O(n)

Merge Sort

How would you code this?

!219

Merge Sort

How would you code this?

Hint: Divide and Conquer!!!

!220

Merge Sort

void mergeSort(array){ if array size <= 1 return //base case split array into left_array and right_array mergeSort(left_array) mergeSort(right_array) merge(left_array, right_array, sorted_array)}

!221

Merge Sort Analysis

Execution time does NOT depend on initial arrangement of data

Worst Case: O( n log n) comparisons and data moves

Best Case: O( n log n) comparisons and data moves

Stable

Best we can do with comparison-based sorting that does not rely on a data structure in the worst case => can’t beat O( n log n)

Space overhead: auxiliary array at each merge step

!222

What we have so far

!223

Worst Case Best Case

Selection Sort O( n2 ) O( n2 )

Insertion Sort O( n2 ) O( n )

Bubble Sort O( n2 ) O( n )

Merge Sort O( n log n ) O( n log n )

Quick Sort

!224

Quick Sort

!225

Select a pivot. Arrange other entries s.t. entries in left partition are ≤ pivot

and entries in right partition are > pivot

<= pivot

> pivot

Partition

Quick Sort

!226

pivot

<= pivot

> pivot

Partition

Quick Sort

!227

pivot

Select a pivot. Arrange other entries s.t. entries in left partition are ≤ pivot

and entries in right partition are > pivot

<= pivot

> pivot

Partition

Quick Sort

!228

pivot

Select a pivot. Arrange other entries s.t. entries in left partition are ≤ pivot

and entries in right partition are > pivot

<= pivot

> pivot

Partition

Quick Sort

!229

pivot

Select a pivot. Arrange other entries s.t. entries in left partition are ≤ pivot

and entries in right partition are > pivot

<= pivot

> pivot

swap

Partition

Quick Sort

!230

pivot

Select a pivot. Arrange other entries s.t. entries in left partition are ≤ pivot

and entries in right partition are > pivot

<= pivot

> pivot

Partition

Quick Sort

!231

pivot

Select a pivot. Arrange other entries s.t. entries in left partition are ≤ pivot

and entries in right partition are > pivot

<= pivot

> pivot

Partition

Quick Sort

!232

pivot

Select a pivot. Arrange other entries s.t. entries in left partition are ≤ pivot

and entries in right partition are > pivot

<= pivot

> pivot

Partition

Quick Sort

!233

pivot

Select a pivot. Arrange other entries s.t. entries in left partition are ≤ pivot

and entries in right partition are > pivot

<= pivot

> pivot

swap

Partition

Quick Sort

!234

pivot

Select a pivot. Arrange other entries s.t. entries in left partition are ≤ pivot

and entries in right partition are > pivot

<= pivot

> pivot

Partition

Quick Sort

!235

pivot

Select a pivot. Arrange other entries s.t. entries in left partition are ≤ pivot

and entries in right partition are > pivot

<= pivot

> pivot

Partition

Quick Sort

!236

pivot

Select a pivot. Arrange other entries s.t. entries in left partition are ≤ pivot

and entries in right partition are > pivot

<= pivot

> pivot

Partition

Quick Sort

!237

pivot

Select a pivot. Arrange other entries s.t. entries in left partition are ≤ pivot

and entries in right partition are > pivot

<= pivot

> pivot

Partition

Quick Sort

!238

pivot

Select a pivot. Arrange other entries s.t. entries in left partition are ≤ pivot

and entries in right partition are > pivot

<= pivot

> pivot

swap

Partition

Quick Sort

!239

pivot

Select a pivot. Arrange other entries s.t. entries in left partition are ≤ pivot

and entries in right partition are > pivot

<= pivot

> pivot

Partition

Quick Sort

!240

Select a pivot. Arrange other entries s.t. entries in left partition are ≤ pivot

and entries in right partition are > pivot

<= pivot

> pivot

≤ pivot > pivotquickSort( ) quickSort( )

Partition

Quick Sort Analysis

Divide and Conquer

n comparisons for each partition

How many subproblems? => Depends on pivot selection

Ideally partition divides problem into two n/2 subproblems for logn recursive calls (Best case)

Possibly (though unlikely) each partition has 1 empty subarray for n recursive calls (Worst case)

!241

template<class T> void quickSort(T the_array[], int first, int last) { if (last - first + 1 < MIN_SIZE) { insertionSort(the_array, first, last); } else { // Create the partition: S1 | Pivot | S2 int pivot_index = partition(the_array, first, last); // Sort subarrays S1 and S2 quickSort(the_array, first, pivot_index - 1); quickSort(the_array, pivotIndex + 1, last); } // end if } // end quickSort

!242

How to select pivot?

!243

How to select pivot?

Ideally median Need to sort array to find median

Other ideas?

!244

How to select pivot?

Ideally median Need to sort array to find median

Other ideas? Pick first, middle, last position and order them making middle the pivot

!245

695 13

How to select pivot?

Ideally median Need to sort array to find median

Other ideas? Pick first, middle, last position and order them making middle the pivot

!246

136 95

pivot

Quick Sort AnalysisExecution time DOES depend on initial arrangement of data AND on PIVOT SELECTION (luck?) => on random data can be faster than Merge Sort

Optimization (e.g. smart pivot selection, speed up base case, iterative instead of recursive implementation) can improve actual runtime -> fastest comparison-based sorting algorithm on average

Worst Case: O( n2) comparisons and data moves

Best Case: O( n log n) comparisons and data moves

Unstable

!247

!248

Worst Case Best Case

Selection Sort O( n2 ) O( n2 )

Insertion Sort O( n2 ) O( n )

Bubble Sort O( n2 ) O( n )

Merge Sort O( n log n ) O( n log n )

Quick Sort O( n2 ) O( n log n )

!249

https://www.toptal.com/developers/sorting-algorithms

!250

https://www.youtube.com/watch?v=kPRA0W1kECg