Internal Sorting 2users.encs.concordia.ca/~sthiel/coen352/03a_Internal_Sorting.pdf · Internal...
Transcript of Internal Sorting 2users.encs.concordia.ca/~sthiel/coen352/03a_Internal_Sorting.pdf · Internal...
Internal Sorting 2
S. Thiel
Sorting
Quicksort
Mergesort
Heapsort
Radix Sort
References
1/31
Internal Sorting 2
S. Thiel1
1Department of Computer Science & Software EngineeringConcordia University
July 11, 2018
Internal Sorting 2
S. Thiel
Sorting
Quicksort
Mergesort
Heapsort
Radix Sort
References
2/31
Outline
SortingQuicksortMergesortHeapsortRadix Sort
References
Internal Sorting 2
S. Thiel
Sorting
Quicksort
Mergesort
Heapsort
Radix Sort
References
3/31
Sorting
I Three general purpose sorting algorithms
I They sort by comparing
I Quicksort
I Mergesort
I Heapsort
Internal Sorting 2
S. Thiel
Sorting
Quicksort
Mergesort
Heapsort
Radix Sort
References
4/31
Quicksort
I Pick a pivot
I Partition around the pivot
I Items that are smaller/equal go before
I Items that are larger go after
I Recursively apply Quicksort on each partition
Internal Sorting 2
S. Thiel
Sorting
Quicksort
Mergesort
Heapsort
Radix Sort
References
5/31
Quicksort 2
I Quicksort works by partitioning an input in two, thensorting each half recursively
I The partition is made around a chosen “pivot”
I The left partition only has elements smaller than the“pivot”
I The right partition only has elements bigger than the“pivot”
I Each “partition” step takes Θ (N) operations
I How many “partition” steps are needed?
I Actually, each “partition” step takes gradually feweroperations. . . why?
Internal Sorting 2
S. Thiel
Sorting
Quicksort
Mergesort
Heapsort
Radix Sort
References
6/31
Quicksort Analysis
I In the best case: Θ (n log n)
I In the average case: Θ (n log n)
I In the worst case: Θ(n2)
Internal Sorting 2
S. Thiel
Sorting
Quicksort
Mergesort
Heapsort
Radix Sort
References
7/31
Quicksort Sort Properties 1
I Quicksort. . .
I is a divide and conquer algorithm
I works best with good pivot selection
I is recursive
I puts a pivot in place every pass
I is in-place
Internal Sorting 2
S. Thiel
Sorting
Quicksort
Mergesort
Heapsort
Radix Sort
References
8/31
Quicksort Sort Properties 2
I Is it Stable?
I Some neat optimizations to make it fast and stable withmany duplicates
I . . . might be a bit slower otherwise
Internal Sorting 2
S. Thiel
Sorting
Quicksort
Mergesort
Heapsort
Radix Sort
References
9/31
Quicksort flavors
I There are two ”partitioning schemes”I Hoare (my preference)
I two scanning indicesI move towards each other till they swap or crossI swap when both point at an element on the wrong side
(inversion)
I LomutoI two-indices, but only one is scanningI swap out of place items to beginningI less efficient (Internet says 3x as slow, let’s see why)
Internal Sorting 2
S. Thiel
Sorting
Quicksort
Mergesort
Heapsort
Radix Sort
References
10/31
Quicksort Standard Approaches [1, p.242]
I Picking a pivot often uses median of three
I Diversion to Insertion Sort for small partitions
I Diversion to HeapSort if it looks like Θ(n2)
ishappening (Introsort, Musser)
I Tail recursion
Internal Sorting 2
S. Thiel
Sorting
Quicksort
Mergesort
Heapsort
Radix Sort
References
11/31
Standard Implementation 1
1 v o i d q s o r t ( i n t [ ] A , i n t i , i n t j ) { // Q u i c k s o r t2 i n t p i v o t i n d e x = f i n d p i v o t (A, i , j ) ;3 swap (A, p i v o t i n d e x , j ) ; // S t i c k p i v o t a t end4 // k w i l l be t h e f i r s t p o s i t i o n i n t h e r i g h t5 // s u b a r r a y6 i n t k = p a r t i t i o n (A, i −1, j , A [ j ] ) ;7 swap (A, k , j ) ; // Put p i v o t i n p l a c e8 i f ( ( k− i ) > 1) q s o r t (A, i , k−1) ; // S o r t l e f t9 i f ( ( j−k ) > 1) q s o r t (A, k+1, j ) ; // S o r t r i g h t
10 }
Internal Sorting 2
S. Thiel
Sorting
Quicksort
Mergesort
Heapsort
Radix Sort
References
12/31
Standard Implementation 2
1 i n t p a r t i t i o n ( i n t [ ] A , i n t l , i n t r , i n t p i v o t ) {2 do { // Move bounds inward u n t i l t h e y meet3 w h i l e (A[++ l ] < p i v o t ) ;4 w h i l e ( ( r !=0) && (A[−− r ] > p i v o t ) ) ;5 swap (A, l , r ) ; // Swap out−of−p l a c e v a l u e s6 } w h i l e ( l < r ) ; // Stop when t h e y c r o s s7 swap (A, l , r ) ; // R e v e r s e l a s t , wasted swap8 r e t u r n l ; // Return f i r s t p o s i t i o n i n r i g h t p a r t i t i o n9 }
Internal Sorting 2
S. Thiel
Sorting
Quicksort
Mergesort
Heapsort
Radix Sort
References
13/31
Mergesort
I Splits Input into halves repeatedly
I Merges halves back together
Internal Sorting 2
S. Thiel
Sorting
Quicksort
Mergesort
Heapsort
Radix Sort
References
14/31
Mergesort Analysis
I In the best case: Θ (n log n)
I In the average case: Θ (n log n)
I In the worst case: Θ (n log n)
Internal Sorting 2
S. Thiel
Sorting
Quicksort
Mergesort
Heapsort
Radix Sort
References
15/31
Figure: A Mergesort example [2].
Internal Sorting 2
S. Thiel
Sorting
Quicksort
Mergesort
Heapsort
Radix Sort
References
16/31
Standard Implementation 1
1 L i s t m e r g e s o r t ( L i s t i n l i s t ) {2 i f ( i n l i s t . s i z e ( ) <= 1) r e t u r n i n l i s t ;3 L i s t L1 = i n l i s t . s u b l i s t ( 0 , i n l i s t . s i z e ( ) /2) ;4 L i s t L2 = i n l i s t . s u b l i s t ( i n l i s t . s i z e ( ) /2 ,5 i n l i s t . s i z e ( )−1)6 r e t u r n merge ( m e r g e s o r t ( L1 ) , m e r g e s o r t ( L2 ) ) ;7 }
Internal Sorting 2
S. Thiel
Sorting
Quicksort
Mergesort
Heapsort
Radix Sort
References
17/31
Standard Implementation 2
1 L i s t merge ( L i s t L1 , L i s t L2 ) {2 L i s t L = new L i s t ( ) ;3 w h i l e ( ! L1 . i sEmpty ( ) && ! L2 . i sEmpty ( ) ) {4 i f ( L1 . g e t ( 0 ) <= L2 . g e t ( 0 ) )5 L . add ( L1 . remove ( 0 ) ;6 e l s e L . add ( L2 . remove ( 0 ) ;7 }8 i f ( L1 . i sEmpty ( ) ) L . a d d A l l ( L2 ) ;9 i f ( L2 . i sEmpty ( ) ) L . a d d A l l ( L1 ) ;
10 }
Internal Sorting 2
S. Thiel
Sorting
Quicksort
Mergesort
Heapsort
Radix Sort
References
18/31
Mergesort with Lists
I The code above looks nice with ListsI Finding the halfway point of a List is costly
I How costly? I say 2n. Why?
I We can alternate and skip finding halfway points for 1n
I We can use a List-of-Lists and just start mergingsublists depth-first?
I We can use a List-of-Lists approach finding inherentstructure first?
I is cost of merging lists of varying sizes worth it?
Internal Sorting 2
S. Thiel
Sorting
Quicksort
Mergesort
Heapsort
Radix Sort
References
19/31
Mergesort with Arrays
I Using two arrays solves most Array-related issues
I Using two arrays implements pretty smoothly
I There is no cost to finding the halfway point
I You need an empty array, unlike Quicksort/Mergesortw/ lists
I Can you benefit from sorting existing runs?
Internal Sorting 2
S. Thiel
Sorting
Quicksort
Mergesort
Heapsort
Radix Sort
References
20/31
Mergesort with existing runs
I Does the best-case change?
Internal Sorting 2
S. Thiel
Sorting
Quicksort
Mergesort
Heapsort
Radix Sort
References
21/31
Heapsort
I Build a heap in Θ (n)
I Take the value off the heap Θ (1)
I re-settle the heap Θ (log n)
Internal Sorting 2
S. Thiel
Sorting
Quicksort
Mergesort
Heapsort
Radix Sort
References
22/31
Heapsort Analysis
I In the best case: Θ (n log n)
I In the average case: Θ (n log n)
I In the worst case: Θ (n log n)
Internal Sorting 2
S. Thiel
Sorting
Quicksort
Mergesort
Heapsort
Radix Sort
References
23/31
Heapsort Properties
I Can be done in-place
I Generally considered slower than Quicksort
I Effective when only the first few values in a list areneeded
I Shaffer and most others show Top-Down Heapsort
I Bottom-Up Heapsort is twice as fast, faster with a bitextra memory
I works well when you have more data than fits inmemory
Internal Sorting 2
S. Thiel
Sorting
Quicksort
Mergesort
Heapsort
Radix Sort
References
24/31
Heap navigation
I We can use math to navigate the heap
I iParent(i) = floor((i − 1)/2)
I iLeftChild(i) = 2 ∗ i + 1
I iRightChild(i) = 2 ∗ i + 2
I http://faculty.simpson.edu/lydia.sinapova/
www/cmsc250/LN250_Weiss/L13-HeapSortEx.htm
I https://en.wikipedia.org/wiki/Heapsort
Internal Sorting 2
S. Thiel
Sorting
Quicksort
Mergesort
Heapsort
Radix Sort
References
25/31
Two Types of Radix Sort
I Most Significant Digit looks good, but actually can bebad
I Least Significant Digit looks weird, but actually good
I LSD Radix sort is how most of us sort cards
Internal Sorting 2
S. Thiel
Sorting
Quicksort
Mergesort
Heapsort
Radix Sort
References
26/31
LSD Radix Sort Analysis
I In the best case: Θ (n)
I In the average case: Θ (n)
I In the worst case: Θ (n)
Internal Sorting 2
S. Thiel
Sorting
Quicksort
Mergesort
Heapsort
Radix Sort
References
27/31
MSD Radix Sort Analysis
I In the best case: Θ (n)
I In the average case: Θ (L)
I In the worst case: Θ (L)
I What the heck is L?
Internal Sorting 2
S. Thiel
Sorting
Quicksort
Mergesort
Heapsort
Radix Sort
References
28/31
LSD Radix Sort Code
Algorithm 1 LSD Radix Sort
Require: input, val()/* Initialization */
1: buf ← initArray [input.length]2: counts ← initArray [passes][bucketCount]
Ensure: sorted input/* perform initial counting pass */
3: for i in input do
4: for j = 0 to passes − 1 do5: counts[j][bitsFor(val(i), j)]6: end for7: end for
/* convert counts to indices */8: for j = 1 to passes − 1 do
9: convertToIndices(counts[j])10: end for
/* deal to buffer based on current radix */11: for j = 0 to passes − 1 do
12: for i in input do13: buf [counts[j][bitsFor(val(i), j)]] = i14: end for15: swap(input, buf )16: end for
Internal Sorting 2
S. Thiel
Sorting
Quicksort
Mergesort
Heapsort
Radix Sort
References
29/31
LSD Radix Sort Example
Figure: A Radix Sort Example.
Internal Sorting 2
S. Thiel
Sorting
Quicksort
Mergesort
Heapsort
Radix Sort
References
30/31
LSD Radix Sort Analysis
I If the size of n grows with n, not so good
I we go back to Θ (n log n)
I but! If that is the case, then comparison sorts become:
I Θ (n log n log n) why?
Internal Sorting 2
S. Thiel
Sorting
Quicksort
Mergesort
Heapsort
Radix Sort
References
31/31
References I
[1] Clifford A. Shaffer.Data Structures and Algorithm Analysis in Java.2013.
[2] Wikipedia.Mergesort.https://en.wikipedia.org/wiki/Merge_sort, May2017.