CSC 2300 Data Structures & Algorithms March 23, 2007 Chapter 7. Sorting.
-
Upload
francis-mcdaniel -
Category
Documents
-
view
213 -
download
0
Transcript of CSC 2300 Data Structures & Algorithms March 23, 2007 Chapter 7. Sorting.
Quicksort – Algorithm
1. If the number of elements in S is 0 or 1, then return.
2. Pick any element v in S. This is called the pivot.
3. Partition S – {v} into two disjoint groups:
S1 = { x ε S – {v} | x ≤ v}
and
S2 = { x ε S – {v} | x ≥ v}.
4. Return { quicksort(S1) followed by v followed by quicksort(S2)}.
Quicksort – Partition Strategy Example. Input: 8, 1, 4, 9, 6, 3, 5, 2, 7, 0. Say 6 is chosen as pivot. 8 1 4 9 0 3 5 2 7 6
i j pivot 8 1 4 9 0 3 5 2 7 6
i j 2 1 4 9 0 3 5 8 7 6
i j 2 1 4 9 0 3 5 8 7 6
i j 2 1 4 5 0 3 9 8 7 6
i j 2 1 4 5 0 3 9 8 7 6
j i pivot 2 1 4 5 0 3 6 8 7 9
pivot
Choices of Pivot
Four suggestions: First element of array; Larger of first two distinct elements of array; Middle element of array; Randomly.
What do you think about these choices? All bad choices. Why?
Good Choice of Pivot
Best choice: median of array. Disadvantage? Practical choice: Median of Three. What is it? Median of left, right, and center elements. Example: 8, 1, 4, 9, 6, 3, 5, 2, 7, 0. Median of 8, 6, and 0.
Example
Example: 8, 1, 4, 9, 6, 3, 5, 2, 7, 0. Pivot = Median of 8, 6, and 0. What should new array look like? Recall what we have done:
8 1 4 9 0 3 5 2 7 6i j pivot
Can we do better?0 1 4 9 6 3 5 2 7 8i pivot j
Where should we move pivot?0 1 4 9 7 3 5 2 6 8
i j pivot
Quicksort – Analysis
Quicksort is recursive. We thus get a recurrence formula:
T(0) = T(1) = 1,
T(N) = T(i) + T(N – i – 1) + cN,
where i denotes the number of elements in S1. What value of i gives worst case? What value of i gives best case?
Worst Case Analysis
We have i = 0, always. What does that say about the pivot? Always the smallest element. Recurrence becomes
T(N) = T(0) + T(N – 1) + cN. Ignore T(0), and get
T(N) = T(N – 1) + cN. Hence
T(N – 1) = T(N – 2) + c(N – 1),T(N – 2) = T(N – 3) + c(N – 2),…T(2) = T(1) + c(2).
We getT(N) = T(1) + c ∑ i = 1 + c [ N(N+1)/2 – 1] = O(N2).
Best Case Analysis
We have i = N/2, always. What does that say about the pivot? Always the median. Recurrence becomes
T(N) = T(N/2) + T(N/2) + cN = 2 T(N/2) + cN. Do you remember how to solve this recurrence? Divide by N to get
T(N)/N = T(N/2)/(N/2) + c. Thus,
T(N/2)/(N/2) = T(N/4)/(N/4) + c,T(N/4)/(N/4) = T(N/8)/(N/8) + c,…T(2)/2 = T(1)/1 + c.
We getT(N)/N = T(1)/1 + c logN,
and soT(N) = N + c N logN = O(N log N).
Average Case Analysis
Always much harder than worst and best cases. What can we assume about the pivot? Assume that each of the sizes for S1 is equally likely and
thus has probability 1/N. The average value of T(i) is thus (1/N) ∑ T(j). What can we say about the value of T(N – i – 1)? Recurrence becomes
T(N) = (2/N) ∑ T(j) + cN. Does this recurrence look familiar? When we did an internal path length analysis in Chapter 4
(Trees).
Average Case Analysis
Recurrence:
T(N) = (2/N) ∑ T(j) + cN. How can we solve this recurrence? Divide by N? No, multiply by N! We get this recurrence:
N T(N) = 2 ∑ T(j) + cN2. How do we get rid of the ∑ T(j) ? We use this recurrence:
(N – 1)T(N – 1) = 2 ∑ T(j) + c(N – 1)2. Subtracting one recurrence from the other, we get
NT(N) – (N – 1)T(N – 1) = 2 T(N – 1) + c(2N – 1). Simplifying and dropping the c term, we get
NT(N) = (N+1) T(N – 1) + 2cN.
Recurrence
Recurrence:NT(N) = (N+1) T(N – 1) + 2cN.
How can we solve this recurrence? Divide by N? Divide by N+1? No, divide by N(N+1)! We get this recurrence:
T(N)/(N+1) = T(N – 1)/N + 2c/(N+1). What to do now? We can telescope:
T(N – 1)/N = T(N – 2)/(N – 1) + 2c/N,T(N – 2)/(N – 1) = T(N – 3)/(N – 2) + 2c/(N – 1),…T(2)/3 = T(1)/2 + 2c/3.
We get this solution:T(N)/(N+1) = T(1)/2 + 2c ∑ (1/i).
What does ∑ (1/i) equal? We get T(N) = O(N log N).