Algorithms and Data Structures (INF1)
Transcript of Algorithms and Data Structures (INF1)
Lecture 7/15Hua Lu
Department of Computer ScienceAalborg University
Fall 2007
Algorithms and Data Structures (INF1)
2
This Lecture• Merge sort• Quick sort• Radix sort• Summary
• We will see more complex techniques than those used by Bubble Sort, Selection Sort and Insertion Sort.
3
Divide and Conquer• Divide-and-conquer is an important algorithm
design technique for large-size problems.• If the problem size is small enough to solve it in a
straightforward manner, solve it. • Otherwise
DivideDivide the problem into two or more disjoint sub-problems
ConquerUse divide-and-conquer recursively to solve the sub-problems
CombineTake the solutions to the sub-problems and combine these solutions into a solution for the original problem
4
Merge Sort• Problem: Sort an array A into non-descending order.• Divide
If A has at least two elements (nothing needs to be done if A has zero or one elements), remove all the elements from A and put them into two arrays, A1 and A2 , each containing about half of the elements of S. (i.e. A1 contains the first ⎡n/2⎤ elements and A2contains the remaining ⎣n/2⎦ elements).
• ConquerSort arrays A1 and A2 using Merge Sort.
• CombinePut back the elements into A, by merging the sorted arrays A1 and A2 into one sorted array
5
Merge Sort Algorithm
Merge-Sort(A, p, r)if p < r then
q← ⎣(p+r)/2⎦Merge-Sort(A, p, q)Merge-Sort(A, q+1, r)Merge(A, p, q, r)
Merge-Sort(A, p, r)if p < r then
q← ⎣(p+r)/2⎦Merge-Sort(A, p, q)Merge-Sort(A, q+1, r)Merge(A, p, q, r)
Merge(A, p, q, r)Take the smallest of the two topmost elements of
arrays A[p..q] and A[q+1..r] and put it into an additional array. Repeat this, until both arrays are empty. Copy the additional array into A[p..r].
Merge(A, p, q, r)Take the smallest of the two topmost elements of
arrays A[p..q] and A[q+1..r] and put it into an additional array. Repeat this, until both arrays are empty. Copy the additional array into A[p..r].
Merge(A, B)Merges two sorted arrays into one. Time complexity O(|A|+|B|).
Merge(A, B)Merges two sorted arrays into one. Time complexity O(|A|+|B|).
6
MergeSort (Example) - 1
MergeSort(A, 1, 8)
7
MergeSort (Example) - 2
MergeSort(A, 1, 4)
8
MergeSort (Example) - 3
MergeSort(A, 1, 2)
9
MergeSort (Example) - 4
MergeSort(A, 1, 1)
10
MergeSort (Example) - 5
MergeSort(A, 1, 1) returns.
11
MergeSort (Example) - 6
MergeSort(A, 2, 2)
12
MergeSort (Example) - 7
MergeSort(A, 2, 2) returns
13
MergeSort (Example) - 8
Merge(A, 1, 2, 2)
14
MergeSort (Example) - 9
MergeSort(A, 1, 2) returns
15
MergeSort (Example) - 10
MergeSort(A, 3, 4)
16
MergeSort (Example) - 11
MergeSort(A, 3, 3)
17
MergeSort (Example) - 12
MergeSort(A, 3, 3) returns
18
MergeSort (Example) - 13
MergeSort(A, 4, 4)
19
MergeSort (Example) - 14
MergeSort(A, 4, 4) returns
20
MergeSort (Example) - 15
Merge(A, 3, 4, 4)
21
MergeSort (Example) - 16
MergeSort(A, 3, 4) returns
22
MergeSort (Example) - 17
Merge(A, 1, 3, 4)
23
MergeSort (Example) - 18
MergeSort(A, 1, 4) returns
24
MergeSort (Example) - 19
MergeSort(A, 5, 8)
25
MergeSort (Example) - 20
Merge(A, 5, 7, 8)
26
MergeSort (Example) - 21
MergeSort(A, 5, 8) returns
27
MergeSort (Example) - 22
Merge(A, 1, 5, 8)
28
Merge Sort Summarized• To sort n numbers
if n=1 done!recursively sort 2 sequences ofnumbers ⎣n/2⎦ and ⎡n/2⎤ elementsmerge 2 sorted lists in Ο(n) time
• Strategybreak problem into similar(smaller) subproblemsrecursively solve subproblemscombine solutions to answer
29
Running time of MergeSort• Again the running time can be expressed as a recurrence
solving_trivial_problem if 1( )
num_pieces ( / subproblem_size_factor) dividing combining if 1n
T nT n n
=⎧= ⎨ + + >⎩
1 i f 1( )
2 ( / 2 ) i f 1n
T nT n n n
=⎧= ⎨ + >⎩
30
Repeated Substitution Method• Let’s find the running time of merge sort (let’s assume that n=2b, for
some b).
1 if 1( )
2 ( / 2) if 1n
T nT n n n
=⎧= ⎨ + >⎩
( )( )( )
2
2
3
lg
( ) 2 /2 substitute2 2 /4 /2 expand
2 ( /4) 2 substitute2 (2 ( /8) /4) 2 expand
2 ( /8) 3 observe the pattern( ) 2 ( /2 )
2 ( / ) lg lg
i i
n
T n T n nT n n n
T n nT n n n
T n nT n T n in
T n n n n n n n
= += + +
= += + +
= +
= +
+ = +=
Let 2i=n, i=lg n
31
Quick Sort• A divide-and-conquer algorithm
Divide - the core of quick sort!Pick an element, called a pivot, from the array. Reorder the array so that all elements which are less than the pivot come before the pivot, and so that all elements greater than the pivot come after it (equal values can go either way). After this partitioning, the pivot is in its final position. This is called the partition operation.
ConquerRecursively sort the 2 subarrays
CombineTrivial since sorting is done in place
• CharacteristicsThe divide-and-conquer nature is like merge sort, but it does not require an additional array
It sorts in-placeVery practical, average performance O(n log n) (with small constant factors), but worst case O(n2)
32
Partitioning: Key Step in Quicksort• We choose some (any) element p in the array as a pivot• We then partition the array into three parts based on pivot
Left part, pivot itself, and right partPartition returns the final position/index of p
• Then, Quicksort will be recursively executed on both left part and right part
• Quicksort(A, l, r)If l<r then
i:=Partition(A, l, r)Quicksort(A, l, i-1)Quicksort(A, i+1, r)
p
elements less than p
elements greater than or equal to p
p
33
Partition Algorithm• Choose an array value (say, the first) to use as the pivot• Starting from the left end, find the first element that is
greater than or equal to the pivot• Searching backward from the right end, find the first
element that is less than the pivot• Interchange (swap) these two elements• Repeat, searching from where we left off, until donePartition(A,left,right):intp:= A[left]; l:=left+1; r:=right;while l<r do
while A[l]<p do l:=l+1;while A[r]≥p do r:=r-1;if l<r then swap(A, l, r)
A[left]:=A[r]; A[r]:=p;return r;
34
Example of Partitioning• choose pivot: 4 3 6 9 2 4 3 1 2 1 8 9 3 5 6• search: 4 3 6 9 2 4 3 1 2 1 8 9 3 5 6• swap: 4 3 3 9 2 4 3 1 2 1 8 9 6 5 6• search: 4 3 3 9 2 4 3 1 2 1 8 9 6 5 6• swap: 4 3 3 1 2 4 3 1 2 9 8 9 6 5 6• search: 4 3 3 1 2 4 3 1 2 9 8 9 6 5 6• swap: 4 3 3 1 2 2 3 1 4 9 8 9 6 5 6• search: 4 3 3 1 2 2 3 1 4 9 8 9 6 5 6 (l > r)
• swap with pivot: 1 3 3 1 2 2 3 4 4 9 8 9 6 5 6
35
Best Case of Quicksort
• Suppose each partition operation divides the array of size n nearly in half
• Then the depth of the recursion in log2nThat’s how many times we can halve n until we get n 1s
• At each level of the recursion, all the partitions at that level do work that is linear in n
Each partition is linear over its sub-arrayAll the partitions at one level cover the array
• Hence in the best case, quicksort has time complexity O(log2n) * O(n) = O(n log2n)
• Average case also has this complexityDetail omitted here in this course
36
Best Case Partitioning• If we are lucky, Partition splits the array evenly
O(lg n) * O(n) = O(n lg n)
37
Worst Case of Quicksort• In the worst case, partitioning always divides the size n
array into these three parts:A length one part, containing the pivot itselfA length zero part, andA length n-1 part, containing everything else
• We don’t recur on the zero-length part• Recurring on the length n-1 part requires (in the worst
case) recurring to depth n-1In the worst case, recursion may be n levels deep
• But the partitioning work done at each level is still n• So worst case for Quicksort is O(n) * O(n) = O(n2)• When does this happen?
When the array is sorted to begin with!
38
Worst Case Partitioning
O(n) * O(n) = O(n2)
39
Picking a Better Pivot• Before, we picked the first element of each sub-array
to use as a pivotIf the array is already sorted, this results in O(n2) behaviorIt’s no better if we pick the last element
• We could do an optimal quicksort (guaranteedO(n log n)) if we always picked a pivot value that
exactly cuts the array in halfSuch a value is called a median
half of the values in the array are larger, half are smallerThe easiest way to find the median is to sort the array and pick the value in the middle (!)
IronicallyRandom pivot
Randomized algorithm of partitioning
40
More Comments on Quicksort• Quicksort is the fastest known sorting algorithm
Most real-world sorting is done by Quicksort
• For optimum efficiency, the pivot must be chosen carefully• Fortunately, Quicksort is usually O(n log2n)• However, no matter what you do, there will be some cases
where Quicksort runs in O(n2) timeIf the array is sorted to begin withIt is possible to construct other bad cases
41
Radix Sort• Every integer number k can be represented by at most d
digits in base r (radix)All digits can be stored in an array A[1..d]k becomes A[1]A[2]…A[d], where A[i] are digits in base rA[1]: the most significant digitA[d]: the least significant digit
• ExampleDecimal system 015, 155, 008, 319, 325, 100, 111: d=3, r=10
For 015: A[1]=0, A[2]=1, A[3]=5
• Ideasort by the Least Significant Digit first (LSD)
the numbers ended with 0 precede the numbers ended with 1, whichprecede those ended with 2, and so on so forth
sort by the next least significant digitcontinue this until all numbers have been sorted on all d digitsAlso can sort by the Most Significant Digit first (MSD)
42
Radix Sort Algorithm• RadixSort(q:Queue):Queue
for i:=0 to r-1 do// Make a new queue for the corresponding digit, e.g. 0, 1, 2…9A[i]:=Queue.make();
for i:= d downto 1 do // Loop on all digits of each numberwhile (NOT q.isEmpty()) do
x:= q.dequeue();j:=i’th digit of x.value from right;A[j].enqueue(x);
for j:=0 to r-1 doq.append(A[j]);A[j]:=Queue.make(); // Make a new queue for that digit
• We use queues for simplicity of operationAll numbers are stored in a queue qIntermediate queues are used
Put each number into a queue corresponding to its i’th digit
Reassemble all numbers back to the original queue. Now they are sorted w.r.t the i’th digits.
43
Radix Sort Example (1)• Least significant digit first• Numbers to sort: 275, 087, 426, 061, 509, 170, 677, 503
170 061 503 275 426 087 677 509
while-loop puts all numbers into different queues corresponding to their last digits
for-loop combines all queues back into the original single queue q, in the order of radices
44
170 061 503 275 426 087 677 509
Radix Sort Example (2)
503 509 426 061 170 275 677 087
061 087 170 275 426 503 509 677
45
Radix Sort Analysis• Loop invariant of the main for-loop (for i:=d downto 1 do)
The queue elements are sorted according to their last d-i digits• Increasing the base r decreases the number of passes
d becomes smaller• Running time (input size n)
d passes over the numberseach pass takes |q|+r
while (NOT q.isEmpty()) dofor j:=0 to r-1 do
total: O((|q|+r)*d)worst case if |q|=O(n): O((n+r)*d)
O(n) as r and d are constants• Remarks
Radix sort is not based on comparisons; the values are used as array indices when locating corresponding queuesRadix sort is good for sorting long sequences of small numbers
Large n, fixed (small) d and r
46
Summary of Sorting Algorithms
Average Case Worst Case RemarksBubble Sort — O(n2) Don’t use it!Selection Sort O(n2) O(n2)Insertion Sort O(n2) O(n2)Merge Sort O(n lg n) O(n lg n) *Quick Sort O(n lg n) O(n2) **Radix Sort O((n+r)·d) O((n+r)·d) ***
Favor small or sorted inputs.
• * Good for sorting on external devices with sequential access.
• ** The fastest general-purpose sorting algorithm.• *** Not based on comparison, good for sorting long
sequences of small numbers
47
Next Lecture• Trees
BasicsRooted treesBinary trees, balanced binary treesTree traversal