Sorting. RHS – SOC 2 Sorting Searching in sorted data is much faster (O(log(n)), than searching in...
-
Upload
valerie-mccarthy -
Category
Documents
-
view
213 -
download
0
Transcript of Sorting. RHS – SOC 2 Sorting Searching in sorted data is much faster (O(log(n)), than searching in...
Sorting
RHS – SOC 2
Sorting
• Searching in sorted data is much faster (O(log(n)), than searching in unsorted data (O(n)).
• Being able to sort data efficiently is thus a quite important ability
• But how fast can be sort data…?
RHS – SOC 3
Selection sort
• A very simple algorithm for sorting an array of n integers works like this:– Search the array from element 0 to element
(n-1), to find the smallest element– If the smallest element is element i, then swap
element 0 and element i– Now repeat the process from element 1 to
element (n-1)– …and so on…
RHS – SOC 4
Selection sort
10 56 26 4 82 7634 18 60 40
RHS – SOC 5
Selection sort
10 56 26 4 82 7634 18 60 40
RHS – SOC 6
Selection sort
10 56 26 34 82 764 18 60 40
RHS – SOC 7
Selection sort
10 56 26 34 82 764 18 60 40
RHS – SOC 8
Selection sort
10 56 26 34 82 764 18 60 40
RHS – SOC 9
Selection sort
10 56 26 34 82 764 18 60 40
RHS – SOC 10
Selection sort
10 56 26 34 82 764 18 60 40
RHS – SOC 11
Selection sort
10 18 26 34 40 564 60 76 82
RHS – SOC 12
Selection sort
• How fast is selection sort?
• We scan for the smallest element n times– In scan 1, we examine n element– In scan 2, we examine (n-1) element– …and so on
• A total of n + (n -1) + (n – 2) +…+ 2 + 1 examinations
• The sum is n(n + 1)/2
RHS – SOC 13
Selection sort
• The total number of examinations is equal to n(n + 1)/2 = (n2 + n)/2
• The run-time complexity of selection sort is therefore O(n2)
• O(n2) grows fairly fast…
RHS – SOC 14
Selection sort
n n2
2 4
5 25
20 400
50 2500
200 40000
RHS – SOC 15
Exercise 1
• Download the project selectionSortInJava from the PSL website• Examine the code – see how selection sort is implemented in Java• The project contains two helper classes ArrayUtil (generates a
random array of integers), and StopWatch (can measure the time needed to execute some code). Using these classes, the program measures how long it takes to sort an array using selection sort
• Try to run the program with various array sizes. For each run, write down the array size and the elapsed time. Make sure to try some array sizes that take several seconds to complete
• Enter the data into an Excel spreadsheet, plot a curve from the data, and see how the run time behaves when the array size increases
RHS – SOC 16
Merge sort
• Selection sort is conceptually very simple, but not very efficient…
• A different algorithm for sorting is merge sort
• Merge sort is an example of a divide-and-conquer algorithm
• It is also a recursive algorithm
RHS – SOC 17
Merge sort
• The principle in merge sort is to merge two already sorted arrays:
10 26 34 56 18 404 60 76 82
RHS – SOC 18
Merge sort
10 26 34 56 18 404 60 76 82
RHS – SOC 19
Merge sort
• Merging two sorted arrays is pretty simple, but how did the arrays get sorted…?
• Recursion to the rescue!
• Sort the two arrays simply by appying merge sort to them…
• If the array has length 1 (or 0), it is sorted
RHS – SOC 20
Merge sortpublic void sort() // Sort the array a
{
if (a.length <= 1) return; // Base case
int[] a1 = new int[a.length/2]; // Create two new
int[] a2 = new int[a.length – a1.length]; // arrays to sort
System.arraycopy(a,0,a1,0,a1.length); // Copy data to
System.arraycopy(a,a1.length,a2,0,a2.length); // the new arrays
MergeSorter ms1 = new MergeSorter(a1); // Create two new
MergeSorter ms2 = new MergeSorter(a2); // sorter objects
ms1.sort(); // Sort the two
ms2.sort(); // new arrays
merge(a1,a2); // Merge the arrays
}
RHS – SOC 21
Merge sort
• All that is left is the method for merging two arrays
• A little bit tedious, but as such trivial…
• Time needed to merge two arrays to the total length of the arrays, i.e to n
• We can now analyse the run-time com-plexity for merge sort
RHS – SOC 22
Merge sort
• Merge sort of an array of length n requires– Two merge sorts of arrays of length n/2– Merging two arrays of length n/2
• The running time T(n) then becomes:
T(n) = 2×T(n/2) + n
RHS – SOC 23
Merge sort
• If we re-insert the expression for T(n) into itself m times, we get
T(n) = 2m×T(n/2m) + mn
• If we choose m such that n = 2m, we get
T(n) = n×T(1) + mn = n + n×log(n)
RHS – SOC 24
Merge sort
• The run-time complexity of merge sort is therefore O(n log(n))
• Many other sorting algorithms have this run-time complexity
• This is the fastest we can sort, except under very special circumstances
• Much better than O(n2)…
RHS – SOC 25
Merge sort
n n log(n) n2
2 2 4
5 12 25
20 86 400
50 282 2500
200 1529 40000
RHS – SOC 26
Exercise 2
• Download the project mergeSortInJava from the PSL website• Examine the code – see how merge sort is implemented in Java
(the project contains the same helper classes as the selectionSortInJava project – ArrayUtil and StopWatch)
• Try to run the program with various array sizes. For each run, write down the array size and the elapsed time. Make sure to try some array sizes that take several seconds to complete
• Enter the data into an Excel spreadsheet, plot a curve from the data, and see how the run time behaves when the array size increases
• Compare the results with the results obtained for selection sort – when do the curves for run time cross each other (if at all)?
RHS – SOC 27
Sorting in practice
• It does matter which sorting algorithm you use…
• …but do I have to code sorting algorithms myself?
• No! You can – and should – use sorting algorithms found in the Java library
RHS – SOC 28
Sorting in practice
• Sorting an array:
Car[] cars = new Car[n];
…
Arrays.sort(cars);
RHS – SOC 29
Sorting in practice
• Sorting an arraylist:
ArrayList<Car> cars =
new ArrayList<Car>();
…
Collections.sort(cars);
RHS – SOC 30
Sorting in practice
• Why not code my own sorting algorithms?
• Sorting algorithms in Java library are better than anything you can produce…– Carefully debugged– Highly optimised– Used by thousands
• You cannot beat them
RHS – SOC 31
Sorting in practice
• In order to sort an array of data, we need to be able to compare the elements
• ”Larger than” should make sense for the elements in the array
• Easy for numeric types (>)
• What about types we define ourselves…?
RHS – SOC 32
Sorting in practice
• If a class T implements the Comparable interface, objects of type T can be compared:
public interface Comparable<T>
{
int compareTo(T other);
}
RHS – SOC 33
Sorting in practice
• In the interface definition, T is a type parameter
• It is used the same way as we use an arraylist
• ArrayList<Car> : an arraylist holding elements of type Car
RHS – SOC 34
Sorting in practice
• In order for the sorting algorithms to work properly, an implementation of the interface must obey these rules:
• The call a.compareTo(b) must return:– A negative number if a < b– Zero if a = b– A positive number if a > b
RHS – SOC 35
Sorting in practice
• The implementation of compareTo must define a so-called total ordering:– Antisymmetric: If a.compareTo(b) ≤ 0, then b.compareTo(a) ≥ 0
– Reflexive: a.compareTo(a) = 0– Transitive: If a.compareTo(b) ≤ 0 and b.compareTo(c) ≤ 0, then a.compareTo(c) ≤ 0
RHS – SOC 36
Sorting in practicepublic class Car implements Comparable<Car>
{
...
// Here using weight as ordering criterion
//
public int compareTo(Car other)
{
if (getWeight() < other.getWeight()) return -1;
if (getWeight() == other.getWeight()) return 0;
return 1;
}
...
}
RHS – SOC 37
Exercises
• Programming P14.12, P.14.14
• For P14.14, read about the Comparator interface in Advanced Topic 14.5, page 657-658