Sorting. RHS – SOC 2 Sorting Searching in sorted data is much faster (O(log(n)), than searching in...

Sorting

RHS – SOC 2

Sorting

• Searching in sorted data is much faster (O(log(n)), than searching in unsorted data (O(n)).

• Being able to sort data efficiently is thus a quite important ability

• But how fast can be sort data…?

RHS – SOC 3

Selection sort

• A very simple algorithm for sorting an array of n integers works like this:– Search the array from element 0 to element

(n-1), to find the smallest element– If the smallest element is element i, then swap

element 0 and element i– Now repeat the process from element 1 to

element (n-1)– …and so on…

RHS – SOC 4

Selection sort

10 56 26 4 82 7634 18 60 40

RHS – SOC 5

Selection sort

10 56 26 4 82 7634 18 60 40

RHS – SOC 6

Selection sort

10 56 26 34 82 764 18 60 40

RHS – SOC 7

Selection sort

10 56 26 34 82 764 18 60 40

RHS – SOC 8

Selection sort

10 56 26 34 82 764 18 60 40

RHS – SOC 9

Selection sort

10 56 26 34 82 764 18 60 40

RHS – SOC 10

Selection sort

10 56 26 34 82 764 18 60 40

RHS – SOC 11

Selection sort

10 18 26 34 40 564 60 76 82

RHS – SOC 12

Selection sort

• How fast is selection sort?

• We scan for the smallest element n times– In scan 1, we examine n element– In scan 2, we examine (n-1) element– …and so on

• A total of n + (n -1) + (n – 2) +…+ 2 + 1 examinations

• The sum is n(n + 1)/2

RHS – SOC 13

Selection sort

• The total number of examinations is equal to n(n + 1)/2 = (n2 + n)/2

• The run-time complexity of selection sort is therefore O(n2)

• O(n2) grows fairly fast…

RHS – SOC 14

Selection sort

n n2

2 4

5 25

20 400

50 2500

200 40000

RHS – SOC 15

Exercise 1

• Download the project selectionSortInJava from the PSL website• Examine the code – see how selection sort is implemented in Java• The project contains two helper classes ArrayUtil (generates a

random array of integers), and StopWatch (can measure the time needed to execute some code). Using these classes, the program measures how long it takes to sort an array using selection sort

• Try to run the program with various array sizes. For each run, write down the array size and the elapsed time. Make sure to try some array sizes that take several seconds to complete

• Enter the data into an Excel spreadsheet, plot a curve from the data, and see how the run time behaves when the array size increases

RHS – SOC 16

Merge sort

• Selection sort is conceptually very simple, but not very efficient…

• A different algorithm for sorting is merge sort

• Merge sort is an example of a divide-and-conquer algorithm

• It is also a recursive algorithm

RHS – SOC 17

Merge sort

• The principle in merge sort is to merge two already sorted arrays:

10 26 34 56 18 404 60 76 82

RHS – SOC 18

Merge sort

10 26 34 56 18 404 60 76 82

RHS – SOC 19

Merge sort

• Merging two sorted arrays is pretty simple, but how did the arrays get sorted…?

• Recursion to the rescue!

• Sort the two arrays simply by appying merge sort to them…

• If the array has length 1 (or 0), it is sorted

RHS – SOC 20

Merge sortpublic void sort() // Sort the array a

{

if (a.length <= 1) return; // Base case

int[] a1 = new int[a.length/2]; // Create two new

int[] a2 = new int[a.length – a1.length]; // arrays to sort

System.arraycopy(a,0,a1,0,a1.length); // Copy data to

System.arraycopy(a,a1.length,a2,0,a2.length); // the new arrays

MergeSorter ms1 = new MergeSorter(a1); // Create two new

MergeSorter ms2 = new MergeSorter(a2); // sorter objects

ms1.sort(); // Sort the two

ms2.sort(); // new arrays

merge(a1,a2); // Merge the arrays

}

RHS – SOC 21

Merge sort

• All that is left is the method for merging two arrays

• A little bit tedious, but as such trivial…

• Time needed to merge two arrays to the total length of the arrays, i.e to n

• We can now analyse the run-time com-plexity for merge sort

RHS – SOC 22

Merge sort

• Merge sort of an array of length n requires– Two merge sorts of arrays of length n/2– Merging two arrays of length n/2

• The running time T(n) then becomes:

T(n) = 2×T(n/2) + n

RHS – SOC 23

Merge sort

• If we re-insert the expression for T(n) into itself m times, we get

T(n) = 2m×T(n/2m) + mn

• If we choose m such that n = 2m, we get

T(n) = n×T(1) + mn = n + n×log(n)

RHS – SOC 24

Merge sort

• The run-time complexity of merge sort is therefore O(n log(n))

• Many other sorting algorithms have this run-time complexity

• This is the fastest we can sort, except under very special circumstances

• Much better than O(n2)…

RHS – SOC 25

Merge sort

n n log(n) n2

2 2 4

5 12 25

20 86 400

50 282 2500

200 1529 40000

RHS – SOC 26

Exercise 2

• Download the project mergeSortInJava from the PSL website• Examine the code – see how merge sort is implemented in Java

(the project contains the same helper classes as the selectionSortInJava project – ArrayUtil and StopWatch)

• Try to run the program with various array sizes. For each run, write down the array size and the elapsed time. Make sure to try some array sizes that take several seconds to complete

• Enter the data into an Excel spreadsheet, plot a curve from the data, and see how the run time behaves when the array size increases

• Compare the results with the results obtained for selection sort – when do the curves for run time cross each other (if at all)?

RHS – SOC 27

Sorting in practice

• It does matter which sorting algorithm you use…

• …but do I have to code sorting algorithms myself?

• No! You can – and should – use sorting algorithms found in the Java library

RHS – SOC 28

Sorting in practice

• Sorting an array:

Car[] cars = new Car[n];

…

Arrays.sort(cars);

RHS – SOC 29

Sorting in practice

• Sorting an arraylist:

ArrayList<Car> cars =

new ArrayList<Car>();

…

Collections.sort(cars);

RHS – SOC 30

Sorting in practice

• Why not code my own sorting algorithms?

• Sorting algorithms in Java library are better than anything you can produce…– Carefully debugged– Highly optimised– Used by thousands

• You cannot beat them

RHS – SOC 31

Sorting in practice

• In order to sort an array of data, we need to be able to compare the elements

• ”Larger than” should make sense for the elements in the array

• Easy for numeric types (>)

• What about types we define ourselves…?

RHS – SOC 32

Sorting in practice

• If a class T implements the Comparable interface, objects of type T can be compared:

public interface Comparable<T>

{

int compareTo(T other);

}

RHS – SOC 33

Sorting in practice

• In the interface definition, T is a type parameter

• It is used the same way as we use an arraylist

• ArrayList<Car> : an arraylist holding elements of type Car

RHS – SOC 34

Sorting in practice

• In order for the sorting algorithms to work properly, an implementation of the interface must obey these rules:

• The call a.compareTo(b) must return:– A negative number if a < b– Zero if a = b– A positive number if a > b

RHS – SOC 35

Sorting in practice

• The implementation of compareTo must define a so-called total ordering:– Antisymmetric: If a.compareTo(b) ≤ 0, then b.compareTo(a) ≥ 0

– Reflexive: a.compareTo(a) = 0– Transitive: If a.compareTo(b) ≤ 0 and b.compareTo(c) ≤ 0, then a.compareTo(c) ≤ 0

RHS – SOC 36

Sorting in practicepublic class Car implements Comparable<Car>

{

...

// Here using weight as ordering criterion

//

public int compareTo(Car other)

{

if (getWeight() < other.getWeight()) return -1;

if (getWeight() == other.getWeight()) return 0;

return 1;

}

...

}

RHS – SOC 37

Exercises

• Programming P14.12, P.14.14

• For P14.14, read about the Comparator interface in Advanced Topic 14.5, page 657-658

Sorting. RHS – SOC 2 Sorting Searching in sorted data is much faster (O(log(n)), than searching in...

Documents

Transcript of Sorting. RHS – SOC 2 Sorting Searching in sorted data is much faster (O(log(n)), than searching in...