Algorithms

AlgorithmsAn algorithm is a computable (i.e. finite) set of steps to achieve a desired result.

To analyse an algorithm, we have to consider its:effectiveness (whether it can be rendered as a computer program), correctness (whether it can produce a correct result),termination (whether it can arrive at an answer),efficiency (how much resource is consumed to produce a correct result), program complexity (whether it is easy to understand).

In our context, we mainly concern the run-time efficiency of an algorithm.

Run-time efficiency of an algorithm

If the run-time of a certain searching algorithm is approximately proportional to the number of items in the list to be searched, we say that the run-time efficiency is in the order of n. (e.g. sequential search)

If it is approximately proportional to the square of the number of items in the list to be searched, we say that the run-time efficiency is in the order of n2. (e.g. bubble sort)

… and so on.

Searching Searching means to scan through lists of data

until one or more items of data that match some specific criteria is found.

Efficiency of searching techniques are compared by considering the average number of comparisons that must be made to get the desired item (i.e. the average search length).

In addition, the search time (the average time taken to search for a given item) is also an important factor which determines the efficiency of an algorithm.

Sequential search Sequential search examines the data items in the order they appear in the list until the desired value is located or the end of the list is reached.

Average search length for a list of n data items = (1 + 2 + … + n) / n = (n + 1) / 2run-time efficiency is in the order of n.

Sequential search function SequentialSearch(list: ArrayType; size: integer;

target: ElementType): integer;

var i: integer;

begin

i := 1;

while (i < size) and (target <> list[i]) do

i := i + 1;

if target = list[i] then

SequentialSearch := i

else

SequentialSearch := 0

end;

If the list is already sorted in ascending order, the second condition in the while statement can be changed to (target > list[i]).

Binary search Binary search cuts the list of data items (which must be sorted) in half repeatedly until the search value is found or the list is fully searched. e.g.

Run-time efficiency: in the order of (log2 n)

No. of items

Max. Number of Search

8 3

16 4

32 5

N log2N

Binary searchfunction BinarySearch(list: ArrayType; size: integer; target: ElementType): integer;var first, last, mid, location: integer;begin first := 1; last := size; location := 0; while (first <= last) and (location = 0) do begin mid := (first + last) div 2; if target = list[mid] then location := mid else if target < list[mid] then last := mid - 1 else first := mid + 1 end; BinarySearch := locationend;

Hashing Hashing is a scheme for providing rapid access to data items which are distinguished by some key. Each data item to be stored is associated with a key, e.g. the name of a person.

A hash function is applied to the item’s key and the resulting hash value is used as an index to select one of a number of ‘hash buckets’ (blocks of storage) in a hash table. The table contains pointers to the original items.

Common hash functions - Mid-square

Mid-square: This function is computed by squaring the key and then using an appropriate number of bits from the middle of the square to obtain the bucket address (the key is assumed to fit into one computer word).

e.g.

19762=3904576

Common hash functions - Division

Division: The key X is divided by some number M and the remainder is used as the hash address for X.

e.g. 1976 div 97 = 36

Common hash functions - Folding

Folding: The key X is partitioned into several parts, all but the last being of the same length. These parts are then added together to obtain the hash address for X.

e.g. if X = 1232032411220, and the length of each part is 3 digits, then the hash address = 123 + 203 + 241 + 112 + 20 = 699.

HashingIf two items’ keys hash to the same value (a ‘hash collision’) then some alternative location is used. (e.g. linear probing: the next free location cyclically following the indicated one).

For best performance, the table size and hash function must be tailored to the number of entries and range of keys to be used. The hash function usually depends on the table size so if the table needs to be enlarged it must usually be completely rebuilt.

Run-time efficiency of hashing

Run-time efficiency: in the order of 1 (best case). However, a collision lead to extra searching time.

The memory requirement for implementing hashing (search) is relatively high.

Sorting algorithms Sorting data requires putting the raw data in some predetermined order which might be alphabetical if the data is a set of words, or it might be ascending or descending order if the data is numerical.

Sorting algorithmsSorting algorithms depend very much on the way in which the data is stored, and on particular aspects of hardware.

The following factors, together with considerations such as speed and amount of memory used, plays a part in deciding the most suitable sorting algorithm to use.

Is the data in the main memory? Is it on disk or on tape? Can all the data fit into the main memory at a time?

Sorting algorithms

Sorting methods can be characterised into two broad categories:

Internal sorting methods, which are to be used when the file to be sorted is small enough so that the entire sort can be carried out in main memory.

External sorting methods, which are to be used upon large files, where data are directly stored on secondary storage (such as magnetic disk and magnetic tape).

Bubble sort procedure BubbleSort(var a: list; n: integer);

var

i, pass: integer;

begin

for pass := 1 to n - 1 do

for i := 1 to n - pass do

if a[i] > a[i + 1] then

swap(a[i], a[i + 1])

end;

Run-time efficiency: in the order of n2

Insertion sort This algorithm sorts all data into sequence by inserting elements to suitable positions.

In the nth pass, we investigate the sublist consisting of the first (n + 1) elements. The (n + 1)th element is inserted to its appropriate position in the sublist.

Insertion sort - Example

1 2 3 4 5 7 9 10 8 6 Result of sixth pass

Already in proper position

1 2 3 4 5 7 9 10 8 6 Result of seventh pass

1 2 3 4 5 7 8 9 10 6 Result of eighth pass

1 2 3 4 5 6 7 8 9 10 Result of ninth pass (final)

2 3 4 5 7 9 1 10 8 6 Result of fifth pass

2 3 4 5 7 9 1 10 8 6 Result of fourth pass


2 3 4 7 5 9 1 10 8 6 Result of third pass

2 4 7 3 5 9 1 10 8 6 Result of second pass

2 4 7 3 5 9 1 10 8 6 Result of first pass


4 2 7 3 5 9 1 10 8 6 Initially

Insertion sort - Algorithmprocedure InsertionSort(var a: list; n: integer);var i, j, x: integer;begin for i := 2 to n do begin x := a[i]; j := i - 1; while (j > 0) and (x < a[j]) do begin a[j + 1] := a[j]; j := j - 1 end; a[j + 1] := x endend;


Quick sort This algorithm sorts by splitting the list of data to two sublists and then sorts the two sublists recursively. A value called the pivot is used to split the list into two sublists.

The pivot can be any value of the same type as that of the elements of the list (not necessary chosen from the list). However, the efficiency is the highest if the pivot is the median of the elements to be sorted.

Run-time efficiency: in the order of (n log n)

Quick sort Example4 2 7 3 5 9 1 10 8 6

L i j R

Initially. Pivot = 4

4 2 7 3 5 9 1 10 8 6 L i j R

Move i, j until a[i]x & a[j]x

1 2 7 3 5 9 4 10 8 6 L i j R

Swap and move i & j.

1 2 7 3 5 9 4 10 8 6 L i j R


1 2 3 7 5 9 4 10 8 6 L j i R


Quick sort Example1 2 3 7 5 9 4 10 8 6 Now i>j. Stop moving. L j i R All elements in L1 4

Sublist L1 Sublist L2 All elements in L2 4

1 2 3 7 5 9 4 10 8 6 L i j R

Sort the sublist L1. Pivot = 2

1 2 3 7 5 9 4 10 8 6 L i j R


1 2 3 7 5 9 4 10 8 6 L j i R


1 2 3 7 5 9 4 10 8 6 L j i R

Now i>j. Stop moving. Sublist L1 is sorted successfully.

Quick sort Example1 2 3 7 5 9 4 10 8 6 L i j R


1 2 3 7 5 9 4 10 8 6 L i j R


1 2 3 6 5 9 4 10 8 7 L i j R


1 2 3 6 5 9 4 10 8 7 L i j R


1 2 3 6 5 4 9 10 8 7 L j i R


1 2 3 6 5 4 9 10 8 7 Now i>j. Stop moving.

L j i R All elements in L21 7




1 2 3 6 5 4 9 10 8 7 L i j R


1 2 3 4 5 6 9 10 8 7 L i j R


1 2 3 4 5 6 9 10 8 7 L i j R


1 2 3 4 5 6 9 10 8 7 L j i R


1 2 3 4 5 6 9 10 8 7 L j i R




1 2 3 4 5 6 9 10 8 7 L i j R


1 2 3 4 5 6 7 10 8 9 L i j R


1 2 3 4 5 6 7 10 8 9 L i j R


1 2 3 4 5 6 7 8 10 9 L j i R


1 2 3 4 5 6 7 8 10 9 Now i>j. Stop moving.

L j i R All elements in L221 9



Sort the sublist L221. Pivot = 7 (assumed).

1 2 3 4 5 6 7 8 10 9 L i

j

R


1 2 3 4 5 6 7 8 10 9 j L i R


1 2 3 4 5 6 7 8 10 9 j L i R



Sort the sublist L222. Pivot (x) = 9 (assumed).

1 2 3 4 5 6 7 8 10 9 L i j R


1 2 3 4 5 6 7 8 9 10 L j i R


1 2 3 4 5 6 7 8 9 10 L j i R


Quick sort Example1 2 3 4 5 6 7 8 9 10 L j i R

Sublist L22 is sorted successfully.

1 2 3 4 5 6 7 8 9 10 L j i R

Sublist L2 is sorted successfully.

1 2 3 4 5 6 7 8 9 10 L j i R

The whole list is sorted successfully.

Quick sort Algorithm procedure QuickSort(var a: list; n: integer); procedure QSort(L, R: integer); var i, j, k, x: integer; begin i := L; j := R; x := Pivot(L, R); { Find a pivot for the sublist a[L..R] } repeat while a[i] < x do i := i + 1; while a[j] > x do j := j - 1; if i <= j then begin swap(a[i], a[j]); i := i + 1; j := j - 1 end until i > j; if j - L >= 1 then QSort(L, j); { Recursively sort the left sublist } if R - i >= 1 then QSort(i, R); { Recursively sort the right sublist

} end;begin { QuickSort } QSort(1, n);end;

Merge sort This algorithm sorts all data into sequence by merging each pair of two sorted sublists into one recursively.

In merge sort, extra space has to be allocated to temporarily store the merged list (note that in other sorting algorithms, there is no need to allocate extra blocks of memory for temporary list storage), thus merge sort is regarded as external sorting method.

Run-time efficiency: in the order of (n log n)

Merge sort - Example4 2 7 3 5 9 1 10 8 6

Sublist L1 Sublist L2

Initially

4 2 7 3 5 9 1 10 8 6 Sublist L11 Sublist L12 Sublist L2

Merge-sort sublist L1

4 2 7 3 5 9 1 10 8 6 Sublist L111 Sublist

L112



4 2 7 3 5 9 1 10 8 6 Sublist L1111

Sublist L1112

Sublist L112



2 4 7 3 5 9 1 10 8 6 Sublist L111' Sublist

L112


Merge sublists L1111 & L1112 to form sorted sublist L111'


Sublist L11' Sublist L122 Sublist L2

Merge sublists L111' & L112 to form sorted sublist L11'


L121 Sublist

L122 Sublist L2


2 4 7 3 5 9 1 10 8 6 Sublist L11' Sublist L12' Sublist L2


2 3 4 5 7 9 1 10 8 6 Sublist L1' Sublist L2

Merge sublists L11' & L12' to form sorted sublist L1'


Sublist L1' Sublist L21 Sublist L22


2 3 4 5 7 9 1 10 8 6 Sublist L1' Sublist L211 Sublist

L212 Sublist L22



L2111 Sublist L2112

Sublist L212

Sublist L22


2 3 4 5 7 1 9 10 8 6 Sublist L1' Sublist L211' Sublist

L212 Sublist L22



Sublist L1' Sublist L21' Sublist L22

Merge sublists L211' & L212 to form sorted sublist L21'

2 3 4 5 7 1 9 10 8 6 Sublist L1' Sublist L21' Sublist

L221 Sublist

L222


2 3 4 5 7 1 9 10 6 8 Sublist L1' Sublist L21' Sublist L22'


2 3 4 5 7 1 6 8 9 10 Sublist L1' Sublist L2'

Merge sublists L21' & L22' to form sorted sublist L2'

1 2 3 4 5 6 7 8 9 10 The whole sorted list

Merge sublists L1' & L2' to form the whole sorted list

Merge sort - Algorithmprocedure MergeSort(var a: list; n: integer); procedure MSort(L, R: integer); var mid: integer; begin if L < R then begin mid := (L + R) div 2; MSort(L, mid); { Recursively sort the left sublist } MSort(mid + 1, R);{ Recursively sort the right sublist } Merge(L, mid, mid + 1, R) { Merge the two sublists } end end; begin { MergeSort } MergeSort(1, n);end;

Selection sort This algorithm sorts all data into sequence by exchanging the largest and the last element at each pass (assuming sorting to ascending order).


Selection sort – Example4 2 7 3 5 9 1 10 8 6 Initially

4 2 7 3 5 9 1 6 8 10 Result of first pass

4 2 7 3 5 8 1 6 9 10 Result of second pass

4 2 7 3 5 6 1 8 9 10 Result of third pass

4 2 1 3 5 6 7 8 9 10 Result of fourth pass

4 2 1 3 5 6 7 8 9 10 Result of fifth pass

Selection sort – Example4 2 1 3 5 6 7 8 9 10 Result of sixth pass

3 2 1 4 5 6 7 8 9 10 Result of seventh pass

1 2 3 4 5 6 7 8 9 10 Result of eighth pass

1 2 3 4 5 6 7 8 9 10 Result of ninth pass (final)

Selection sort – Algorithmprocedure SelectionSort(var a: list; n: integer);var i, j, pos, largest: integer;begin for i := n downto 2 do begin largest := a[1]; pos := 1; for j := 1 to i do if a[j] > largest then begin largest := a[j]; pos := j end; swap(a[i], a[pos]) endend;

Shell sort This algorithm divides the list of data into several interlaced sublists and sorts them (usually by insertion sort).

The gap between two adjacent elements in the interlaced sublists is gradually decreased until the gap is equal to 1.

Using this method, the time used for sorting the whole list using insertion sort is shorter.

run-time efficiency : in the order of (n log n)

Shell sort - Example4 2 7 3 5 9 1 10 8 6 Initially

Initial gap = 10/ 2 = 5

4 9 1st sublist for gap = 5

2 1 2nd sublist for gap = 5

7 10 3rd sublist for gap = 5

3 8 4th sublist for gap = 5

5 6 5th sublist for gap = 5

4 9 Sorted 1st sublist for gap = 5

1 2 Sorted 2nd sublist for gap = 5

7 10 Sorted 3rd sublist for gap = 5

3 8 Sorted 4th sublist for gap = 5

5 6 Sorted 5th sublist for gap = 5

4 1 7 3 5 9 2 10 8 6 Intermediate result

Shell sort - ExampleNext gap = 5/ 2 = 2

4 7 5 2 8 Sorted 1st sublist for gap = 2

1 3 9 10 6 Sorted 2nd sublist for gap = 2



2 1 4 3 5 6 7 9 8 10 Intermediate result

Next gap = 2/ 2 = 1

4 1 7 3 5 6 2 9 8 10 Sort the whole list

1 2 3 4 5 6 7 8 9 10 Final result

Shell sort - Algorithmprocedure ShellSort(var a: list; n: integer);var i, j, x, gap: integer;begin gap := InitialGap(n); { The initial gap } while gap > 0 do begin for i := gap + 1 to n do begin { i begins at the 2nd key of the first sublist } x := a[i]; { The item to be inserted } j := i - gap; while (j > 0) and (x < a[j]) do begin a[j + gap] := a[j]; j := j - gap end; a[j + gap] := x end; gap := NextGap(gap, n) { The next gap (should be smaller) } endend;

Merging Merging is a task of combining two or more groups of data to form a new one that preserves the order of the original groups of data.

Merging is usually operated on files and lists. The files or the lists to be merged are already sorted in a particular order (e.g. in ascending order of the key field), and the merged file or the merged list is sorted in the same order as that of the original files or lists.

Merging Example

4 11 15 20 31 3 7 11 21 35 39 a b c d e f g h i j k 3 4 7 11 15 20 21 31 35 39 f a g b + h c d i e j k

Merging Algorithmprocedure merge(A, B: list; var C: list);

var iA, iB, iC: integer;

begin

iA := 1; iB := 1; iC := 1;

while (iA <= NumberOfEntries(A)) and (iB <= NumberOfEntries(B)) do begin

if A[iA].key < B[iB].key then begin

C[iC] := A[iA]; iC := iC + 1; iA := iA + 1

end

else if A[iA].key > B[iB].key then begin

C[iC] := B[iB]; iC := iC + 1; iB := iB + 1

end

else begin

{ Here A[iA].key = B[iB].key and we should combine the two entries

according to the need of actual application }

C[iC] := combine(A[iA], B[iB]);

iC := iC + 1; iA := iA + 1; iB := iB + 1

end

end;

Merging Algorithm (cont’d) while iA <= NumberOfEntries(A) do begin

C[iC] := A[iA];

iC := iC + 1; iA := iA + 1

end;

while iB <= NumberOfEntries(B) do begin

C[iC] := B[iB];

iC := iC + 1; iB := iB + 1

end;

end;

run-time efficiency: in the order of n.

Merging and external sorting

Sometimes we want to sort a file which contains huge amount of records stored on a hard disk, but the file is so huge that it cannot fit into the main memory. In this case, external sorting is required.

The basic idea of sorting a huge file stored in a secondary storage is: breaking the huge file into smaller files, each of which is small en

ough to fit into the main memory; for all these small files, loading each one into the main memory, s

orting it and saving it into the secondary storage; merging the smaller files to give a sorted single file.

File updating Sequential file updating

It is not possible to write data to existing sequential files. Therefore, to update a file, we have to write the updated records to a new file. However, if the file size is small, we can also use arrays to manipulate the records in the file.

Appending data to a file Appending data to a file means adding the data to the end of the existing file. The algorithm is as follows:

1. Open the source file for input and open a new file for output.2. While the end of the source file is not reached, repeat

steps 3 to 4:3. Read a record from the source file.4. Write the record to the new file.5. Read a new record.6. While there are more records, repeat steps 7 to 8:7. Write the new record to the new file.8. Read the next new record.9. Close all files.

Inserting a new record into a file

When a new record is inserted into a sequential file, it must be inserted in the proper position so that the order of the sequence is kept.

1. Get the new record NRec to be inserted.2. Set the Boolean variable Done to FALSE.3. While the end of the source file is not reached and Done is FALSE,

repeat steps 4 to 6:4. Read a record from the source file into MRec.5. If key of NRec < key of MRec, then

write NRec to the new file and set Done to TRUE.6. Write MRec to the new file.7. If Done is FALSE,

then write NRec to the new file, else copy the remaining records in the source file to the new file.

Deleting a record from a file 1. Get the key field of the record to be deleted.2. Search the source file for a match and copy each non-match to a n

ew file.3. Copy the remainder of the input file to the new file.

Changing a record in a file 1. Get the key field of the record to be changed.2. Search the input file for a match and copy each non-match

to a new file.3. Make any changes to the record and write it to the new file.4. Copy the remainder of the input file to the new file.

Direct-access file updating Standard Pascal can only process a file sequentially. Turbo Pascal provides several functions and procedures that enable direct access to a binary file.

With direct access, we can intermix read and write operations. To open a binary file for direct access:

use assign to associate a file variable with a file, and use reset to open the file.

Function filesizeSyntax

filesize(<File Variable>)

This function returns the number of components in the binary file <File Variable>.

Procedure seek

Syntax

seek(<File Variable>, <Record Number>)

This procedure moves the file location marker for the binary file associated with <File Variable> to the file component number <Record Number>, where the first component has number 0.

If f is a file variable, then the range of valid <Record Number> is from 0 to filesize(f) – 1 inclusive.

Algorithms

Documents

Transcript of Algorithms