Post on 14-Apr-2018
7/30/2019 Sorting Classnotes AAM
1/28
Introduction
Sorting and searching are fundamental operations in computer science. Sorting
refers to the operation of arranging data in some given order. Searching refers to
the operation of searching the particular record from the existing information.
Normally, the information retrieval involves searching, sorting and merging. In thischapter we will discuss the searching and sorting techniques in detail.
Sorting
Sorting is very important in every computer application. Sorting refers to arranging
of data elements in some given order. Many sorting algorithms are available to sort
the given set of elements. We will now discuss two sorting techniques and analyze
their performance. The two
Techniques are: internal sorting external sorting
Internal sorting
Internal Sorting takes place in the main memory of a computer. The internal
sorting methods are applied to small collection of data. It means that, the entire
collection of data to be sorted in small enough that the sorting can take place
within main memory. We will study the following methods of internal sorting Insertion sort Selection sort Merge sort Radix sort Quick sort Heap sort Bubble sort
7/30/2019 Sorting Classnotes AAM
2/28
Insertion sort
In this sorting we can read the given elements from 1 to n, inserting each element
into its proper position. For example, the card player arranging the cards dealt to
him. The player picks up the card and inserts them into the proper position. At
every step, we insert the item into its proper place.
This sorting algorithm is frequently used when n is small. The insertion sort
algorithm scans A from A[l] to A[N], inserting each element A[K] into its proper
position in the previously sorted subarray A[l], A[2], . . . , A[K-1]. That is:
Pass 1. A[l] by itself is trivially sorted.
Pass 2. A[2] is inserted either before or after A[l] so that: A[l], A[2] is sorted.Pass 3. A[3] is inserted into its proper place in A[l], A[2], that is, before A[l],
between A[l] and A[2], or after A[2], so that: A[l], A[2], A[3] is sorted.
Pass 4. A[4] is inserted into its proper place in A[l], A[2], A[3] so that:
A[l], A[2], A[3], A[4] is sorted.
Pass N. A[N] is inserted into its proper place in A[l], A[2], . . . , A[N - 1] sothat: A[l], A[2], . . . ,A[N] is sorted.
7/30/2019 Sorting Classnotes AAM
3/28
Algorithm INSERTION ( A , N )
This algorithm sorts the array A with N elements
1. Set A[0] := -- . [initializes the element]
2. Repeat Steps 3 to 5 for K= 2,3, ,N
3. Set TEMP := A[K] and PTR:= K-1
4. Repeat while TEMP < A[PTR]
(a) Set A[PTR +1]:=A[PTR] [Moves element forward]
(b) Set PTR := PTR-1
[End of loop].
5. Set A[PTR+1] := TEMP [inserts element in proper place]
[End of Step 2 loop]
6. Return
Selection sort
In this sorting we find the smallest element in this list and put it in the first
position. Then find the second smallest element in the list and put it in the second
position. And so on.
Pass 1. Find the location LOC of the smallest in the list of N elements A[l], A[2], .
. . , A[N], and then interchange A[LOC] and [1] . Then A[1] is sorted.
Pass 2. Find the location LOC of the smallest in the sublist of N 1 Elements
A[2], A[3],. . . , A[N], and then interchangeA[LOC] and A[2]. Then:A[l], A[2] is
sorted, since A[1]
7/30/2019 Sorting Classnotes AAM
4/28
Example
Suppose an array A contains 8 elements as follows:77, 33, 44, 11, 88, 22, 66, 55
Algorithm
1. To find the minimum element
MIN ( A, K , N, LOC)
An array A is in memory. This procedure finds the location
LOC of the smallest element among A[K] , A[K+1],.A[N].
1. Set MIN:= A[K] and LOC := K [Initializes pointers]2. Repeat for J = K +1, K+2
If MIN > A [J] , then : Set MIN := A[J] and LOC := A[j]
and LOC: = J
3. Return
2. To Sort the elements
SELECTION (A, N)
1. Repeat Steps 2 and 3 form K= 1,2, .., N 1
2. Call MIN(A,K,N,LOC)
3. [Interchange A[K] and A[LOC] ]
Set TEMP: = A [K], A [K]:= A [LOC] and A [LOC]:=TEMP4. Exit.
7/30/2019 Sorting Classnotes AAM
5/28
Merge sort
Combing the two lists is called as merging. For example A is a sorted list with r
elements and B is a sorted list with s elements. The operation that combines the
elements of A and B into a single sorted list C with n = r + s elements is called
merging. After combing the two lists the elements are sorted by using the
following merging algorithm Suppose one is given two sorted decks of cards. The
decks are merged as in Fig. That is, at each step, the two front cards are compared
and the smaller one is placed in the combined deck. When one of the decks is
empty, all of the remaining cards in the other deck are put at the end of the
combined deck. Similarly, suppose we have two lines of students sorted by
increasing heights, and suppose we want to merge them into a single sorted line.
The new line is formed by choosing, at each step, the shorter of the two students
who are at the head of their respective lines. When one of the lines has no more
students, the remaining students line up at the end of the combined line.
The above discussion will now be translated into a formal algorithm which merges
a sorted r-element array A and a sorted s-element array B into a sorted array C,
with n = r + s elements. First of all, we must always keep track of the locations of
the smallest element of A and the smallest element of B which have not yet been
7/30/2019 Sorting Classnotes AAM
6/28
placed in C. Let NA and NB denote these locations, respectively. Also, let PTR
denote the location in C to be filled. Thus, initially, we set NA : = 1, NB : = 1 and
PTR : = 1. At each step of the algorithm, we compare A[NA] and B[NB] and
assign the smaller element to C[PTR]. Then we increment PTR by setting PTR:=
PTR + 1, and we either increment NA by setting NA: = NA + 1 or increment NB
by setting NB: = NB + 1, according to whether the new element in C has come
from A or from B. Furthermore, if NA> r, then the remaining elements of B are
assigned to C; or if NB > s, then the remaining elements of A are assigned to C.
Algorithm MERGING ( A, R, B, S, C)
Let A and B be sorted arrays with R and S elements. This algorithm
merges A and B into an array C with N = R + S elements.
1. [Initialize ] Set NA : = 1 , NB := 1 AND PTR : = 1
2. [Compare] Repeat while NA
7/30/2019 Sorting Classnotes AAM
7/28
Quick sort Algorithm
Quick sort is one of the example of "Divide and Conquer approach" forsolving the problems.
Quick sort algorithm works by placing the last element of queue inproper position through comparing the other element from the first end of
queue.
The steps followed by quick sort algorithm are as follows:1. Adjust the dividing (pivot) point in the queue i.e last element of the
queue.
2. Then compare each element of the queue from the beginning of thequeue if condition satisfy that element is less then pivot element then
place it left hand side by exchanging the element else greater element
than pivot element will be at right hand side.
3. After completing a iteration exchange the pivot element with theexact element from where all element in left hand side are less & right
hand side are greater and after placing pivot element divide the queue
in two parts.
4. After dividing in two parts again choose the pivot element in both thequeues and sort them separately by repeating step 1,2,3,4.
5. Repeat the process until the queue is not sorted and after sorting eachsub queues recursively combine them to one one sorted queue.
Now we move to see the quick sort algorithm structure as follow:Firstly to set our pointers to get partition of an array:
Quick sort(Array, S ,Piv)
If S < Piv
Then q Partition(Array, S, Piv)
Quick_sort(Array, S, q-1)
Quick_sort(Array, q+1,Piv)
Partition (Array, S , Piv)
x Array[Piv]
i S - 1
For j S to Piv-1
do If Array[j] xThen i i + 1
Exchange Array[i] Array[j]
Exchange Array[i + 1] Array[Piv]
Return i + 1 (Return the position of S)
7/30/2019 Sorting Classnotes AAM
8/28
7/30/2019 Sorting Classnotes AAM
9/28
7/30/2019 Sorting Classnotes AAM
10/28
7/30/2019 Sorting Classnotes AAM
11/28
7/30/2019 Sorting Classnotes AAM
12/28
Bubble sort Algorithm
Bubble sort and somes say it as sinking sort. Selection sort algorithm simply start sorting step by step comparing element to
the next element and swapping them this procedure repeat's until all element in
array is sorted in some sequence accordingly.
Bubble sort algorithm gets name bubble because of sorting the elements inarray in shorter range i.e just next value of the element in array is checked and
swapped or we can say sorting function is perform in very smaller time that is
why it is also called comparison sort.
Now we will see the algorithm structure as follows:
7/30/2019 Sorting Classnotes AAM
13/28
7/30/2019 Sorting Classnotes AAM
14/28
7/30/2019 Sorting Classnotes AAM
15/28
Radix Sort
Radix sorting involves looking at a radix (or digit) of a number and placing it in
an array of linked lists to sort it.
Algorithm for radix sorting:
1. Look at the rightmost digit.2. Assign the full number to that digits index.3. Look at the next digit to the left FROM the current sorted array. IF there is
no digit, pad a 0.
4. REPEAT STEP 3 UNTIL all numbers have been sorted.Let's see a step by step example of a radix sort of the following set of unsorted
numbers. The bold digits here represent the first digit to look at when attempting to
sort the list. You must also append it to the end of that linked list in the array.
212 21 72 5 431 898 616 24 9
Step 1:
0
1 21 -> 431
2 212 -> 72
3
4 24
5 05
6 616
7
8 898
9 09
Step 2: (working from step 1)
0 005 -> 009
1 212 -> 616
2 021 -> 024
3 431
45
6
7 072
8
9 898
7/30/2019 Sorting Classnotes AAM
16/28
Step 3: (working from step 2)
0 5 -> 9 -> 21 -> 24 -> 72
1
2 212
3
4 431
5
6 616
7
8 898
9
Step 3 is the final step and the list is sorted.
The benefits of a radix sort is the fact that it can be done by pencil and paper. It
also only contains a fixed data structure (an array of size 10). The downside of
radix sort is that it takes time to implement since you may manually go throughnumerous steps to sort the list depending on how many numbers you have to sort.
Here is another example of radix sort, this time using numbers up to 4 digits in
length. You will notice something interesting here
58 99 999 47 200 101 1002 12 1111
Step 1:
0 200
1 101 -> 1111
2 1002 -> 12
3
4
5
6
7 47
8 58
9 99 -> 999
Step 2: (working from step 1)
0 200 -> 101 -> 10021 1111 -> 012
2
3
4 047
5 058
6
7/30/2019 Sorting Classnotes AAM
17/28
7
8
9 099 -> 999
Step 3: (working from step 2)
0 1002 -> 0012 -> 0047 -> 0058 -> 0099
1 0101 -> 1111
2 0200
3
4
5
6
7
8
9 0999Step 4: (working from step 3)
0 12 -> 47 -> 58 -> 99 -> 101 -> 200 -> 999
1 1002 -> 1111
2
3
4
5
6
7
8
9
Step 4 is the final step here. Notice however that the index 0 goes from 0 to 999
while 1 goes from 1000 to 1999 etc...
7/30/2019 Sorting Classnotes AAM
18/28
Heapsort
Heaps
The (Binary) heap data structure is an array object that can be viewed as a nearly
complete binary tree.
A binary tree with n nodes and depth k is complete iff its nodes correspond to
the nodes numbered from 1 to n in the full binary tree of depth k.
7/30/2019 Sorting Classnotes AAM
19/28
Attributes of a Heap
An array A that presents a heap with two attributes:
length[A]: the number of elements in the array.
heapsize[A]: the number of elements in the heap stored with
array A.
length[A] heapsize[A]
Basic procedures
If a complete binary tree with n nodes is represented
sequentially, then for any node with index i, 1 i n, we have
A[1] is the root of the tree
the parent PARENT(i) is at i/2if i 1
the left child LEFT(i) is at 2i
the right child RIGHT(i) is at 2i+1
The LEFT procedure can compute 2i in one instruction by simply shifting thebinary representation of i left one bit position.
Similarly, the RIGHT procedure can quickly compute 2i+1 by shifting the
binary representation of i left one bit position and adding in a 1 as the loworder
bit.
The PARENT procedure can compute i/2 by shifting i right one bit position.
7/30/2019 Sorting Classnotes AAM
20/28
Heap properties
There are two kind of binary heaps: maxheaps and minheaps.
In a maxheap, the maxheap property is that for every node i other than the
root, A[PARENT(i) ] A[i] .
the largest element in a maxheap is stored at the root the subtree rooted at a node contains values no larger than that contained at the
node itself
In a minheap, the minheap property is that for every node i other than the root,
A[PARENT(i) ] A[i] .
the smallest element in a minheap is at the root
the subtree rooted at a node contains values no smaller than that contained at the
node itself
7/30/2019 Sorting Classnotes AAM
21/28
The height of a heap
The height of a node in a heap is the number of edges on the longest simple
downward path from the node to a leaf, and the height of the heap to be the height
of the root, that is (lgn).
For example:
the height of node 2 is 2
the height of the heap is 3
The MAXHEAPIFY procedure
MAXHEAPIFY is an important subroutine for manipulating max heaps.
Input: an array A and an index i
Output: the subtree rooted at index i becomes a max heap Assume: the binary trees rooted at LEFT(i) and RIGHT(i) are maxheaps, but
A[i] may be smaller than its children
Method: let the value at A[i] float down in the maxheap
MAXHEAPIFY(A, i)
1. l LEFT(i)
2. r RIGHT(i)
3. if l heapsize[A] and A[l] > A[i]
4. then largest l
5. else largest i6. if r heapsize[A] and a[r] > A[largest]
7. then largest r
8. if largest i
9. then exchange A[i] A[largest]10. MAXHEAPIFY (A, largest)
7/30/2019 Sorting Classnotes AAM
22/28
Building a Heap
We can use the MAXHEAPIFY procedure to convert an array A=[1..n] into amaxheap in a bottomup manner.
The elements in the subarray A[(n/2+1)n ] are all leaves ofthe tree, and so
each is a 1element heap.
The procedure BUILDMAXHEAP goes through the remaining nodes of the
tree and runs MAXHEAPIFY on each one.
BUILDMAXHEAP(A)
1. heapsize[A] length[A]
2. for i length[A]/2 downto 13. do MAXHEAPIFY(A,i)
7/30/2019 Sorting Classnotes AAM
23/28
7/30/2019 Sorting Classnotes AAM
24/28
The heapsort algorithm
Since the maximum element of the array is stored at the root, A[1] we can
exchange it with A[n].
If we now discard A[n], we observe that A[1...(n1)] can easily be made into
a maxheap.
The children of the root A[1] remain maxheaps, but the new root A[1] element
may violate the maxheap property, so we need to readjust the maxheap. That is to
call MAXHEAPIFY(A, 1).
HEAPSORT(A)
1. BUILDMAXHEAP(A)
2. for i length[A] downto 2
3. do exchange A[1] A[i]
4. heapsize[A] heapsize[A] 15. MAXHEAPIFY(A, 1)
7/30/2019 Sorting Classnotes AAM
25/28
7/30/2019 Sorting Classnotes AAM
26/28
3 essential properties of algorithms:
In computer science, an in-place algorithm (or in Latin in situ) is an algorithm
which transforms input using a data structure with a small, constant amount of
extra storage space. The input is usually overwritten by the output as the algorithm
executes. An algorithm which is not in-place is sometimes called not-in-place or
out-of-place
In computer science, an online algorithm is one that can process its input piece-by-
piece in a serial fashion, i.e., in the order that the input is fed to the algorithm,
without having the entire input available from the start. In contrast, an offline
algorithm is given the whole problem data from the beginning and is required to
output an answer which solves the problem at hand.
A sorting algorithm is said to be stable if two objects with equal keys appear in thesame order in sorted output as they appear in the input unsorted array.
Algorithm In-place Online Stable
Insertion sort Yes Yes Yes
Selection sort Yes No No
Merge sort No Yes Yes
Radix sort No No Yes
Quick sort Yes Yes NoHeap sort Yes No No
Bubble sort Yes No Yes
External sorting
External sorting is a term for a class of sorting algorithms that can handle massive
amounts of data. External sorting is required when the data being sorted do not fit
into the main memory of a computing device (usually RAM) and instead they must
reside in the slower external memory (usually a hard drive). External sorting
typically uses a hybrid sort-merge strategy. In the sorting phase, chunks of datasmall enough to fit in main memory are read, sorted, and written out to a temporary
file. In the merge phase, the sorted subfiles are combined into a single larger file.
7/30/2019 Sorting Classnotes AAM
27/28
Basic External Sorting Algorithm
Assume unsorted data is on disk at start
Let M = maximum number of records that can be stored & sorted in internal
memory at one time
Algorithm
Repeat:
1. Read M records into main memory & sort internally.2. Write this sorted sub-list onto disk. (This is one run).
Until all data is processed into runs
Repeat:
1. Merge two runs into one sorted run twice as long2. Write this single run back onto disk
Until all runs processed into runs twice as long
Merge runs again as often as needed until only one large run: the sorted list
7/30/2019 Sorting Classnotes AAM
28/28