Computer Sciences Department 1
Computer Sciences Department 3
Book: Introduction to
Algorithms, by:Thomas H. CormenCharles E. LeisersonRonald L. RivestClifford Stein
Electronic: http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-006-introduction-to-algorithms-fall-2011/
Reference
Computer Sciences Department 4
The Role of Algorithms in Computing
Computer Sciences Department 5
The Role of Algorithms in Computing The problem of sorting What kinds of problems are solved by
algorithms? Hard problems Algorithms as a technology Insertion sort
Analysis of insertion sort Example of insertion sort (Best, Worst and Average) case analysis
Merge sort The divide-and-conquer approach Analyzing merge sort Example of merge sort Recursion tree
Lecture Contents (objectives)
Computer Sciences Department 6
What are algorithms? Why is the study of algorithms worthwhile? What is the role of algorithms relative to
other technologies used in computers?
The Role of Algorithms in Computing
Computer Sciences Department 7
Informally, an algorithm is any well-defined computational procedure that takes some value, or set of values, as input and produces some value, or set of values, as output.
An algorithm is thus a sequence of computational steps that transform the input into the output.
Algorithms
Computer Sciences Department 8
Design and Analysis of Algorithms
• Analysis: predict the cost of an algorithm in terms of resources and performance
• Design: design algorithms which minimize the cost
Computer Sciences Department 9
Computational problem.
Algorithms
In general, an instance of a problem consists of the input (satisfying whatever constraints are imposed in the problem statement) needed to compute a solution to the problem.
Computer Sciences Department 10
The problem of sorting
Input: sequence <a1, a2, …, an > of numbers.
Example:
Input: 8 1 4 9 3 7
Output: 1 3 4 7 8 9
Output: permutation <a'1, a'2, …, a'n > suchthat a'1 £ a'2 £ … £ a'n .
Computer Sciences Department 11
An algorithm is said to be correct if, for every input instance, it halts with the correct output
Computer Sciences Department 12
The Human Genome Project - has the goals of identifying all the 100,000 genes in human DNA, determining the sequences of the 3 billion chemical base pairs that make up human DNA - storing this information in databases, and developing tools for data analysis.
The Internet enables people all around the world to quickly access and retrieve large amounts of information. (to manage and manipulate this large volume of data)
What kinds of problems are solved by algorithms?
Computer Sciences Department 13
Electronic commerce.
In manufacturing and other commercial settings.
What kinds of problems are solved by algorithms? (cont’d)
Example: an equation ax ≡ b (mod n), where a, b, and n are integers, and we wish to find all the integers x, modulo n, that satisfy the equation.There may be zero, one, or more than one such solution. We can simply try x = 0, 1, . . . , n − 1 in order, but Chapter 31 shows a more efficient method.
Computer Sciences Department 14
A data structure is a way to store and organize data in order to facilitate access and modifications.
Data structure
Technique
Computer Sciences Department 15
Efficient algorithms. NP-complete problems. (decision problem) – First, although no efficient algorithm for an
NP-complete problem has ever been found, nobody has ever proven that an efficient algorithm for one cannot exist.
It is unknown whether or not efficient algorithms exist for NP-complete problems.
Hard problems
Computer Sciences Department 16
Computer Sciences Department 17
Would you have any reason to study algorithms? YES. If computers were infinitely fast, any correct method
for solving a problem would do. Computers may be fast, but:
- memory may be cheap, but it is not free. Computing time is therefore a bounded resource, and so is space in memory. These resources should be used wisely, and algorithms that are efficient in terms of time or space will help you do so.
Algorithms as a technology
Computer Sciences Department 18
Which computer/ algorithm is faster?
Very important
Computer Sciences Department 19
Computer Sciences Department 20
Computer Sciences Department 21
Getting Started
Computer Sciences Department 22
Running time
• The running time depends on the input: an already sorted sequence is easier to sort.
• Major Simplifying Convention: Parameterize the running time by the size of the input, since short sequences are easier to sort than long ones. TA(n) = time of A on length n inputs
• Generally, seek upper bounds on the running time, to have a guarantee of performance.
Computer Sciences Department 23
Kinds of analysesWorst-case: (usually)
• T(n) = maximum time of algorithm on any input of size n. (upper bound on the running time)
Average-case: (sometimes)• T(n) = expected time of algorithm over all inputs
of size n.
Best-case: • T(n) = minimum time of algorithm ((fastest
time to complete, with optimal inputs chosen) (lower bound on the running time))
Computer Sciences Department 24
The best case, in which the input array was already sorted, and the worst case, in which the input array was reverse sorted.
The worst-case running time of an algorithm is an upper bound on the running time for any input.
Worst-case and average-case analysis
How long does it take to determine where in sub-array A[1 . . j − 1] to insert element A[ j ]?
Computer Sciences Department 25
Suppose that we randomly choose n numbers and apply insertion sort. How long does it take to determine where in sub-
array A[1 . . j − 1] to insert element A[ j ]?
On average, half the elements in A[1 . . j − 1] are less than A[ j ], and half the elements are greater.
On average, therefore, we check half of the sub-array A[1 . . j − 1], so t j = j/2.
Average case
Computer Sciences Department 26
Insertion sort
The numbers that we wish to sort are also known as the keysInsertion sort, is an efficient algorithm for sorting a small number of elements.
6 5 3 1 8 7 2 4
Start
One move / comparing / insert it into the correct position
Computer Sciences Department 27
Insertion sort (cont’d)
Computer Sciences Department 28
Insertion sort (cont’d) “Pseudo-code conventions”
i j
keysorted
A:1 n
Length[A]=n
Computer Sciences Department 29
predicting the resources that the algorithm requires.
random-access machine (RAM)- Instructions are (executed one after another, with no concurrent operations)
The data types in the RAM model are integer and floating point and limit on the size of each word of data.
Is exponentiation a constant time instruction? “shift left” instruction.
Analyzing algorithms
Computer Sciences Department 30
The time taken by the INSERTION-SORT procedure depends on the input.
INSERTION SORT can take different amounts of time:
- to sort two input sequences of the same size depending on how nearly sorted they already are.- to sort thousand numbers or three numbers.
The running time of an algorithm on a particular input is the number of primitive operations or “steps” executed.
Analysis of insertion sort
Computer Sciences Department
Analysis of insertion sort (cont’d)
Best case
Worst case
31
Computer Sciences Department 32
Analysis of insertion sort (cont’d)
Computer Sciences Department 33
Explanation
a b
c =an2+bn-c
Computer Sciences Department 34
Example of insertion sort8 1 4 9 3 7
Computer Sciences Department 35
Example of insertion sort8 1 4 9 3 7
Computer Sciences Department 36
Example of insertion sort8 1 4 9 3 7
1 8 4 9 3 7
Computer Sciences Department 37
Example of insertion sort8 1 4 9 3 7
1 8 4 9 3 7
Computer Sciences Department 38
Example of insertion sort8 1 4 9 3 7
1 8 4 9 3 7
1 4 8 9 3 7
Computer Sciences Department 39
Example of insertion sort8 1 4 9 3 7
1 8 4 9 3 7
1 4 8 9 3 7
Computer Sciences Department 40
Example of insertion sort8 1 4 9 3 7
1 8 4 9 3 7
1 4 8 9 3 7
1 4 8 9 3 7
Computer Sciences Department 41
Example of insertion sort8 1 4 9 3 7
1 8 4 9 3 7
1 4 8 9 3 7
1 4 8 9 3 7
Computer Sciences Department 42
Example of insertion sort8 1 4 9 3 7
1 8 4 9 3 7
1 4 8 9 3 7
1 4 8 9 3 7
1 3 4 8 9 7
Computer Sciences Department 43
Example of insertion sort
1 3 4 8 9 7
8 1 4 9 3 7
1 8 4 9 3 7
1 4 8 9 3 7
1 4 8 9 3 7
Computer Sciences Department 44
Example of insertion sort
1 3 4 7 8 9
1 3 4 8 9 7
8 1 4 9 3 7
1 8 4 9 3 7
1 4 8 9 3 7
1 4 8 9 3 7
Computer Sciences Department 45
Divide the n-element sequence to be sorted into two subsequences of n/2 elements each.
The merge sort algorithm closely follows the divide-and-conquer paradigm.
The divide-and-conquer approach
Computer Sciences Department 46
Computer Sciences Department 47
MERGE-SORT
MERGE-SORT A[1 . . n]1. If n = 1, done.2. Recursively sort A[ 1 . . n/2 ]
and A[ n/2+1 . . n ] .3. “Merge” the 2 sorted lists.
Computer Sciences Department 48
Analyzing merge sort
MERGE-SORT A[1 . . n]1. If n = 1, done.2. Recursively sort A[ 1 . . n/2 ]
and A[ n/2+1 . . n ] .3. “Merge” the 2 sorted lists
T(n)Q(1)2T(n/2)
Q(n)
T(n) =Q(1) if n = 1;
2T(n/2) + Q(n) if n > 1.
Recurrence for merge sort
Computer Sciences Department 49
Computer Sciences Department 50
MERGE(A, p, q, r), where A is an array and p, q, and r are indices numbering elements of the array such that p ≤ q < r.
Merge sort
Computer Sciences Department 51
Computer Sciences Department 52
Merging two sorted arrays
20
13
7
2
12
11
9
1
Computer Sciences Department 53
Merging two sorted arrays
20
13
7
2
12
11
9
1
1
Computer Sciences Department 54
Merging two sorted arrays
20
13
7
2
12
11
9
1
1
20
13
7
2
12
11
9
Computer Sciences Department 55
Merging two sorted arrays
20
13
7
2
12
11
9
1
1
20
13
7
2
12
11
9
2
Computer Sciences Department 56
Merging two sorted arrays
20
13
7
2
12
11
9
1
1
20
13
7
2
12
11
9
2
20
13
7
12
11
9
Computer Sciences Department 57
Merging two sorted arrays
20
13
7
2
12
11
9
1
1
20
13
7
2
12
11
9
2
20
13
7
12
11
9
7
Computer Sciences Department 58
Merging two sorted arrays
20
13
7
2
12
11
9
1
1
20
13
7
2
12
11
9
2
20
13
7
12
11
9
7
20
13
12
11
9
Computer Sciences Department 59
Merging two sorted arrays
20
13
7
2
12
11
9
1
1
20
13
7
2
12
11
9
2
20
13
7
12
11
9
7
20
13
12
11
9
9
Computer Sciences Department 60
Merging two sorted arrays
20
13
7
2
12
11
9
1
1
20
13
7
2
12
11
9
2
20
13
7
12
11
9
7
20
13
12
11
9
9
20
13
12
11
Computer Sciences Department 61
Merging two sorted arrays
20
13
7
2
12
11
9
1
1
20
13
7
2
12
11
9
2
20
13
7
12
11
9
7
20
13
12
11
9
9
20
13
12
11
11
Computer Sciences Department 62
Merging two sorted arrays
20
13
7
2
12
11
9
1
1
20
13
7
2
12
11
9
2
20
13
7
12
11
9
7
20
13
12
11
9
9
20
13
12
11
11
20
13
12
Computer Sciences Department 63
Merging two sorted arrays
20
13
7
2
12
11
9
1
1
20
13
7
2
12
11
9
2
20
13
7
12
11
9
7
20
13
12
11
9
9
20
13
12
11
11
20
13
12
12
Computer Sciences Department 64
Merging two sorted arrays
20
13
7
2
12
11
9
1
1
20
13
7
2
12
11
9
2
20
13
7
12
11
9
7
20
13
12
11
9
9
20
13
12
11
11
20
13
12
12
Time = Q(n) to merge a total of n elements (linear time).
Computer Sciences Department 65
Merge sort - “Pseudo-code conventions”
Computer Sciences Department 66
Recursion treeSolve T(n) = 2T(n/2) + cn, where c > 0 is constant.
Computer Sciences Department 67
Recursion treeSolve T(n) = 2T(n/2) + cn, where c > 0 is constant.
Computer Sciences Department 68
Recursion treeSolve T(n) = 2T(n/2) + cn, where c > 0 is constant.
T(n)
Computer Sciences Department 69
Recursion treeSolve T(n) = 2T(n/2) + cn, where c > 0 is constant.
T(n/2) T(n/2)
cn
Computer Sciences Department 70
Recursion treeSolve T(n) = 2T(n/2) + cn, where c > 0 is constant.
cn
T(n/4) T(n/4) T(n/4) T(n/4)
cn/2 cn/2
Computer Sciences Department 71
Recursion treeSolve T(n) = 2T(n/2) + cn, where c > 0 is constant.
cn
cn/4 cn/4 cn/4 cn/4
cn/2 cn/2
Q(1)
…
Computer Sciences Department 72
Recursion treeSolve T(n) = 2T(n/2) + cn, where c > 0 is constant.
cn
cn/4 cn/4 cn/4 cn/4
cn/2 cn/2
Q(1)
…
h = lg n
Computer Sciences Department 73
Recursion treeSolve T(n) = 2T(n/2) + cn, where c > 0 is constant.
cn
cn/4 cn/4 cn/4 cn/4
cn/2 cn/2
Q(1)
…
h = lg n
cn
Computer Sciences Department 74
Recursion treeSolve T(n) = 2T(n/2) + cn, where c > 0 is constant.
cn
cn/4 cn/4 cn/4 cn/4
cn/2 cn/2
Q(1)
…
h = lg n
cn
cn
Computer Sciences Department 75
Recursion treeSolve T(n) = 2T(n/2) + cn, where c > 0 is constant.
cn
cn/4 cn/4 cn/4 cn/4
cn/2 cn/2
Q(1)
…
h = lg n
cn
cn
cn
…
Computer Sciences Department 76
Recursion treeSolve T(n) = 2T(n/2) + cn, where c > 0 is constant.
cn
cn/4 cn/4 cn/4 cn/4
cn/2 cn/2
Q(1)
…
h = lg n
cn
cn
cn
#leaves = n Q(n)
…
Computer Sciences Department 77
Recursion treeSolve T(n) = 2T(n/2) + cn, where c > 0 is constant.
cn
cn/4 cn/4 cn/4 cn/4
cn/2 cn/2
Q(1)
…
h = lg n
cn
cn
cn
#leaves = n Q(n)
Total = Q(n lg n)
…
Computer Sciences Department 78
Conclusions
• Q(n lg n) grows more slowly than Q(n2).
• Therefore, merge sort asymptotically beats insertion sort in the worst case.
• In practice, merge sort beats insertion sort for n > 30 or so.
Top Related