Input sizeTime I1I1 T1T1 I2I2 T2T2 …… Algorithm Analysis We want to quantify the behavior of an...

Post on 26-Mar-2015

214 views 0 download

Tags:

Transcript of Input sizeTime I1I1 T1T1 I2I2 T2T2 …… Algorithm Analysis We want to quantify the behavior of an...

Input size TimeI1 T1

I2 T2

… …

Algorithm Analysis

We want to quantify the behavior of an algorithm.

Useful to compare efficiency of two algorithms on the same problem.

Observations:

1. A program (algorithm) consumes resources: time and space

2. Amount of resources directly related to size of input.

Say, we have the following table:

How can we derive the function T = f(I) ?

Problems:Too many parameters to learn this functionIt depends on Machine in which program is run

Compiler used Programming language Programmer who writes the code

Solution:Imagine our algorithm runs on an “algorithm machine” that accepts pseudo-code.

Assumptions:READs and WRITEs take constant timeArithmetic operations take constant timeLogical operations take constant time

Even though we made various assumptions, … it is still complicated.

Instead, we quantify our algorithm on the worst-case input.

This is called “worst-case analysis”

Also, the “average-case analysis” exists:

Requires probability distribution of set of inputs which is usually unknown.

Not studied in this course.

Input Size:

• not always easy to determine, and • problem dependent.

Some examples:

1. Graph-theoretic problem: Number of vertices, V, and number of edges, E.

2. Matrix multiplication: Number of rows and columns of input matrices.

3. Sorting: The number of elements, n.

Not needs to find exactly what T = f(I) is, but we can say:T(I) = O(f(I))

For example, the time complexity of mergesort of n elements:T(n) = O(n log n)

What does it mean?

Behavior of mergesort is better than a constant times n log n,where n n0.

Growth of Functions:

The analysis of the complexity of an algorithm is linked to a problem of growth of functions.

Let f(n) be a function of a positive integer n.The dominant term of f(n) determines the behavior of f(n) as n .

For example, let:f(n) = 2n3 + 3n2 + 4n + 1

The dominant term of f(n) is 2n3.

This means that as n becomes large (n ):2n3 dominates the behavior of f(n)

The other terms’ contributions become much less significant.

Example 2:

The term dominates the behavior of f(n) as n

Example 3:

The term dominates the behavior of f(n) as n

The rate of growth means how function behaves as n .It is determined by its dominant term.

The big-Oh notation is a short-hand way of expressing this.

1log)( nnnf

1log2)( 2 nnnnf n

n

n2

The relationship:f(n) = O(n2)

is interpreted as:

f(n) grows no faster than n2 as n becomes large (n ).The dominant term of f(n) does not grow faster than n2.It can grow as fast as n2 the most accurate description.In this case: Another notation used, not discussed here.

Also, O(n2) is true for:

We could make up as many such functions as we wish.

In general, the description: the most accurate possible (the smaller).

143)( 2 nnnf

1log)( nnnnnf

1loglog)( 22/3 nnnnnf

Problem: To find the most accurate description for any function… in terms of the big-Oh notation.

A formal definition of:f(n) = O(g(n))

is that the inequality:f(n) c * g(n)

holds for all n n0,

where n0 and c are positive constants f(n) and g(n) are functions mapping nonnegative integers to real numbers

Informally“f(n) is order g(n)”

Input Size: n

f(n)

c * g(n)

n0

Graphically:

Example: f(n) = 7n2 + .5n + 6 and g(n) = n2

f(n) is O(n2), provided that c = 10 and n0 = 2

n0 = 2

7n2 + .5n + 6

10n2

In general, iff(n) = a0 + a1 n + … + ad-1 nd-1 + ad nd

Then, f(n) is O(nd)

We will see other functions too:For example:

O(log n), O(n log n), etc.

Having defined the big-Oh notation, now …

Seven functions that often appear in algorithm analysis:

Constant 1Logarithmic log nLinear nN-Log-N n log nQuadratic n2

Cubic n3

Exponential 2n

1. f(n) = 8 + n/2 + n4/104 O(n4)

2. f(n) = log4 n + n O(n)

3. f(n) = log2 n + n O(n)

4. f(n) = n2 - n - n O(n2)

5. f(n) = n2/log2 n O(n2/log2 n)

6. f(n) = log2/3 n + log n2 O(log n)

Others examples

What is the smallest big-oh complexity associated to algorithms for which the running time is given by the following functions:

Analysis of Examples

Given a list of n elements, find the minimum (or maximum).Then,

T(n) = O(n)

We look at all elements to determine minimum (maximum).

Given n points in the plane, find the closest pair of points.In this case,

T(n) = O(n2)why?

a brute-force algorithm that looks at all n2 pairs of points.

Given n points in a plane,determine if any three points are contained in a straight line.

In this case,T(n) = O(n3)

why?

a brute-force algorithm that searches all n3 triplets.

Maximum Contiguous Subsequence (MCS) Problem

Given a sequence S of n integers:S = a1, a2, a3, …, an-1, an

a contiguous subsequence is:ai, ai+1, …, aj-1, aj,

where 1 i j n.

The problem: Determine a contiguous subsequence such that:

ai + ai+1 + … + aj-1 + aj 0is maximal.

Some examples:

S = -1, -2, -3, -4, -5, -6

MCS is empty, it has value 0 by definition.

For the sequence:-1, 2, 3, -3, 2,

an MCS is 2, 3

whose value is 2 + 3 = 5.

Note: There may be more than one MCS.

For example:-1, 1, -1, 1, -1, 1

has six MCS whose value is 1

An O(n2) Algorithm for MCS

Search problems have an associated search space.

To figure out: How large the search space is.

For the MCS problem:How many subsequences need be examined?

For example: -1, 2, 3, -3, 2

Then, the subsequences that begin with –1 are:-1-1, 2-1, 2, 3-1, 2, 3, -3-1, 2, 3, -3, 2

The ones beginning with 2 are:22, 32, 3, -32, 3, -3, 2

Those beginning with 3 are:33, -33, -3, 2

The ones beginning with –3:-3-3, 2

and beginning with 2, just one: 2

Then, including the empty sequence, a total of 16 examined.

In general, given a1, a2, a3, …, an-1, an

We have n sequences beginning with a1:a1

a1, a2

a1, a2, a3

….a1, a2, a3, …, an-1, an

n-1 beginning with a2:a2

a2, a3

….a2, a3, …, an-1, an

and so on. Then, two subsequences beginning with an-1:

an-1

an-1, an

and, finally, one beginning with an

an

Total of possible subsequences:

1 + 2 + … n-1 + n + 1 = n(n+1)/2 + 1

Analysis:

The dominant term is n2/2, hence search space is O(n2).

A “brute-force” algorithm follows…

Algorithm MCSBruteForce

Input: A sequence a1, a2, a3, …, an-1, an.

Output: value, start and end of MCS.

maxSum 0for i = 1 to n do

Set sum 0for j = i to n do

sum sum + ajif (sum > maxSum).

maxSum sumstart iend j

Print start, end, maxSum and STOP.

Improved MCS Algorithm

Think of avoiding looking at all the subsequences.

Introduce the following notion.Given: ai, ai+1, …, ak, ak+1, …, aj (1)

the subsequence:ai, ai+1, …, ak

is a prefix of (1), where i k j.

The prefix sum is:ai + ai+1 + … + ak

Observation:In an MCS no prefix sum can be negative.

In the previous example,-1, 2, 3, -3, 2,

we exclude:-1-1, 2-1, 2, 3-1, 2, 3, -3-1, 2, 3, -3, 2

and -3-3, 2

as being possible candidates.

In general:

If ever sum < 0, skip over index positions from i+1, …, j

Also, if sum 0 always for a starting position i, none of positions i+1, …, n is a candidate start position, since ai and all following prefix sums are non-negative.

So the only time we need to consider a new starting position is when the sum becomes negative and all the index positions from i+1, …, j can be skipped.

The improved MCS algorithm inspects ai just once.

The algorithm follows….

Algorithm MCSImproved

Set i 1; Set start end 1Set maxSum sum 0for j = 1 to n do

sum sum + ajif (sum > maxSum)

maxSum sumstart iend j

if (sum < 0)i j + 1sum 0

Print start, end, maxSum and STOP.

Analysis of the Algorithms

Algorithm MCSBruteForce:

The outer loop is executed n timesFor each i, the inner loop is executed n – i + 1 timesThus, the total number of times the inner loop is executed:

Algorithm MCSImproved:

It has a single for loop, which visits all n elements.Hence,

)(2/)1(1)( 2

1nOnninnT

n

i

)()( nOnT

What is the big-Oh complexity of the following algorithm? What is the value of Sum after the execution of this algorithm with the values: n = 5, m = 10, and p = 6?

Sum = 2;For (i = 0, i < n; i++) {

For (j = 0; j ≤ m; j++) {For (k = 1; k < p; k++) {

Sum++;

What is the big-Oh complexity of the following algorithm? What is the value of Sum after the execution of this algorithm with the values: n = 8?

Sum = -5;For (i = 0, i < n; i++) {

For (j = 0; 2*j < i; j++) {Sum++;