Chapter 2: Algorithm Analysis Mark Allen Weiss: Data Structures and Algorithm Analysis in Java.

Post on 19-Dec-2015

382 views 13 download

Tags:

Transcript of Chapter 2: Algorithm Analysis Mark Allen Weiss: Data Structures and Algorithm Analysis in Java.

Chapter 2: Algorithm Analysis

Mark Allen Weiss: Data Structures and Algorithm Analysis in Java

Big-Oh and Other Notations in Algorithm

Analysis• Classifying Functions by Their

Asymptotic Growth

• Theta, Little oh, Little omega

• Big Oh, Big Omega

• Rules to manipulate Big-Oh expressions

• Typical Growth Rates

Classifying Functions by Classifying Functions by Their Asymptotic GrowthTheir Asymptotic Growth

Asymptotic growth : The rate of growth of a function

Given a particular differentiable function f(n), all other differentiable functions fall into three classes:

.growing with the same rate

.growing faster

.growing slower

f(n) and g(n) have same rate of growth, if

lim( f(n) / g(n) ) = c, 0 < c < ∞, n -> ∞

Notation: f(n) = Θ( g(n) ) pronounced "theta"

Theta

f(n) grows slower than g(n) (or g(n) grows faster than f(n)) if

lim( f(n) / g(n) ) = 0, n → ∞

Notation: f(n) = o( g(n) ) pronounced "little oh"

Little oh

f(n) grows faster than g(n)

(or g(n) grows slower than f(n)) if

lim( f(n) / g(n) ) = ∞, n -> ∞

Notation: f(n) = ω (g(n)) pronounced "little omega"

Little omega

if g(n) = o( f(n) )

then f(n) = ω( g(n) )

Examples: Compare n and n2

lim( n/n2 ) = 0, n → ∞, n = o(n2)lim( n2/n ) = ∞, n → ∞, n2 = ω(n)

Little omega and Little oh

Theta: Relation of Equivalence

R: "having the same rate of growth":relation of equivalence, gives a partition over the set of all differentiable functions - classes of equivalence.

Functions in one and the same class are equivalent with respect to their growth.

Algorithms with Same Complexity

Two algorithms have same complexity, if the functions representing the number of operations have same rate of growth.

Among all functions with same rate of growth we choose the simplest one to represent the complexity.

Examples

Compare n and (n+1)/2

lim( n / ((n+1)/2 )) = 2,

same rate of growth

(n+1)/2 = Θ(n)

- rate of growth of a linear function

Examples

Compare n2 and n2+ 6n

lim( n2 / (n2+ 6n ) )= 1

same rate of growth.

n2+6n = Θ(n2) rate of growth of a quadratic function

Examples

Compare log n and log n2

lim( log n / log n2 ) = 1/2

same rate of growth.

log n2 = Θ(log n) logarithmic rate of growth

Θ(n3): n3

5n3+ 4n

105n3+ 4n2 + 6n

Θ(n2): n2

5n2+ 4n + 6

n2 + 5

Θ(log n): log n

log n2

log (n + n3)

Examples

Comparing Functions

• same rate of growth: g(n) = Θ(f(n))• different rate of growth: either g(n) = o (f(n))

g(n) grows slower than f(n), and hence f(n) = ω(g(n))

or g(n) = ω (f(n)) g(n) grows faster than f(n), and hence f(n) = o(g(n))

f(n) = O(g(n))

if f(n) grows with

same rate or slower than g(n).

f(n) = Θ(g(n)) or f(n) = o(g(n))

The Big-Oh Notation

n+5 = Θ(n) = O(n) = O(n2) = O(n3) = O(n5)

the closest estimation: n+5 = Θ(n)

the general practice is to use the Big-Oh notation: n+5 = O(n)

Example

The inverse of Big-Oh is Ω

If g(n) = O(f(n)), then f(n) = Ω (g(n))

f(n) grows faster or with the same rate as g(n): f(n) = Ω (g(n))

The Big-Omega Notation

Rules to manipulateBig-Oh expressions

Rule 1:a. If

T1(N) = O(f(N))and

T2(N) = O(g(N)) then T1(N) + T2(N) = max( O( f (N) ), O( g(N) ) )

Rules to manipulateBig-Oh expressions

b. If T1(N) = O( f(N) )

and T2(N) = O( g(N) )

then

T1(N) * T2(N) = O( f(N)* g(N) )

Rules to manipulateBig-Oh expressions

Rule 2:If T(N) is a polynomial of degree k, then

T(N) = Θ( Nk )

Rule 3:log k N = O(N) for any constant k.

Examples

n2 + n = O(n2)we disregard any lower-order term

nlog(n) = O(nlog(n))

n2 + nlog(n) = O(n2)

C constant, we write O(1)logN logarithmiclog2N log-squaredN linearNlogNN2 quadraticN3 cubic2N exponentialN! factorial

Typical Growth Rates

Exercise True or False

N2 = O(N2)2N = O(N2) N = O(N2) N2 = O(N)2N = O(N) N = O(N)

Exercise True or False

N2 = Θ (N2)2N = Θ (N2) N = Θ (N2)

N2 = Θ (N)2N = Θ (N) N = Θ (N)

25

The work done by an algorithm, i.e. its complexity, is determined by the number of the basic operations necessary to solve the problem.

Running Time Calculations

The Task

26

Determine how the number of operations depend on the size of input :

N - size of inputF(N) - number of operations

Basic operations in an algorithm

27

Problem: Find x in an array

Operation: Comparison of x

with an entry in the array

Size of input: The number of

the elements in the array

Basic operations ….

28

Problem: Multiplying two

matrices with real entries

Operation: Multiplication of two real numbers

Size of input: The dimensions of the matrices

Basic operations ….

29

Problem: Sort an array of

numbers

Operation: Comparison of two array entries plus moving elements in the array

Size of input: The number of

elements in the array

Counting the number of operations

30

A. for loops O(n)

The running time of a for loop

is at most the running time of

the statements inside the loop

times the number of iterations.

for loops

31

sum = 0;

for( i = 0; i < n; i+

+ )

sum = sum + i;

The running time is O(n)

Counting the number of operations

32

B. Nested loops

The total running time is the

running time of the inside

statements times

the product of the sizes of all the

loops

Nested loops

33

sum = 0;

for( i = 0; i < n; i++)

for( j = 0; j < n; j++)

sum++;

The running time is O(n2)

Counting the number of operations

34

C. Consecutive program fragments

Total running time : the maximum of the running time of the individual fragments

Consecutive program fragments

35

sum = 0; O(n)

for( i = 0; i < n; i++) sum = sum + i;

sum = 0; O(n2) for( i = 0; i < n; i++)

for( j = 0; j < 2n; j++) sum++;

The maximum is O(n2)

Counting the number of operations

36

D: If statementif C

S1;else

S2;

The running time is the maximum of the running times of S1 and S2.

EXAMPLESwhat is the number of operations?

37

sum = 0;for( i = 0 ; i < n; i++)for( j = 0 ; j < n*n ; j++ ) sum++;

EXAMPLESwhat is the number of operations?

38

sum = 0;for( i = 0; i < n ; i++)

for( j = 0; j < i ; j++)

sum++;

EXAMPLESwhat is the number of operations?

39

for(j = 0; j < n*n; j++) compute_val(j);

The complexity of compute_val(x) is given to be O(n*logn)

Search in an unordered array of elements

40

for (i = 0; i < n; i++)

if (a[ i ] == x) return 1;

return -1;

Search in a table n x m

41

for (i = 0; i < n; i++)

for (j = 0; j < m; j++)

  if (a[ i ][ j ] == x) return 1 ;

return -1;

42

Max Subsequence Problem• Given a sequence of integers A1, A2, …, An, find the maximum

possible value of a subsequence Ai, …, Aj.• Numbers can be negative.• You want a contiguous chunk with largest sum.

• Example: -2, 11, -4, 13, -5, -2• The answer is 20 (subseq. A2 through A4).

• We will discuss 4 different algorithms, with time complexities O(n3), O(n2), O(n log n), and O(n).

• With n = 106, algorithm 1 may take > 10 years; algorithm 4 will take a fraction of a second!

43

int maxSum = 0;

for( int i = 0; i < a.size( ); i++ )

for( int j = i; j < a.size( ); j++ )

{int thisSum = 0;for( int k = i; k <= j; k+

+ )thisSum += a[ k ];

if( thisSum > maxSum )maxSum = thisSum;

}return maxSum;

Algorithm 1 for Max Subsequence Sum• Given A1,…,An , find the maximum value of Ai+Ai+1+···+Aj

0 if the max value is negative

Time complexity: On3

)1(O

)( ijO

)1(O

)1(O

)1(O

))((1

n

ij

ijO ))((1

0

1

n

i

n

ij

ijO

44

Algorithm 2• Idea: Given sum from i to j-1, we can compute the

sum from i to j in constant time.

• This eliminates one nested loop, and reduces the running time to O(n2).

into maxSum = 0;

for( int i = 0; i < a.size( ); i++ )

int thisSum = 0;for( int j = i; j < a.size( );

j++ ){ thisSum += a[ j ]; if( thisSum > maxSum )

maxSum = thisSum;}

return maxSum;

45

Algorithm 3• This algorithm uses divide-and-conquer paradigm.• Suppose we split the input sequence at midpoint.• The max subsequence is entirely in the left half,

entirely in the right half, or it straddles the midpoint.

• Example:left half | right half4 -3 5 -2 | -1 2 6 -2

• Max in left is 6 (A1 through A3); max in right is 8 (A6 through A7). But straddling max is 11 (A1 thru A7).

46

Algorithm 3 (cont.)

• Example:left half | right half4 -3 5 -2 | -1 2 6 -2

• Max subsequences in each half found by recursion.• How do we find the straddling max subsequence?• Key Observation:

– Left half of the straddling sequence is the max subsequence ending with -2.

– Right half is the max subsequence beginning with -1.

• A linear scan lets us compute these in O(n) time.

47

Algorithm 3: Analysis

• The divide and conquer is best analyzed through recurrence:

T(1) = 1T(n) = 2T(n/2) + O(n)

• This recurrence solves to T(n) = O(n log n).

48

Algorithm 4

• Time complexity clearly O(n)• But why does it work? I.e. proof of correctness.

2, 3, -2, 1, -5, 4, 1, -3, 4, -1, 2

int maxSum = 0, thisSum = 0;

for( int j = 0; j < a.size( ); j++ ){

thisSum += a[ j ];

if ( thisSum > maxSum )maxSum = thisSum;

else if ( thisSum < 0 )thisSum = 0;

}return maxSum;

}

49

Proof of Correctness• Max subsequence cannot start or end at a

negative Ai.• More generally, the max subsequence cannot

have a prefix with a negative sum. Ex: -2 11 -4 13 -5 -2• Thus, if we ever find that Ai through Aj sums to <

0, then we can advance i to j+1– Proof. Suppose j is the first index after i when

the sum becomes < 0– The max subsequence cannot start at any p

between i and j. Because Ai through Ap-1 is positive, so starting at i would have been even better.

50

Algorithm 4int maxSum = 0, thisSum = 0;

for( int j = 0; j < a.size( ); j++ ){

thisSum += a[ j ];

if ( thisSum > maxSum )maxSum = thisSum;

else if ( thisSum < 0 )thisSum = 0;

}return maxSum

• The algorithm resets whenever prefix is < 0. Otherwise, it forms new sums and updates maxSum in one pass.

51

Why Efficient Algorithms Matter

• Suppose N = 106

• A PC can read/process N records in 1 sec.• But if some algorithm does N*N computation, then it

takes 1M seconds = 11 days!!!

• 100 City Traveling Salesman Problem. – A supercomputer checking 100 billion tours/sec

still requires 10100 years!

• Fast factoring algorithms can break encryption schemes. Algorithms research determines what is safe code length. (> 100 digits)

52

How to Measure Algorithm Performance

• What metric should be used to judge algorithms?– Length of the program (lines of code)– Ease of programming (bugs, maintenance)– Memory requiredRunning time

• Running time is the dominant standard.– Quantifiable and easy to compare– Often the critical bottleneck

Logarithms in Running Time

• Binary search• Euclid’s algorithm• Exponentials• Rules to count operations

53

Divide-and-conquer algorithms

54

Subsequently reducing the problem by a factor of two

require O(logN) operations

Why logN?

55

A complete binary tree with N leaves has logN levels.

Each level in the divide-and- conquer algorithms corresponds to an operation

Hence the number of operations is O(logN)

Example: 8 leaves, 3 levels

56

Binary Search

57

Solution 1:

Scan all elements from left to right, each time comparing with X.

O(N) operations.

Binary Search

58

Solution 2: O(logN)Find the middle element Amid in the list and compare it with X

If they are equal, stop If X < Amid consider the left part If X > Amid consider the right part

Do until the list is reduced to one element

Euclid's algorithm

59

Finding the greatest common divisor (GCD)

GCD of M and N, M > N,

= GCD of N and M % N

GCD and recursion

60

Recursion: If M%N = 0 return NElse return GCD(N, M%N)

The answer is the last nonzero remainder.

61

M N rem 24 15 9

15 9 6

9 6 3

6 3 0

3 0

62

long gcd ( long m, long n){ long rem; while (n != 0)

{ rem = m % n; m = n; n = rem;}

return m; }

Euclid’s

Algorithm(non-recursive implementation)

Why O(logN)

63

M % N <= M / 2

After 1st iteration N appears as first argument, the remainder is less than N/2

After 2nd iteration the remainder appears as first argument and will be reduced by a factor of two

Hence O(logN)

Computing XN

64

XN = X*(X2) N / 2 ,N is odd

XN = (X2) N / 2 ,N is even

65

long pow (long x, int n){

if ( n == 0) return 1;

if (is_Even( n ))

return pow(x * x, n/2);

else return

x * pow ( x * x, n/2);}

Why O(LogN)

66

If N is odd : two multiplications

The operations are at most 2logN:O(logN)

Another recursion for XN

67

Another recursive definition that reduces the power just by 1:

XN = X*XN -1

Here the operations are N-1, i.e. O(N) and the algorithm is less efficient than the divide-and-conquer algorithm.

How to count operations

• single statements (not function calls) : constant O(1) = 1.

• sequential fragments: the maximum of the operations of each fragment

68

How to count operations

• single loop running up to N, with single statements in its body: O(N)

• single loop running up to N, with the number of operations in the body O(f(N)):

O( N * f(N) )

69

How to count operations

• two nested loops each running up to N, with single statements: O(N2)

• divide-and-conquer algorithms with input size N: O(logN)

Or O(N*logN) if each step requires additional processing of N elements

70

Example: What is the probability two numbers to be relatively prime?

71

tot = 0; rel = 0;

for ( i = 0; i <= n; i++)

for (j = i+1; j <= n; j++)

{ tot++;

if ( gcd( i, j ) ==1) rel++; }

return (rel/tot);Running time = ?