Post on 06-Jan-2016
description
1
CSE 326 Data Structures: Complexity
Lecture 2: Wednesday, Jan 8, 2003
2
Overview of the Quarter• Complexity and analysis of algorithms• List-like data structures• Search trees• Priority queues• Hash tables• Sorting• Disjoint sets• Graph algorithms• Algorithm design• Advanced topics
Midtermaround here
3
Complexity and analysis of algorithms
• Weiss Chapters 1 and 2
• Additional material– Cormen,Leiserson,Rivest – on reserve at Eng.
library– Graphical analysis– Amortized analysis
4
Program Analysis
• Correctness– Testing– Proofs of correctness
• Efficiency– Asymptotic complexity - how running times
scales as function of size of input
5
Proving Programs Correct• Often takes the form of an inductive proof• Example: summing an array
int sum(int v[], int n)
{
if (n==0) return 0;
else return v[n-1]+sum(v,n-1);
}
int sum(int v[], int n)
{
if (n==0) return 0;
else return v[n-1]+sum(v,n-1);
}
What are the parts of an inductive proof?
6
Inductive Proof of Correctnessint sum(int v[], int n)
{
if (n==0) return 0;
else return v[n-1]+sum(v,n-1);
}
int sum(int v[], int n)
{
if (n==0) return 0;
else return v[n-1]+sum(v,n-1);
}
Theorem: sum(v,n) correctly returns sum of 1st n elements of array v for any n.
Basis Step: Program is correct for n=0; returns 0.
Inductive Hypothesis (n=k): Assume sum(v,k) returns sum of first k elements of v.
Inductive Step (n=k+1): sum(v,k+1) returns v[k]+sum(v,k), which is the same of the first k+1 elements of v.
7
Inductive Proof of Correctness
int find(int x, int v[], int n){ int left = -1; int right = n;
while (left+1 < right) { int m = (left+right) / 2; if (x == v[m]) return m; if (x < v[m]) right = m; else left = m; } return –1; /* not found */}
int find(int x, int v[], int n){ int left = -1; int right = n;
while (left+1 < right) { int m = (left+right) / 2; if (x == v[m]) return m; if (x < v[m]) right = m; else left = m; } return –1; /* not found */}
Binary search in a sortedarray:
v[0]v[1] ... v[n-1]
Given x,find m s.t. x=v[m]
Exercise 1: proof that it is correct
Exercise 2: compute m, k s.t.v[m-1] < x = v[m] = v[m+1] = ... = v[k] < v[k+1]
8
Proof by Contradiction
• Assume negation of goal, show this leads to a contradiction
• Example: there is no program that solves the “halting problem”– Determines if any other program runs
forever or not
Alan Turing, 1937
9
• Does NonConformist(NonConformist) halt?
• Yes? That means HALT(NonConformist) = “never
halts”
• No? That means HALT(NonConformist) = “halts”
Program NonConformist (Program P)If ( HALT(P) = “never halts” ) Then
Halt Else
Do While (1 > 0)Print “Hello!”
End WhileEnd If
End Program
Contradiction!
10
Defining Efficiency
• Asymptotic Complexity - how running time scales as function of size of input
• Two problems:– What is the “input size” ?– How do we express the running time ? (The
Big-O notation)
11
Input Size
• Usually: length (in characters) of input
• Sometimes: value of input (if it is a number)
• Which inputs?– Worst case: tells us how good an algorithm works– Best case: tells us how bad an algorithm works– Average case: useful in practice, but there are
technical problems here (next)
12
Input Size
Average Case Analysis• Assume inputs are randomly distributed according
to some “realistic” distribution • Compute expected running time
• Drawbacks– Often hard to define realistic random distributions– Usually hard to perform math
Inputs( )
( , ) Prob ( ) RunTime( )x n
E T n x x
13
Input Size
• Recall the function: find(x, v, n)• Input size: n (the length of the array)• T(n) = “running time for size n”• But T(n) needs clarification:
– Worst case T(n): it runs in at most T(n) time for any x,v– Best case T(n): it takes at least T(n) time for any x,v– Average case T(n): average time over all v and x
14
Input Size
Amortized Analysis• Instead of a single input, consider a sequence of
inputs:– This is interesting when the running time on some input
depends on the result of processing previous inputs
• Worst case analysis over the sequence of inputs• Determine average running time on this sequence• Will illustrate in the next lecture
15
Definition of Order Notation• Upper bound: T(n) = O(f(n)) Big-O
Exist constants c and n’ such that
T(n) c f(n) for all n n’
• Lower bound: T(n) = (g(n)) OmegaExist constants c and n’ such that
T(n) c g(n) for all n n’
• Tight bound: T(n) = (f(n)) ThetaWhen both hold:
T(n) = O(f(n))
T(n) = (f(n))
Other notations: o(f), (f) - see Cormen et al.
16
Example: Upper Bound
2
2
2 2
2
2
2
Proof: Must find , such that for all ,
100
Let's try setting 2. Then
100 2
100
100
So we
Claim
can
: 10
set 100 and reverse the steps abov .
(
e
0 )
c n n n
n n cn
c
n n n
n
n n O
n
n
n
n
17
Using a Different Pair of Constants2
2 2
2 2
2
Proof: Must find , such that for all ,
100
Let's try setting 101. Then
100 100
100 101
Claim: 100 ( )
(divide both sides by
100 100
1
So we can set 1 and reve
n)
rs
c n n n
n n cn
c
n n n
n n
n
n
n
n n O n
e the steps above.
18
Example: Lower Bound2
2
2 2
2 2
2
Proof: Must find , such that for all ,
100
Let's try setting 1. Then
100
0
So we can set 0 and reverse the steps above.
Thus we can also conc
Claim: 100 ( )
100lude
c n n n
n n cn
c
n n n
n
n
n n n
n n
2( )n
19
Conventions of Order Notation2 2
2 2
Order notation is not symmetric: write
but never
The expression ( ( )) ( ( )) is equivalent to
( ) ( ( ))
The expression ( ( )) ( ( )) is equivalent to
( ) ( (
( )
2
))
T
(
he right
2
)
-h
O f n O g n
f n O g n
f n g n
f n
n n
g n
O n n n
O n
2
2 2
2 318
and side is a "cruder" version of the l
( ) ( )
18 ( ) (
(2
log ) ( )
f
)
e t:nn O n O n
n n n
O
n n
20
Which Function Dominates?f(n) =
n3 + 2n2
n0.1
n + 100n0.1
5n5
n-152n/100
82log n
g(n) =
100n2 + 1000
log n
2n + 10 log n
n!
1000n15
3n7 + 7n
Question to class: is f = O(g) ? Is g = O(f) ?
21
Race If(n)= n3+2n2 g(n)=100n2+1000vs.
22
Race IIn0.1 log nvs.
23
Race IIIn + 100n0.1 2n + 10 log nvs.
24
Race IV5n5 n!vs.
25
Race Vn-152n/100 1000n15vs.
26
Race VI82log(n) 3n7 + 7nvs.
27
• Eliminate low order terms
• Eliminate constant coefficients
3 2 28
3 28
3 28
3 28 8
3 3 28 8
3 28
38
38
38
3
16 log (10 ) 100
16 log (10 )
log (10 )
log (10) log ( )
log (10) log ( )
log ( )
2 log ( )
log ( )
log (2) log( )
log( )
n n n
n n
n n
n n
n n n
n n
n n
n n
n n
n n
3 2 2 3816 log (10 ) 100 ( log( ))n n n O n n
28
Common Namesconstant: O(1)logarithmic: O(log n) linear: O(n)log-linear: O(n log n)quadratic: O(n2)exponential: O(cn) (c is a constant > 1)hyperexponential: (a tower of n
exponentials
Other names:superlinear: O(nc) (c is a constant > 1)polynomial: O(nc) (c is a constant > 0)
)O(22. . .22
Fastest Growth
Slowest Growth
29
Sums and Recurrences
Often the function f(n) is not explicit but expressed as:
• A sum, or
• A recurrence
Need to obtain analytical formula first
30
Sums
)(6
)12)(1(...21)( 3
1
2222 nOnnn
innfn
i
)(2
)1(...21)( 2
1
nOnn
innfn
i
)()12()12(...531)( 2
1
2 nOninnfn
i
(?)...21)( 333 Onnf
(??))34()34(...951)(1
44444 Oinnfn
i
31
More Sums
)3(13
1333...331)(
1
12 n
n
i
nin Onf
))(ln()1ln()ln(11
111
...3
1
2
1
1
1)(
11
nOndxxin
nfn
i
n
)1(1
1
11
11
11...
3
1
2
1
1
1)(
11 222222
On
dxxin
nfn
i
n
Sometimes sums are easiest computed with integrals:
32
Recurrences• f(n) = 2f(n-1) + 1, f(0) = T• Telescoping
f(n)+1 = 2(f(n-1)+1) f(n-1)+1 = 2(f(n-2)+1) 2 f(n-2)+1 = 2(f(n-3)+1) 22
. . . . . f(1) + 1 = 2(f(0) + 1) 2n-1
f(n)+1 = 2n(f(0)+1) = 2n(T+1)
f(n) = 2n(T+1) - 1
33
Recurrences
• Fibonacci: f(n) = f(n-1)+f(n-2), f(0)=f(1)=1 try f(n) = A cn What is c ? A cn = A cn-1 + A cn-2
c2 – c – 1 = 0
nnn
OBAnf
c
2
51
2
51
2
51)(
2
51
2
4112,1
Constants A, B can be determined from f(0), f(1) – not interesting for us for the Big O notation
34
Recurrences
• f(n) = f(n/2) + 1, f(1) = T
• Telescoping:f(n) = f(n/2) + 1f(n/2) = f(n/4) + 1. . .f(2) = f(1) + 1 = T + 1
f(n) = T + log n = O(log n)