Post on 02-Jan-2016
description
CS322Week 12 - Monday
Last time
What did we talk about last time? Trees Graphing functions
Questions?
Logical warmup
Two vegetarians and two cannibals are on one bank of a river
They have a boat that can hold at most two people
Come up with a sequence of boat loads that will convey all four people safely to the other side of the river
The cannibals on any given bank cannot outnumber the vegetarians…. or else!
Spanning TreesWhich we somehow seemed to skip earlier…
Turning graphs into trees
Consider the following graph that shows all the routes an airline has between citiesMinneapolis
Milwaukee
Chicago
St. Louis
Detroit
Cincinnati
Louisville
Nashville
Turning graphs into trees What if we want to remove routes (to save
money)? How can we keep all cities connected?
MinneapolisMilwaukee
Chicago
St. Louis
Detroit
Cincinnati
Louisville
Nashville
The best tree?
Does this tree have the smallest number of routes?
Why?Minneapolis
Milwaukee
Chicago
St. Louis
Detroit
Cincinnati
Louisville
Nashville
Spanning trees
A spanning tree for a graph G is a subgraph of G that contains every vertex of G and is a tree
Some properties: Every connected graph has a spanning
tree▪ Why?
Any two spanning trees for a graph have the same number of edges▪ Why?
Weighted graphs
In computer science, we often talk about weighted graphs when tackling practical applications
A weighted graph is a graph for which each edge has a real number weight
The sum of the weights of all the edges is the total weight of the graph
Notation: If e is an edge in graph G, then w(e) is the weight of e and w(G) is the total weight of G
A minimum spanning tree (MST) is a spanning tree of lowest possible total weight
Weighted graphs example
Here is the graph from before, with labeled weights
MinneapolisMilwaukee
Chicago
St. Louis
Detroit
Cincinnati
Louisville
Nashville
355
695
262
74
348
269
242
151
230306
83
Finding a minimum spanning tree
Kruskal's algorithm gives an easy to follow technique for finding an MST on a weighted, connected graph
Informally, go through the edges, adding the smallest one, unless it forms a circuit
Algorithm: Input: Graph G with n vertices Create a subgraph T with all the vertices of G (but no edges) Let E be the set of all edges in G Set m = 0 While m < n – 1▪ Find an edge e in E of least weight▪ Delete e from E▪ If adding e to T doesn't make a circuit▪ Add e to T ▪ Set m = m + 1
Output: T
Minimum spanning tree example
Run Kruskal's algorithm on the city graph:
MinneapolisMilwaukee
Chicago
St. Louis
Detroit
Cincinnati
Louisville
Nashville
355
695
262
74
348
269
242
151
230306
83
Minimum spanning tree output
Output:
MinneapolisMilwaukee
Chicago
St. Louis
Detroit
Cincinnati
Louisville
Nashville
355
262
74
242
151
230
83
Prim's algorithm
Prim's algorithm gives another way to find an MST Informally, start at a vertex and add the next closest
node not already in the MST Algorithm:
Input: Graph G with n vertices Let subgraph T contain a single vertex v from G Let V be the set of all vertices in G except for v For i from 1 to n – 1▪ Find an edge e in G such that: ▪ e connects T to one of the vertices in V▪ e has the lowest weight of all such edges
▪ Let w be the endpoint of e in V▪ Add e and w to T▪ Delete w from V
Output: T
Prim fights Kruskal
Apply Kruskal's algorithm to the graph below
Now, apply Prim's algorithm to the graph below
Is there any other MST we could make? 2
3
3
4 11
1
Back to Functions
Discontinuous functions
Recall the definition of the floor of x: x = the largest integer that is less than
or equal to x Graph f(x) = x Defining functions on integers
instead of real values affects their graphs a great deal
Graph p1(x) = x, x R Graph f(n) = n, n N
Multiples of functions
There is a strong visual (and of course mathematical) correlation a function that is the multiple of another function
Examples: g(x) = x + 2 2g(x) = 2x + 4
Given f graphed below, sketch 2f
- 6 - 5 - 4 - 3 - 2 - 1 1 2 3 4 5 6
21
-1-2
Absolute value
Consider the absolute value function f(x) = |x|
Left of the origin it is constantly decreasing
Right of the origin it is constantly increasing
- 6 - 5 - 4 - 3 - 2 - 1
1 2 3 4 5 6
21
-1-2
Increasing and decreasing functions
We say that f is decreasing on the set S iff for all real numbers x1 and x2 in S, if x1 < x2, then f(x1) > f(x2)
We say that f is increasing on the set S iff for all real numbers x1 and x2 in S, if x1 < x2, then f(x1) < f(x2)
We say that f is an increasing (or decreasing) function iff f is increasing (or decreasing) on its entire domain
Clearly, a positive multiple of an increasing function is increasing
Virtually all running time functions are increasing functions
Asymptotic Notation (Big Oh)Student Lecture
Asymptotic Notations
Growth of functions
Mathematicians worry about the growth of various functions
They usually express such things in terms of limits, maybe with derivatives
We are focused primarily on functions that bound running time spent and memory consumed
We just need a rough guide We want to know the order of the
growth
Definitions
Let f and g be real-valued functions defined on the same set of nonnegative real numbers
f is of order at least g, written f(x) is (g(x)), iff there is a positive A R and a nonnegative a R such that A|g(x)| ≤ |f(x)| for all x > a
f is of order at most g, written f(x) is O(g(x)), iff there is a positive B R and a nonnegative b R such that |f(x)| ≤ B|g(x)| for all x > b
f is of order g, written f(x) is (g(x)), iff there are positive A, B R and a nonnegative k R such that A|g(x)| ≤ |f(x)| ≤ B|g(x)| for all x > k
Using the notation
Express the following statements using appropriate notation: 10|x6| ≤ |17x6 – 45x3 + 2x + 8| ≤ 30|x6|, for x > 2
Justify the following:
is (x)
0,1
)92(1515
xx
xxx
6,151
)92(15
xxx
xx
1)92(15
xxx
Properties of -, O-, and - notation
Let f and g be real-valued functions defined on the same set of nonnegative real numbers
1. f(x) is (g(x)) and f(x) is O(g(x)) iff f(x) is (g(x))2. f(x) is (g(x)) iff g(x) is O(f(x))3. f(x) is (f(x)), f(x) is O(f(x)), and f(x) is (f(x))4. If f(x) is O(g(x)) and g(x) is O(h(x)) then f(x) is
O(h(x))5. If f(x) is O(g(x)) and c is a positive real, then cf(x)
is O(g(x))6. If f(x) is O(h(x)) and g(x) is O(k(x)) then f(x) + g(x)
is O(G(x)) where G(x) = max(|h(x)|,|k(x)|) for all x7. If f(x) is O(h(x)) and g(x) is O(k(x)) then f(x)g(x) is
O(h(x)k(x))
Orders of functions
If 1 < x, then x < x2
x2 < x3
… So, for r, s R, where r < s and x >
1, xr < xs
By extension, xr is O(xs)
Proving bounds
Prove a bound for g(x) = (1/4)(x – 1)(x + 1) for x R
Prove that x2 is not O(x) Hint: Proof by contradiction
Polynomials
Let f(x) be a polynomial with degree n f(x) = anxn + an-1xn-1 + an-2xn-2 … + a1x + a0
By extension from the previous results, if an is a positive real, then f(x) is O(xs) for all integers s n f(x) is (xr) for all integers r ≤ n f(x) is (xn)
Furthermore, let g(x) be a polynomial with degree m g(x) = bmxm + bm-1xm-1 + bm-2xm-2 … + b1x + b0
If an and bm are positive reals, then f(x)/g(x) is O(xc) for real numbers c > n - m f(x)/g(x) is not O(xc) for all integers c > n - m f(x)/g(x) is (xn - m)
Algorithm Efficiency
Extending notation to algorithms
We can easily extend our -, O-, and - notations to analyzing the running time of algorithms
Imagine that an algorithm A is composed of some number of elementary operations (usually arithmetic, storing variables, etc.)
We can imagine that the running time is tied purely to the number of operations
This is, of course, a lie Not all operations take the same amount of time Even the same operation takes different amounts of
time depending on caching, disk access, etc.
Running time
First, assume that the number of operations performed by A on input size n is dependent only on n, not the values of the data If f(n) is (g(n)), we say that A is (g(n)) or that A is of
order g(n) If the number of operations depends not only on n
but also on the values of the data Let b(n) be the minimum number of operations where
b(n) is (g(n)), then we say that in the best case, A is (g(n)) or that A has a best case order of g(n)
Let w(n) be the maximum number of operations where w(n) is (g(n)), then we say that in the worst case, A is (g(n)) or that A has a worst case order of g(n)
Time
Approximate Time for f(n) Operations Assuming One Operation Per Nanosecond
f(n) n = 10 n = 1,000 n = 100,000 n = 10,000,000
log2 n3.3 x 10-
9 s 10-8 s 1.7 x 10-8 s 2.3 x 10-8 s
n 10-8 s 10-6 s 0.0001 s 0.01 s
n log2 n
3.3 x 10-
8 s 10-5 s 0.0017 s 0.23 s
n2 10-7 s 0.001 s 10 s 27.8 hours
n3 10-6 s 1 s 11.6 days 31,668 years
2n 10-6 s 3.4 x 10284 years
3.1 x 1030086 years
2.9 x 103010283 years
Computing running time
With a single for (or other) loop, we simply count the number of operations that must be performed:int p = 0;int x = 2;for( int i = 2; i <= n; i++ )p = (p + i)*x;
Counting multiplies and adds, (n – 1) iterations times 2 operations = 2n – 2
As a polynomial, 2n – 2 is (n)
Nested loops
When loops do not depend on each other, we can simply multiply their iterations (and asymptotic bounds)int p = 0;for( int i = 2; i <= n; i++ )
for( int j = 3; j <= n; j++ )p++;
Clearly (n – 1)(n -2) is (n2)
Trickier nested loops
When loops depend on each other, we have to do more analysisint s = 0;for( int i = 1; i <= n; i++ )
for( int j = 1; j <= i; j++ )s = s + j*(i – j + 1);
What's the running time here? Arithmetic sequence saves the day
(for the millionth time)
Iterations with floor
When loops depend on floor, what happens to the running time?int a = 0;for( int i = n/2; i <= n; i++ )
a = n - i; Floor is used implicitly here, because
we are using integer division What's the running time? Hint:
Consider n as odd or as even separately
Sequential search
Consider a basic sequential search algorithm:int search( int[] array, int n, int value)
{for( int i = 0; i < n; i++ )
if( array[i] == value )return i;
return -1;} What's its best case running time? What's its worst case running time? What's its average case running time?
Insertion sort algorithm
Insertion sort is a common introductory sort
It is suboptimal, but it is one of the fastest ways to sort a small list (10 elements or fewer)
The idea is to sort initial segments of an array, insert new elements in the right place as they are found
So, for each new item, keep moving it up until the element above it is too small (or we hit the top)
Insertion sort in code
public static void sort( int[] array, int n){for( int i = 1; i < n; i++ ){
int next = array[i];int j = i - 1;while( j != 0 && array[j] > next ){array[j+1] = array[j];j--;}array[j] = next;
}}
Best case analysis of insertion sort
What is the best case analysis of insertion sort?
Hint: Imagine the array is already sorted
Worst case analysis of insertion sort
What is the worst case analysis of insertion sort?
Hint: Imagine the array is sorted in reverse order
Average case analysis of insertion sort
What is the average case analysis of insertion sort?
Much harder than the previous two! Let's look at it recursively Let Ek be the average number of comparisons
needed to sort k elements Ek can be computed as the sum of the average
number of comparisons needed to sort k – 1 elements plus the average number of comparisons (x) needed to insert the kth element in the right place Ek = Ek-1 + x
Finding x
We can employ the idea of expected value from probability
There are k possible locations for the element to go
We assume that any of these k locations is equally likely
For each turn of the loop, there are 2 comparisons to do
There could be 1, 2, 3, … up to k turns of the loop
Thus, weighting each possible number of iterations evenly gives us 1
2)1(2
21
1
kkk
kj
kx
k
j
Finishing the analysis
Having found x, our recurrence relation is: Ek = Ek-1 + k + 1 Sorting one element takes no time, so E1 =
0 Solve this recurrence relation! Well, if you really banged away at it, you
might find: En = (1/2)(n2 + 3n – 4)
By the polynomial rules, this is (n2) and so the average case running time is the same as the worst case
Upcoming
Next time…
Finish algorithmic efficiency Exponential and logarithmic
functions
Reminders
Keep reading Chapter 11 Keep working on Assignment 9
Due Friday before midnight Exam 3 is next Monday
Review on Friday