1 Growth of Functions CS 202 Epp, section ??? Aaron Bloomfield.

29
1 Growth of Functions CS 202 Epp, section ??? Aaron Bloomfield

Transcript of 1 Growth of Functions CS 202 Epp, section ??? Aaron Bloomfield.

Page 1: 1 Growth of Functions CS 202 Epp, section ??? Aaron Bloomfield.

1

Growth of Functions

CS 202

Epp, section ???

Aaron Bloomfield

Page 2: 1 Growth of Functions CS 202 Epp, section ??? Aaron Bloomfield.

2

How does one measure algorithms

• We can time how long it takes a computer– What if the computer is doing other things?– And what happens if you get a faster computer?

• A 3 Ghz Windows machine chip will run an algorithm at a different speed than a 3 Ghz Macintosh

• So that idea didn’t work out well…

Page 3: 1 Growth of Functions CS 202 Epp, section ??? Aaron Bloomfield.

3

How does one measure algorithms

• We can measure how many machine instructions an algorithm takes– Different CPUs will require different amount of

machine instructions for the same algorithm

• So that idea didn’t work out well…

Page 4: 1 Growth of Functions CS 202 Epp, section ??? Aaron Bloomfield.

4

How does one measure algorithms

• We can loosely define a “step” as a single computer operation– A comparison, an assignment, etc.– Regardless of how many machine instructions it

translates into

• This allows us to put algorithms into broad categories of efficient-ness– An efficient algorithm on a slow computer will always

beat an inefficient algorithm on a fast computer

Page 5: 1 Growth of Functions CS 202 Epp, section ??? Aaron Bloomfield.

5

Bubble sort running time

• The bubble step take (n2-n)/2 “steps”

• Let’s say the bubble sort takes the following number of steps on specific CPUs:– Intel Pentium IV CPU: 58*(n2-n)/2– Motorola CPU: 84.4*(n2-2n)/2– Intel Pentium V CPU: 44*(n2-n)/2

• Notice that each has an n2 term– As n increases, the other terms will drop out

Page 6: 1 Growth of Functions CS 202 Epp, section ??? Aaron Bloomfield.

6

Bubble sort running time

• This leaves us with:– Intel Pentium IV CPU: 29n2

– Motorola CPU: 42.2n2

– Intel Pentium V CPU: 22n2

• As processors change, the constants will always change– The exponent on n will not

• Thus, we can’t care about the constants

Page 7: 1 Growth of Functions CS 202 Epp, section ??? Aaron Bloomfield.

7

An aside: inequalities

• If you have a inequality you need to show:x < y

• You can replace the lesser side with something greater:x+1 < y

• If you can still show this to be true, then the original inequality is true

• Consider showing that 15 < 20– You can replace 15 with 16, and then show that 16 < 20.

Because 15 < 16, and 16 < 20, then 15 < 20

Page 8: 1 Growth of Functions CS 202 Epp, section ??? Aaron Bloomfield.

8

An aside: inequalities

• If you have a inequality you need to show:x < y

• You can replace the greater side with something lesser:x < y-1

• If you can still show this to be true, then the original inequality is true

• Consider showing that 15 < 20– You can replace 20 with 19, and then show that 15 < 19.

Because 15 < 19, and 19 < 20, then 15 < 20

Page 9: 1 Growth of Functions CS 202 Epp, section ??? Aaron Bloomfield.

9

An aside: inequalities

• What if you do such a replacement and can’t show anything?– Then you can’t say anything about the original

inequality

• Consider showing that 15 < 20– You can replace 20 with 10– But you can’t show that 15 < 10– So you can’t say anything one way or the other about

the original inequality

Page 10: 1 Growth of Functions CS 202 Epp, section ??? Aaron Bloomfield.

10

Review of last time

• Searches– Linear: n steps– Binary: log2 n steps– Binary search is about as fast as you can get

• Sorts– Bubble: n2 steps– Insertion: n2 steps– There are other, more efficient, sorting techniques

• In principle, the fastest are heap sort, quick sort, and merge sort• These each take take n * log2 n steps• In practice, quick sort is the fastest (although this is not

guaranteed!), followed by merge sort

Page 11: 1 Growth of Functions CS 202 Epp, section ??? Aaron Bloomfield.

11

Big-Oh notation

• Let b(x) be the bubble sort algorithm• We say b(x) is O(n2)

– This is read as “b(x) is big-oh n2”– This means that the input size increases, the running time of the

bubble sort will increase proportional to the square of the input size

• In other words, by some constant times n2

• Let l(x) be the linear (or sequential) search algorithm• We say l(x) is O(n)

– Meaning the running time of the linear search increases directly proportional to the input size

Page 12: 1 Growth of Functions CS 202 Epp, section ??? Aaron Bloomfield.

12

Big-Oh notation

• Consider: b(x) is O(n2)– That means that b(x)’s running time is less than (or

equal to) some constant times n2

• Consider: l(x) is O(n)– That means that l(x)’s running time is less than (or

equal to) some constant times n

Page 13: 1 Growth of Functions CS 202 Epp, section ??? Aaron Bloomfield.

13

Big-Oh proofs• Show that f(x) = x2 + 2x + 1 is O(x2)

– In other words, show that x2 + 2x + 1 ≤ c*x2

• Where c is some constant• For input size greater than some x

• We know that 2x2 ≥ 2x whenever x ≥ 1• And we know that x2 ≥ 1 whenever x ≥ 1• So we replace 2x+1 with 3x2

– We then end up with x2 + 3x2 = 4x2

– This yields 4x2 ≤ c*x2

• This, for input sizes 1 or greater, when the constant is 4 or greater, f(x) is O(x2)

• We could have chosen values for c and x that were different

Page 14: 1 Growth of Functions CS 202 Epp, section ??? Aaron Bloomfield.

14

Big-Oh proofs

Page 15: 1 Growth of Functions CS 202 Epp, section ??? Aaron Bloomfield.

15

Sample Big-Oh problems

• Show that f(x) = x2 + 1000 is O(x2)– In other words, show that x2 + 1000 ≤ c*x2

• We know that x2 > 1000 whenever x > 31– Thus, we replace 1000 with x2

– This yields 2x2 ≤ c*x2

• Thus, f(x) is O(x2) for all x > 31 when c ≥ 2

Page 16: 1 Growth of Functions CS 202 Epp, section ??? Aaron Bloomfield.

16

Sample Big-Oh problems

• Show that f(x) = 3x+7 is O(x)– In other words, show that 3x+7 ≤ c*x

• We know that x > 7 whenever x > 7– Duh!– So we replace 7 with x– This yields 4x ≤ c*x

• Thus, f(x) is O(x) for all x > 7 when c ≥ 4

Page 17: 1 Growth of Functions CS 202 Epp, section ??? Aaron Bloomfield.

17

A variant of the last question

• Show that f(x) = 3x+7 is O(x2)– In other words, show that 3x+7 ≤ c*x2

• We know that x > 7 whenever x > 7– Duh!– So we replace 7 with x– This yields 4x < c*x2

– This will also be true for x > 7 when c ≥ 1

• Thus, f(x) is O(x2) for all x > 7 when c ≥ 1

Page 18: 1 Growth of Functions CS 202 Epp, section ??? Aaron Bloomfield.

18

What that means

• If a function is O(x)– Then it is also O(x2)– And it is also O(x3)

• Meaning a O(x) function will grow at a slower or equal to the rate x, x2, x3, etc.

Page 19: 1 Growth of Functions CS 202 Epp, section ??? Aaron Bloomfield.

19

Function growth rates• For input size n = 1000

• O(1) 1• O(log n) ≈10• O(n) 103

• O(n log n) ≈104

• O(n2) 106

• O(n3) 109

• O(n4) 1012

• O(nc) 103*c c is a consant• 2n ≈10301

• n! ≈102568

• nn 103000

Many interesting problems fall into these categories

Page 20: 1 Growth of Functions CS 202 Epp, section ??? Aaron Bloomfield.

20

Function growth rates

Logarithmic scale!

Page 21: 1 Growth of Functions CS 202 Epp, section ??? Aaron Bloomfield.

21

Integer factorization

• Factoring a composite number into it’s component primes is O(2n)– Where n is the number of bits in the number

• This, if we choose 2048 bit numbers (as in RSA keys), it takes 22048 steps– That’s about 10617 steps!

Page 22: 1 Growth of Functions CS 202 Epp, section ??? Aaron Bloomfield.

22

Formal Big-Oh definition

• Let f and g be functions. We say that f(x) is O(g(x)) if there are constants c and k such that

|f(x)| ≤ C |g(x)|

whenever x > k

Page 23: 1 Growth of Functions CS 202 Epp, section ??? Aaron Bloomfield.

23

Formal Big-Oh definition

Page 24: 1 Growth of Functions CS 202 Epp, section ??? Aaron Bloomfield.

24

A note on Big-Oh notation

• Assume that a function f(x) is O(g(x))

• It is sometimes written as f(x) = O(g(x))– However, this is not a proper equality!– It’s really saying that |f(x)| ≤ C |g(x)|

• In this class, we will write it as f(x) is O(g(x))

Page 25: 1 Growth of Functions CS 202 Epp, section ??? Aaron Bloomfield.

25

NP Completeness

• A full discussion of NP completeness takes 3 hours for somebody who already has a CS degree– We are going to do the 15 minute version of it today– More will (probably) follow on Friday

• Any term of the form nc, where c is a constant, is a polynomial– Thus, any function that is O(nc) is a polynomial-time function– 2n, n!, nn are not polynomial functions

Page 26: 1 Growth of Functions CS 202 Epp, section ??? Aaron Bloomfield.

26

Satisfiability

• Consider a Boolean expression of the form:• (x1 x2 x3) (x2 x3 x4) (x1 x4 x5)

– This is a conjunction of disjunctions

• Is such an equation satisfiable?– In other words, can you assign truth values to all the

xi’s such that the equation is true?– The above problem is easy (only 3 clauses of 3

variables) – set x1, x2, and x4 to true• There are other possibilities: set x1, x2, and x5 to true, etc.

– But consider an expression with 1000 variables and thousands of clauses

Page 27: 1 Growth of Functions CS 202 Epp, section ??? Aaron Bloomfield.

27

Satisfiability

• If given a solution, it is easy to check if such a solution works– Plug in the values – this can be done quickly, even by hand

• However, there is no known efficient way to find such a solution– The only definitive way to do so is to try all possible values for

the n Boolean variables– That means this is O(2n)!– Thus it is not a polynomial time function

• NP stands for “Not Polynomial”

• Cook’s theorem (1971) states that SAT is NP-complete– There still may be an efficient way to solve it, though!

Page 28: 1 Growth of Functions CS 202 Epp, section ??? Aaron Bloomfield.

28

NP Completeness

• There are hundreds of NP complete problems– It has been shown that if you can solve one of them

efficiently, then you can solve them all– Example: the traveling salesman problem

• Given a number of cities and the costs of traveling from any city to any other city, what is the cheapest round-trip route that visits each city once and then returns to the starting city?

• Not all algorithms that are O(2n) are NP complete– In particular, integer factorization (also O(2n)) is not

thought to be NP complete

Page 29: 1 Growth of Functions CS 202 Epp, section ??? Aaron Bloomfield.

29

NP Completeness

• It is “widely believed” that there is no efficient solution to NP complete problems– In other words, everybody has that belief

• If you could solve an NP complete problem in polynomial time, you would be showing that P = NP– And you’d get a million dollar prize (and lots of fame!)

• If this were possible, it would be like proving that Newton’s or Einstein’s laws of physics were wrong

• In summary: – NP complete problems are very difficult to solve, but easy to

check the solutions of– It is believed that there is no efficient way to solve them