CS421 - Course Information Website Syllabus Schedule The Book:

60
CS421 - Course Information Website Syllabus Schedule • The Book:
  • date post

    20-Dec-2015
  • Category

    Documents

  • view

    218
  • download

    0

Transcript of CS421 - Course Information Website Syllabus Schedule The Book:

What is an algorithm?

Informally, an algorithm is any well-defined computational procedure that takes some value(s) as input and produces some value(s) as output.

Goals for an algorithm:1. Correct2. Terminates3. Efficient

What is CS421 about?

We will engage in the theoretical study of the design and analysis of computer algorithms.

– Analysis: predict the cost of an algorithm in terms of performance and resources used

– Design: design algorithms which minimize such costs

Machine Model Assumptions..

Random Access Machine (RAM) Model:1. Any memory cell can be accessed in 1 step.2. Memory is not limited (unbounded).3. Arbitrarily large integers can be stored in each memory

cell.4. Operations are executed sequentially.5. Operators include:

• primitive arithmetic (+, -, *, /, modulo, etc..) • logic (if..then, and, or, etc..) • comparators (<, >, =, etc..) and• function calls.

6. Each operation has a unit cost of 1.

An Example.. Sorting

Input: A sequence of n numbers, a1, a2, …, an

Output: A permutation (reordering), a'1, a'2, …, a'n,

of the input sequence such that a'1 a'2 … a'n .

Example:

Input: <13, 7, 42, 3, 6, 41>Output: <3, 6, 7, 13, 41, 42>

Insertion Sort

Pseudocode:

INSERTION-SORT (A)1 for j ← 2 to length[A]2 do key ← A[ j]3 i ← j – 14 while i > 0 and A[i] > key5 do A[i+1] ← A[i]6 i ← i – 17 A[i+1] = key

Example of insertion sort

8 2 4 9 3 6

Example of insertion sort

8 2 4 9 3 6

Example of insertion sort

8 2 4 9 3 6

2 8 4 9 3 6

Example of insertion sort

8 2 4 9 3 6

2 8 4 9 3 6

Example of insertion sort

8 2 4 9 3 6

2 8 4 9 3 6

2 4 8 9 3 6

Example of insertion sort

8 2 4 9 3 6

2 8 4 9 3 6

2 4 8 9 3 6

Example of insertion sort

8 2 4 9 3 6

2 8 4 9 3 6

2 4 8 9 3 6

2 4 8 9 3 6

Example of insertion sort

8 2 4 9 3 6

2 8 4 9 3 6

2 4 8 9 3 6

2 4 8 9 3 6

Example of insertion sort

8 2 4 9 3 6

2 8 4 9 3 6

2 4 8 9 3 6

2 4 8 9 3 6

2 3 4 8 9 6

Example of insertion sort

8 2 4 9 3 6

2 8 4 9 3 6

2 4 8 9 3 6

2 4 8 9 3 6

2 3 4 8 9 6

Example of insertion sort

8 2 4 9 3 6

2 8 4 9 3 6

2 4 8 9 3 6

2 4 8 9 3 6

2 3 4 8 9 6

2 3 4 6 8 9 end

How “Good” is Insertion Sort?

Recall our goals for an algorithm:1. Correct

2. Terminates

3. Efficient

Correctness

• Informally: At each step the “current card”, j, is inserted into an already sorted subarray A[1.. j-1].

• More formally: The loop invariant (a condition that does not change if correct) is that at the start of each for-loop, the subarray A[1.. j-1] consists of the elements in A[1.. j-1] but in sorted order.

Correctness

These properties must hold for a loop invariant:

• Initialization: It is true prior to the first iteration.• Maintenance: It is true before an iteration of the

loop and remains true before the next iteration.• Termination: When the loop terminates, the

invariant yields a useful property which helps show the algorithm is correct.

Correctness

In the case of insertion sort:

• Initialization: If j=2 at initialization, A[1..j-1] consists of a single element which is by definition sorted.

• Maintenance: Informally, the for-loop works by moving A[j-1], A[j-2], etc. one position to the right until the correct position for A[j] is found. Since A[1..j] is now sorted, when j is incremented, A[1..j-1] is sorted.

• Termination: The loop terminates when j=n+1. From the Maintenance property we know A[1..j-1] is now sorted. That is A[1..n] is now sorted. Hence the algorithm is correct.

Termination

• As we will see later in the semester determining if an arbitrary program terminates or halts is undecidable for Turing machines.

• In the specific instance of Insertion-Sort, it is easy to see that the two loops iterate at most n times each and thus the algorithm does not run indefinitely and does terminate.

Efficiency

• Running time is a measure of how many steps or primitive operations were performed.

• We have already stated that in the RAM machine model each operator has cost 1, but lets assume instead each operator has cost ci, where i is a line in our algorithm.

Kinds of Analysis

• Worst-case: T(n) = maximum time of algorithm on any input of size n.

• Average-case: T(n) = expected time of algorithm over all inputs of size n.– Requires assumption of statistical distribution of

inputs.

• Best-case: T(n) = minimum time of algorithm on any input of size n. – Problematic because a generally slow algorithm may

works fast on some input.

Running Time

Lets analyze our algorithm once more..

INSERTION-SORT (A) cost times1 for j ← 2 to length[A] c1 n2 do key ← A[ j] c2 n-13 i ← j – 1 c3 n-14 while i > 0 and A[i] > key c4 ∑j=2..ntj

5 do A[i+1] ← A[i] c5 ∑j=2..n (tj-1)6 i ← i – 1 c6 ∑j=2..n(tj-1)7 A[i+1] = key c7 n-1

Best Case

T(n) = (c1+c2+c3+c4+c7)n – (c2+c3+c4+c7)

INSERTION-SORT (A) cost times1 for j ← 2 to length[A] c1 n2 do key ← A[ j] c2 n-13 i ← j – 1 c3 n-14 while i > 0 and A[i] > key c4 ∑j=2..ntj

5 do A[i+1] ← A[i] c5 ∑j=2..n (tj-1)6 i ← i – 1 c6 ∑j=2..n(tj-1)7 A[i+1] = key c7 n-1

Worst Case

T(n) = (c4+c5+c6)n2/2 + (c1+c2+c3+c4/2-c5/2-c6/2c7)n – (c2 + c3 + c4 + c7)

INSERTION-SORT (A) cost times1 for j ← 2 to length[A] c1 n2 do key ← A[ j] c2 n-13 i ← j – 1 c3 n-14 while i > 0 and A[i] > key c4 ∑j=2..ntj

5 do A[i+1] ← A[i] c5 ∑j=2..n (tj-1)6 i ← i – 1 c6 ∑j=2..n(tj-1)7 A[i+1] = key c7 n-1

Average Case

tj = j/2 .. Only out of order half the time..

T(n) = (c4+c5+c6)n2/4 + (c1+c2+c3+c4/4-c5/4-c6/4c7)n – (c2 + c3 + c4 + c7)

INSERTION-SORT (A) cost times1 for j ← 2 to length[A] c1 n2 do key ← A[ j] c2 n-13 i ← j – 1 c3 n-14 while i > 0 and A[i] > key c4 ∑j=2..ntj

5 do A[i+1] ← A[i] c5 ∑j=2..n (tj-1)6 i ← i – 1 c6 ∑j=2..n(tj-1)7 A[i+1] = key c7 n-1

Machine Independent Analysis

As the size of the input becomes large, the constants ci don’t matter as much as the exponents and log factors. The constants also make machine independent analysis impossible.– Ignore the constants.– Examine growth of T(n) as n → ∞.– Asymptotic Analysis

Order of Growth

• The rate of growth is of primary interest, so we consider only the leading term and ignore all constants (e.g. n^2)

• Thus, the worst case running time of Insertion Sort is Θ(n2). Quadratic time.

• We will define this more precisely later.

Design Approach: Divide and Conquer

• Divide the problem into a number of subproblems.

• Conquer the subproblems recursively.

• Combine the subproblem solutions into the solution for the original problem.

• Recursion: when an algorithm calls itself.

Merge Sort

• Divide: Divide an n-element array into two subsequences of n/2 elements each.

• Conquer: Sort the two subsequences recursively with merge sort.

• Combine: Merge the two sorted arrays to produce the sorted sequence.

• Special Case: If the sequence has only one element the recursion “bottoms out” as the sequence is sorted by definition.

Merge Sort

MERGE-SORT (A[1 . . n])1. If n = 1, return A.2. L = A[ 1 . . n/2 ] 3. R = A[ n/2+1 . . n ] 4. L = Merge-Sort(L)5. R = Merge-Sort(R)6. Return Merge(L, R)

Merging two sorted arrays

20

13

7

2

12

11

9

1

Merging two sorted arrays

20

13

7

2

12

11

9

1

1

Merging two sorted arrays

20

13

7

2

12

11

9

1

1

20

13

7

2

12

11

9

Merging two sorted arrays

20

13

7

2

12

11

9

1

1

20

13

7

2

12

11

9

2

Merging two sorted arrays

20

13

7

2

12

11

9

1

1

20

13

7

2

12

11

9

2

20

13

7

12

11

9

Merging two sorted arrays

20

13

7

2

12

11

9

1

1

20

13

7

2

12

11

9

2

20

13

7

12

11

9

7

Merging two sorted arrays

20

13

7

2

12

11

9

1

1

20

13

7

2

12

11

9

2

20

13

7

12

11

9

7

20

13

12

11

9

Merging two sorted arrays

20

13

7

2

12

11

9

1

1

20

13

7

2

12

11

9

2

20

13

7

12

11

9

7

20

13

12

11

9

9

Merging two sorted arrays

20

13

7

2

12

11

9

1

1

20

13

7

2

12

11

9

2

20

13

7

12

11

9

7

20

13

12

11

9

9

20

13

12

11

Merging two sorted arrays

20

13

7

2

12

11

9

1

1

20

13

7

2

12

11

9

2

20

13

7

12

11

9

7

20

13

12

11

9

9

20

13

12

11

11

Merging two sorted arrays

20

13

7

2

12

11

9

1

1

20

13

7

2

12

11

9

2

20

13

7

12

11

9

7

20

13

12

11

9

9

20

13

12

11

11

20

13

12

Merging two sorted arrays

20

13

7

2

12

11

9

1

1

20

13

7

2

12

11

9

2

20

13

7

12

11

9

7

20

13

12

11

9

9

20

13

12

11

11

20

13

12

12

Merging two sorted arrays

20

13

7

2

12

11

9

1

1

20

13

7

2

12

11

9

2

20

13

7

12

11

9

7

20

13

12

11

9

9

20

13

12

11

11

20

13

12

12

Time = (n) to merge a total of n elements (linear time).

Analyzing merge sort

MERGE-SORT A[1 . . n]1. If n = 1, done.2. Recursively sort A[ 1 . . n/2 ]

and A[ n/2+1 . . n ] .3. “Merge” the 2 sorted lists

T(n)(1)2T(n/2)

(n)

Sloppiness: Should be T( n/2 ) + T( n/2 ) , but it turns out not to matter asymptotically.

Recurrence for merge sort

T(n) =(1) if n = 1;

2T(n/2) + (n) if n > 1.

• We shall usually omit stating the base case when T(n) = (1) for sufficiently small n, but only when it has no effect on the asymptotic solution to the recurrence.

Recursion tree

Solve T(n) = 2T(n/2) + cn, where c > 0 is constant.

Recursion tree

Solve T(n) = 2T(n/2) + cn, where c > 0 is constant.

T(n)

Recursion tree

Solve T(n) = 2T(n/2) + cn, where c > 0 is constant.

T(n/2) T(n/2)

cn

Recursion tree

Solve T(n) = 2T(n/2) + cn, where c > 0 is constant.

cn

T(n/4) T(n/4) T(n/4) T(n/4)

cn/2 cn/2

Recursion tree

Solve T(n) = 2T(n/2) + cn, where c > 0 is constant.

cn

cn/4 cn/4 cn/4 cn/4

cn/2 cn/2

(1)

Recursion tree

Solve T(n) = 2T(n/2) + cn, where c > 0 is constant.

cn

cn/4 cn/4 cn/4 cn/4

cn/2 cn/2

(1)

h = lg n

Recursion tree

Solve T(n) = 2T(n/2) + cn, where c > 0 is constant.

cn

cn/4 cn/4 cn/4 cn/4

cn/2 cn/2

(1)

h = lg n

cn

Recursion tree

Solve T(n) = 2T(n/2) + cn, where c > 0 is constant.

cn

cn/4 cn/4 cn/4 cn/4

cn/2 cn/2

(1)

h = lg n

cn

cn

Recursion tree

Solve T(n) = 2T(n/2) + cn, where c > 0 is constant.

cn

cn/4 cn/4 cn/4 cn/4

cn/2 cn/2

(1)

h = lg n

cn

cn

cn

Recursion tree

Solve T(n) = 2T(n/2) + cn, where c > 0 is constant.

cn

cn/4 cn/4 cn/4 cn/4

cn/2 cn/2

(1)

h = lg n

cn

cn

cn

#leaves = n (n)

Recursion tree

Solve T(n) = 2T(n/2) + cn, where c > 0 is constant.

cn

cn/4 cn/4 cn/4 cn/4

cn/2 cn/2

(1)

h = lg n

cn

cn

cn

#leaves = n (n)

Total(n lg n)

Conclusions

• (n lg n) grows more slowly than (n2).

• Therefore, merge sort asymptotically beats insertion sort in the worst case.