Lecture 8. Paradigm #6 Dynamic Programming

35
Lecture 8. Paradigm #6 Dynamic Programming Popularized by Richard Bellman ("Dynamic Programming", Princeton University Press, 1957; call number QA 264.B36). Chapter 15 of CLRS. Typically, dynamic programming reduces the complexity of a problem from 2 n to O(n 3 ) or O(n 2 ) or even O(n). It does so by keeping track of already computed results in a bottom-up fashion, hence avoiding enumerating all possibilities. Typically applies to optimization problems.

description

Lecture 8. Paradigm #6 Dynamic Programming. Popularized by Richard Bellman ("Dynamic Programming", Princeton University Press, 1957; call number QA 264.B36). Chapter 15 of CLRS. Typically, dynamic programming reduces the complexity of a problem from 2 n to O(n 3 ) or O(n 2 ) or even O(n). - PowerPoint PPT Presentation

Transcript of Lecture 8. Paradigm #6 Dynamic Programming

Page 1: Lecture 8. Paradigm #6 Dynamic Programming

Lecture 8. Paradigm #6 Dynamic Programming

Popularized by Richard Bellman ("Dynamic Programming", Princeton University Press, 1957; call number QA 264.B36). Chapter 15 of CLRS.

Typically, dynamic programming reduces the complexity of a problem from 2n to O(n3) or O(n2) or even O(n).

It does so by keeping track of already computed results in a bottom-up fashion, hence avoiding enumerating all possibilities.

Typically applies to optimization problems.

Page 2: Lecture 8. Paradigm #6 Dynamic Programming

Example 1. Efficient multiplication of matrices (Section 15.2 of CLRS.)

Suppose we are given the following 3 matrices: M1 10 x 100

M2 100 x 5

M3 5 x 50

There are two ways to compute M1*M2*M3: M1 (M2 M3) or (M1 M2) M3

Since the cost of multiplying a p x q matrix by a q x r matrix is pqr multiplications, the cost of M1 (M2 M3) is 100 x 5 x 50 + 10 x 100 x 50 = 75,000 multiplications, while the cost of (M1 M2) M3 is 10 x 100 x 5 + 10 x 5 x 50 = 7,500 multiplications: a difference

of a factor of 10.

Page 3: Lecture 8. Paradigm #6 Dynamic Programming

Naïve approach

We could enumerate all possibilities, and then take the minimum. How many possibilities are there?

The LAST multiplication performed is either M1*(M2 ... Mn), or (M1 M2)*(M3 ... Mn), or ... (M1 M2 ...)(Mn). Therefore, W(n), the number of ways to compute M1 M2 ... Mn, satisfies the following recurrence:

W(n) = Σ1 ≤ k < n W(k)W(n-k) --- Catalan number

Now it can be proved by induction that W(n) = (2n-2 choose n-1)/n. Using Stirling's approximation, which says that

n! = √(2πn) nn e-n (1 + o(1)),

we have (2n choose n) ~ 22n/√(π n), We conclude that W(n) ~ 4n n-3/2, which means our naive approach

will simply take too long (about 1010 steps when n = 20).

Page 4: Lecture 8. Paradigm #6 Dynamic Programming

Dynamic Programming approach

Let’s avoid all the re-computation of the recursive approach. Observe: Suppose the optimal method to compute M1 M2 ... Mn were to

first compute M1 M2 ... Mk (in some order), then compute Mk+1 ... Mn (in some order), and then multiply these together. Then the method used for M1 M2 ... Mk must be optimal, for otherwise we could substitute a superior method and improve the optimal method. Similarly, the method used to compute Mk+1 ... Mn must also be optimal. The only thing left to do is to find the best possible k, and there are only n choices for that.

Letting m[i,j] represent the optimal cost for computing the product Mi ... Mj, we see that

m[i,j] = min { m[i,k] + m[k+1,j] + p[i-1]p[k]p[j] }, i ≤ k < j k represents the optimal place to break the product Mi ... Mj into two

pieces. Here p is an array such that M1 is of dimension p[0] × p[1], M2 is of dimension p[1] × p[2], ... etc.

Page 5: Lecture 8. Paradigm #6 Dynamic Programming

Implementing it --- O(n3) time

Like the Fibonacci number example, we cannot implement this by recursion. It will be exponential time.

MATRIX-MULT-ORDER(p)

/* p[0..n] is an array holding the dimensions of the matrices; matrix i has dimension p[i-1] x p[i] */

for i := 1 to n do

m[i,i] := 0

for d := 1 to n-1 do // d is the size of the sub-problem.

for i := 1 to n-d do

j := i+d

m[i,j] := infinity;

for k := i to j-1 do

q := m[i,k] + m[k+1,j] + p[i-1]*p[k]*p[j]

if q < m[i,j] then

m[i,j] := q

s[i,j] := k // optimal position for breaking m[i,j]

return(m,s)

Page 6: Lecture 8. Paradigm #6 Dynamic Programming

Actually multiply the matrices

We have stored the break points k’s in the array s. s[i,j] represents the optimal place to break the product Mi ... Mj. We can use s now to multiply the matrices:

MATRIX-MULT(M, s, i, j)

/* Given the matrix s calculated by MATRIX-MULT-ORDER. The list of matrices M = [M1, M2, ... , Mn]. Starting and finishing indices i and j. This routine computes the product Mi ... Mj using the optimal method */

if j > i then

X := MATRIX-MULT(M, s, i, s[i,j]);

Y := MATRIX-MULT(M, s, s[i,j]+1, j);

return(X*Y);

else return(Mi)

Page 7: Lecture 8. Paradigm #6 Dynamic Programming

Longest Common Subsequence (LCS)

Application: comparison of two DNA stringsEx: X= {A B C B D A B }, Y= {B D C A B A}

Longest Common Subsequence:

X = A B C B D A B

Y = B D C A B A

Brute force algorithm would compare each subsequence of X with the symbols in Y

Page 8: Lecture 8. Paradigm #6 Dynamic Programming

LCS Algorithm

if |X| = m, |Y| = n, then there are 2m subsequences of x; we must compare each with Y (n comparisons)

So the running time of the brute-force algorithm is O(n 2m)

Notice that the LCS problem has optimal substructure: solutions of subproblems are parts of the final solution – often, this is when you can use dynamic programming.

Subproblems: “find LCS of pairs of prefixes of X and Y”

Page 9: Lecture 8. Paradigm #6 Dynamic Programming

LCS Algorithm

First we’ll find the length of LCS. Later we’ll modify the algorithm to find LCS itself.

Let Xi, Yj be the prefixes of X and Y of length i and j respectively

Let c[i,j] be the length of LCS of Xi and Yj

Then the length of LCS of X and Y will be c[m,n]

otherwise]),1[],1,[max(

],[][ if1]1,1[],[

jicjic

jyixjicjic

Page 10: Lecture 8. Paradigm #6 Dynamic Programming

LCS recursive solution

We start with i = j = 0 (empty substrings of x and y)

Since X0 and Y0 are empty strings, their LCS is

always empty (i.e. c[0,0] = 0)

LCS of empty string and any other string is empty,

so for every i and j: c[0, j] = c[i,0] = 0

otherwise]),1[],1,[max(

],[][ if1]1,1[],[

jicjic

jyixjicjic

Page 11: Lecture 8. Paradigm #6 Dynamic Programming

LCS recursive solution

When we calculate c[i,j], we consider two cases:

First case: x[i]=y[j]: one more symbol in strings X

and Y matches, so the length of LCS Xi and Yj equals

to the length of LCS of smaller strings Xi-1 and Yi-1 ,

plus 1

otherwise]),1[],1,[max(

],[][ if1]1,1[],[

jicjic

jyixjicjic

Page 12: Lecture 8. Paradigm #6 Dynamic Programming

04/22/23 12

LCS recursive solution

Second case: x[i] != y[j]

As symbols don’t match, our solution is not

improved, and the length of LCS(Xi , Yj) is the same

as before, we take the maximum of LCS(Xi, Yj-1) and

LCS(Xi-1,Yj)

otherwise]),1[],1,[max(

],[][ if1]1,1[],[

jicjic

jyixjicjic

Think: Why can’t we just take the length of LCS(Xi-1, Yj-1) ?

Page 13: Lecture 8. Paradigm #6 Dynamic Programming

LCS Length Algorithm

LCS-Length(X, Y)1. m = length(X) // get the # of symbols in X2. n = length(Y) // get the # of symbols in Y

3. for i = 1 to m c[i,0] = 0 // special case: Y0

4. for j = 1 to n c[0,j] = 0 // special case: X0

5. for i = 1 to m // for all Xi

6. for j = 1 to n // for all Yj

7. if ( Xi == Yj )8. c[i,j] = c[i-1,j-1] + 19. else c[i,j] = max( c[i-1,j], c[i,j-1] )10. return c

Page 14: Lecture 8. Paradigm #6 Dynamic Programming

LCS Example

We’ll see how LCS algorithm works on the following example:

X = ABCB Y = BDCAB

LCS(X, Y) = BCBX = A B C BY = B D C A B

Page 15: Lecture 8. Paradigm #6 Dynamic Programming

LCS Example (0)j 0 1 2 3 4 5

0

1

2

3

4

i

Xi

A

B

C

B

Yj BB ACD

X = ABCB; m = |X| = 4Y = BDCAB; n = |Y| = 5Allocate array c[5,4]

ABCBBDCAB

Page 16: Lecture 8. Paradigm #6 Dynamic Programming

04/22/23 16

LCS Example (1)j 0 1 2 3 4 5

0

1

2

3

4

i

Xi

A

B

C

B

Yj BB ACD

0

0

00000

0

0

0

for i = 1 to m c[i,0] = 0 for j = 1 to n c[0,j] = 0

ABCBBDCAB

Page 17: Lecture 8. Paradigm #6 Dynamic Programming

04/22/23 17

LCS Example (2)j 0 1 2 3 4 5

0

1

2

3

4

i

Xi

A

B

C

B

Yj BB ACD

0

0

00000

0

0

0

if ( Xi == Yj ) c[i,j] = c[i-1,j-1] + 1 else c[i,j] = max( c[i-1,j], c[i,j-1] )

0

ABCBBDCAB

Page 18: Lecture 8. Paradigm #6 Dynamic Programming

04/22/23 18

LCS Example (3)j 0 1 2 3 4 5

0

1

2

3

4

i

Xi

A

B

C

B

Yj BB ACD

0

0

00000

0

0

0

if ( Xi == Yj ) c[i,j] = c[i-1,j-1] + 1 else c[i,j] = max( c[i-1,j], c[i,j-1] )

0 0 0

ABCBBDCAB

Page 19: Lecture 8. Paradigm #6 Dynamic Programming

04/22/23 19

LCS Example (4)j 0 1 2 3 4 5

0

1

2

3

4

i

Xi

A

B

C

B

Yj BB ACD

0

0

00000

0

0

0

if ( Xi == Yj ) c[i,j] = c[i-1,j-1] + 1 else c[i,j] = max( c[i-1,j], c[i,j-1] )

0 0 0 1

ABCBBDCAB

Page 20: Lecture 8. Paradigm #6 Dynamic Programming

04/22/23 20

LCS Example (5)j 0 1 2 3 4 5

0

1

2

3

4

i

Xi

A

B

C

B

Yj BB ACD

0

0

00000

0

0

0

if ( Xi == Yj ) c[i,j] = c[i-1,j-1] + 1 else c[i,j] = max( c[i-1,j], c[i,j-1] )

000 1 1

ABCBBDCAB

Page 21: Lecture 8. Paradigm #6 Dynamic Programming

04/22/23 21

LCS Example (6)j 0 1 2 3 4 5

0

1

2

3

4

i

Xi

A

B

C

B

Yj BB ACD

0

0

00000

0

0

0

if ( Xi == Yj ) c[i,j] = c[i-1,j-1] + 1 else c[i,j] = max( c[i-1,j], c[i,j-1] )

0 0 10 1

1

ABCBBDCAB

Page 22: Lecture 8. Paradigm #6 Dynamic Programming

04/22/23 22

LCS Example (7)j 0 1 2 3 4 5

0

1

2

3

4

i

Xi

A

B

C

B

Yj BB ACD

0

0

00000

0

0

0

if ( Xi == Yj ) c[i,j] = c[i-1,j-1] + 1 else c[i,j] = max( c[i-1,j], c[i,j-1] )

1000 1

1 1 11

ABCBBDCAB

Page 23: Lecture 8. Paradigm #6 Dynamic Programming

04/22/23 23

LCS Example (8)j 0 1 2 3 4 5

0

1

2

3

4

i

Xi

A

B

C

B

Yj BB ACD

0

0

00000

0

0

0

if ( Xi == Yj ) c[i,j] = c[i-1,j-1] + 1 else c[i,j] = max( c[i-1,j], c[i,j-1] )

1000 1

1 1 1 1 2

ABCBBDCAB

Page 24: Lecture 8. Paradigm #6 Dynamic Programming

04/22/23 24

LCS Example (10)j 0 1 2 3 4 5

0

1

2

3

4

i

Xi

A

B

C

B

Yj BB ACD

0

0

00000

0

0

0

if ( Xi == Yj )c[i,j] = c[i-1,j-1] + 1

else c[i,j] = max( c[i-1,j], c[i,j-1] )

1000 1

21 1 11

1 1

ABCBBDCAB

Page 25: Lecture 8. Paradigm #6 Dynamic Programming

04/22/23 25

LCS Example (11)j 0 1 2 3 4 5

0

1

2

3

4

i

Xi

A

B

C

B

Yj BB ACD

0

0

00000

0

0

0

if ( Xi == Yj ) c[i,j] = c[i-1,j-1] + 1 else c[i,j] = max( c[i-1,j], c[i,j-1] )

1000 1

1 21 11

1 1 2

ABCBBDCAB

Page 26: Lecture 8. Paradigm #6 Dynamic Programming

04/22/23 26

LCS Example (12)j 0 1 2 3 4 5

0

1

2

3

4

i

Xi

A

B

C

B

Yj BB ACD

0

0

00000

0

0

0

if ( Xi == Yj )c[i,j] = c[i-1,j-1] + 1

else c[i,j] = max( c[i-1,j], c[i,j-1] )

1000 1

1 21 1

1 1 2

1

22

ABCBBDCAB

Page 27: Lecture 8. Paradigm #6 Dynamic Programming

04/22/23 27

LCS Example (13)j 0 1 2 3 4 5

0

1

2

3

4

i

Xi

A

B

C

B

Yj BB ACD

0

0

00000

0

0

0

if ( Xi == Yj ) c[i,j] = c[i-1,j-1] + 1 else c[i,j] = max( c[i-1,j], c[i,j-1] )

1000 1

1 21 1

1 1 2

1

22

1

ABCBBDCAB

Page 28: Lecture 8. Paradigm #6 Dynamic Programming

04/22/23 28

LCS Example (14)j 0 1 2 3 4 5

0

1

2

3

4

i

Xi

A

B

C

B

Yj BB ACD

0

0

00000

0

0

0

if ( Xi == Yj )c[i,j] = c[i-1,j-1] + 1

else c[i,j] = max( c[i-1,j], c[i,j-1] )

1000 1

1 21 1

1 1 2

1

22

1 1 2 2

ABCBBDCAB

Page 29: Lecture 8. Paradigm #6 Dynamic Programming

04/22/23 29

LCS Example (15)j 0 1 2 3 4 5

0

1

2

3

4

i

Xi

A

B

C

B

Yj BB ACD

0

0

00000

0

0

0

if ( Xi == Yj ) c[i,j] = c[i-1,j-1] + 1 else c[i,j] = max( c[i-1,j], c[i,j-1] )

1000 1

1 21 1

1 1 2

1

22

1 1 2 2 3

ABCBBDCAB

Page 30: Lecture 8. Paradigm #6 Dynamic Programming

04/22/23 30

LCS Algorithm Running Time

LCS algorithm calculates the values of each entry of the array c[m,n]

So what is the running time?

O(m*n)

since each c[i,j] is calculated in constant time, and there are m*n elements in the array

Page 31: Lecture 8. Paradigm #6 Dynamic Programming

04/22/23 31

How to find actual LCS

So far, we have just found the length of LCS, but not LCS itself.

We want to modify this algorithm to make it output Longest Common Subsequence of X and Y

Each c[i,j] depends on c[i-1,j] and c[i,j-1]

or c[i-1, j-1]

For each c[i,j] we can say how it was acquired:

2

2 3

2 For example, here c[i,j] = c[i-1,j-1] +1 = 2+1=3

Page 32: Lecture 8. Paradigm #6 Dynamic Programming

04/22/23 32

How to find actual LCS - continued

Remember that

otherwise]),1[],1,[max(

],[][ if1]1,1[],[

jicjic

jyixjicjic

So we can start from c[m,n] and go backwards

Whenever c[i,j] = c[i-1, j-1]+1, remember x[i] (because x[i] is a part of LCS)

When i=0 or j=0 (i.e. we reached the beginning), output remembered letters in reverse order

Page 33: Lecture 8. Paradigm #6 Dynamic Programming

04/22/23 33

Finding LCSj 0 1 2 3 4 5

0

1

2

3

4

i

Xi

A

B

C

Yj BB ACD

0

0

00000

0

0

0

1000 1

1 21 1

1 1 2

1

22

1 1 2 2 3B

Page 34: Lecture 8. Paradigm #6 Dynamic Programming

04/22/23 34

Finding LCS (2)j 0 1 2 3 4 5

0

1

2

3

4

i

Xi

A

B

C

Yj BB ACD

0

0

00000

0

0

0

1000 1

1 21 1

1 1 2

1

22

1 1 2 2 3B

B C BLCS (reversed order):

LCS (straight order): B C B (this string turned out to be a palindrome)

Page 35: Lecture 8. Paradigm #6 Dynamic Programming

If we have time, we will do some exercises in class: Edit distance: Given two text strings A of length n and B

of length m, you want to transform A into B with a minimum number of operations of the following types: delete a character from A, insert a character into A, or change some character in A into a new character. The minimal number of such operations required to transform A into B is called the edit distance between A and B.

Balanced Partition: Given a set of n integers each in the range 0 ... K. Partition these integers into two subsets such that you minimize |S1 - S2|, where S1 and S2 denote the sums of the elements in each of the two subsets.