Hidden Markov Models (Part 1) - cvut.cz · Hidden Markov Models (Part 1) BMI/CS 576 Mark Craven...

Hidden Markov Models

(Part 1)

BMI/CS 576

www.biostat.wisc.edu/bmi576.html

Mark Craven

[email protected]

Fall 2011

A simple HMM

A

T C

G A

T C

G

•! given say a T in our input sequence, which state

emitted it?

The hidden part of the problem

•! we’ll distinguish between the observed parts of a

problem and the hidden parts

•! in the Markov models we’ve considered previously, it

is clear which state accounts for each part of the

observed sequence

•! in the model above, there are multiple states that

could account for each part of the observed

sequence – this is the hidden part of the problem

Simple HMM for gene finding

Figure from A. Krogh, An Introduction to Hidden Markov Models for Biological Sequences

GCTAC

AAAAA

TTTTT

CTACG

CTACA

CTACC

CTACT

start

A

T

C

G

A

T

C

G

A

T

C

G start

The parameters of an HMM

•! since we’ve decoupled states and characters, we

might also have emission probabilities

!

ek(b) = P(x

i= b |"

i= k)

!

akl

= P("i= l |"

i#1 = k)

probability of emitting character b in state k

probability of a transition from state k to l

represents a path (sequence of states) through the model

•! as in Markov chain models, we have transition

probabilities

!

A simple HMM with emission

parameters

0.8

)A(2e

13a

probability of emitting character A in state 2

probability of a transition from state 1 to state 3

0.4

A 0.4

C 0.1

G 0.2

T 0.3

A 0.1

C 0.4

G 0.4

T 0.1

A 0.2

C 0.3

G 0.3

T 0.2

begin end

0.5

0.5

0.2

0.8

0.6

0.1

0.9 0.2

0 5

4

3

2

1

A 0.4

C 0.1

G 0.1

T 0.4

Three important questions

•! How likely is a given sequence?

the Forward algorithm

•! What is the most probable “path” for generating a given sequence?

the Viterbi algorithm

•! How can we learn the HMM parameters given a set of sequences?

the Forward-Backward (Baum-Welch) algorithm

Path notation

A 0.1

C 0.4

G 0.4

T 0.1

A 0.4

C 0.1

G 0.1

T 0.4

begin end

0.5

0.5

0.2

0.8

0.4

0.6

0.1

0.9 0.2

0.8

0 5

4

3

2

1

!

"0

= 0

"1

=1

"2

=1

"3

= 3

"4

= 5

A 0.4

C 0.1

G 0.2

T 0.3

A 0.2

C 0.3

G 0.3

T 0.2

•! let be a vector representing a path through the HMM

!

"

!

x1

= A

x2

= A

x3

= C

How likely is a given sequence?

•! the probability that the path is taken and

the sequence is generated:

(assuming begin/end are the only silent states on path)

!

P(x1…x

L, "

0.…"

N) = a

0"1e"

i

(xi)a"

i"i+1

i=1

L

#

!

x1…x

L

!

"0…"

N


A 0.1

C 0.4

G 0.4

T 0.1

A 0.4

C 0.1

G 0.1

T 0.4

begin end

0.5

0.5

0.2

0.8

0.4

0.6

0.1

0.9 0.2

0.8

0 5

4

3

2

1

!

P(AAC," ) = a01 # e1(A) # a11 # e1(A) # a13 # e3(C) # a35

= 0.5 # 0.4 # 0.2 # 0.4 # 0.8 # 0.3# 0.6

A 0.4

C 0.1

G 0.2

T 0.3

A 0.2

C 0.3

G 0.3

T 0.2


•! the probability over all paths is:

!

P(x1…x

L) = P(x

1…x

L, "

0…"

N)

"

#

!

Number of paths

A 0.4

C 0.1

G 0.1

T 0.4

A 0.4

C 0.1

G 0.2

T 0.3

begin end

0 3

2

1

•! for a sequence of length L, how many possible paths

through this HMM are there?

2L!

•! the Forward algorithm enables us to compute

efficiently

!

P(x1…x

L)

How likely is a given sequence: the

Forward algorithm

•! define to be the probability of being in state k

having observed the first i characters of x!

)(ifk

•! we want to compute , the probability of being

in the end state having observed all of x!

•! can define this recursively

)(LfN

The Forward algorithm

•! because of the Markov property, don’t have to explicitly

enumerate every path – use dynamic programming instead

)(4 if )1( ),1( 42 !! ifif•! e.g. compute using

A 0.4

C 0.1

G 0.2

T 0.3

A 0.1

C 0.4

G 0.4

T 0.1

A 0.4

C 0.1

G 0.1

T 0.4

A 0.2

C 0.3

G 0.3

T 0.2

begin end

0.5

0.5

0.2

0.8

0.4

0.6

0.1

0.9 0.2

0.8

0 5

4

3

2

1


initialization:

1)0(0 =f

statessilent not are that for ,0)0( kfk =

probability that we’re in start state and

have observed 0 characters from the sequence


!

fl (i) = fk (i)aklk

"

recursion for silent states:

!

fl (i) = el (i) fk (i "1)aklk

#

recursion for emitting states (i =1…L):


termination:

!

P(x) = P(x1…xL ) = fN (L) = fk (L)akN

k

"

probability that we’re in the end state and

have observed the entire sequence

Forward algorithm example

A 0.4

C 0.1

G 0.2

T 0.3

A 0.1

C 0.4

G 0.4

T 0.1

A 0.4

C 0.1

G 0.1

T 0.4

A 0.2

C 0.3

G 0.3

T 0.2

begin end

0.5

0.5

0.2

0.8

0.4

0.6

0.1

0.9 0.2

0.8

0 5

4

3

2

1

•! given the sequence x = TAGA

Forward algorithm example

1)0(0 =f 0)0( 0)0( 51 == ff …

•! given the sequence x = TAGA

•! initialization

•! computing other values

!

f1(1) = e1(T) " ( f0(0)a01 + f1(0)a11) =

0.3" 1" 0.5 + 0 " 0.2( ) = 0.15

...!

f2(1) = 0.4 " 1" 0.5 + 0 " 0.8( )

!

f1(2) = e1(A) " ( f0(1)a01 + f1(1)a11) =

0.4 " 0 " 0.5 + 0.15 " 0.2( )

!

P(TAGA) = f5(4) = ( f3(4)a35 + f4 (4)a45)

Forward algorithm note

•! in some cases, we can make the algorithm more efficient

by taking into account the minimum number of steps that

must be taken to reach a state

begin end

0 5

4

3

2

1

•! e.g. for this HMM, we

don’t need to initialize or

compute the values

)1( ,)0(

,)0( ,)0(

55

43

ff

ff



•! What is the most probable “path” for generating a

given sequence?

•! How can we learn the HMM parameters given a set

of sequences?

Finding the most probable path:


•! define to be the probability of the most

probable path accounting for the first i characters of x

and ending in state k!

)(ivk

•! we want to compute , the probability of the most

probable path accounting for all of the sequence and

ending in the end state

•! can define recursively, use DP to find efficiently

)(LvN

)(LvN

Finding the most probable path:


•! initialization:

1)0(0 =v

statessilent not are that for ,0)0( kvk

=

The Viterbi algorithm

•! recursion for emitting states (i =1…L):

[ ]klk

kill

aivxeiv )1(max)()( !=

[ ]klk

kl

aiviv )(max)( =

•! recursion for silent states:

[ ]klk

k

laivi )(maxarg)(ptr =

[ ]klk

k

laivi )1(maxarg)(ptr != keep track of most

probable path

The Viterbi algorithm

•! traceback: follow pointers back starting at

•! termination:

L!

( )kNk

k

aLv )( maxargL =!

!

P(x," ) =max k

vk(L)a

kN( )



•! What is the most probable “path” for generating a

given sequence?

•! How can we learn the HMM parameters given a set

of sequences?

Hidden Markov Models (Part 1) - cvut.cz · Hidden Markov Models (Part 1) BMI/CS 576 Mark Craven...

Documents

Transcript of Hidden Markov Models (Part 1) - cvut.cz · Hidden Markov Models (Part 1) BMI/CS 576 Mark Craven...