Hidden Markov Models BMI/CS 776 craven/776.html Mark Craven [email protected] March 2002.
Hidden Markov Models (Part 1) - cvut.cz · Hidden Markov Models (Part 1) BMI/CS 576 Mark Craven...
Transcript of Hidden Markov Models (Part 1) - cvut.cz · Hidden Markov Models (Part 1) BMI/CS 576 Mark Craven...
Hidden Markov Models
(Part 1)
BMI/CS 576
www.biostat.wisc.edu/bmi576.html
Mark Craven
Fall 2011
A simple HMM
A
T C
G A
T C
G
•! given say a T in our input sequence, which state
emitted it?
The hidden part of the problem
•! we’ll distinguish between the observed parts of a
problem and the hidden parts
•! in the Markov models we’ve considered previously, it
is clear which state accounts for each part of the
observed sequence
•! in the model above, there are multiple states that
could account for each part of the observed
sequence – this is the hidden part of the problem
Simple HMM for gene finding
Figure from A. Krogh, An Introduction to Hidden Markov Models for Biological Sequences
GCTAC
AAAAA
TTTTT
CTACG
CTACA
CTACC
CTACT
start
A
T
C
G
A
T
C
G
A
T
C
G start
The parameters of an HMM
•! since we’ve decoupled states and characters, we
might also have emission probabilities
!
ek(b) = P(x
i= b |"
i= k)
!
akl
= P("i= l |"
i#1 = k)
probability of emitting character b in state k
probability of a transition from state k to l
represents a path (sequence of states) through the model
•! as in Markov chain models, we have transition
probabilities
!
A simple HMM with emission
parameters
0.8
)A(2e
13a
probability of emitting character A in state 2
probability of a transition from state 1 to state 3
0.4
A 0.4
C 0.1
G 0.2
T 0.3
A 0.1
C 0.4
G 0.4
T 0.1
A 0.2
C 0.3
G 0.3
T 0.2
begin end
0.5
0.5
0.2
0.8
0.6
0.1
0.9 0.2
0 5
4
3
2
1
A 0.4
C 0.1
G 0.1
T 0.4
Three important questions
•! How likely is a given sequence?
the Forward algorithm
•! What is the most probable “path” for generating a given sequence?
the Viterbi algorithm
•! How can we learn the HMM parameters given a set of sequences?
the Forward-Backward (Baum-Welch) algorithm
Path notation
A 0.1
C 0.4
G 0.4
T 0.1
A 0.4
C 0.1
G 0.1
T 0.4
begin end
0.5
0.5
0.2
0.8
0.4
0.6
0.1
0.9 0.2
0.8
0 5
4
3
2
1
!
"0
= 0
"1
=1
"2
=1
"3
= 3
"4
= 5
A 0.4
C 0.1
G 0.2
T 0.3
A 0.2
C 0.3
G 0.3
T 0.2
•! let be a vector representing a path through the HMM
!
"
!
x1
= A
x2
= A
x3
= C
How likely is a given sequence?
•! the probability that the path is taken and
the sequence is generated:
(assuming begin/end are the only silent states on path)
!
P(x1…x
L, "
0.…"
N) = a
0"1e"
i
(xi)a"
i"i+1
i=1
L
#
!
x1…x
L
!
"0…"
N
How likely is a given sequence?
A 0.1
C 0.4
G 0.4
T 0.1
A 0.4
C 0.1
G 0.1
T 0.4
begin end
0.5
0.5
0.2
0.8
0.4
0.6
0.1
0.9 0.2
0.8
0 5
4
3
2
1
!
P(AAC," ) = a01 # e1(A) # a11 # e1(A) # a13 # e3(C) # a35
= 0.5 # 0.4 # 0.2 # 0.4 # 0.8 # 0.3# 0.6
A 0.4
C 0.1
G 0.2
T 0.3
A 0.2
C 0.3
G 0.3
T 0.2
How likely is a given sequence?
•! the probability over all paths is:
!
P(x1…x
L) = P(x
1…x
L, "
0…"
N)
"
#
!
Number of paths
A 0.4
C 0.1
G 0.1
T 0.4
A 0.4
C 0.1
G 0.2
T 0.3
begin end
0 3
2
1
•! for a sequence of length L, how many possible paths
through this HMM are there?
2L!
•! the Forward algorithm enables us to compute
efficiently
!
P(x1…x
L)
How likely is a given sequence: the
Forward algorithm
•! define to be the probability of being in state k
having observed the first i characters of x!
)(ifk
•! we want to compute , the probability of being
in the end state having observed all of x!
•! can define this recursively
)(LfN
The Forward algorithm
•! because of the Markov property, don’t have to explicitly
enumerate every path – use dynamic programming instead
)(4 if )1( ),1( 42 !! ifif•! e.g. compute using
A 0.4
C 0.1
G 0.2
T 0.3
A 0.1
C 0.4
G 0.4
T 0.1
A 0.4
C 0.1
G 0.1
T 0.4
A 0.2
C 0.3
G 0.3
T 0.2
begin end
0.5
0.5
0.2
0.8
0.4
0.6
0.1
0.9 0.2
0.8
0 5
4
3
2
1
The Forward algorithm
initialization:
1)0(0 =f
statessilent not are that for ,0)0( kfk =
probability that we’re in start state and
have observed 0 characters from the sequence
The Forward algorithm
!
fl (i) = fk (i)aklk
"
recursion for silent states:
!
fl (i) = el (i) fk (i "1)aklk
#
recursion for emitting states (i =1…L):
The Forward algorithm
termination:
!
P(x) = P(x1…xL ) = fN (L) = fk (L)akN
k
"
probability that we’re in the end state and
have observed the entire sequence
Forward algorithm example
A 0.4
C 0.1
G 0.2
T 0.3
A 0.1
C 0.4
G 0.4
T 0.1
A 0.4
C 0.1
G 0.1
T 0.4
A 0.2
C 0.3
G 0.3
T 0.2
begin end
0.5
0.5
0.2
0.8
0.4
0.6
0.1
0.9 0.2
0.8
0 5
4
3
2
1
•! given the sequence x = TAGA
Forward algorithm example
1)0(0 =f 0)0( 0)0( 51 == ff …
•! given the sequence x = TAGA
•! initialization
•! computing other values
!
f1(1) = e1(T) " ( f0(0)a01 + f1(0)a11) =
0.3" 1" 0.5 + 0 " 0.2( ) = 0.15
...!
f2(1) = 0.4 " 1" 0.5 + 0 " 0.8( )
!
f1(2) = e1(A) " ( f0(1)a01 + f1(1)a11) =
0.4 " 0 " 0.5 + 0.15 " 0.2( )
!
P(TAGA) = f5(4) = ( f3(4)a35 + f4 (4)a45)
Forward algorithm note
•! in some cases, we can make the algorithm more efficient
by taking into account the minimum number of steps that
must be taken to reach a state
begin end
0 5
4
3
2
1
•! e.g. for this HMM, we
don’t need to initialize or
compute the values
)1( ,)0(
,)0( ,)0(
55
43
ff
ff
Three important questions
•! How likely is a given sequence?
•! What is the most probable “path” for generating a
given sequence?
•! How can we learn the HMM parameters given a set
of sequences?
Finding the most probable path:
the Viterbi algorithm
•! define to be the probability of the most
probable path accounting for the first i characters of x
and ending in state k!
)(ivk
•! we want to compute , the probability of the most
probable path accounting for all of the sequence and
ending in the end state
•! can define recursively, use DP to find efficiently
)(LvN
)(LvN
Finding the most probable path:
the Viterbi algorithm
•! initialization:
1)0(0 =v
statessilent not are that for ,0)0( kvk
=
The Viterbi algorithm
•! recursion for emitting states (i =1…L):
[ ]klk
kill
aivxeiv )1(max)()( !=
[ ]klk
kl
aiviv )(max)( =
•! recursion for silent states:
[ ]klk
k
laivi )(maxarg)(ptr =
[ ]klk
k
laivi )1(maxarg)(ptr != keep track of most
probable path
The Viterbi algorithm
•! traceback: follow pointers back starting at
•! termination:
L!
( )kNk
k
aLv )( maxargL =!
!
P(x," ) =max k
vk(L)a
kN( )
Three important questions
•! How likely is a given sequence?
•! What is the most probable “path” for generating a
given sequence?
•! How can we learn the HMM parameters given a set
of sequences?