Hidden Markov Models
description
Transcript of Hidden Markov Models
![Page 1: Hidden Markov Models](https://reader036.fdocuments.us/reader036/viewer/2022070503/56815972550346895dc6b5f1/html5/thumbnails/1.jpg)
Hidden Markov Models
Ellen WalkerBioinformatics
Hiram College, 2008
![Page 2: Hidden Markov Models](https://reader036.fdocuments.us/reader036/viewer/2022070503/56815972550346895dc6b5f1/html5/thumbnails/2.jpg)
State Machine to Recognize “AUG”
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.transition
Start state
Final state
Each character causes a transition to the next state
![Page 3: Hidden Markov Models](https://reader036.fdocuments.us/reader036/viewer/2022070503/56815972550346895dc6b5f1/html5/thumbnails/3.jpg)
“AUG” anywhere in a string
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
![Page 4: Hidden Markov Models](https://reader036.fdocuments.us/reader036/viewer/2022070503/56815972550346895dc6b5f1/html5/thumbnails/4.jpg)
“AUG” in frame
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
![Page 5: Hidden Markov Models](https://reader036.fdocuments.us/reader036/viewer/2022070503/56815972550346895dc6b5f1/html5/thumbnails/5.jpg)
Deterministic Finite Automaton (DFA)
• States– One start state– One or more accept states
• Transitions– For every state, for every character
• Outputs– Optional: states can emit outputs, e.g.
“Stop” at accept state
![Page 6: Hidden Markov Models](https://reader036.fdocuments.us/reader036/viewer/2022070503/56815972550346895dc6b5f1/html5/thumbnails/6.jpg)
Why DFAs?
• Every regular expression has an associated state machine that recognizes it (and vice versa)
• State machines are easy to implement in very low level code (or hardware)
• Sometimes the state machine is easier to describe than the regular expression
![Page 7: Hidden Markov Models](https://reader036.fdocuments.us/reader036/viewer/2022070503/56815972550346895dc6b5f1/html5/thumbnails/7.jpg)
Hidden Markov Models
• Also a form of state machine• Transitions based on probabilities, not
inputs• Every state has (probabilistic) output (or
emission)• “Hidden” because only emissions are
visible, not states or transitions
![Page 8: Hidden Markov Models](https://reader036.fdocuments.us/reader036/viewer/2022070503/56815972550346895dc6b5f1/html5/thumbnails/8.jpg)
HMM vs. DFA
• DFA is deterministic– Each decision (which state next? What to
output?) is fully determined by the input string
• HMM is probabilistic– HMM makes both decisions based on
probability distributions
![Page 9: Hidden Markov Models](https://reader036.fdocuments.us/reader036/viewer/2022070503/56815972550346895dc6b5f1/html5/thumbnails/9.jpg)
HMM vs. DFA (2)
• DFA model is explicit and used directly like a program.
• HMM model must be inferred from data. Only emissions (outputs) can be observed. States and transitions, as well as the probability distributions for transitions and outputs are hidden.
![Page 10: Hidden Markov Models](https://reader036.fdocuments.us/reader036/viewer/2022070503/56815972550346895dc6b5f1/html5/thumbnails/10.jpg)
HMM Example: Fair Bet Casino
• The casino has two coins, a Fair coin (F) and a Biased coin (B)– Fair coin has 50% H, 50% T– Biased coin has 75% H, 25% T
• Before each flip, with probability 10%, the dealer will switch coins.
• Can you tell, based only on a sequence of H and T which coin is used when?
![Page 11: Hidden Markov Models](https://reader036.fdocuments.us/reader036/viewer/2022070503/56815972550346895dc6b5f1/html5/thumbnails/11.jpg)
“Fair Bet Casino” HMM
Image from Jones & Pevner 2004
![Page 12: Hidden Markov Models](https://reader036.fdocuments.us/reader036/viewer/2022070503/56815972550346895dc6b5f1/html5/thumbnails/12.jpg)
The Decoding Problem
• Given an HMM and a sequence of outputs, what is the most likely path through the HMM that generated the outputs?
![Page 13: Hidden Markov Models](https://reader036.fdocuments.us/reader036/viewer/2022070503/56815972550346895dc6b5f1/html5/thumbnails/13.jpg)
Viterbi Algorithm
• Uses dynamic programming• Starting point:
– When the output string is “”, the most likely state is the start state (and there is no path)
• Taking a step:– Likelihood of this state is maximum of all ways to
get here, measured as:• Likelihood of previous state *
Likelihood of transition to this state * Likelihood of output from this state
![Page 14: Hidden Markov Models](https://reader036.fdocuments.us/reader036/viewer/2022070503/56815972550346895dc6b5f1/html5/thumbnails/14.jpg)
Example: “HHT”
• Initial -> F – Prev= 1, Trans = 0.5, Out=0.5, total = 0.25
• Initial -> B– Prev =1, Trans = 0.5, Out=0.75, total =
0.375• Result: F = 0.25, B=0.375
![Page 15: Hidden Markov Models](https://reader036.fdocuments.us/reader036/viewer/2022070503/56815972550346895dc6b5f1/html5/thumbnails/15.jpg)
Example: “HHT”• F -> F
– Prev=0.25, Trans = 0.9, Out=0.5, total = 0.1125• B -> F
– Prev=0.375, Trans = 0.1, Out=0.5, total = 0.01875• F -> B
– Prev =.25, Trans = 0.1, Out=0.75, total = 0.01875• B -> B
– Prev =.375, Trans = 0.9, Out=0.75, total = 0.253125
• Result: F = 0.1125, B=0.253125
![Page 16: Hidden Markov Models](https://reader036.fdocuments.us/reader036/viewer/2022070503/56815972550346895dc6b5f1/html5/thumbnails/16.jpg)
Example: HHT
• F -> F – Prev=.1125, Trans = 0.9, Out=0.5, total = 0.0506
• B -> F – Prev=.253125, Trans = 0.1, Out=0.5, total = 0.0127
• F -> B– Prev =.1125, Trans = 0.1, Out=0.25, total = 0.00281
• B -> B– Prev=.253125, Trans = 0.9, Out=0.25, total = 0.0570
• Result: F = 0.0506, B=0.0570
![Page 17: Hidden Markov Models](https://reader036.fdocuments.us/reader036/viewer/2022070503/56815972550346895dc6b5f1/html5/thumbnails/17.jpg)
Tracing Back
• Pick the highest result from the last step, follow the highest transition from each previous step (just like Smith-Waterman)
• Result: initial->B->B->B• Biased coin always used• What if the next flip is T?
![Page 18: Hidden Markov Models](https://reader036.fdocuments.us/reader036/viewer/2022070503/56815972550346895dc6b5f1/html5/thumbnails/18.jpg)
Log Probabilities
• Probabilities are increasingly small, as you multiply numbers less than one
• Computers have limits to precision• Therefore, it’s better to use a log
probability format• 1/10*1/10 = 1/100 (10-1 *10-1 = 10-2)• -1 + -1 = -2
![Page 19: Hidden Markov Models](https://reader036.fdocuments.us/reader036/viewer/2022070503/56815972550346895dc6b5f1/html5/thumbnails/19.jpg)
GC Rich Islands
• A GC Rich Island is an area of a genome where GC content is significantly greater than the genome as a whole
• GC Rich Islands are like Biased Coins• Can recognize them using the same HMM
– GC content is p(H) for fair coin– Larger number is p(H) for biased coin– Estimate probability of entering vs. leaving GC
Rich island for “changing coin” probability
![Page 20: Hidden Markov Models](https://reader036.fdocuments.us/reader036/viewer/2022070503/56815972550346895dc6b5f1/html5/thumbnails/20.jpg)
Probability of State Sequence, Given Output Sequence
• Given HMM and output string, what is probability that HMM is in state S at time t?– Forward: similar formulation as decoding
problem, except take sum of all paths, instead of max of all paths (times from 0 to t-1)
– Backward: similar, but work from end of string (times from t+1 to end of sequence
![Page 21: Hidden Markov Models](https://reader036.fdocuments.us/reader036/viewer/2022070503/56815972550346895dc6b5f1/html5/thumbnails/21.jpg)
Parameter Estimation
• Given many strings, what are the parameters of the HMM that generated them?– Assume we know the states and
transitions, but not the probabilities of transitions or outputs
– This is an optimization problem
![Page 22: Hidden Markov Models](https://reader036.fdocuments.us/reader036/viewer/2022070503/56815972550346895dc6b5f1/html5/thumbnails/22.jpg)
Characteristics of an Optimization Problem
• Each potential solution has a “goodness” value (in this case, probability)
• We want the best solution• Perfect answer: try all possibilities (not
usually possible)• Good, but not perfect answer: use a
heuristic
![Page 23: Hidden Markov Models](https://reader036.fdocuments.us/reader036/viewer/2022070503/56815972550346895dc6b5f1/html5/thumbnails/23.jpg)
Hill Climbing (an Optimization Heuristic)
• Start with a solution (could be random)• Consider one or more “steps”, or
perturbations to the solution• Choose the “step” that most improves
the score• Repeat until the score is good enough,
or no better score can be reached
![Page 24: Hidden Markov Models](https://reader036.fdocuments.us/reader036/viewer/2022070503/56815972550346895dc6b5f1/html5/thumbnails/24.jpg)
Hill Climbing for HMM
• Guess a state sequence• Using the string(s), estimate transition
and emission probabilities• Using the probabilities, generate a new
state sequence using the decoding algorithm
• Repeat until the sequence stabilizes
![Page 25: Hidden Markov Models](https://reader036.fdocuments.us/reader036/viewer/2022070503/56815972550346895dc6b5f1/html5/thumbnails/25.jpg)
HMM for Sequence Profiles
• Three kinds of states:– Insertion– Deletion– Match
• Probability estimations indicate how often each occurs
• Logos are direct representations of HMMs in this format