Hidden Markov Modelshisto.ucsf.edu/BMS270/BMS270a_2012/slides/Slides05_HMM.pdf · Hidden Markov...

Post on 25-Sep-2020

5 views 0 download

Transcript of Hidden Markov Modelshisto.ucsf.edu/BMS270/BMS270a_2012/slides/Slides05_HMM.pdf · Hidden Markov...

Hidden Markov Models

Mark Voorhies

4/2/2012

Mark Voorhies Hidden Markov Models

Searching with PSI-BLAST

Mark Voorhies Hidden Markov Models

0th order Markov Model

Mark Voorhies Hidden Markov Models

1st order Markov Model

Mark Voorhies Hidden Markov Models

1st order Markov Model

Mark Voorhies Hidden Markov Models

1st order Markov Model

Mark Voorhies Hidden Markov Models

What are Markov Models good for?

Background sequence composition

Spam

Mark Voorhies Hidden Markov Models

Hidden Markov Models

Mark Voorhies Hidden Markov Models

Hidden Markov Models

Mark Voorhies Hidden Markov Models

Hidden Markov Models

Mark Voorhies Hidden Markov Models

Hidden Markov Models

Mark Voorhies Hidden Markov Models

Hidden Markov Models

Mark Voorhies Hidden Markov Models

Hidden Markov Model

Mark Voorhies Hidden Markov Models

The Viterbi algorithm: Alignment

Mark Voorhies Hidden Markov Models

The Viterbi algorithm: Alignment

Dynamic programming, likeSmith-Waterman

Sums best log probabilitiesof emissions and transitions(i.e., multiplyingindependent probabilities)

Result is most likelyannotation of the targetwith hidden states

Mark Voorhies Hidden Markov Models

The Forward algorithm: Net probability

Probability-weighted sumover all possible paths

Simple modification ofViterbi (although summingprobabilities means we haveto be more careful aboutrounding error)

Result is the probability thatthe observed sequence isexplained by the model

In practice, this probabilityis compared to that of a nullmodel (e.g., randomgenomic sequence)

Mark Voorhies Hidden Markov Models

Training an HMM

If we have a set of sequenceswith known hidden states(e.g., from experiment),then we can calculate theemission and transitionprobabilities directly

Otherwise, they can beiteratively fit to a set ofunlabeled sequences that areknown to be true matchesto the model

The most common fittingprocedure is theBaum-Welch algorithm, aspecial case of expectationmaximization (EM)

Mark Voorhies Hidden Markov Models

Training an HMM

If we have a set of sequenceswith known hidden states(e.g., from experiment),then we can calculate theemission and transitionprobabilities directly

Otherwise, they can beiteratively fit to a set ofunlabeled sequences that areknown to be true matchesto the model

The most common fittingprocedure is theBaum-Welch algorithm, aspecial case of expectationmaximization (EM)

Mark Voorhies Hidden Markov Models

Training an HMM

If we have a set of sequenceswith known hidden states(e.g., from experiment),then we can calculate theemission and transitionprobabilities directly

Otherwise, they can beiteratively fit to a set ofunlabeled sequences that areknown to be true matchesto the model

The most common fittingprocedure is theBaum-Welch algorithm, aspecial case of expectationmaximization (EM)

Mark Voorhies Hidden Markov Models

Profile Alignments: Plan 7

(Image from Sean Eddy, PLoS Comp. Biol. 4:e1000069)

Mark Voorhies Hidden Markov Models

Profile Alignments: Plan 7 (from Outer Space)

(Image from Sean Eddy, PLoS Comp. Biol. 4:e1000069)

Mark Voorhies Hidden Markov Models

Rigging Plan 7 for Multi-Hit Alignment

(Image from Sean Eddy, PLoS Comp. Biol. 4:e1000069)

Mark Voorhies Hidden Markov Models

HMMer3 speeds

Eddy, PLoSCompBiol 7:e1002195

Mark Voorhies Hidden Markov Models

HMMer3 sensitivity and specificity

Eddy, PLoSCompBiol 7:e1002195

Mark Voorhies Hidden Markov Models

Homework

Compare the performance of BLASTP, PSI-BLAST, phmmer,and jackhmmer on a difficult sequence such as AGA1p(CAA96325.1). Use the shuffling tool on the course websiteto generate negative controls with the same composition. Forpositive controls, see Euk. Cell 5:628.

Download Cluster3 and JavaTreeView

Read PNAS 95:14863

Mark Voorhies Hidden Markov Models

Stochastic Context Free Grammars

Can emit from both sides → base pairs

Can duplicate emitter → bifurcations

Mark Voorhies Hidden Markov Models

INFERNAL/Rfam

Modified from the INFERNAL User Guide – Nawrocki, Kolbe, and Eddy

Mark Voorhies Hidden Markov Models

INFERNAL/Rfam

Modified from the INFERNAL User Guide – Nawrocki, Kolbe, and Eddy

Mark Voorhies Hidden Markov Models

INFERNAL/Rfam

Modified from the INFERNAL User Guide – Nawrocki, Kolbe, and Eddy

Mark Voorhies Hidden Markov Models

INFERNAL/Rfam

Modified from the INFERNAL User Guide – Nawrocki, Kolbe, and Eddy

Mark Voorhies Hidden Markov Models

INFERNAL/Rfam

Modified from the INFERNAL User Guide – Nawrocki, Kolbe, and Eddy

Mark Voorhies Hidden Markov Models

INFERNAL/Rfam

Modified from the INFERNAL User Guide – Nawrocki, Kolbe, and Eddy

Mark Voorhies Hidden Markov Models

INFERNAL/Rfam

Modified from the INFERNAL User Guide – Nawrocki, Kolbe, and Eddy

Mark Voorhies Hidden Markov Models