Thomas Jellema & Wouter Van Gool 1 Question. 2Answer.

51
Thomas Jellema & Wouter Van Gool 1 Question Question

Transcript of Thomas Jellema & Wouter Van Gool 1 Question. 2Answer.

Page 1: Thomas Jellema & Wouter Van Gool 1 Question. 2Answer.

Thomas Jellema & Wouter Van Gool 1

QuestionQuestion

Page 2: Thomas Jellema & Wouter Van Gool 1 Question. 2Answer.

Thomas Jellema & Wouter Van Gool 2

AnswerAnswer

Page 3: Thomas Jellema & Wouter Van Gool 1 Question. 2Answer.

Thomas Jellema & Wouter Van Gool 3

Pairwise alignment using Pairwise alignment using HMMsHMMs

Wouter van Gool and Thomas Jellema

Page 4: Thomas Jellema & Wouter Van Gool 1 Question. 2Answer.

Thomas Jellema & Wouter Van Gool 4

Contents

• Most probable path Thomas • Probability of an alignment Thomas • Sub-optimal alignments Thomas• Pause• Posterior probability that xi is aligned to yi Wouter• Pair HMMs versus FSAs for searching Wouter• Conclusion and summary Wouter• Questions

Pairwise alignment using Pairwise alignment using HMMsHMMs

Page 5: Thomas Jellema & Wouter Van Gool 1 Question. 2Answer.

Thomas Jellema & Wouter Van Gool 5

4.1 Most probable path4.1 Most probable path

Model that emits a single sequene

Page 6: Thomas Jellema & Wouter Van Gool 1 Question. 2Answer.

Thomas Jellema & Wouter Van Gool 6

4.1 Most probable path4.1 Most probable path

Begin and end state

Page 7: Thomas Jellema & Wouter Van Gool 1 Question. 2Answer.

Thomas Jellema & Wouter Van Gool 7

4.1 Most probable path4.1 Most probable path

Model that emits a pairwise alignment

Page 8: Thomas Jellema & Wouter Van Gool 1 Question. 2Answer.

Thomas Jellema & Wouter Van Gool 8

4.1 Most probable path4.1 Most probable path

Example of a sequenceSeq1: A C T _ CSeq2: T _ G G CAll : M X M Y M

Page 9: Thomas Jellema & Wouter Van Gool 1 Question. 2Answer.

Thomas Jellema & Wouter Van Gool 9

4.1 Most probable path4.1 Most probable path

Begin and end state

Page 10: Thomas Jellema & Wouter Van Gool 1 Question. 2Answer.

Thomas Jellema & Wouter Van Gool 10

4.1 Most probable path4.1 Most probable path

Finding the most probable path- The path you choose is the path that has the highest probability of being the correct alignment.- The state we choose to be part of the alignment has to be the state with the highest probability of being correct.- We calculate the probability of the state being a M, X or Y and choose the one with the highest probability- If the probability of ending the alignment is higher then the next state being a M, X or Y then we end the alignment

Page 11: Thomas Jellema & Wouter Van Gool 1 Question. 2Answer.

Thomas Jellema & Wouter Van Gool 11

4.1 Most probable path4.1 Most probable path

The probability of emmiting an M is the highest probability of: 1 previous state X new state M 2 previous state Y new state M 3 previous state M new state M

Page 12: Thomas Jellema & Wouter Van Gool 1 Question. 2Answer.

Thomas Jellema & Wouter Van Gool 12

4.1 Most probable path4.1 Most probable path

Probability of going to the M state

Page 13: Thomas Jellema & Wouter Van Gool 1 Question. 2Answer.

Thomas Jellema & Wouter Van Gool 13

4.1 Most probable path4.1 Most probable path

Viterbi algorithm for pair HMMs

Page 14: Thomas Jellema & Wouter Van Gool 1 Question. 2Answer.

Thomas Jellema & Wouter Van Gool 14

4.1 Most probable path4.1 Most probable path

Finding the most probable path using FSAs

-The most probable path is also the optimal FSA alignment

Page 15: Thomas Jellema & Wouter Van Gool 1 Question. 2Answer.

Thomas Jellema & Wouter Van Gool 15

4.1 Most probable path4.1 Most probable path

Finding the most probable path using FSAs

Page 16: Thomas Jellema & Wouter Van Gool 1 Question. 2Answer.

Thomas Jellema & Wouter Van Gool 16

4.1 Most probable path4.1 Most probable path

Recurrence relations

Page 17: Thomas Jellema & Wouter Van Gool 1 Question. 2Answer.

Thomas Jellema & Wouter Van Gool 17

4.1 Most probable path4.1 Most probable path

We wish to know if the alignment score is above or below the score of random alignment.

The log-odds ratio s(a,b) = log (pab / qaqb).

log (pab / qaqb)>0 iff the probability that a and b are related by our model is larger than the probability that they are picked at random.

The log odds scoring function

Page 18: Thomas Jellema & Wouter Van Gool 1 Question. 2Answer.

Thomas Jellema & Wouter Van Gool 18

4.1 Most probable path4.1 Most probable path

Random model

Page 19: Thomas Jellema & Wouter Van Gool 1 Question. 2Answer.

Thomas Jellema & Wouter Van Gool 19

1END

η1- ηY

η1- ηX

ENDYX

1END

τε1-ε -τ

Y

τ ε1-ε -τX

τδδ1-2δ -τ

M

ENDYXM

“Model”

“Random”

4.1 Most probable path4.1 Most probable path

Page 20: Thomas Jellema & Wouter Van Gool 1 Question. 2Answer.

Thomas Jellema & Wouter Van Gool 20

4.1 Most probable path4.1 Most probable path

Transitions

Page 21: Thomas Jellema & Wouter Van Gool 1 Question. 2Answer.

Thomas Jellema & Wouter Van Gool 21

4.1 Most probable path4.1 Most probable path

Transitions

Page 22: Thomas Jellema & Wouter Van Gool 1 Question. 2Answer.

Thomas Jellema & Wouter Van Gool 22

4.1 Most probable path4.1 Most probable pathOptimal log-odds alignment

Page 23: Thomas Jellema & Wouter Van Gool 1 Question. 2Answer.

Thomas Jellema & Wouter Van Gool 23

4.1 Most probable path4.1 Most probable pathA pair HMM for local alignment

Page 24: Thomas Jellema & Wouter Van Gool 1 Question. 2Answer.

Thomas Jellema & Wouter Van Gool 24

Contents

• Most probable path Thomas• Probability of an alignment Thomas • Sub-optimal alignments Thomas• Pause• Posterior probability that xi is aligned to yi Wouter• Pair HMMs versus FSAs for searching Wouter• Conclusion and summary Wouter• Questions

Pairwise alignment using Pairwise alignment using HMMsHMMs

Page 25: Thomas Jellema & Wouter Van Gool 1 Question. 2Answer.

Thomas Jellema & Wouter Van Gool 25

4.2 Probability of an allignment4.2 Probability of an allignment

Probability that a given pair of sequences are related.

Page 26: Thomas Jellema & Wouter Van Gool 1 Question. 2Answer.

Thomas Jellema & Wouter Van Gool 26

4.2 Probability of an allignment4.2 Probability of an allignment

Summing the probabilities

Page 27: Thomas Jellema & Wouter Van Gool 1 Question. 2Answer.

Thomas Jellema & Wouter Van Gool 27

4.2 Probability of an allignment4.2 Probability of an allignment

Page 28: Thomas Jellema & Wouter Van Gool 1 Question. 2Answer.

Thomas Jellema & Wouter Van Gool 28

Contents

• Most probable path Thomas• Probability of an alignment Thomas • Sub-optimal alignments Thomas• Pause• Posterior probability that xi is aligned to yiPosterior probability that xi is aligned to yi Wouter• Pair HMMs versus FSAs for searching Wouter• Conclusion and summary Wouter• Questions

Pairwise alignment using Pairwise alignment using HMMsHMMs

Page 29: Thomas Jellema & Wouter Van Gool 1 Question. 2Answer.

Thomas Jellema & Wouter Van Gool 29

4.3 Suboptimal alignment4.3 Suboptimal alignment

Finding suboptimal alignments

How to make sample alignments?

Page 30: Thomas Jellema & Wouter Van Gool 1 Question. 2Answer.

Thomas Jellema & Wouter Van Gool 30

4.3 Suboptimal alignment4.3 Suboptimal alignmentFinding distinct suboptimal alignments

Page 31: Thomas Jellema & Wouter Van Gool 1 Question. 2Answer.

Thomas Jellema & Wouter Van Gool 31

Contents

• Most probable path Thomas• Probability of an alignment Thomas • Sub-optimal alignments Thomas• Pause• Posterior probability that xi is aligned to yi Wouter• Example Wouter• Pair HMMs versus FSAs for searching Wouter• Conclusion or summary Wouter• Questions

Pairwise alignment using Pairwise alignment using HMMsHMMs

Page 32: Thomas Jellema & Wouter Van Gool 1 Question. 2Answer.

Thomas Jellema & Wouter Van Gool 32

Contents

• Most probable path Thomas• Probability of an alignment Thomas • Sub-optimal alignments Thomas• Pause• Posterior probability that xi is aligned to yi Wouter• Pair HMMs versus FSAs for searching Wouter• Conclusion and summary Wouter• Questions

Pairwise alignment using Pairwise alignment using HMMsHMMs

Page 33: Thomas Jellema & Wouter Van Gool 1 Question. 2Answer.

Thomas Jellema & Wouter Van Gool 33

Posterior probability that xPosterior probability that x ii is is aligned to yaligned to yii

Local accuracy of an alignment?Reliability measure for each part of an

alignmentHMM as a local alignment measureIdea: P(all alignments trough (xi,yi))

P(all alignments of (x,y))

Page 34: Thomas Jellema & Wouter Van Gool 1 Question. 2Answer.

Thomas Jellema & Wouter Van Gool 34

Posterior probability that xPosterior probability that x ii is is

aligned to yaligned to yii

Notation: xi ◊ yi means xi is aligned to yi

Page 35: Thomas Jellema & Wouter Van Gool 1 Question. 2Answer.

Thomas Jellema & Wouter Van Gool 35

Posterior probability that xPosterior probability that x ii is is aligned to yaligned to yii

Page 36: Thomas Jellema & Wouter Van Gool 1 Question. 2Answer.

Thomas Jellema & Wouter Van Gool 36

Posterior probability that xPosterior probability that x ii is is

aligned to yaligned to yii

Page 37: Thomas Jellema & Wouter Van Gool 1 Question. 2Answer.

Thomas Jellema & Wouter Van Gool 37

Probability alignmentProbability alignment

Miyazawa: it seems attractive to find alignment by maximising P(xi ◊ yi )

May lead to inconsistencies:

e.g. pairs (i1,i1) & (i2,j2)

i2 > i1 and j1 < j2

Restriction to pairs (i,j) for which

P(xi ◊ yi )>0.5

Page 38: Thomas Jellema & Wouter Van Gool 1 Question. 2Answer.

Thomas Jellema & Wouter Van Gool 38

Posterior probability that xPosterior probability that x ii is is aligned to yaligned to yii

The expected accuracy of an alignment

Expected overlap between π and paths sampled from the posterior distribution

Dynamic programming

)1,(

),1(

)()1,1(

max),(

jiA

jiA

yxPjiA

jiAji

),(

)()(ji

ji yxPA

Page 39: Thomas Jellema & Wouter Van Gool 1 Question. 2Answer.

Thomas Jellema & Wouter Van Gool 39

Contents

• Most probable path Thomas• Probability of an alignment Thomas • Sub-optimal alignments Thomas• Pause• Posterior probability that xi is aligned to yi Wouter• Pair HMMs versus FSAs for searching Wouter• Conclusion and summary Wouter• Questions

Pairwise alignment using Pairwise alignment using HMMsHMMs

Page 40: Thomas Jellema & Wouter Van Gool 1 Question. 2Answer.

Thomas Jellema & Wouter Van Gool 40

Contents

• Most probable path Thomas• Probability of an alignment Thomas • Sub-optimal alignments Thomas• Pause• Posterior probability that xi is aligned to yi Wouter• Pair HMMs versus FSAs for searching Wouter• Conclusion and summary Wouter• Questions

Pairwise alignment using Pairwise alignment using HMMsHMMs

Page 41: Thomas Jellema & Wouter Van Gool 1 Question. 2Answer.

Thomas Jellema & Wouter Van Gool 41

Pair HMMs versus FSAs for Pair HMMs versus FSAs for searchingsearching

P(D | M) > P(M | D)HMM: maximum data likelihood by giving

the same parameters (i.e. transition and emission probabilities)

Bayesian model comparison with random model R

Page 42: Thomas Jellema & Wouter Van Gool 1 Question. 2Answer.

Thomas Jellema & Wouter Van Gool 42

Pair HMMs versus FSAs for Pair HMMs versus FSAs for searchingsearching

Problems: 1. Most algorithms do not compute full

probability P(x,y | M) but only best match or Viterbi path 2. FSA parameters may not be readily

translated into probabilities

Page 43: Thomas Jellema & Wouter Van Gool 1 Question. 2Answer.

Thomas Jellema & Wouter Van Gool 43

Pair HMMs vs FSAs for Pair HMMs vs FSAs for searchingsearching

Example: a model whose parameters match the data need not be the best model

a b a c

qa

S

B

α

1-α

1 1 1

1PS(abac) = α4qaqbqaqc

PB(abac) = 1-α

Model comparison using the best match rather than the total probability

Page 44: Thomas Jellema & Wouter Van Gool 1 Question. 2Answer.

Thomas Jellema & Wouter Van Gool 44

Pair HMMs vs FSAs for Pair HMMs vs FSAs for searchingsearching

Problem: no fixed scaling procedure can make the scores of this model into the log probabilities of an HMM

Page 45: Thomas Jellema & Wouter Van Gool 1 Question. 2Answer.

Thomas Jellema & Wouter Van Gool 45

Pair HMMs vs FSAs for Pair HMMs vs FSAs for searchingsearching

Bayesian model comparision: both HMMs have same log-odds ratio as previous FSA

Page 46: Thomas Jellema & Wouter Van Gool 1 Question. 2Answer.

Thomas Jellema & Wouter Van Gool 46

Pair HMMs vs FSAs for Pair HMMs vs FSAs for searchingsearching

Conversion FSA into probabilistic model– Probabilistic models may underperform

standard alignment methods if Viterbi is used for database searching.

– Buf if forward algorithm is used, it would be better than standard methods.

Page 47: Thomas Jellema & Wouter Van Gool 1 Question. 2Answer.

Thomas Jellema & Wouter Van Gool 47

Contents

• Most probable path Thomas• Probability of an alignment Thomas • Sub-optimal alignments Thomas• Pause• Posterior probability that xi is aligned to yi Wouter• Example Wouter• Pair HMMs versus FSAs for searching Wouter• Conclusion and summary Wouter• Questions

Pairwise alignment using Pairwise alignment using HMMsHMMs

Page 48: Thomas Jellema & Wouter Van Gool 1 Question. 2Answer.

Thomas Jellema & Wouter Van Gool 48

Why try to use HMMs?Why try to use HMMs?Many complicated alignment algorithms can be described as simple Finite State Machines.HMMs have many advantages: - Parameters can be trained to fit the data: no need

for PAM/BLOSSUM matrices

- HMMs can keep track of all alignments, not just

the best one

Page 49: Thomas Jellema & Wouter Van Gool 1 Question. 2Answer.

Thomas Jellema & Wouter Van Gool 49

New things HMMs we can do New things HMMs we can do with pair HMMswith pair HMMs

Compute probability over all alignments. Compute relative probability of Viterbi

alignment (or any other alignment). Sample over all alignments in proportion to their

probability. Find distinct sub-optimal alignments. Compute reliability of each part of the best

alignment. Compute the maximally reliable alignment.

Page 50: Thomas Jellema & Wouter Van Gool 1 Question. 2Answer.

Thomas Jellema & Wouter Van Gool 50

ConclusionConclusion

Pairs-HMM work better for sequence alignment and database search than penalty score based alignment algorithms.

Unfortunately both approaches are O(mn) and hence too slow for large database searches!

Page 51: Thomas Jellema & Wouter Van Gool 1 Question. 2Answer.

Thomas Jellema & Wouter Van Gool 51

Contents

• Most probable path Thomas• Probability of an alignment Thomas • Sub-optimal alignments Thomas• Pause• Posterior probability that xi is aligned to yi Wouter• Pair HMMs versus FSAs for searching Wouter• Conclusion or summary Wouter• Questions

Pairwise alignment using Pairwise alignment using HMMsHMMs