CS273a Lecture 8, Win07, Batzoglou Evolution at the DNA level …ACGGTGCAGTTACCA…...

Post on 19-Dec-2015

216 views 0 download

Tags:

Transcript of CS273a Lecture 8, Win07, Batzoglou Evolution at the DNA level …ACGGTGCAGTTACCA…...

CS273a Lecture 8, Win07, Batzoglou

Evolution at the DNA level

…ACGGTGCAGTTACCA…

…AC----CAGTCCACCA…

Mutation

SEQUENCE EDITS

REARRANGEMENTS

Deletion

InversionTranslocationDuplication

CS273a Lecture 8, Win07, Batzoglou

Evolutionary Rates

OK

OK

OK

X

X

Still OK?

next generation

CS273a Lecture 8, Win07, Batzoglou

CS273a Lecture 8, Win07, Batzoglou

Genome Evolution – Macro Events

• Inversions• Deletions• Duplications

CS273a Lecture 8, Win07, Batzoglou

Synteny maps

Comparison of human and mouse

CS273a Lecture 8, Win07, Batzoglou

Synteny maps

CS273a Lecture 8, Win07, Batzoglou

Orthology, Paralogy, Inparalogs, Outparalogs

CS273a Lecture 8, Win07, Batzoglou

Synteny maps

CS273a Lecture 8, Win07, Batzoglou

Dog Genome

CS273a Lecture 8, Win07, Batzoglou

Synteny maps

CS273a Lecture 8, Win07, Batzoglou

Building synteny maps

Recommended local aligners• BLASTZ

Most accurate, especially for genes Chains local alignments

• WU-BLAST Good tradeoff of efficiency/sensitivity Best command-line options

• BLAT Fast, less sensitive Good for

• comparing very similar sequences • finding rough homology map

CS273a Lecture 8, Win07, Batzoglou

Index-based local alignment

Dictionary:

All words of length k (~10)

Alignment initiated between words of alignment score T

(typically T = k)

Alignment:

Ungapped extensions until score

below statistical threshold

Output:

All local alignments with score

> statistical threshold

……

……

query

DB

query

scan

Question: Using an idea from overlap detection, better way to find all local alignments between two genomes?

CS273a Lecture 8, Win07, Batzoglou

Local Alignments

CS273a Lecture 8, Win07, Batzoglou

After chaining

CS273a Lecture 8, Win07, Batzoglou

Chaining local alignments

1. Find local alignments

2. Chain -O(NlogN) L.I.S.

3. Restricted DP

CS273a Lecture 8, Win07, Batzoglou

Progressive Alignment

• When evolutionary tree is known:

Align closest first, in the order of the tree In each step, align two sequences x, y, or profiles px, py, to generate a new

alignment with associated profile presult

Weighted version: Tree edges have weights, proportional to the divergence in that edge New profile is a weighted average of two old profiles

x

w

y

z

Example

Profile: (A, C, G, T, -)px = (0.8, 0.2, 0, 0, 0)py = (0.6, 0, 0, 0, 0.4)

s(px, py) = 0.8*0.6*s(A, A) + 0.2*0.6*s(C, A) + 0.8*0.4*s(A, -) + 0.2*0.4*s(C, -)

Result: pxy = (0.7, 0.1, 0, 0, 0.2)

s(px, -) = 0.8*1.0*s(A, -) + 0.2*1.0*s(C, -)

Result: px- = (0.4, 0.1, 0, 0, 0.5)

CS273a Lecture 8, Win07, Batzoglou

Threaded Blockset Aligner

Human–Cow

HMR – CDRestricted AreaProfile Alignment

CS273a Lecture 8, Win07, Batzoglou

Neutral Substitution Rates

CS273a Lecture 8, Win07, Batzoglou

Reconstructing the Ancestral Mammalian Genome

Human: C

Baboon: C

Cat: C

Dog: G

C

C or G

G

CS273a Lecture 8, Win07, Batzoglou

Finding Conserved Elements (1)

• Binomial method 25-bp window in the human genome Binomial distribution of k matches in N bases given the neutral

probability of substitution

CS273a Lecture 8, Win07, Batzoglou

Finding Conserved Elements (2)

• Parsimony Method Count minimum # of mutations explaining each column Assign a probability to this parsimony score given neutral model Multiply probabilities across 25-bp window of human genome

A

CAAG

CS273a Lecture 8, Win07, Batzoglou

Finding Conserved Elements

CS273a Lecture 8, Win07, Batzoglou

Finding Conserved Elements (3)

GERP

CS273a Lecture 8, Win07, Batzoglou

Phylo HMMs

HMM

Phylogenetic Tree Model

Phylo HMM

CS273a Lecture 8, Win07, Batzoglou

Finding Conserved Elements (3)

CS273a Lecture 8, Win07, Batzoglou

How do the methods agree/disagree?

CS273a Lecture 8, Win07, Batzoglou

Statistical Power to Detect Constraint

L

N

C: cutoff # mutationsD: neutral mutation rate: constraint mutation rate relative to neutral

CS273a Lecture 8, Win07, Batzoglou

Statistical Power to Detect Constraint

L

N

C: cutoff # mutationsD: neutral mutation rate: constraint mutation rate relative to neutral