Models of DNA evolution
description
Transcript of Models of DNA evolution
Models of DNA evolutionHow does DNA change, and how can we obtain distances?
The Jukes-Cantor model
Thomas H. Jukes (1906-1999)
King JL Jukes TH 1969. Non-DarwinianEvolution. Science 164: 788-798.
Charles R. Cantor (°1942)
The Jukes-Cantor model
A G
TC
u
u
u
u
u
u
in the JC model, each base in the sequence has an equal chance of changing, u, into one of the three other bases
The Jukes-Cantor model
A G
TC
u/3
u/3
u/3
u/3
u/3
u/3
fictionalising, each base has a chance of (4/3)u of changing to a base randomly drawn from all 4 possibilities
The Jukes-Cantor model
PrY=y = e-m my
y!
537 hits576 squaresaverage hit per square = 0.9323
probability of not being hit e-0.9323*0.93230
0!= = 0.3936
expected number of squares hit 226.74not at all
the probability of no event is given by the zero term of a Poisson distribution
The Jukes-Cantor model
PrY=y = e-m my
y!
537 hits576 squaresaverage hit per square = 0.9323
probability being hit once e-0.9323*0.93231
1!= = 0.3670
expected number of squares hit 226.74 211.39 not at all 1x
The Jukes-Cantor model
PrY=y = e-m my
y!
537 hits576 squaresaverage hit per square = 0.9323
probability of being hit twice e-0.9323*0.93232
2!= = 0.1711
expected number of squares hit 226.74 211.39 98.54not at all 1x 2x
The Jukes-Cantor model
PrY=y = e-m my
y!
537 hits576 squaresaverage hit per square = 0.9323
probability of being hit four times e-0.9323*0.93234
4!= = 0.012
expected number of squares hit 226.74 211.39 98.54 30.62 7.13 1.6not at all 1x 2x 3x 4x 5+
observed number of squares hit 229 211 93 35 7 1
The Jukes-Cantor model
PrY=y = e-m my
y!
u/3
A G
TC
u/3
u/3
u/3
u/3
u/3
probability of no event = e-(4/3)ut
probability of ≥1 event = 1 - e-(4/3)ut
probability of C at the end of a branch that started with A = (¼)(1 - e-(4/3)ut)
probability that a site is differentat two ends of a branch = (¾)(1 - e-(4/3)ut)
The Jukes-Cantor model
branch length (ut)
0 1 2 3
diffe
renc
es p
er si
te
0.0
0.2
0.4
0.6
0.8
y = (¾)(1 - e-(4/3)ut)
the expected difference per site between two sequences increases with branch length but reaches a plateau at 0.75
The Jukes-Cantor model not using the J&C correction will distort the tree
A D
B C A B C D
A 0 0.57698 0.59858 0.70439
B 0.57698 0 0.24726 0.59858
C 0.59858 0.24726 0 0.57698
D 0.70439 0.59858 0.57698 0
the real tree expected uncorrected sequence differences
A D
BC
least squares tree
The Jukes-Cantor model
A G
TC
u/3
u/3
u/3
u/3
u/3
u/3
the J&C model assumes no difference in substitution rates between transversions and transitions
Kimura’s two-parameter model
A G
TC
a
b
a
b
b
b
R = number of transitionsnumber of transversions
= a2b
the Kimura model allows a difference in substitution rate between transversions and transitions
Kimura’s two-parameter model
Prob (transition|t) = ¼ - ½ e + ¼ e - 2R+1
R+1 t 2R+1 t-
probability that a transition will occur in a time interval t
R = a2b
Kimura’s two-parameter model
Prob (transition|t) = ¼ - ½ e + ¼ e - 2R+1
R+1 t 2R+1 t-
probability that any tranversion will occur in a time interval t
Prob (transversion|t) = ½ - ½ e
2R+1 t
Kimura’s two-parameter model
transversions
transitions
total
R=10
Time (branch length)
0.0 0.5 1.0 1.5 2.0 2.5 3.0
Diffe
renc
es
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
(50% different)
Time (branch length)
0.0 0.5 1.0 1.5 2.0 2.5 3.0
Diffe
renc
es
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
Kimura’s two-parameter model
transversions
transitions
total
R=2
(50% different)
Tamura-Nei models
P(event type I)purine>purinepyrim>pyrim
P(event type II) random base A G C T
purine
A aR b -aRpG/pR
+ bpGbpC bpT
G aR baRpA/pR
+ bpA- bpC bpT
pyrimidine
C aY b bpA bpG -aYpT/pY +
bpT
T aY b bpA bpGaYpC/pY +
bpC-
pA,G,C,T: relative proportion of A,G,C,T in the poolpR= pA+ pG
pY = pC+ pT
the T&N models allow asymmetric base frequencies
The general time-reversible model (GTR)
A G C T
A - apG bpC gpT
G apA - dpC epT
C bpA dpG - hYpT
T gpA epG hpC -
The general 12-parameter model
A G C T
A - apG bpC gpT
G dpA - epC fpT
C gpA hpG - iYpT
T jpA kpG lpC -