Models and their benefits.

10
Models and their benefits. Models + Data 1. probability of data (statistics...) 2. probability of individual histories 3. hypothesis testing 4. parameter estimation

description

Models and their benefits. Models + Data probability of data (statistics...) probability of individual histories hypothesis testing parameter estimation. Wright-Fisher Model of Population Reproduction. Haploid Model. - PowerPoint PPT Presentation

Transcript of Models and their benefits.

Page 1: Models and their benefits.

Models and their benefits.

Models + Data

1. probability of data (statistics...)

2. probability of individual histories

3. hypothesis testing

4. parameter estimation

Page 2: Models and their benefits.

Haploid Model

Diploid Model

Wright-Fisher Model of Population Reproduction

i. Individuals are made by sampling with replacement in the previous generation.

ii. The probability that 2 alleles have same ancestor in previous generation is 1/2N

Individuals are made by sampling a chromosome from the female and one from the male previous generation with replacement

Assumptions

1. Constant population size

2. No geography

3. No Selection

4. No recombination

Page 3: Models and their benefits.

10 Alleles’ Ancestry for 15 generations

Page 4: Models and their benefits.

Mean, E(X2) = 2N.

Ex.: 2N = 20.000, Generation time 30 years, E(X2) = 600000 years.

Waiting for most recent common ancestor - MRCA

P(X2 = j) = (1-(1/2N))j-1 (1/2N)

Distribution until 2 alleles had a common ancestor, X2?:

P(X2 > j) = (1-(1/2N))j

P(X2 > 1) = (2N-1)/2N = 1-(1/2N)

1 2N 1 2N

1 1

2

j

1 2N

1

2

j

Page 5: Models and their benefits.

P(k):=P{k alleles had k distinct parents}

1 2N

1

2N *(2N-1) *..* (2N-(k-1)) =: (2N)[k]

(2N)k

k -> any k -> k k -> k-1

Ancestor choices:

P(k) 2N[k ]

(2N)k (k 2 2N) 1

k

2

/2N e

k

2

/ 2N

k

2

(2N)[k 1]

k -> j

Sk, j (2N)[ j ]

For k << 2N:

Sk,j - the number of ways to group k labelled objects into j groups.(Stirling Numbers of second kind.

Page 6: Models and their benefits.

)1(

2

2/1

kk

k

Expected Height and Total Branch Length

Expected Total height of tree: Hk= 2(1-1/k)

i.Infinitely many alleles finds 1 allele in finite time. ii. In takes less than twice as long for k alleles to find 1 ancestors as it does for 2 alleles.

Expected Total branch length in tree, Lk:

2*(1 + 1/2 + 1/3 +..+ 1/(k-1)) ca= 2*ln(k-1)

1

2

3

k

1/3

1 2

1

2/(k-1)

Time Epoch Branch Lengths

Page 7: Models and their benefits.

6 Realisations with 25 leaves

Observations:

Variation great close to root.

Trees are unbalanced.

Page 8: Models and their benefits.

Sampling more sequences

The probability that the ancestor of the sample of size n is in a sub-sample of size k is

Letting n go to infinity gives (k-1)/(k+1), i.e. even for quite small samples it is quite large.

(n 1)(k 1)

(n 1)(k 1)

Page 9: Models and their benefits.

Final Aligned Data Set:

Infinite Site Model

Page 10: Models and their benefits.

Recombination-Coalescence Illustration

Copied from Hudson 1991

Intensities

Coales. Recomb.

1 2

3 2

6 2

3 (2+b)

1 (1+b)

0

b