Hernandez smbe 2015
Transcript of Hernandez smbe 2015
Background Selection With Non-Equilibrium Demographic Models
Ryan D. Hernandez
SMBE [email protected]
@rdhernand
Raul Torres
Talk 23.12:!Thursday 10am!
Deleterious mutations will arise in the next generation
Chromosomes in a population with standing variation
The Effect of Negative Selection
Deleterious mutations will arise in the next generation
Chromosomes in a population with standing variation
Negative selection: the action of natural selection purging deleterious mutations.
The Effect of Negative Selection
Deleterious mutations will arise in the next generation
Chromosomes in a population with standing variation
Negative selection: the action of natural selection purging deleterious mutations.
The Effect of Negative Selection
The Effect of Negative Selection
Consequences:!
• Some proportion of chromosomes are eliminated each generation!
➡ Decreased effective population size (f0Ne)!
➡ Decreased neutral variation ( f0π )!
➡Excess of neutral rare variants{Backgr
ound
selectio
n
f0 = exp�� U
s + R
⇥. Charlesworth (MANY, 1993-)
Hudson & Kaplan. Genetics (1995) Many others…
Wright-Fisher Island model
Many demographic bells and population structure whistles
Various distributions of selective effects
Mutation models (with or w/o CpG effects)
Coding versus non-coding
X versus autosome
Arbitrary recombination maps
Recently optimized…
Forward SimulationsSelection on Finite Sites under COmplex Demographic Events
(SFS_CODE)
Hernandez. Bioinformatics (2008) http://sfscode.sourceforge.net
Performance
1e−0
50.
001
0.1
110
Run
Tim
e (d
)
1000 10000 1e+05 1e+06 1e+07
θ=ρ=0.001
●
●
●
●
●
●
●
●
● ● ●
●
●
● ●
●
●
50
1000
1e−0
50.
001
0.1
110
1000 10000 1e+05 1e+06 1e+07
θ=ρ=0.005
N=10
,000
●
●
●
●FFPopSimfwdpp_indslim
SFS_CODESFS_CODE (opt)
●
●
●
●
●
●
●
● ●
●
●
●
●●
●
20200
50000.
001
0.01
0.1
110
Run
Tim
e (d
)
1000 10000 1e+05 1e+06 1e+07
L
●●
●
●
●
●
●
● ●●
●
● ●●
●
50
500
0.01
0.1
110
1000 10000 1e+05 1e+06 1e+07
L
N=50
,000●
●
●
●
●
●
●
●●
● ●
●
10
200
~4.2 CPU years of computation Hernandez. Bioinformatics (2008)!http://sfscode.sourceforge.net
Weak vs. Strong Background Selection
●
●
●
●●●
●
●
●
●
●
0 50 100 150 200
0.65
0.70
0.75
0.80
0.85
0.90
0.95
1.00
γγ
ππππ 0
Maximal effect: -25≤γ≤-5
π/π0: ratio of observed to
expected diversity at
neutral loci, a measure of the effect of
BGS
Deleterious lociNeutral locus
Deleterious loci
Deleterious lociNeutral locus
Deleterious loci
Beyond Equilibrium: BGS & π
0 2 4 6 8 100.00
000.
0010
0.00
200.
0030
π
γ = 0γ = 2γ = 5γ = 10γ = 20γ = 50γ = 100
0 2 4 6 8 10
0.2
0.4
0.6
0.8
1.0
time (in 2Ne gens)
ππ 0
γ = 2γ = 5γ = 10γ = 20γ = 50γ = 100
0 2 4 6 8 100.00
000.
0010
0.00
200.
0030
π
0 2 4 6 8 10
0.2
0.4
0.6
0.8
1.0
time (in 2Ne gens)
ππ 0
0 2 4 6 8 100.00
000.
0010
0.00
200.
0030
π
0 2 4 6 8 10
0.4
0.6
0.8
1.0
time (in 2Ne gens)
ππ 0
NA = 10,000!10-fold expansion
Beyond Equilibrium
Deleterious lociNeutral locus
0 2 4 6 8 10
0.05
0.2
0.5
25
101
time (in 2Ne gens)
Rel
Pop
Size
: (N
CN
A)
Deleterious loci
NA = 10,000
Qualitatively similar patterns across a demographic models: BGS substantially reduces the time to equilibrium
Beyond Equilibrium: BGS & SFS
ψ: Fraction of SNPs that are singletons
NA = 10,000
10-fold expansion
0 2 4 6 8 10
0.2
0.3
0.4
0.5
0.6
0.7
ψ
γ = 0
γ = 2γ = 5
γ = 10γ = 20
γ = 50γ = 100
0 2 4 6 8 10
1.0
1.5
2.0
2.5
time (in 2Ne gens)
ψψ
0
0 2 4 6 8 10
0.2
0.3
0.4
0.5
0.6
0.7
ψ
0 2 4 6 8 10
1.0
1.5
2.0
2.5
time (in 2Ne gens)
ψψ
0
0 2 4 6 8 10
0.2
0.3
0.4
0.5
0.6
0.7
ψ
γ = 0
γ = 2γ = 5
γ = 10γ = 20
γ = 50γ = 100
0 2 4 6 8 10
1.0
1.5
2.0
2.5
time (in 2Ne gens)
ψψ
0
Deleterious lociNeutral locus
Deleterious loci
Demographic inference • Complete Genomics Diversity Panel!
• ~50 individuals from each of YRI, CEU, and CHS !
• High coverage Whole Genome Sequencing!
• Neutral sites in the human genome!
• Filtering based on PhyloP!
• High B-value (weak BGS)!
• Low B-value (strong BGS)!
• Four-fold degenerate sites
π/π 0
: Exp
ecte
d di
vers
ity d
ue to
BG
SBGS in the human genome
McVicker et al. PLoS Genet (2009)
∂a∂i{
Human Demographic InferenceThe most neutral regions of the genome: !
! High B-values and high recombination rates.
−500 −400 −300 −200 −100 0
050
000
1000
0015
0000
2000
00Ef
fect
ive p
opul
atio
n si
ze
Time (kya)
AF−EU−AS shared demographyEU−AS shared demographyAS demographyEU demographyAF demography
Human Demographic InferenceRegions of strong background selection (Low B) significantly increase the inferred rate of growth
−500 −400 −300 −200 −100 0
010
2030
Rel
ative
eff
pop
size
(Nc/
Na)
Time (kya)
AF−EU−AS shared demographyEU−AS shared demographyAS demographyEU demographyAF demography
High BLow B
Human Demographic InferenceFour-fold degenerate synonymous sites are a mixture of high and low BGS.
−500 −400 −300 −200 −100 0
010
2030
Rel
ative
eff
pop
size
(Nc/
Na)
Time (kya)
AF−EU−AS shared demographyEU−AS shared demographyAS demographyEU demographyAF demography
Low B4−foldHigh B
−20 −15 −10 −5 0
010
2030
Conclusions
• The effects of background selection are highly dependent on demographics.!
• Every genome is composed of a mixture of high and low background selection, so pooling sites should be done with caution!!
• More modeling of non-equilibrium BGS is necessary.
Thanks!
Funding: NIH; QB3; CHARM; CTSI; CFAR
Nicolas Strauli
Dominic Tong
Raul Torres
Lawrence Uricchio
Zach Szpiech
Kevin Hartman
Dan Vasco
Talk 23.12: Thursday 10am! Poster: 555B