Post on 12-Jan-2016
Landscape genomics in sugar pines (Pinus lambertiana)
Exploring patterns of adaptive genetic variation along environmental gradients.
Carl Vangestel
Why associations with measures of aridity?
• Drought stress common cause mortality and annual yield loss
• Shortage of water is one of the strongest environmental constraints and abiotic selective forces in trees
• Geography directly affect water availability → clinal variation in adaptive traits
Spatial Genomics
Why associations with measures of aridity?
• Future climate change
→ affect local abiotic conditions and distribution of trees
→ higher temperatures and increased variability in
precipitation SW US
→ increase in frequency and intensity of drought
Spatial Genomics
Why sugar pine?
• Sugar pines are less tolerant to drought stress than other conifer species
→ expected to show strong clinal patterns in adaptive genetic variation along aridity gradient → very sensitive to future climate changes: alterations in current distribution range
• One of the most diverse genomes among conifers→ average heterozygosity of specific genes was 26 percent (upper range of pines studied so far)
Spatial Genomics
Climate Change
current 2030 2060 2090
(Source: USDA Forest Service, RMRS, Moscow Forestry Science Labaratory)
- Different scenarios- Hadley Climate Scenario
Spatial Genomics
Goal of this study:
• identify adaptive SNP’s associated with variation in temperature, precipitation, aridity index (precipitation/potential evapo-transpiration), elevation
• functionally annotate these genes • explore both neutral and adaptive variation across the sugar pine’s
range
Spatial Genomics
How adaptive variation is distributed over the range of environments is largely unknown
Detailed knowledge on adaptive variation may become crucial to mitigate impact global climate change
N= 338 individualsSpatial Genomics
• Transcriptome assembly: Sanger, 454 (pool) and Illumina (3 ind)
• Candidate SNPs selection Literature SNP Quality
• MYB proteins (stomatal closure, etc ...)
• heat shock proteins (prevention of protein denaturation during cellular dehydration)
• Trehalose-6-phosphate synthase (osmotic protection cell membranes during dehydration)
• LEA proteins (membrane and protein stabilisers, etc ...)• ...
• First screening: 67 genes selected • Second screening: 109 under review
Spatial Genomics
Generalized linear models
Fst Outlier Analysis
Bayesian Environmental
analyses
Spatial Genomics
Multi-analytical approach
Neutral SNPSpatial Genomics
Gene Flow (IBD) Genetic Drift
Neutral SNPSpatial Genomics
• “Separate” neutral patterns from selective ones
• Explore adaptive patterns while accounting for neutral population structure
‘Neutral SNP’ ‘Adaptive SNP’
Spatial Genomics
ENVi = Environmental value for tree i q1i .. q12n: first n principal components of Q-matrix for tree i
Generalized linear models
For each SNP j:
Spatial Genomics
iiiij
ij qqENV 1212110int ...1
log
Fst Outlier AnalysisArlequin
Spatial Genomics
FDR=0.2 FDR=0.05 FDR=0.001
0.0 0.5 1.0 1.5 2.0 2.5 3.0
0.0
0.2
0.4
0.6
0.8
Alpha10 posterior distribution
Alpha10
Density
0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5
0.0
0.2
0.4
0.6
0.8
1.0
Alpha11 posterior distribution
Alpha11
Density
0 1 2 3 4
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
Alpha66 posterior distribution
Alpha66
Density
Fst Outlier Analysis
SNP10[0.68,2.35]
SNP11: [0.92,2.52]
SNP66: [0.00,2.20]
HPDI
BayeScan
0 -1 -2 -3 -4
0.01
0.02
0.03
0.04
0.05
log10(q value)
fst
10
11
66
0 -1 -2 -3 -4
0.01
0.02
0.03
0.04
0.05
log10(q value)
fst
9
10
11
66
0 -1 -2 -3 -4
0.01
0.02
0.03
0.04
0.05
log10(q value)
fst
10
11
Spatial Genomics
ε 𝑙
𝑥 𝑙1 𝑥 𝑙4 𝑥 𝑙5𝑥 𝑙3𝑥 𝑙2
fancestral
Drift: fpopulation deviate
Gene flow: deviations covary
Spatial Genomics
𝑔 (θ 𝑙1) 𝑔 (θ 𝑙5)𝑔 (θ 𝑙3)𝑔 (θ 𝑙2) 𝑔 (θ 𝑙4)
Transformed variable )
Bayesian Environmental Analysis
(Coop et al., 2010)
Structure
Spatial Genomics
Heat map of var-cov matrix
[ 1 ρρ 1
ρ ² ρ ³ρ ρ ²
ρ4ρ ³
ρ ² ρρ ³ ρ ²
1 ρρ 1
ρ ²ρ
ρ4 ρ ³ ρ ² ρ 1]
← p
op1
← p
op2
← p
op3
← p
op4
← p
op5
← pop1
← pop2
← pop3
← pop4
← pop5
Ω =
Bayesian Environmental Analysis
Bayesian Environmental Analysis
• Selected 1 SNP per gene for var-cov matrix (excluded putative selective genes)
Correlation matrix BayEnv Pairwise Fst matrix
Spatial Genomics
• Formulate null model: drift/gene flow
• Alternative model: drift/gene flow + selection
Spatial Genomics
Null model: P(θl|Ω, εl) ~ N(εl, εl(1- εl) Ω)
Alternative model: P(θl|Ω, εl, β) ~ N(εl + βY, εl(1- εl) Ω)
• Bayes Factor: ratio of posterior probability under alternative to the one under null
• High BF indicative for SELECTION
Bayesian Environmental Analysis
Bayesian Environmental AnalysisSpatial Genomics