Post on 31-Dec-2015
Analysis of 14 Coccidioides fungal genome sequences highlights incomplete speciation and natural selectionDaniel E. Neafsey1, Bridget Barker2, Garry T. Cole3, John Galgiani2, Matthew R. Henn1, ChiungYu Hung3, Theo Kirkland4, Scott Kroken2, Cody McMahan3,
Marc Orbach2, Daniel Park1, Steve Rounsley2, Thomas J. Sharpton5, Jason E. Stajich5, John W. Taylor5, Emily Whiston5, Bruce W. Birren1
1Broad Institute of MIT and Harvard, Cambridge, MA, USA, (neafsey@broad.mit.edu). 2University of Arizona, Tucson, AZ, USA. 3University of Texas, San Antonio, TX, USA. 4University of California, San Diego, CA, USA. 5University of California at Berkeley, Berkeley, CA, USA.
ABSTRACTWe have fully sequenced the genomes of 4 C. immitis isolates and 9 C. posadasii isolates, allowing us to explore regional variation in patterns of intraspecific diversity and interspecific divergence. These genome sequences, in combination with the C. posadasii C735 genome previously sequenced by JCVI, offer an excellent resource for improving our understanding of these dimorphic fungal pathogens, which are the etiological agent of coccidioidomycosis (Valley Fever). Through analysis of all 14 genomes we find that although C. immitis and C. posadasii nominally diverged at least 5 million years ago, extensive regions of their genomes exhibit evidence of recent gene flow even while the majority of the genome exhibits perfect genetic isolation. We explore the signal of natural selection across all genes in both species, and identify signals of positive selection in membrane or cell wall associated proteins. These selection signals may indicate that immune-mediated selection pressure from mammalian hosts is an important driver of Coccidioides evolution, and help to clarify the relative importance of the saprophytic vs. parasitic phases of the Coccidioides life cycle.
CONCLUSIONS•C. immitis and C. posadasii, the causative agents of Desert Valley Fever, are nominally distinct species that selectively exchange 7-10% of genes.
•Cell surface and exported proteins exhibit signs of positive selection, suggesting immune pressure.
•Knowledge of introgression and selection patterns will inform vaccine design.
Desert Valley Fever Facts• Coccidioides On CDC list of select agents
• Infection caused by inhalation of arthroconidia
•>100,000 infections annually in US
•35% symptomatic, illness can last months
•5% require medical care
•30% of all pneumonia cases in Arizona
•infections that spread past lungs (to bones, joints, skin, brain) can persist for life
C. immitis
C. posadasii
PFAM ID P value description
PF00904 7.26 x 10-4 Involucrin repeat (membrane-associated)
PF02370 2.40 x 10-3 M protein repeat (virulence; IgA binding; resistance to opsonophagocytosis)
Evidence of Immune-Mediated Selection
-0.2
0
0.2
0.4
0.6
0.8
1
1.2
0 1 2 3 4 5 6 7 8
position (Mb)
Fst
(di
verg
ence
)
Heterogeneity in Ci vs Cp Divergence
Locally reduced genetic differentiation could be caused by introgression or incomplete
lineage sorting.
Incomplete lineage sorting Introgression
Divergence (as measured by Fst) is nearly complete along most of each chromosome (Fst=1), but there are
regions of much lower differentiation.
Chromosome 1
Genome Sequencing
0.01
CP RMSCC 2133
CP RMSCC 370099
CP RMSCC 1037
CP RMSCC 3488
CP CPA 0001
CP CPA 0020100
CP1
95
CP RMSCC 1038
CPS2
CP CPA 006699
100
CI RMSCC 2394
CI RMSCC 3703100
99
CIH1
CI2
73 kb cds
ML, HKY + C
. posadasii
C. im
mitis
Isolate Coverage Size (Mb)C. immitis RS 14.41X 28.9
C. immitis H538.4 3.41X 27.7
C. immitis RMSCC_2394 8.22X 28.8
C. immitis RMSCC_3703 3.17X 27.6
C. posadasii Silveira 5.23X 27.5
C. posadasii RMSCC_3488 8.52X 28.1
C. posadasii RMSCC_2133 6.69X 27.8
C. posadasii RMSCC_3700 3.58X 25.5
C. posadasii CPA_0001 3.09X 28.6
C. posadasii CPA_0020 3.42X 27.3
C. posadasii CPA_0066 3.34X 27.7
C. posadasii RMSCC_1037 3.41X 26.6
C. posadasii RMSCC_1038 3.00X 26.1
C. posadasii C735 8X (JCVI) 26.7
Reference Genome: C. immitis RS
Total Annotated Genes: 10,355
Total SNPs: 670,880
Coalescent Analysis: Introgression Occurring
0
500
1000
1500
2000
2500
1 2 3 4 5 6 7 8 9 10 20 30 >30
Chi Sq. Value
No
. G
en
es
threshold for
significance after
correction for mult.
testing
Coalescent method to test H0
(incomplete lineage sorting)
Wakeley & Hey 1997:
1. Use data from multiple loci to infer population parameters.
2. Analytically derive coalescent expectations for relative incidence of shared polymorphisms between species, fixed differences, and exclusive polymorphisms within species.
3. For individual loci, compare Obs vs. Exp mutation counts assuming no introgression.
(Chi sq. test results to left; significant results indicate evidence of introgression.)
CONCLUSION: Over 700 genes show evidence of recent
introgresssion.
Natural SelectionMost genes exhibit purifying selection
Neutrality Index (NI) measured for each gene using counts of divergent(D) vs. polymorphic(P) synonymous(S) & replacement(N) SNPs.
Functional Enrichment in Positively Selected Genes
0
200
400
600
800
1000
1200
-LOG(Neutrality Index)
No
. g
enes Purifying Selection Positive Selection
PN PSNI
DN DS