Rob Beiko - #SMBE12 presentation
-
Upload
beiko -
Category
Technology
-
view
429 -
download
0
description
Transcript of Rob Beiko - #SMBE12 presentation
Rob Beiko
A network of everything??
the basis of microbial taxonomy
Woese et al (1990) PNAS
contrasting 16S trees
Trimmed & RY-recoded
Morgan Langille, Conor Meehan
concatenated protein trees?Concatenated alignment of 92 single-copy Lachnospiraceae proteins(minimum support value 0.91)
Conor Meehan, Chris Whidden
Ribosomal protein L7AdSPR = 10
Most protein trees reject this topology (AU test)
phylogenomics writ large
Build "all the trees of all the genes", and summarize the patterns that emerge
Can summarize properties of TREES,build SUPERTREES,
or try to construct NETWORKS
2005: highways of gene sharing
144 genomes, 22,432 treesSupertree analysis, common pathways of LGT Beiko et al. (2005) PNAS
2011: "telling the whole story"
1173 genomes
159,905 treesGenome
networks
Beiko (2011) Biol Direct
Neighbor-net of 298 genomes(mean normalized BLASTP distances)
Nearest neighbors of Acidithiobacillus in 504 phylogenetic trees
not shown: 795 trees w/multiple partnersalso not shown: 333 trees with other, less common partners
the curious case of Acidithiobacillus
Trait-based taxonomy: A. and many other acidophiles grouped into Thiomonas16S was used to carve up the group in different genera (and classes)Phylogenomics actually brings the group back together!
Intergenomic affinity graph
Nodes are genera
Edges connect genera with substantial phylogenetic connections (sisters in at least five trees & count of such trees at least 20% of strongest affinity)
Galled network of 13 trees with Coprothermobacter partners and Archaea (for rooting)
example network• Challenges:–Many different sources of discordance– Taxonomic overlap is poor– Need to root trees– The best threshold for inclusion of
reticulations is not obvious
open problems• Orthology, alignment, trees• Testing for model violations• Feeding statistical support to networks• Duplication and loss (DLT scenarios)• Merging different lines of genomic evidence
• Deciding what to show and what to exclude:– Statistical: Count / frequency threshold?– Biological: Adaptive vs. "churn"?– Topological: Distance of transfer?
acknowledgmentsSPR- Chris Whidden- Norbert Zeh- Nick Hamilton
16S- Conor Meehan- Morgan Langille
LGT- Mark Ragan- Tim Harlow- Cheong Xin Chan
Fin