High-throughut comparative genomics 24th October 2013 Joe Parker, Queen Mary University London.
-
Upload
frances-smithey -
Category
Documents
-
view
222 -
download
0
Transcript of High-throughut comparative genomics 24th October 2013 Joe Parker, Queen Mary University London.
![Page 1: High-throughut comparative genomics 24th October 2013 Joe Parker, Queen Mary University London.](https://reader036.fdocuments.us/reader036/viewer/2022062304/56649c9e5503460f9495dacc/html5/thumbnails/1.jpg)
High-throughut comparative genomics
24th October 2013Joe Parker,
Queen Mary University London
![Page 2: High-throughut comparative genomics 24th October 2013 Joe Parker, Queen Mary University London.](https://reader036.fdocuments.us/reader036/viewer/2022062304/56649c9e5503460f9495dacc/html5/thumbnails/2.jpg)
Topics
1. Introduction2. Background: why phylogenomics?3. Examples4. Practice5. Case study6. On the horizon7. Over the horizon
![Page 3: High-throughut comparative genomics 24th October 2013 Joe Parker, Queen Mary University London.](https://reader036.fdocuments.us/reader036/viewer/2022062304/56649c9e5503460f9495dacc/html5/thumbnails/3.jpg)
Aims
• Context of phylogenomics: Next-generation sequencing (NGS)
• Why phylogenomics?• Practical analyses• Future developments
![Page 4: High-throughut comparative genomics 24th October 2013 Joe Parker, Queen Mary University London.](https://reader036.fdocuments.us/reader036/viewer/2022062304/56649c9e5503460f9495dacc/html5/thumbnails/4.jpg)
1. Our Research
![Page 5: High-throughut comparative genomics 24th October 2013 Joe Parker, Queen Mary University London.](https://reader036.fdocuments.us/reader036/viewer/2022062304/56649c9e5503460f9495dacc/html5/thumbnails/5.jpg)
Lab Interests
• Ecology and evolution of traits• Echolocation, sociality• NGS data for population genetics and phylogenomics
![Page 6: High-throughut comparative genomics 24th October 2013 Joe Parker, Queen Mary University London.](https://reader036.fdocuments.us/reader036/viewer/2022062304/56649c9e5503460f9495dacc/html5/thumbnails/6.jpg)
Activities
• Phylogeny estimation/comparison• Molecular correlates of evolution;
– site substitutions, dN/dS, composition• Simulation • Dataset limitations
(R-L): Joe Parker; GeorgiaTsagkogeorga; Kalina Davies; Steve Rossiter; Xiuguang Mao; Seb Bailey
![Page 7: High-throughut comparative genomics 24th October 2013 Joe Parker, Queen Mary University London.](https://reader036.fdocuments.us/reader036/viewer/2022062304/56649c9e5503460f9495dacc/html5/thumbnails/7.jpg)
2. Background
![Page 8: High-throughut comparative genomics 24th October 2013 Joe Parker, Queen Mary University London.](https://reader036.fdocuments.us/reader036/viewer/2022062304/56649c9e5503460f9495dacc/html5/thumbnails/8.jpg)
Next-generation sequencing
![Page 9: High-throughut comparative genomics 24th October 2013 Joe Parker, Queen Mary University London.](https://reader036.fdocuments.us/reader036/viewer/2022062304/56649c9e5503460f9495dacc/html5/thumbnails/9.jpg)
Why phylogenomics, not -genetics?
• Causes of discordant signal– Incomplete lineage sorting– Lateral transfer– Recombination – Introgression
![Page 10: High-throughut comparative genomics 24th October 2013 Joe Parker, Queen Mary University London.](https://reader036.fdocuments.us/reader036/viewer/2022062304/56649c9e5503460f9495dacc/html5/thumbnails/10.jpg)
Quantitative biology
• Multiple configurations
• Hyperparameters empirically investigated
• Determine sensitivity of results
![Page 11: High-throughut comparative genomics 24th October 2013 Joe Parker, Queen Mary University London.](https://reader036.fdocuments.us/reader036/viewer/2022062304/56649c9e5503460f9495dacc/html5/thumbnails/11.jpg)
Distributions
• Genome-scale data provides context
• Identify outliersGenes / taxa / trees
• Compare values across biological systems
![Page 12: High-throughut comparative genomics 24th October 2013 Joe Parker, Queen Mary University London.](https://reader036.fdocuments.us/reader036/viewer/2022062304/56649c9e5503460f9495dacc/html5/thumbnails/12.jpg)
Integration with ‘Omics
• Multiple databases
• Functional data
• Bibliographic information
![Page 13: High-throughut comparative genomics 24th October 2013 Joe Parker, Queen Mary University London.](https://reader036.fdocuments.us/reader036/viewer/2022062304/56649c9e5503460f9495dacc/html5/thumbnails/13.jpg)
3. Example studies
![Page 14: High-throughut comparative genomics 24th October 2013 Joe Parker, Queen Mary University London.](https://reader036.fdocuments.us/reader036/viewer/2022062304/56649c9e5503460f9495dacc/html5/thumbnails/14.jpg)
Tsakgogeorgia et al. (in press)
QuickTime™ and a decompressor
are needed to see this picture.
![Page 15: High-throughut comparative genomics 24th October 2013 Joe Parker, Queen Mary University London.](https://reader036.fdocuments.us/reader036/viewer/2022062304/56649c9e5503460f9495dacc/html5/thumbnails/15.jpg)
Salichos & Rokas (2013)
QuickTime™ and a decompressor
are needed to see this picture.
![Page 16: High-throughut comparative genomics 24th October 2013 Joe Parker, Queen Mary University London.](https://reader036.fdocuments.us/reader036/viewer/2022062304/56649c9e5503460f9495dacc/html5/thumbnails/16.jpg)
Backström et al. (2013)
QuickTime™ and a decompressor
are needed to see this picture.
![Page 17: High-throughut comparative genomics 24th October 2013 Joe Parker, Queen Mary University London.](https://reader036.fdocuments.us/reader036/viewer/2022062304/56649c9e5503460f9495dacc/html5/thumbnails/17.jpg)
Lindblad-Toh et al. (2011)
QuickTime™ and a decompressor
are needed to see this picture.
![Page 18: High-throughut comparative genomics 24th October 2013 Joe Parker, Queen Mary University London.](https://reader036.fdocuments.us/reader036/viewer/2022062304/56649c9e5503460f9495dacc/html5/thumbnails/18.jpg)
4. Practice
![Page 19: High-throughut comparative genomics 24th October 2013 Joe Parker, Queen Mary University London.](https://reader036.fdocuments.us/reader036/viewer/2022062304/56649c9e5503460f9495dacc/html5/thumbnails/19.jpg)
Source material
• Samples• Storage• Purification• Library prep
![Page 20: High-throughut comparative genomics 24th October 2013 Joe Parker, Queen Mary University London.](https://reader036.fdocuments.us/reader036/viewer/2022062304/56649c9e5503460f9495dacc/html5/thumbnails/20.jpg)
Sequencing
• Genome– Sanger– Illumina – Pyro /454– SOLiD– PacBio
• Transcriptome / RNA-seq– MyBAITS
• HiSeq / MiSeq• IonTorrent
![Page 21: High-throughut comparative genomics 24th October 2013 Joe Parker, Queen Mary University London.](https://reader036.fdocuments.us/reader036/viewer/2022062304/56649c9e5503460f9495dacc/html5/thumbnails/21.jpg)
Infrastructure
• Desktop machines• Computing clusters• Grid systems• Cloud-based computation
![Page 22: High-throughut comparative genomics 24th October 2013 Joe Parker, Queen Mary University London.](https://reader036.fdocuments.us/reader036/viewer/2022062304/56649c9e5503460f9495dacc/html5/thumbnails/22.jpg)
Assembly, Annotation
• Assembly– To reference (mapping)– De novo
• Annotation– By homology– De novo
•SOAPdenovo•MAKER•Velvet•Bowtie / Cufflinks / Tophat•Trinity
![Page 23: High-throughut comparative genomics 24th October 2013 Joe Parker, Queen Mary University London.](https://reader036.fdocuments.us/reader036/viewer/2022062304/56649c9e5503460f9495dacc/html5/thumbnails/23.jpg)
Alignment
• PRANK• MUSCLE• MAFFT• Clustal
![Page 24: High-throughut comparative genomics 24th October 2013 Joe Parker, Queen Mary University London.](https://reader036.fdocuments.us/reader036/viewer/2022062304/56649c9e5503460f9495dacc/html5/thumbnails/24.jpg)
Phylogeny inference
• MrBayes• RAxML• BEAST• MP-EST• STAR
![Page 25: High-throughut comparative genomics 24th October 2013 Joe Parker, Queen Mary University London.](https://reader036.fdocuments.us/reader036/viewer/2022062304/56649c9e5503460f9495dacc/html5/thumbnails/25.jpg)
Phylogenetic analysis
• BEAST• HYPHY• PAML• Pipelines• LRT
![Page 26: High-throughut comparative genomics 24th October 2013 Joe Parker, Queen Mary University London.](https://reader036.fdocuments.us/reader036/viewer/2022062304/56649c9e5503460f9495dacc/html5/thumbnails/26.jpg)
5. Case study
![Page 27: High-throughut comparative genomics 24th October 2013 Joe Parker, Queen Mary University London.](https://reader036.fdocuments.us/reader036/viewer/2022062304/56649c9e5503460f9495dacc/html5/thumbnails/27.jpg)
QuickTime™ and a decompressor
are needed to see this picture.
Parker et al. (2013)
• De novo genomes:– four taxa– 2,321 protein-coding loci– 801,301 codons
• Published:– 18 genomes
• ~69,000 simulated datasets
• ~3,500 cluster cores
![Page 28: High-throughut comparative genomics 24th October 2013 Joe Parker, Queen Mary University London.](https://reader036.fdocuments.us/reader036/viewer/2022062304/56649c9e5503460f9495dacc/html5/thumbnails/28.jpg)
Our pipeline for detecting genome-wide convergence
![Page 29: High-throughut comparative genomics 24th October 2013 Joe Parker, Queen Mary University London.](https://reader036.fdocuments.us/reader036/viewer/2022062304/56649c9e5503460f9495dacc/html5/thumbnails/29.jpg)
![Page 30: High-throughut comparative genomics 24th October 2013 Joe Parker, Queen Mary University London.](https://reader036.fdocuments.us/reader036/viewer/2022062304/56649c9e5503460f9495dacc/html5/thumbnails/30.jpg)
![Page 31: High-throughut comparative genomics 24th October 2013 Joe Parker, Queen Mary University London.](https://reader036.fdocuments.us/reader036/viewer/2022062304/56649c9e5503460f9495dacc/html5/thumbnails/31.jpg)
![Page 32: High-throughut comparative genomics 24th October 2013 Joe Parker, Queen Mary University London.](https://reader036.fdocuments.us/reader036/viewer/2022062304/56649c9e5503460f9495dacc/html5/thumbnails/32.jpg)
![Page 33: High-throughut comparative genomics 24th October 2013 Joe Parker, Queen Mary University London.](https://reader036.fdocuments.us/reader036/viewer/2022062304/56649c9e5503460f9495dacc/html5/thumbnails/33.jpg)
![Page 34: High-throughut comparative genomics 24th October 2013 Joe Parker, Queen Mary University London.](https://reader036.fdocuments.us/reader036/viewer/2022062304/56649c9e5503460f9495dacc/html5/thumbnails/34.jpg)
![Page 35: High-throughut comparative genomics 24th October 2013 Joe Parker, Queen Mary University London.](https://reader036.fdocuments.us/reader036/viewer/2022062304/56649c9e5503460f9495dacc/html5/thumbnails/35.jpg)
![Page 36: High-throughut comparative genomics 24th October 2013 Joe Parker, Queen Mary University London.](https://reader036.fdocuments.us/reader036/viewer/2022062304/56649c9e5503460f9495dacc/html5/thumbnails/36.jpg)
mean = 0.05
![Page 37: High-throughut comparative genomics 24th October 2013 Joe Parker, Queen Mary University London.](https://reader036.fdocuments.us/reader036/viewer/2022062304/56649c9e5503460f9495dacc/html5/thumbnails/37.jpg)
mean = 0.05 mean = -0.01 mean = -0.08
![Page 38: High-throughut comparative genomics 24th October 2013 Joe Parker, Queen Mary University London.](https://reader036.fdocuments.us/reader036/viewer/2022062304/56649c9e5503460f9495dacc/html5/thumbnails/38.jpg)
Development cycle
Design
Wireframe & specify
tests
Implement
AlignmentloadSequences()
getSubstitutions()
PhylogenytrimTaxa()getMRCA()
DataSeriescalculateECDF()randomise()
RegressiongetResiduals()
predictInterval()
Review, refine & refactor
![Page 39: High-throughut comparative genomics 24th October 2013 Joe Parker, Queen Mary University London.](https://reader036.fdocuments.us/reader036/viewer/2022062304/56649c9e5503460f9495dacc/html5/thumbnails/39.jpg)
Parker et al. (2013)
![Page 40: High-throughut comparative genomics 24th October 2013 Joe Parker, Queen Mary University London.](https://reader036.fdocuments.us/reader036/viewer/2022062304/56649c9e5503460f9495dacc/html5/thumbnails/40.jpg)
Parker et al. (2013)
![Page 41: High-throughut comparative genomics 24th October 2013 Joe Parker, Queen Mary University London.](https://reader036.fdocuments.us/reader036/viewer/2022062304/56649c9e5503460f9495dacc/html5/thumbnails/41.jpg)
6. On the horizon
![Page 42: High-throughut comparative genomics 24th October 2013 Joe Parker, Queen Mary University London.](https://reader036.fdocuments.us/reader036/viewer/2022062304/56649c9e5503460f9495dacc/html5/thumbnails/42.jpg)
Environmental metagenomics
![Page 43: High-throughut comparative genomics 24th October 2013 Joe Parker, Queen Mary University London.](https://reader036.fdocuments.us/reader036/viewer/2022062304/56649c9e5503460f9495dacc/html5/thumbnails/43.jpg)
Models of computation
• Cloud resources: Unlimited flexibility, finite time
• Development trade-off– Off-the-shelf– Bespoke
• Exploratory work– Real time genomic
transects?
• Essential fundamental data missing from nearly every system;
– Diversity; structure; substitution rates; dN/dS; recombination; dispersal; lateral transfer
![Page 44: High-throughut comparative genomics 24th October 2013 Joe Parker, Queen Mary University London.](https://reader036.fdocuments.us/reader036/viewer/2022062304/56649c9e5503460f9495dacc/html5/thumbnails/44.jpg)
Serialisation
• Process data remotely
• Freeze-dry objects, download to desktop
• Implement new methods directly on previously-analysed data
![Page 45: High-throughut comparative genomics 24th October 2013 Joe Parker, Queen Mary University London.](https://reader036.fdocuments.us/reader036/viewer/2022062304/56649c9e5503460f9495dacc/html5/thumbnails/45.jpg)
7. Over the horizon
• Real-time phylogenetics• Field phylogenetics• Alignment-free analyses
![Page 46: High-throughut comparative genomics 24th October 2013 Joe Parker, Queen Mary University London.](https://reader036.fdocuments.us/reader036/viewer/2022062304/56649c9e5503460f9495dacc/html5/thumbnails/46.jpg)
Conclusions
• Why phylogenomics?• Practice• Comparative approach• Statistical context
![Page 47: High-throughut comparative genomics 24th October 2013 Joe Parker, Queen Mary University London.](https://reader036.fdocuments.us/reader036/viewer/2022062304/56649c9e5503460f9495dacc/html5/thumbnails/47.jpg)
ThanksSteve Rossiter1, James Cotton2, Elia Stupka3 & Georgia Tsagkogeorga1
1School of Biological and Chemical Sciences, Queen Mary, University of London2Wellcome Trust Sanger Institute
3Center for Translational Genomics and Bioinformatics, San Raffaele Institute, Milan
Chris Walker & Dan TraynorQueen Mary GridPP High-throughput Cluster
Chaz Mein & Anna TerryBarts and The London Genome Centre
Mahesh PancholiSchool of Biological and Chemical Sciences
BBSRC (UK); Queen Mary, University of London
![Page 48: High-throughut comparative genomics 24th October 2013 Joe Parker, Queen Mary University London.](https://reader036.fdocuments.us/reader036/viewer/2022062304/56649c9e5503460f9495dacc/html5/thumbnails/48.jpg)
Resources• My email: Joe Parker (Queen Mary University of London): [email protected]
• Parker, J., Tsagkogeorga, G., Cotton, J.A., Liu, Y., Provero, P., Stupka, E. & Rossiter, S.J. (2013) Genome-wide signatures of convergent evolution in echolocating mammals. Nature 502(7470):228-231 doi:10.1038/nature12511.
• Tsagkogeorga, G., Parker, J., Stupka, E., Cotton, J.A., & Rossiter, S.J. (2013) Phylogenomic analyses elucidate evolutionary relationships of the bats (Chiroptera) Curr. Biol. in the press.
• Salichos, L. & Rokas, A. (2013) Inferring ancient divergences requires genes with strong phylogenetic signals. Nature 437:327-331. doi:10.1038/nature12130
• Backström, N., Zhang, Q. & Edwards, S.V. (2013) Evidence from a House Finch (Haemorhous mexicanus) Spleen Transcriptome for Adaptive Evolution and Biased Gene Conversion in Passerine Birds. MBE 30(5):1046-50. doi:10.1093/molbev/mst033
• Lindblad-Toh, K., Garber, M., Zuk, O., Lin, M.F., Parker, B.J., et al. (2011) A high-resolution map of human evolutionary constraint using 29 mammals. Nature 478:476–482 doi:10.1038/nature10530
• Degnan, J.H. & Rosenberg, N.A. (2009) Gene tree discordance, phylogenetic inference and the multispecies coalescent. TREE 24:(6)332-340 doi:10.1016/j.tree.2009.01.009
• The Tree Of Life: http://phylogenomics.blogspot.co.uk/
• RNA-seq For Everyone: http://rnaseq.uoregon.edu/index.html
• Evo-Phylo: http://www.davelunt.net/evophylo/tag/phylogenomics/
• OpenHelix: http://blog.openhelix.eu/
• Our blogs: http://evolve.sbcs.qmul.ac.uk/rossiter/ (lab) and http://www.lonelyjoeparker.com/?cat=11 (Joe)