Brudno lab: A WHIRLWIND TOUR
description
Transcript of Brudno lab: A WHIRLWIND TOUR
BRUDNO LAB: A WHIRLWIND TOUR
Marc FiumeDepartment of Computer ScienceUniversity of Toronto
Savant Genome Browser - http://compbio.cs.toronto.edu/savant/
1. what we do, our tools2. Savant Genome Browser
Outline
WHAT WE DO
Savant Genome Browser - http://compbio.cs.toronto.edu/savant/
Savant Genome Browser - http://compbio.cs.toronto.edu/savant/
• main focus: genomic analysis using output from high-throughput sequencing (HTS) machines
• high throughput: sequence billions of nucleotides per week• poor data quality: “reads” are shorter; error profiles are poorly understood
What we do
Savant Genome Browser - http://compbio.cs.toronto.edu/savant/
HTS Pipeline
Savant Genome Browser - http://compbio.cs.toronto.edu/savant/
What to do with all these reads?
Savant Genome Browser - http://compbio.cs.toronto.edu/savant/
1. Assembly
• ASSEMBLY: • reconstruct the donor’s genome• “HapSembler”: specialized for highly polymorphic species
Savant Genome Browser - http://compbio.cs.toronto.edu/savant/
2. Alignment
ALIGNMENT• find region in a “reference” genome that matches closely with each read; suggests similar origin from “donor”
• “SHRiMP”: Short Read Mapping Package
Savant Genome Browser - http://compbio.cs.toronto.edu/savant/
3. Genetic Variation Discovery
GENETIC VARIATION DISCOVERY• find differences between two genomes• between donor and reference• between two samples (e.g. tumour vs. normal)
• “VARiD”, “MODiL”, and “CNVer”
Savant Genome Browser - http://compbio.cs.toronto.edu/savant/
Genetic Variation• Single Nucleotide Polymorphism (SNP): genomes have different nucleotides at corresponding positions• VARiD – VARiation IDentification
• Insertions and Deletions (Indels): genomes have additional sequence put in or sequence taken out at corresponding locations• MODiL – Mixtures of Distributions Indel Locator
• Copy Number Variation (CNV): genomes have a different number of the same sequence• CNVer
Our Bioinformatics Tools
READ MAPPING (SHRiMP)
SNP DETECTION(VARiD)
INDEL DETECTION(MODiL)
CNV DETECTION(CNVer)
ASSEMBLY(HapSembler)
VISUALIZATION(SAVANT)
COMPRESSION
SAVANT GENOME BROWSER
Savant Genome Browser - http://compbio.cs.toronto.edu/savant/
Savant Genome Browser - http://compbio.cs.toronto.edu/savant/
Genome Browsing, the old way
Savant Genome Browser - http://compbio.cs.toronto.edu/savant/
Challenge presented by HTS datasets• genomic data is generated in high volumes• HTS machines generate billions of bases per run
• interpretation and analysis challenge• typical pipeline employs many separate tools for computation and visualization
Savant Genome Browser - http://compbio.cs.toronto.edu/savant/
Tools for HTS data analysisTool Cost Computation Visualization
Read Alignment e.g. Bowtie, BWA
Free Y N
File Format Conversion e.g. Galaxy, SAMTools
Free Y N
Other Comand-line Toolse.g. Genetic Variation Discovery, Comparitive Genomics, etc.
Free Y N
UCSC Genome Browser Free N Y
Integrative Genomics Viewer Free N Y
GBrowse Free N Y
CLC Genomics Workbench $$$ Y Y
• substantial disconnect between the processes of computational analysis and visualization
Savant Genome Browser - http://compbio.cs.toronto.edu/savant/
Tools for Genomic Data AnalysisTool Cost Computation Visualization
Read Alignment e.g. Bowtie, BWA
Free Y N
File Format Conversion e.g. Galaxy, SAMTools
Free Y N
Other Comand-line Toolse.g. Genetic Variation Discovery, Comparitive Genomics, etc.
Free Y N
UCSC Genome Browser Free N Y
Integrative Genomics Viewer Free N Y
GBrowse Free N Y
CLC Genomics Workbench $$$ Y Y
Savant Genome Browser Free Y Y
• substantial disconnect between the processes of computational analysis and visualization
Savant Genome Browser - http://compbio.cs.toronto.edu/savant/
ASIDE: Cytoscape?
• platform for visual analysis of networks• extensive plugin framework
Bader Lab
Savant Genome Browser - http://compbio.cs.toronto.edu/savant/
Savant Genome Browser• platform for integrated visual analysis of genomic data• feature-rich genome browser• computationally extensible via plugin framework
FEATURE DEMONSTRATION
Savant Genome Browser - http://compbio.cs.toronto.edu/savant/
INTERFACEHTS READ ALIGNMENTSEXAMPLE PLUGIN: SNP FINDER
Savant Genome Browser - http://compbio.cs.toronto.edu/savant/
Power of visual analytics• task: find the correct parameter for command-line tool
Savant Genome Browser - http://compbio.cs.toronto.edu/savant/
Plugin Framework• unlocks the potential for performing visual analytics
• beneficial for both users and tool developers
• tool developers: simple platform for development and dissemination of work• plugin development is easy• API contains over a hundred prebuilt functions (e.g. get
track data, add bookmarks, draw custom graphics, etc.)
CONCLUSIONS
Savant Genome Browser - http://compbio.cs.toronto.edu/savant/
Savant Genome Browser - http://compbio.cs.toronto.edu/savant/
Conclusions• Savant is a platform for integrated visualization and analysis of genomic data
• stand-alone genome browser• novel features: e.g. table view, visualization modes,
data selection, etc.
• computationally extensible through plugin framework
• makes interpretation and analysis of genomic data easier and more efficient
Acknowledgements
Recep Andrew Vlad MikeBrudno
Yue Marc
Vanessa OrionJoe Nilgun
Paul
Vera
Misko Yoni
Savant Genome Browser - http://compbio.cs.toronto.edu/savant/
Thanks!