Next Generation Sequencing and its data analysis challenges Background Alignment and Assembly...
-
Upload
cynthia-lane -
Category
Documents
-
view
218 -
download
0
Transcript of Next Generation Sequencing and its data analysis challenges Background Alignment and Assembly...
![Page 1: Next Generation Sequencing and its data analysis challenges Background Alignment and Assembly Applications Genome Epigenome Transcriptome.](https://reader035.fdocuments.us/reader035/viewer/2022062719/56649ec65503460f94bd15a4/html5/thumbnails/1.jpg)
Next Generation Sequencing and its data analysis challenges
Background
Alignment and Assembly
ApplicationsGenomeEpigenomeTranscriptome
![Page 2: Next Generation Sequencing and its data analysis challenges Background Alignment and Assembly Applications Genome Epigenome Transcriptome.](https://reader035.fdocuments.us/reader035/viewer/2022062719/56649ec65503460f94bd15a4/html5/thumbnails/2.jpg)
References
Cell 2013, 155:27Cell 2013, 155:39Annu. Rev. Plant Biol. 2009, 60: 305.Annu. Rev. Genomics Hum. Genet. 2009, 10:135.Curr. Opin. Biotechnology, 24:22.Nat. Biotech. 2009, 25:195.Nat. Methods. 2009, 6:S6.Nat. Rev. Genet. 2009, 10:669.Nat Rev Genet. 2010 Jan;11(1):31-46.Genomics. 2010 Jun;95(6):315-27.
This lecture is about the opportunities and challenges, not detailed statistical techniques. The materials are taken from some review articles.
![Page 3: Next Generation Sequencing and its data analysis challenges Background Alignment and Assembly Applications Genome Epigenome Transcriptome.](https://reader035.fdocuments.us/reader035/viewer/2022062719/56649ec65503460f94bd15a4/html5/thumbnails/3.jpg)
Background
“Method of the year” 2007 by Nature Methods.The name:
“Next generation sequencing”“Deep sequencing”“High-throughput sequencing” “Second-generation sequencing”
The key characteristics:
Massive parallel sequencingamount of data from a single run ~ amount of data from the human genome project
The reads are short~ a few hundred bases / read
![Page 4: Next Generation Sequencing and its data analysis challenges Background Alignment and Assembly Applications Genome Epigenome Transcriptome.](https://reader035.fdocuments.us/reader035/viewer/2022062719/56649ec65503460f94bd15a4/html5/thumbnails/4.jpg)
Background
Potential impact:
The “$1000 genome” will become reality very soon
Genome sequencing will become a regular medical procedure.
Personalized medicinePredictive medicineEthical issues
For statisticians:Data mining using hundreds of thousands of
genomesFinding rare SNPs/mutations associated with
diseasesNew methods to analyze
epigeomics/transcriptomics dataFinding interventions to improve life quality
![Page 5: Next Generation Sequencing and its data analysis challenges Background Alignment and Assembly Applications Genome Epigenome Transcriptome.](https://reader035.fdocuments.us/reader035/viewer/2022062719/56649ec65503460f94bd15a4/html5/thumbnails/5.jpg)
Background
The companies use different techniques. We use Illumina’s as an example. (http://seqanswers.com/forums/showthread.php?t=21)
![Page 6: Next Generation Sequencing and its data analysis challenges Background Alignment and Assembly Applications Genome Epigenome Transcriptome.](https://reader035.fdocuments.us/reader035/viewer/2022062719/56649ec65503460f94bd15a4/html5/thumbnails/6.jpg)
Background
![Page 7: Next Generation Sequencing and its data analysis challenges Background Alignment and Assembly Applications Genome Epigenome Transcriptome.](https://reader035.fdocuments.us/reader035/viewer/2022062719/56649ec65503460f94bd15a4/html5/thumbnails/7.jpg)
Background
![Page 8: Next Generation Sequencing and its data analysis challenges Background Alignment and Assembly Applications Genome Epigenome Transcriptome.](https://reader035.fdocuments.us/reader035/viewer/2022062719/56649ec65503460f94bd15a4/html5/thumbnails/8.jpg)
Background
![Page 9: Next Generation Sequencing and its data analysis challenges Background Alignment and Assembly Applications Genome Epigenome Transcriptome.](https://reader035.fdocuments.us/reader035/viewer/2022062719/56649ec65503460f94bd15a4/html5/thumbnails/9.jpg)
Background
An incomplete list of some common platforms.
BMC Genomics 2012, 13:341
![Page 10: Next Generation Sequencing and its data analysis challenges Background Alignment and Assembly Applications Genome Epigenome Transcriptome.](https://reader035.fdocuments.us/reader035/viewer/2022062719/56649ec65503460f94bd15a4/html5/thumbnails/10.jpg)
Background
![Page 11: Next Generation Sequencing and its data analysis challenges Background Alignment and Assembly Applications Genome Epigenome Transcriptome.](https://reader035.fdocuments.us/reader035/viewer/2022062719/56649ec65503460f94bd15a4/html5/thumbnails/11.jpg)
Background
Advantages:
Fast and cost effective.No need to clone DNA fragments.
Drawbacks:
Short read length (platform dependent)Some platforms have trouble on identical
repeatsNon-uniform confidence in base calling in
reads. Data less reliable near the 3’ end of each read.
![Page 12: Next Generation Sequencing and its data analysis challenges Background Alignment and Assembly Applications Genome Epigenome Transcriptome.](https://reader035.fdocuments.us/reader035/viewer/2022062719/56649ec65503460f94bd15a4/html5/thumbnails/12.jpg)
Background
What deep sequencing can do:
![Page 13: Next Generation Sequencing and its data analysis challenges Background Alignment and Assembly Applications Genome Epigenome Transcriptome.](https://reader035.fdocuments.us/reader035/viewer/2022062719/56649ec65503460f94bd15a4/html5/thumbnails/13.jpg)
Background
Nat Methods. 2009 Nov;6(11 Suppl):S2-5.
![Page 14: Next Generation Sequencing and its data analysis challenges Background Alignment and Assembly Applications Genome Epigenome Transcriptome.](https://reader035.fdocuments.us/reader035/viewer/2022062719/56649ec65503460f94bd15a4/html5/thumbnails/14.jpg)
Sequence the genome of a person? --- Alignment
Can rely on existing human genome as a blue print.
Align the short reads onto the existing human genome.
Need a few fold coverage to cover most regions.
Sequence a whole new genome? --- Assembly
Overlaps are required to construct the genome.The reads are short need ~30 fold coverage.If 3G data per run, need 30 runs for a new
genome similar to human size.
Alignment and Assembly
![Page 15: Next Generation Sequencing and its data analysis challenges Background Alignment and Assembly Applications Genome Epigenome Transcriptome.](https://reader035.fdocuments.us/reader035/viewer/2022062719/56649ec65503460f94bd15a4/html5/thumbnails/15.jpg)
Alignment and Assembly
Hash table-based alignment. Similar to BLAST in principle.(1) Find potential locations:
(2) Local alignment.
![Page 16: Next Generation Sequencing and its data analysis challenges Background Alignment and Assembly Applications Genome Epigenome Transcriptome.](https://reader035.fdocuments.us/reader035/viewer/2022062719/56649ec65503460f94bd15a4/html5/thumbnails/16.jpg)
Alignment and Assembly
From read to graph:
![Page 17: Next Generation Sequencing and its data analysis challenges Background Alignment and Assembly Applications Genome Epigenome Transcriptome.](https://reader035.fdocuments.us/reader035/viewer/2022062719/56649ec65503460f94bd15a4/html5/thumbnails/17.jpg)
Alignment and Assembly
![Page 18: Next Generation Sequencing and its data analysis challenges Background Alignment and Assembly Applications Genome Epigenome Transcriptome.](https://reader035.fdocuments.us/reader035/viewer/2022062719/56649ec65503460f94bd15a4/html5/thumbnails/18.jpg)
Alignment and Assembly
de Bruijn graph assembly
Red: read error.
![Page 19: Next Generation Sequencing and its data analysis challenges Background Alignment and Assembly Applications Genome Epigenome Transcriptome.](https://reader035.fdocuments.us/reader035/viewer/2022062719/56649ec65503460f94bd15a4/html5/thumbnails/19.jpg)
Alignment and Assembly
de Bruijn graph assembly
![Page 20: Next Generation Sequencing and its data analysis challenges Background Alignment and Assembly Applications Genome Epigenome Transcriptome.](https://reader035.fdocuments.us/reader035/viewer/2022062719/56649ec65503460f94bd15a4/html5/thumbnails/20.jpg)
Alignment and Assembly
de Bruijn graph assembly
![Page 21: Next Generation Sequencing and its data analysis challenges Background Alignment and Assembly Applications Genome Epigenome Transcriptome.](https://reader035.fdocuments.us/reader035/viewer/2022062719/56649ec65503460f94bd15a4/html5/thumbnails/21.jpg)
Whole gnome/exome/transcriptome sequencing
![Page 22: Next Generation Sequencing and its data analysis challenges Background Alignment and Assembly Applications Genome Epigenome Transcriptome.](https://reader035.fdocuments.us/reader035/viewer/2022062719/56649ec65503460f94bd15a4/html5/thumbnails/22.jpg)
Genomics
Whole genome sequencing detects all variants (SNP alleles, rare variants, mutations)
Could be associated with disease:
Rare variants (burden testing by collapsing by gene)
De novo mutations (need family tree)
Rare Mendelian disorders
Structural variants in cancer
![Page 23: Next Generation Sequencing and its data analysis challenges Background Alignment and Assembly Applications Genome Epigenome Transcriptome.](https://reader035.fdocuments.us/reader035/viewer/2022062719/56649ec65503460f94bd15a4/html5/thumbnails/23.jpg)
Medical Genomics
Nature Reviews Genetics 11, 415
Example: Extreme-case sequencing to find rare variants associated with a disease.
![Page 24: Next Generation Sequencing and its data analysis challenges Background Alignment and Assembly Applications Genome Epigenome Transcriptome.](https://reader035.fdocuments.us/reader035/viewer/2022062719/56649ec65503460f94bd15a4/html5/thumbnails/24.jpg)
MedicalGenomics
Example:Cancergenome
![Page 25: Next Generation Sequencing and its data analysis challenges Background Alignment and Assembly Applications Genome Epigenome Transcriptome.](https://reader035.fdocuments.us/reader035/viewer/2022062719/56649ec65503460f94bd15a4/html5/thumbnails/25.jpg)
Epigenomics
http://www.roadmapepigenomics.org/
![Page 26: Next Generation Sequencing and its data analysis challenges Background Alignment and Assembly Applications Genome Epigenome Transcriptome.](https://reader035.fdocuments.us/reader035/viewer/2022062719/56649ec65503460f94bd15a4/html5/thumbnails/26.jpg)
ChIP-Seq
ChIP-Seq.
Purpose: analyze which part of the DNA sequence bind to a certain protein.
Transcription factor(Regulome)
Modified histone (Epigenome)
![Page 27: Next Generation Sequencing and its data analysis challenges Background Alignment and Assembly Applications Genome Epigenome Transcriptome.](https://reader035.fdocuments.us/reader035/viewer/2022062719/56649ec65503460f94bd15a4/html5/thumbnails/27.jpg)
Overall ChIP-Seq workflow
ChIP-Seq
![Page 28: Next Generation Sequencing and its data analysis challenges Background Alignment and Assembly Applications Genome Epigenome Transcriptome.](https://reader035.fdocuments.us/reader035/viewer/2022062719/56649ec65503460f94bd15a4/html5/thumbnails/28.jpg)
Before deep sequencing, the same information was obtained by using array in the place of sequencing.
ChIP-Seq
![Page 29: Next Generation Sequencing and its data analysis challenges Background Alignment and Assembly Applications Genome Epigenome Transcriptome.](https://reader035.fdocuments.us/reader035/viewer/2022062719/56649ec65503460f94bd15a4/html5/thumbnails/29.jpg)
ChIP-Seq
![Page 30: Next Generation Sequencing and its data analysis challenges Background Alignment and Assembly Applications Genome Epigenome Transcriptome.](https://reader035.fdocuments.us/reader035/viewer/2022062719/56649ec65503460f94bd15a4/html5/thumbnails/30.jpg)
Different kind of profiles in different applications.
Elongation
Silencing
ChIP-Seq
![Page 31: Next Generation Sequencing and its data analysis challenges Background Alignment and Assembly Applications Genome Epigenome Transcriptome.](https://reader035.fdocuments.us/reader035/viewer/2022062719/56649ec65503460f94bd15a4/html5/thumbnails/31.jpg)
Example of active gene chromatin pattern found by ChIP-Seq.
Initiation site
Elongation
ChIP-Seq
![Page 32: Next Generation Sequencing and its data analysis challenges Background Alignment and Assembly Applications Genome Epigenome Transcriptome.](https://reader035.fdocuments.us/reader035/viewer/2022062719/56649ec65503460f94bd15a4/html5/thumbnails/32.jpg)
RNA-Seq
![Page 33: Next Generation Sequencing and its data analysis challenges Background Alignment and Assembly Applications Genome Epigenome Transcriptome.](https://reader035.fdocuments.us/reader035/viewer/2022062719/56649ec65503460f94bd15a4/html5/thumbnails/33.jpg)
RNA-Seq
![Page 34: Next Generation Sequencing and its data analysis challenges Background Alignment and Assembly Applications Genome Epigenome Transcriptome.](https://reader035.fdocuments.us/reader035/viewer/2022062719/56649ec65503460f94bd15a4/html5/thumbnails/34.jpg)
Deep sequencing provides more information about each mRNA
RNA-Seq
![Page 35: Next Generation Sequencing and its data analysis challenges Background Alignment and Assembly Applications Genome Epigenome Transcriptome.](https://reader035.fdocuments.us/reader035/viewer/2022062719/56649ec65503460f94bd15a4/html5/thumbnails/35.jpg)
Finding novel exons.
Splicing? (short read could be an issue.)
RNA-Seq
![Page 36: Next Generation Sequencing and its data analysis challenges Background Alignment and Assembly Applications Genome Epigenome Transcriptome.](https://reader035.fdocuments.us/reader035/viewer/2022062719/56649ec65503460f94bd15a4/html5/thumbnails/36.jpg)
Gene expression profiling – to replace arrays?Exon-specific abundance.
RNA-Seq
![Page 37: Next Generation Sequencing and its data analysis challenges Background Alignment and Assembly Applications Genome Epigenome Transcriptome.](https://reader035.fdocuments.us/reader035/viewer/2022062719/56649ec65503460f94bd15a4/html5/thumbnails/37.jpg)
Sequencin small RNA.
RNA-Seq
![Page 38: Next Generation Sequencing and its data analysis challenges Background Alignment and Assembly Applications Genome Epigenome Transcriptome.](https://reader035.fdocuments.us/reader035/viewer/2022062719/56649ec65503460f94bd15a4/html5/thumbnails/38.jpg)
Quantification of miRNA and de novo detection of miRNAs
MicroRNA:21-23 in length.
Regulate gene expression by complementary binding .
Derived from non-coding RNAs that form stem-loop structure.
RNA-Seq
![Page 39: Next Generation Sequencing and its data analysis challenges Background Alignment and Assembly Applications Genome Epigenome Transcriptome.](https://reader035.fdocuments.us/reader035/viewer/2022062719/56649ec65503460f94bd15a4/html5/thumbnails/39.jpg)
Directly probe mRNA targets of miRNA.
RNA-Seq