Registration Page · 2012. 12. 6. · Next Generation Sequencing Data Analysis Lynn Young, Ph.D....
Transcript of Registration Page · 2012. 12. 6. · Next Generation Sequencing Data Analysis Lynn Young, Ph.D....
![Page 1: Registration Page · 2012. 12. 6. · Next Generation Sequencing Data Analysis Lynn Young, Ph.D. lynny@mail.nih.gov NIH Library Bioinformatics Support Program An ORS Service 20 September](https://reader034.fdocuments.us/reader034/viewer/2022052004/60170669db03a44e6662973f/html5/thumbnails/1.jpg)
Next Generation Sequencing Data Analysis An ORS Service
Registration Page
![Page 2: Registration Page · 2012. 12. 6. · Next Generation Sequencing Data Analysis Lynn Young, Ph.D. lynny@mail.nih.gov NIH Library Bioinformatics Support Program An ORS Service 20 September](https://reader034.fdocuments.us/reader034/viewer/2022052004/60170669db03a44e6662973f/html5/thumbnails/2.jpg)
Next Generation Sequencing Data Analysis
Lynn Young, Ph.D.
NIH Library Bioinformatics Support Program
An ORS Service
20 September 2012
![Page 3: Registration Page · 2012. 12. 6. · Next Generation Sequencing Data Analysis Lynn Young, Ph.D. lynny@mail.nih.gov NIH Library Bioinformatics Support Program An ORS Service 20 September](https://reader034.fdocuments.us/reader034/viewer/2022052004/60170669db03a44e6662973f/html5/thumbnails/3.jpg)
Next Generation Sequencing Data Analysis An ORS Service
Acknowledgement
This training uses cloud services provided by an “AWS in Education” grant to the Galaxy Project.
![Page 4: Registration Page · 2012. 12. 6. · Next Generation Sequencing Data Analysis Lynn Young, Ph.D. lynny@mail.nih.gov NIH Library Bioinformatics Support Program An ORS Service 20 September](https://reader034.fdocuments.us/reader034/viewer/2022052004/60170669db03a44e6662973f/html5/thumbnails/4.jpg)
Next Generation Sequencing Data Analysis An ORS Service
Introduction
http
://en.w
ikiped
ia.org
/wiki/File:D
NA_Seq
uen
cing_gD
NA_lib
raries.jpg
![Page 5: Registration Page · 2012. 12. 6. · Next Generation Sequencing Data Analysis Lynn Young, Ph.D. lynny@mail.nih.gov NIH Library Bioinformatics Support Program An ORS Service 20 September](https://reader034.fdocuments.us/reader034/viewer/2022052004/60170669db03a44e6662973f/html5/thumbnails/5.jpg)
Next Generation Sequencing Data Analysis An ORS Service
Objectives
■ Sequence quality ■ Mapping ■ Mapping quality ■ Variant analysis ■ Biological context
http://en.wikipedia.org/wiki/File:DNA_Sequencing_gDNA_libraries.jpg
![Page 6: Registration Page · 2012. 12. 6. · Next Generation Sequencing Data Analysis Lynn Young, Ph.D. lynny@mail.nih.gov NIH Library Bioinformatics Support Program An ORS Service 20 September](https://reader034.fdocuments.us/reader034/viewer/2022052004/60170669db03a44e6662973f/html5/thumbnails/6.jpg)
Next Generation Sequencing Data Analysis An ORS Service
Data Analysis Workflow Reads
Ref
QC
Trim
Map Variant detection
![Page 7: Registration Page · 2012. 12. 6. · Next Generation Sequencing Data Analysis Lynn Young, Ph.D. lynny@mail.nih.gov NIH Library Bioinformatics Support Program An ORS Service 20 September](https://reader034.fdocuments.us/reader034/viewer/2022052004/60170669db03a44e6662973f/html5/thumbnails/7.jpg)
Next Generation Sequencing Data Analysis An ORS Service
Reference – FASTA format
>gi|206583719|gb|CM000511.1| Homo sapiens chromosome 21, whole genome shotgun sequence
ATTCATTCCATTCCACTGCACTCCAATCTTCACATAAAATGTAGACAGAAGCTTTCTGAGAAACTTTTCT
CTGATGTGTGCATTCATCTCACAGATGTGAACCATTCTTTTGTTTGAGCAGTTTGGTAACATTCTTTTTG
TAGAATCTGCAAAAGGATATTTGTGAGCACTTTGAAGCCTATGGTGAAAAAGGAAATATCTTCAGAGAAA
AACTAGAAAGAAGGTTTCTGAGAAACTGCTTTGTCATGTGTGAATTAGTCTCACAGATTTGAACCTTTCT
GTTGATTGAACATATTGGAAACCTTCTTTTTGTAGAATCTGCAAAGGGATATTTGTGAACACTTGGAGGC
CAATGGTGAAAAAGGAAATATATTCACATGAAAACTAGACAGAATCTTTCTGAGACACTTCTGTGTTTGG!
![Page 8: Registration Page · 2012. 12. 6. · Next Generation Sequencing Data Analysis Lynn Young, Ph.D. lynny@mail.nih.gov NIH Library Bioinformatics Support Program An ORS Service 20 September](https://reader034.fdocuments.us/reader034/viewer/2022052004/60170669db03a44e6662973f/html5/thumbnails/8.jpg)
Next Generation Sequencing Data Analysis An ORS Service
Reads – FASTQ Format @SRR016862.16884!
ATTTTGAGTGGTACATCTAGGTAGCCGTTTTTGGAAACGGG!
+!
IIIIII,IIIII?III?I&II9$H+/I>IA%1.$,$%$#$F!
@SRR016862.58801!
ATTTTGAGTGGTACATCTAGGTAGCCGTTTTTGAAACCAGG!
+!
IIIIIIIIIIIIIIIIII9III0II4.II@&?6&$&#%'@.!
![Page 9: Registration Page · 2012. 12. 6. · Next Generation Sequencing Data Analysis Lynn Young, Ph.D. lynny@mail.nih.gov NIH Library Bioinformatics Support Program An ORS Service 20 September](https://reader034.fdocuments.us/reader034/viewer/2022052004/60170669db03a44e6662973f/html5/thumbnails/9.jpg)
Next Generation Sequencing Data Analysis An ORS Service
Alignments – SAM Format
http://samtools.sourceforge.net/SAM1.pdf
http://bio-bwa.sourceforge.net/bwa.shtml#4
![Page 10: Registration Page · 2012. 12. 6. · Next Generation Sequencing Data Analysis Lynn Young, Ph.D. lynny@mail.nih.gov NIH Library Bioinformatics Support Program An ORS Service 20 September](https://reader034.fdocuments.us/reader034/viewer/2022052004/60170669db03a44e6662973f/html5/thumbnails/10.jpg)
Next Generation Sequencing Data Analysis An ORS Service
Variant Calls – VCF Format http://www.1000genomes.org/wiki/Analysis/Variant%20Call%20Format/vcf-variant-call-format-version-41
![Page 11: Registration Page · 2012. 12. 6. · Next Generation Sequencing Data Analysis Lynn Young, Ph.D. lynny@mail.nih.gov NIH Library Bioinformatics Support Program An ORS Service 20 September](https://reader034.fdocuments.us/reader034/viewer/2022052004/60170669db03a44e6662973f/html5/thumbnails/11.jpg)
Next Generation Sequencing Data Analysis An ORS Service
VCF Format - Data
![Page 12: Registration Page · 2012. 12. 6. · Next Generation Sequencing Data Analysis Lynn Young, Ph.D. lynny@mail.nih.gov NIH Library Bioinformatics Support Program An ORS Service 20 September](https://reader034.fdocuments.us/reader034/viewer/2022052004/60170669db03a44e6662973f/html5/thumbnails/12.jpg)
Next Generation Sequencing Data Analysis An ORS Service
Data – Sequence Read Archive http://trace.ncbi.nlm.nih.gov/Traces/sra/?study=SRP000535
![Page 13: Registration Page · 2012. 12. 6. · Next Generation Sequencing Data Analysis Lynn Young, Ph.D. lynny@mail.nih.gov NIH Library Bioinformatics Support Program An ORS Service 20 September](https://reader034.fdocuments.us/reader034/viewer/2022052004/60170669db03a44e6662973f/html5/thumbnails/13.jpg)
Next Generation Sequencing Data Analysis An ORS Service
Galaxy
■ Public ➤ usegalaxy.org
■ 20 September 2012 class ➤ cloud1.galaxyproject.org ➤ cloud2.galaxyproject.org
![Page 14: Registration Page · 2012. 12. 6. · Next Generation Sequencing Data Analysis Lynn Young, Ph.D. lynny@mail.nih.gov NIH Library Bioinformatics Support Program An ORS Service 20 September](https://reader034.fdocuments.us/reader034/viewer/2022052004/60170669db03a44e6662973f/html5/thumbnails/14.jpg)
Next Generation Sequencing Data Analysis An ORS Service
Galaxy Account Registration
![Page 15: Registration Page · 2012. 12. 6. · Next Generation Sequencing Data Analysis Lynn Young, Ph.D. lynny@mail.nih.gov NIH Library Bioinformatics Support Program An ORS Service 20 September](https://reader034.fdocuments.us/reader034/viewer/2022052004/60170669db03a44e6662973f/html5/thumbnails/15.jpg)
Next Generation Sequencing Data Analysis An ORS Service
Galaxy Account Registration
![Page 16: Registration Page · 2012. 12. 6. · Next Generation Sequencing Data Analysis Lynn Young, Ph.D. lynny@mail.nih.gov NIH Library Bioinformatics Support Program An ORS Service 20 September](https://reader034.fdocuments.us/reader034/viewer/2022052004/60170669db03a44e6662973f/html5/thumbnails/16.jpg)
Next Generation Sequencing Data Analysis An ORS Service
Galaxy Login if Already Have Account
![Page 17: Registration Page · 2012. 12. 6. · Next Generation Sequencing Data Analysis Lynn Young, Ph.D. lynny@mail.nih.gov NIH Library Bioinformatics Support Program An ORS Service 20 September](https://reader034.fdocuments.us/reader034/viewer/2022052004/60170669db03a44e6662973f/html5/thumbnails/17.jpg)
Next Generation Sequencing Data Analysis An ORS Service
Galaxy Login
![Page 18: Registration Page · 2012. 12. 6. · Next Generation Sequencing Data Analysis Lynn Young, Ph.D. lynny@mail.nih.gov NIH Library Bioinformatics Support Program An ORS Service 20 September](https://reader034.fdocuments.us/reader034/viewer/2022052004/60170669db03a44e6662973f/html5/thumbnails/18.jpg)
Next Generation Sequencing Data Analysis An ORS Service
Galaxy – Shared Data
![Page 19: Registration Page · 2012. 12. 6. · Next Generation Sequencing Data Analysis Lynn Young, Ph.D. lynny@mail.nih.gov NIH Library Bioinformatics Support Program An ORS Service 20 September](https://reader034.fdocuments.us/reader034/viewer/2022052004/60170669db03a44e6662973f/html5/thumbnails/19.jpg)
Next Generation Sequencing Data Analysis An ORS Service
Galaxy – Obtain Shared Data
![Page 20: Registration Page · 2012. 12. 6. · Next Generation Sequencing Data Analysis Lynn Young, Ph.D. lynny@mail.nih.gov NIH Library Bioinformatics Support Program An ORS Service 20 September](https://reader034.fdocuments.us/reader034/viewer/2022052004/60170669db03a44e6662973f/html5/thumbnails/20.jpg)
Next Generation Sequencing Data Analysis An ORS Service
Galaxy – Obtain Share Data for Input Datasets
Step 1
Step 2
![Page 21: Registration Page · 2012. 12. 6. · Next Generation Sequencing Data Analysis Lynn Young, Ph.D. lynny@mail.nih.gov NIH Library Bioinformatics Support Program An ORS Service 20 September](https://reader034.fdocuments.us/reader034/viewer/2022052004/60170669db03a44e6662973f/html5/thumbnails/21.jpg)
Next Generation Sequencing Data Analysis An ORS Service
Galaxy Data Analysis Workflow - Details
![Page 22: Registration Page · 2012. 12. 6. · Next Generation Sequencing Data Analysis Lynn Young, Ph.D. lynny@mail.nih.gov NIH Library Bioinformatics Support Program An ORS Service 20 September](https://reader034.fdocuments.us/reader034/viewer/2022052004/60170669db03a44e6662973f/html5/thumbnails/22.jpg)
Next Generation Sequencing Data Analysis An ORS Service
Galaxy – Analyze Data
![Page 23: Registration Page · 2012. 12. 6. · Next Generation Sequencing Data Analysis Lynn Young, Ph.D. lynny@mail.nih.gov NIH Library Bioinformatics Support Program An ORS Service 20 September](https://reader034.fdocuments.us/reader034/viewer/2022052004/60170669db03a44e6662973f/html5/thumbnails/23.jpg)
Next Generation Sequencing Data Analysis An ORS Service
Galaxy Input Dataset – View Sequence Reads
![Page 24: Registration Page · 2012. 12. 6. · Next Generation Sequencing Data Analysis Lynn Young, Ph.D. lynny@mail.nih.gov NIH Library Bioinformatics Support Program An ORS Service 20 September](https://reader034.fdocuments.us/reader034/viewer/2022052004/60170669db03a44e6662973f/html5/thumbnails/24.jpg)
Next Generation Sequencing Data Analysis An ORS Service
Galaxy Input Dataset – View Reference Sequence
For next slide
Reference Sequence
![Page 25: Registration Page · 2012. 12. 6. · Next Generation Sequencing Data Analysis Lynn Young, Ph.D. lynny@mail.nih.gov NIH Library Bioinformatics Support Program An ORS Service 20 September](https://reader034.fdocuments.us/reader034/viewer/2022052004/60170669db03a44e6662973f/html5/thumbnails/25.jpg)
Next Generation Sequencing Data Analysis An ORS Service
Galaxy - FASTQ Groomer
Step 1
Step 2
Step 3
Repeat steps for the other two FASTQ files
![Page 26: Registration Page · 2012. 12. 6. · Next Generation Sequencing Data Analysis Lynn Young, Ph.D. lynny@mail.nih.gov NIH Library Bioinformatics Support Program An ORS Service 20 September](https://reader034.fdocuments.us/reader034/viewer/2022052004/60170669db03a44e6662973f/html5/thumbnails/26.jpg)
Next Generation Sequencing Data Analysis An ORS Service
Galaxy - FastQC
Step 1
Step 2
Step 3
Step 4
![Page 27: Registration Page · 2012. 12. 6. · Next Generation Sequencing Data Analysis Lynn Young, Ph.D. lynny@mail.nih.gov NIH Library Bioinformatics Support Program An ORS Service 20 September](https://reader034.fdocuments.us/reader034/viewer/2022052004/60170669db03a44e6662973f/html5/thumbnails/27.jpg)
Next Generation Sequencing Data Analysis An ORS Service
Galaxy – FastQC Results
For next slide
Step 2 Step 1
![Page 28: Registration Page · 2012. 12. 6. · Next Generation Sequencing Data Analysis Lynn Young, Ph.D. lynny@mail.nih.gov NIH Library Bioinformatics Support Program An ORS Service 20 September](https://reader034.fdocuments.us/reader034/viewer/2022052004/60170669db03a44e6662973f/html5/thumbnails/28.jpg)
Next Generation Sequencing Data Analysis An ORS Service
Galaxy Mapping – Burris Wheeler Aligner (BWA)
Step 1
Step 2
Step 3
Repeat steps for the other two Groomed files
Step 4
![Page 29: Registration Page · 2012. 12. 6. · Next Generation Sequencing Data Analysis Lynn Young, Ph.D. lynny@mail.nih.gov NIH Library Bioinformatics Support Program An ORS Service 20 September](https://reader034.fdocuments.us/reader034/viewer/2022052004/60170669db03a44e6662973f/html5/thumbnails/29.jpg)
Next Generation Sequencing Data Analysis An ORS Service
Galaxy – View BWA Results
Step 1
Step 2
For next slide
![Page 30: Registration Page · 2012. 12. 6. · Next Generation Sequencing Data Analysis Lynn Young, Ph.D. lynny@mail.nih.gov NIH Library Bioinformatics Support Program An ORS Service 20 September](https://reader034.fdocuments.us/reader034/viewer/2022052004/60170669db03a44e6662973f/html5/thumbnails/30.jpg)
Next Generation Sequencing Data Analysis An ORS Service
Galaxy – SAM to BAM
Step 1
Step 2
Step 3 Repeat steps for the other two SAM files Step 4
![Page 31: Registration Page · 2012. 12. 6. · Next Generation Sequencing Data Analysis Lynn Young, Ph.D. lynny@mail.nih.gov NIH Library Bioinformatics Support Program An ORS Service 20 September](https://reader034.fdocuments.us/reader034/viewer/2022052004/60170669db03a44e6662973f/html5/thumbnails/31.jpg)
Next Generation Sequencing Data Analysis An ORS Service
Galaxy – Navigation to Picard Alignment Summary Metrics
Step 1
Step 2
![Page 32: Registration Page · 2012. 12. 6. · Next Generation Sequencing Data Analysis Lynn Young, Ph.D. lynny@mail.nih.gov NIH Library Bioinformatics Support Program An ORS Service 20 September](https://reader034.fdocuments.us/reader034/viewer/2022052004/60170669db03a44e6662973f/html5/thumbnails/32.jpg)
Next Generation Sequencing Data Analysis An ORS Service
Galaxy – Picard Alignment Summary Metrics
Step 1
Step 3
Step 4 Step 2
Step 5 – uncheck the box
Step 6
![Page 33: Registration Page · 2012. 12. 6. · Next Generation Sequencing Data Analysis Lynn Young, Ph.D. lynny@mail.nih.gov NIH Library Bioinformatics Support Program An ORS Service 20 September](https://reader034.fdocuments.us/reader034/viewer/2022052004/60170669db03a44e6662973f/html5/thumbnails/33.jpg)
Next Generation Sequencing Data Analysis An ORS Service
Galaxy – Results of Picard Summary Alignment Metrics
Key - http://picard.sourceforge.net/picard-metric-definitions.shtml
Step 2 Step 1 For next slide
![Page 34: Registration Page · 2012. 12. 6. · Next Generation Sequencing Data Analysis Lynn Young, Ph.D. lynny@mail.nih.gov NIH Library Bioinformatics Support Program An ORS Service 20 September](https://reader034.fdocuments.us/reader034/viewer/2022052004/60170669db03a44e6662973f/html5/thumbnails/34.jpg)
Next Generation Sequencing Data Analysis An ORS Service
Galaxy Variant Detection – Preparation Merging Bam Files
Step 1
Step 3
Step 4
Step 2 Step 6
Step 5
![Page 35: Registration Page · 2012. 12. 6. · Next Generation Sequencing Data Analysis Lynn Young, Ph.D. lynny@mail.nih.gov NIH Library Bioinformatics Support Program An ORS Service 20 September](https://reader034.fdocuments.us/reader034/viewer/2022052004/60170669db03a44e6662973f/html5/thumbnails/35.jpg)
Next Generation Sequencing Data Analysis An ORS Service
Galaxy Variant Detection – Preparation Merging Bam Files
Step 1
Step 2
![Page 36: Registration Page · 2012. 12. 6. · Next Generation Sequencing Data Analysis Lynn Young, Ph.D. lynny@mail.nih.gov NIH Library Bioinformatics Support Program An ORS Service 20 September](https://reader034.fdocuments.us/reader034/viewer/2022052004/60170669db03a44e6662973f/html5/thumbnails/36.jpg)
Next Generation Sequencing Data Analysis An ORS Service
Galaxy Variant Detection - FreeBayes
Step 2 Step 3 Step 4
Step 5
Step 6 Step 1
![Page 37: Registration Page · 2012. 12. 6. · Next Generation Sequencing Data Analysis Lynn Young, Ph.D. lynny@mail.nih.gov NIH Library Bioinformatics Support Program An ORS Service 20 September](https://reader034.fdocuments.us/reader034/viewer/2022052004/60170669db03a44e6662973f/html5/thumbnails/37.jpg)
Next Generation Sequencing Data Analysis An ORS Service
Galaxy Variant Detection – FreeBayes Results
Step 2 Step 1
For next slide
![Page 38: Registration Page · 2012. 12. 6. · Next Generation Sequencing Data Analysis Lynn Young, Ph.D. lynny@mail.nih.gov NIH Library Bioinformatics Support Program An ORS Service 20 September](https://reader034.fdocuments.us/reader034/viewer/2022052004/60170669db03a44e6662973f/html5/thumbnails/38.jpg)
Next Generation Sequencing Data Analysis An ORS Service
Galaxy – Filter and Sort
Step 1
Step 3 Step 4
Step 2 Step 5
![Page 39: Registration Page · 2012. 12. 6. · Next Generation Sequencing Data Analysis Lynn Young, Ph.D. lynny@mail.nih.gov NIH Library Bioinformatics Support Program An ORS Service 20 September](https://reader034.fdocuments.us/reader034/viewer/2022052004/60170669db03a44e6662973f/html5/thumbnails/39.jpg)
Next Generation Sequencing Data Analysis An ORS Service
Galaxy – Filter and Sort Results
For next slide, open new tab
![Page 40: Registration Page · 2012. 12. 6. · Next Generation Sequencing Data Analysis Lynn Young, Ph.D. lynny@mail.nih.gov NIH Library Bioinformatics Support Program An ORS Service 20 September](https://reader034.fdocuments.us/reader034/viewer/2022052004/60170669db03a44e6662973f/html5/thumbnails/40.jpg)
Next Generation Sequencing Data Analysis An ORS Service
Biological Context UCSC Genome Browser http://genome.ucsc.edu Step 1
Step 3
Step 4
Step 2
Step 5
![Page 41: Registration Page · 2012. 12. 6. · Next Generation Sequencing Data Analysis Lynn Young, Ph.D. lynny@mail.nih.gov NIH Library Bioinformatics Support Program An ORS Service 20 September](https://reader034.fdocuments.us/reader034/viewer/2022052004/60170669db03a44e6662973f/html5/thumbnails/41.jpg)
Next Generation Sequencing Data Analysis An ORS Service
UCSC Genome Browser – OMIM Genes, OMIM AV SNPs
Step 1
Step 3
Step 4
Step 2
![Page 42: Registration Page · 2012. 12. 6. · Next Generation Sequencing Data Analysis Lynn Young, Ph.D. lynny@mail.nih.gov NIH Library Bioinformatics Support Program An ORS Service 20 September](https://reader034.fdocuments.us/reader034/viewer/2022052004/60170669db03a44e6662973f/html5/thumbnails/42.jpg)
Next Generation Sequencing Data Analysis An ORS Service
UCSC Genome Browser - Results
![Page 43: Registration Page · 2012. 12. 6. · Next Generation Sequencing Data Analysis Lynn Young, Ph.D. lynny@mail.nih.gov NIH Library Bioinformatics Support Program An ORS Service 20 September](https://reader034.fdocuments.us/reader034/viewer/2022052004/60170669db03a44e6662973f/html5/thumbnails/43.jpg)
Next Generation Sequencing Data Analysis An ORS Service
Galaxy – Exporting Data Download VCF File
Step 1
Step 2 Step 3
Step 4
![Page 44: Registration Page · 2012. 12. 6. · Next Generation Sequencing Data Analysis Lynn Young, Ph.D. lynny@mail.nih.gov NIH Library Bioinformatics Support Program An ORS Service 20 September](https://reader034.fdocuments.us/reader034/viewer/2022052004/60170669db03a44e6662973f/html5/thumbnails/44.jpg)
Next Generation Sequencing Data Analysis An ORS Service
Galaxy – Exporting Data Download BAM Files
Step 1
Step 2
Step 4
Step 5
Step 3
Repeat steps for the other two BAM files
![Page 45: Registration Page · 2012. 12. 6. · Next Generation Sequencing Data Analysis Lynn Young, Ph.D. lynny@mail.nih.gov NIH Library Bioinformatics Support Program An ORS Service 20 September](https://reader034.fdocuments.us/reader034/viewer/2022052004/60170669db03a44e6662973f/html5/thumbnails/45.jpg)
Next Generation Sequencing Data Analysis An ORS Service
Galaxy - Exporting Data Download BAI Files
Step 1
Step 2 Step 4
Step 5
Step 3
Repeat steps for the other two BAM files
![Page 46: Registration Page · 2012. 12. 6. · Next Generation Sequencing Data Analysis Lynn Young, Ph.D. lynny@mail.nih.gov NIH Library Bioinformatics Support Program An ORS Service 20 September](https://reader034.fdocuments.us/reader034/viewer/2022052004/60170669db03a44e6662973f/html5/thumbnails/46.jpg)
Next Generation Sequencing Data Analysis An ORS Service
Thank you for attending.