AGBT2017 Reference Workshop: Fulton
-
Upload
genome-reference-consortium -
Category
Health & Medicine
-
view
202 -
download
0
Transcript of AGBT2017 Reference Workshop: Fulton
![Page 1: AGBT2017 Reference Workshop: Fulton](https://reader034.fdocuments.us/reader034/viewer/2022042907/58cf44911a28ab254a8b6165/html5/thumbnails/1.jpg)
Laboratory Aspects of Generating High Quality Assemblies
MGI Reference Genomes Workshop
Bob FultonFebruary 13th 2017
![Page 2: AGBT2017 Reference Workshop: Fulton](https://reader034.fdocuments.us/reader034/viewer/2022042907/58cf44911a28ab254a8b6165/html5/thumbnails/2.jpg)
Primary Objectives
• Develop Tools and Techniques to Provide High Quality, Haplo-resolved Genome Assemblies Sampling and Capturing as Much Human Diversity as Possible
![Page 3: AGBT2017 Reference Workshop: Fulton](https://reader034.fdocuments.us/reader034/viewer/2022042907/58cf44911a28ab254a8b6165/html5/thumbnails/3.jpg)
Sequencing Strategy for Reference Genomes
• PacBio Large Insert Library Construction• Linked Reads with 10X Genomics• Validation Using BioNano Physical Map
![Page 4: AGBT2017 Reference Workshop: Fulton](https://reader034.fdocuments.us/reader034/viewer/2022042907/58cf44911a28ab254a8b6165/html5/thumbnails/4.jpg)
PacBio
![Page 5: AGBT2017 Reference Workshop: Fulton](https://reader034.fdocuments.us/reader034/viewer/2022042907/58cf44911a28ab254a8b6165/html5/thumbnails/5.jpg)
PacBio WGS Library Construction
• High Molecular Weight Genomic DNA• DNA must be of sufficient quality to allow for 50 kb shearing to
produce PacBio Continuous Long Reads (CLR)
• Consistent Shearing 50 kb• Preferred method: Diagenode Megaruptor
• Fragment size setting – 50kb
• Working on 3 Methods for Library Construction• PacBio SMRTbell – Current Standard PacBio SMRTbell Template Prep
Kit 1.0 and SMRTbell Damage Repair Kit• Hybrid Library– Swift Accel-NGS XL Library Prep Kit but exchanging
PacBio Damage Repair Kit• Swift Library - Swift Accel-NGS XL Library Prep Kit Including Swift
DNA Repair Enzymes • New Data Recently Available with New Repair Process
![Page 6: AGBT2017 Reference Workshop: Fulton](https://reader034.fdocuments.us/reader034/viewer/2022042907/58cf44911a28ab254a8b6165/html5/thumbnails/6.jpg)
HG02818 Library Preparation and Sequencing
• Three library reactions(15ug) each of HG02818 were processed using the PacBio SMRTbell, Hybrid, and Swift library preps.
• Library recoveries leading into BluePippin size selection for the Hybrid and Swift methods were double the PacBio library prep.
• All libraries were size selected on the BP at 20Kb-50Kb..
• The PacBio SMRTbell library generated over a Gb of data for the first two SMRT cells. Additional SMRT cells produced less data as the library appeared to degrade.
Library Method Library Recovery Pre-BP
ROI Read Length
PacBio SMRTbell 35.8% (5.3ug) 12178
Hybrid 68.8% (10.3ug) 13511
Swift 70.9% (10.6ug) 10232
![Page 7: AGBT2017 Reference Workshop: Fulton](https://reader034.fdocuments.us/reader034/viewer/2022042907/58cf44911a28ab254a8b6165/html5/thumbnails/7.jpg)
HG02818 Library Preparation and Sequencing
0
200
400
600
800
1000
1200
1400
1600
1800
11/6/16 11/11/16 11/16/16 11/21/16 11/26/16 12/1/16 12/6/16
PacBio SMRTbell
Hybrid
Swift
Date of PacBio RSII Sequencing Run
Read
of
Inse
rt M
base
spe
r SM
RT c
ell
![Page 8: AGBT2017 Reference Workshop: Fulton](https://reader034.fdocuments.us/reader034/viewer/2022042907/58cf44911a28ab254a8b6165/html5/thumbnails/8.jpg)
Subread Length Comparisons - HG02818
SMRTbell Library
• Mean Subread Length: 11,391 bp
• N50 Subread Length: 17,007 bp
Hybrid Libraries
• Mean Subread Length: 13,406 bp
• N50 Subread Length: 18,649 bp
![Page 9: AGBT2017 Reference Workshop: Fulton](https://reader034.fdocuments.us/reader034/viewer/2022042907/58cf44911a28ab254a8b6165/html5/thumbnails/9.jpg)
Subread Length Comparisons - HG02818
Swift Library
• Mean Subread Length: 10,163 bp
• N50 Subread Length: 15,220 bp
E. Coli New Swift Only Kit
• Mean Subread Length:
16,387 bp
• N50 Subread Length:
22,625 bp
![Page 10: AGBT2017 Reference Workshop: Fulton](https://reader034.fdocuments.us/reader034/viewer/2022042907/58cf44911a28ab254a8b6165/html5/thumbnails/10.jpg)
Agilent Tape Station Assessment of Library Size
PacBio SMRTbell No BluePippin Size Selection
![Page 11: AGBT2017 Reference Workshop: Fulton](https://reader034.fdocuments.us/reader034/viewer/2022042907/58cf44911a28ab254a8b6165/html5/thumbnails/11.jpg)
Agilent Tape Station Assessment of Library Size
PacBio SMRTbell 6Kb-50Kb BluePippin Size Selection
![Page 12: AGBT2017 Reference Workshop: Fulton](https://reader034.fdocuments.us/reader034/viewer/2022042907/58cf44911a28ab254a8b6165/html5/thumbnails/12.jpg)
Agilent Tape Station Assessment of Library Size
Hybrid Prep Pre-BluePippin Size Selection
![Page 13: AGBT2017 Reference Workshop: Fulton](https://reader034.fdocuments.us/reader034/viewer/2022042907/58cf44911a28ab254a8b6165/html5/thumbnails/13.jpg)
Agilent Tape Station Assessment of Library Size
PacBio SMRTbell 8Kb-50Kb BluePippin Size Selection
![Page 14: AGBT2017 Reference Workshop: Fulton](https://reader034.fdocuments.us/reader034/viewer/2022042907/58cf44911a28ab254a8b6165/html5/thumbnails/14.jpg)
Agilent Tape Station Assessment of Library Size
Hybrid Prep 18Kb-50Kb BluePippin Size Selection
![Page 15: AGBT2017 Reference Workshop: Fulton](https://reader034.fdocuments.us/reader034/viewer/2022042907/58cf44911a28ab254a8b6165/html5/thumbnails/15.jpg)
10X Genomics
![Page 16: AGBT2017 Reference Workshop: Fulton](https://reader034.fdocuments.us/reader034/viewer/2022042907/58cf44911a28ab254a8b6165/html5/thumbnails/16.jpg)
10X Genomics
• Chromium Instrument• Long Range Linking Information on a Genome Wide
Scale• Phasing Information Across a Genome• Enhanced Variant Calling and Structural Variation
Detection• DeNovo Assembly of Diploid Genomes• Both WGS and Targeted Approaches
![Page 17: AGBT2017 Reference Workshop: Fulton](https://reader034.fdocuments.us/reader034/viewer/2022042907/58cf44911a28ab254a8b6165/html5/thumbnails/17.jpg)
10X Genomics Overview
(Church 10X Genomics)
![Page 18: AGBT2017 Reference Workshop: Fulton](https://reader034.fdocuments.us/reader034/viewer/2022042907/58cf44911a28ab254a8b6165/html5/thumbnails/18.jpg)
10X Genomics Phasing – Important for Het vs. Repeat Copy Resolution
(Church 10X Genomics)
![Page 19: AGBT2017 Reference Workshop: Fulton](https://reader034.fdocuments.us/reader034/viewer/2022042907/58cf44911a28ab254a8b6165/html5/thumbnails/19.jpg)
(Church 10X Genomics)
![Page 20: AGBT2017 Reference Workshop: Fulton](https://reader034.fdocuments.us/reader034/viewer/2022042907/58cf44911a28ab254a8b6165/html5/thumbnails/20.jpg)
BioNano
![Page 21: AGBT2017 Reference Workshop: Fulton](https://reader034.fdocuments.us/reader034/viewer/2022042907/58cf44911a28ab254a8b6165/html5/thumbnails/21.jpg)
![Page 22: AGBT2017 Reference Workshop: Fulton](https://reader034.fdocuments.us/reader034/viewer/2022042907/58cf44911a28ab254a8b6165/html5/thumbnails/22.jpg)
Bionano Stats from Human Cell Lines
Genome Coverage Mol N50(Kb)
# of Map Contigs
Contig N50 (Mb)
Total Map Size (Gb)
NA19240 96X 174.9 3148 1.26 2.85
NA19238 93X 216.9 2798 1.47 2.93
NA19239 118X 201 2565 1.68 2.96
HG00733 157X 202.9 2484 1.69 2.92
HG00514 161X 211.7 3025 1.35 2.83
NA12878 134X 202.7 2739 1.46 2.84
HG01352 117X 184.5 3666 1.01 2.80
![Page 23: AGBT2017 Reference Workshop: Fulton](https://reader034.fdocuments.us/reader034/viewer/2022042907/58cf44911a28ab254a8b6165/html5/thumbnails/23.jpg)
Large Inversion in HG00514
![Page 24: AGBT2017 Reference Workshop: Fulton](https://reader034.fdocuments.us/reader034/viewer/2022042907/58cf44911a28ab254a8b6165/html5/thumbnails/24.jpg)
Printrepeats showing ~25kb Inverted Repeat
![Page 25: AGBT2017 Reference Workshop: Fulton](https://reader034.fdocuments.us/reader034/viewer/2022042907/58cf44911a28ab254a8b6165/html5/thumbnails/25.jpg)
Read Mapping of Short Reads
A CG TG T
Short ReadsA A
CC ? ?G G G G
TTTT ??? ?
![Page 26: AGBT2017 Reference Workshop: Fulton](https://reader034.fdocuments.us/reader034/viewer/2022042907/58cf44911a28ab254a8b6165/html5/thumbnails/26.jpg)
Short Read Assembly
A CG TG T
Short ReadsA A
CC ? ?G G G G
TTTT ??? ?
A
C
G
T
G
T
![Page 27: AGBT2017 Reference Workshop: Fulton](https://reader034.fdocuments.us/reader034/viewer/2022042907/58cf44911a28ab254a8b6165/html5/thumbnails/27.jpg)
Long (PacBio) Reads
A CG TG T
Long ReadsA CG
T
T
A
GA
G G
G
G
T
CT
![Page 28: AGBT2017 Reference Workshop: Fulton](https://reader034.fdocuments.us/reader034/viewer/2022042907/58cf44911a28ab254a8b6165/html5/thumbnails/28.jpg)
10X Linked Reads
A CG TG T
A
C
G G
T
A
C
G
T T
T
T
G T
![Page 29: AGBT2017 Reference Workshop: Fulton](https://reader034.fdocuments.us/reader034/viewer/2022042907/58cf44911a28ab254a8b6165/html5/thumbnails/29.jpg)
10X Linked Reads
A CG TG T
CT TA
T T
A G T
G TX
We only achieve ~.2X per Molecule
X
X
![Page 30: AGBT2017 Reference Workshop: Fulton](https://reader034.fdocuments.us/reader034/viewer/2022042907/58cf44911a28ab254a8b6165/html5/thumbnails/30.jpg)
10X Linked Reads – Resolving Alleles vs Repeats
A CG T/GG T
CT TA G
CT T
A G G
G GX
![Page 31: AGBT2017 Reference Workshop: Fulton](https://reader034.fdocuments.us/reader034/viewer/2022042907/58cf44911a28ab254a8b6165/html5/thumbnails/31.jpg)
BioNano Map
A CG TG T
Nick Sites
![Page 32: AGBT2017 Reference Workshop: Fulton](https://reader034.fdocuments.us/reader034/viewer/2022042907/58cf44911a28ab254a8b6165/html5/thumbnails/32.jpg)
BioNano Map
A CG TG T
Nick Sites
Indicates Flipped Loop of Inverted Repeat
![Page 33: AGBT2017 Reference Workshop: Fulton](https://reader034.fdocuments.us/reader034/viewer/2022042907/58cf44911a28ab254a8b6165/html5/thumbnails/33.jpg)
Future Plans
• Refine Existing Platforms• Longer Linking• Longer Sequences• Cost Reductions
• Investigate New Platforms• PacBio Sequel• Oxford Nanopore
• Investigate New Techniques• Hybridization of Long Linked Reads in Lieu of Large Insert Clones to
Capture Allelic Diversity Across as Many Humans as Possible
![Page 34: AGBT2017 Reference Workshop: Fulton](https://reader034.fdocuments.us/reader034/viewer/2022042907/58cf44911a28ab254a8b6165/html5/thumbnails/34.jpg)
Summary
• Goal: Generate Robust Data Sets for Additional High-quality Reference Genome Enhancing the Full Range of Genetic Diversity in Humans
• These Long Read (Long Range) Sequencing/Mapping Applications Provide Orthogonal Synergistic Data Sets to Help Accomplish Our Goal.
• Each System Possesses Unique Challenges and Requires Optimization of Protocols and Running Conditions Specific to Our Needs.
• Experience and Communication is Key.
(Magrini)
![Page 35: AGBT2017 Reference Workshop: Fulton](https://reader034.fdocuments.us/reader034/viewer/2022042907/58cf44911a28ab254a8b6165/html5/thumbnails/35.jpg)
Acknowledgements
The McDonnell Genome Institute at Washington University in St. Louis
Tina GravesAmy LyLisa CookCatrina FronickKaryn Meltz SteinbergWes WarrenChad TomlinsonEddie BelterSusan Dutcher
10X GenomicsDeanna ChurchMichael Chase
BioNano GenomicsAlex Hastie
Pacific Biosciences Nick SisnerosLaura Nolden
Nationwide Children’s Hospital
Rick WilsonVince MagriniSean McGrath
NCBIValerie Schneider