Sequencing tutorial Peter HANTZ EMBL Heidelberg. Uni Osnabruck M. Waterman Dideoxy (Sanger)...
-
Upload
harvey-dalton -
Category
Documents
-
view
213 -
download
0
Transcript of Sequencing tutorial Peter HANTZ EMBL Heidelberg. Uni Osnabruck M. Waterman Dideoxy (Sanger)...
Sequencingtutorial
Peter HANTZEMBL Heidelberg
Uni OsnabruckM. WatermanDideoxy (Sanger) sequencing
Principle:Gel electrophoresis: discrimination of 1 bp below ~1000 bpSynthesis: starts with a DNA oligo, stops after incorporating a (marked) ddNTPFirst ~ 60 bp uncertain (high relative mass of the fluo. dye)
Radiolabeling: 4 reactions
Dye-termination: 4 fluorescent dyes, one reaction
Pyrosequencing (Roche / 454)
Library constructionA,B: short DNA oligos fused with genomic DNA segments B is biotinilated
Selection of dsDNA: streptavidin-coated magnetic beadsdenaturation: AB strands collected
www.454.comwiki
ds
Bead I.Streptavidin coated
Pyrosequencing (Roche / 454)
www.454.comwiki
Bead II.Simple agarose beads coated with B oligosSingle sstDNA (singles-stranded template DNA)with cA and cB oligo immobilized one on a bead
Bead-bound library emulsified (water-in-oil)
PCR reaction:One strand will be covalently bound to the bead
Pyrosequencing (Roche / 454)
denaturation, one strand is released
Following the selection of DNA-positive beads (enrichment),Beads+reactants in wells having a diameter of cca 40um
www.454.comwiki
Pyrosequencing (Roche / 454)
www.454.comwiki
The reaction: -addition of dNTP-s: incorporation releases pyrophosphate
(only one phosphate is needed for the backbone)
-ATP sulfurylase converts PPi to ATP -luciferase: acts in the presence of ATP -Unincorporated nucleotides and ATP are degraded by the apyrase
-400,000 reads in parallel-multiple consensus incorporations:
>higher signal intensity>problematic...
Illumina (Solexa) sequencing
-making DNA library (~300bp fragments)-ligation of adapters A and B to the fragments
-binding the ssDNA randomly to the flow cell surface-complementary primers are ligated to the surface
Illumina-Fasteris
Illumina (Solexa) sequencing
Bridge amplification: initiation
GeneCore
On the surface: complementary oligos
Illumina (Solexa) sequencing
EMBL Gene Core
Illumina (Solexa) sequencing
TGCA
Data aquisition:
sequencing by synthesis:“reverible terminator” nucleotides blocked + fluorescently labeled
illumina.com
de-blocking to enable the synthesisdye cleavage+eliminationwash step+repeat
Illumina (Solexa) sequencing
Mate-pair sequencing
Single Molecule Real Time Sequencing
Detection:"Zero-Mode Waveguide" holes:near-field standing waves
(~Total Internal Reflection )
WikiPacific Biosciences
Present performance:
1,500 bp in read lengths
Principle:
fluorescent label on the terminal phosphate of NTP-sDNA polymerase:
cleaves this incorporation lasts ~ mS
Assembling
Shotgun sequencingThe genome is fragmented randomly (sonication)No positional and orientatin information is availableThe fragments are sequenced
The results have to be assembled
Merging reads into contigs
Bridges of KönigsbergLeonhard Euler, 1735
www.bioalgorithms.info
Find a path that visits each bridge (=edge) once!
Eulerian path problem: visit each edge once and only once: linear-time algorithm
Graphsset of edges that connect pairs of nodesused to model pairwise relations between certain objects
Hamiltonian Path Problem www.ams.org
Find a route that visits each node (=each airport)exactly once
the amount of computation necessary, using the most efficient algorithms known at present, grows exponentially with the size of the route map
This is an NP (Non-Polynomial) -problem
Traveling Salesman Problem
Find the shortest path which visits every vertex exactly once.
That is: the shortest Hamiltonian pathway
www.wolfram.com
This is also an NP-hard problem...
The Shortest Superstring Problem
Problem: Given a set of strings, find a shortest string that contains all of them
Input: Strings s1, s2,…., sn
Output: A string s that contains all strings s1, s2,…., sn as substrings, such that the length of s is minimized
Equivalent of: -finding the shortest Hamiltonian pathway-TSP
Graph Theory helps DNA assembly
"Translation" of the problem: a model
Nodes: reads Edges: connects nodes if the corresponding reads overlap
Example: assembling a bacterial genome
Red lines - wrong assemblyBold Black lines - good assembly
Assembling the reads = finding the shortest Hamiltonian pathway = TSP = SSPNP...impossible...?
University of Maryland
The Way Out: Constructing and analyzing de Bruijn Graphs
J. Kaptcianos
Finding Eulerian paths in the de Bruijn graph can lead to sequence reconstruction
Linear problem!
Thank You for Your attention!
Second-generation DNA sequencing "Sequencing by synthesis" methods
(Solexa) 300bp [normal] - 10kb [mate-pair]
(454)1-10 kb, and 20 kb in expt. stage
Nature Biotech, vol. 26
DNA Colonies amplified by PCR: “Polonies” (Solexa)
isothermal extension "bridge PCR" note: even PCR-free!
(454)emulsion PCR
fluorescent imaging of the entire arrayReads: (Solexa): ~50-80 (454): ~200-300
Illumina (Solexa) sequencing
Paired-end sequencingflow cell:
EMBL GeneCore
ABI: capillary electrophoresis sequencing and SoLID
Directed graphs
We assign a certain direction with the edges
The Eulerian Path Problem can be re-formulated accordingly:
Visit each edge 1! while passing along the edges in their direction
Note: Eulerian path might not exist!
Examples:
(also known as Overlap-Layout-Consensus method)
M. Waterman
Red: repeats
kezdet tenyleg legrovidebb-e
The Way Out: Constructing and analyzing de Bruijn Graphs
J. Kaptcianos
directed graph representing overlaps between sequences of symbols
Given sequences of symbols (~reads): ATG, TGG, TGC, GTG, GGC, GCA, GCG, CGT"k-length fragments" (k=3)
Nodes: fragments of k-1 (k-1=2)Edges: k-length fragments connecting overlapping vertices
Finding Eulerian paths in the de Bruijn graph can lead to sequence reconstruction(Superpath problem, Merging transformation, etc.)
Linear problem!