Sequencing tutorial Peter HANTZ EMBL Heidelberg. Uni Osnabruck M. Waterman Dideoxy (Sanger)...

26
Sequencing tutorial Peter HANTZ EMBL Heidelberg

Transcript of Sequencing tutorial Peter HANTZ EMBL Heidelberg. Uni Osnabruck M. Waterman Dideoxy (Sanger)...

Page 1: Sequencing tutorial Peter HANTZ EMBL Heidelberg. Uni Osnabruck M. Waterman Dideoxy (Sanger) sequencing Principle: Gel electrophoresis: discrimination.

Sequencingtutorial

Peter HANTZEMBL Heidelberg

Page 2: Sequencing tutorial Peter HANTZ EMBL Heidelberg. Uni Osnabruck M. Waterman Dideoxy (Sanger) sequencing Principle: Gel electrophoresis: discrimination.

Uni OsnabruckM. WatermanDideoxy (Sanger) sequencing

Principle:Gel electrophoresis: discrimination of 1 bp below ~1000 bpSynthesis: starts with a DNA oligo, stops after incorporating a (marked) ddNTPFirst ~ 60 bp uncertain (high relative mass of the fluo. dye)

Radiolabeling: 4 reactions

Dye-termination: 4 fluorescent dyes, one reaction

Page 3: Sequencing tutorial Peter HANTZ EMBL Heidelberg. Uni Osnabruck M. Waterman Dideoxy (Sanger) sequencing Principle: Gel electrophoresis: discrimination.

Pyrosequencing (Roche / 454)

Library constructionA,B: short DNA oligos fused with genomic DNA segments B is biotinilated

Selection of dsDNA: streptavidin-coated magnetic beadsdenaturation: AB strands collected

www.454.comwiki

ds

Bead I.Streptavidin coated

Page 4: Sequencing tutorial Peter HANTZ EMBL Heidelberg. Uni Osnabruck M. Waterman Dideoxy (Sanger) sequencing Principle: Gel electrophoresis: discrimination.

Pyrosequencing (Roche / 454)

www.454.comwiki

Bead II.Simple agarose beads coated with B oligosSingle sstDNA (singles-stranded template DNA)with cA and cB oligo immobilized one on a bead

Bead-bound library emulsified (water-in-oil)

PCR reaction:One strand will be covalently bound to the bead

Page 5: Sequencing tutorial Peter HANTZ EMBL Heidelberg. Uni Osnabruck M. Waterman Dideoxy (Sanger) sequencing Principle: Gel electrophoresis: discrimination.

Pyrosequencing (Roche / 454)

denaturation, one strand is released

Following the selection of DNA-positive beads (enrichment),Beads+reactants in wells having a diameter of cca 40um

www.454.comwiki

Page 6: Sequencing tutorial Peter HANTZ EMBL Heidelberg. Uni Osnabruck M. Waterman Dideoxy (Sanger) sequencing Principle: Gel electrophoresis: discrimination.

Pyrosequencing (Roche / 454)

www.454.comwiki

The reaction: -addition of dNTP-s: incorporation releases pyrophosphate

(only one phosphate is needed for the backbone)

-ATP sulfurylase converts PPi to ATP -luciferase: acts in the presence of ATP -Unincorporated nucleotides and ATP are degraded by the apyrase

-400,000 reads in parallel-multiple consensus incorporations:

>higher signal intensity>problematic...

Page 7: Sequencing tutorial Peter HANTZ EMBL Heidelberg. Uni Osnabruck M. Waterman Dideoxy (Sanger) sequencing Principle: Gel electrophoresis: discrimination.

Illumina (Solexa) sequencing

-making DNA library (~300bp fragments)-ligation of adapters A and B to the fragments

-binding the ssDNA randomly to the flow cell surface-complementary primers are ligated to the surface

Illumina-Fasteris

Page 8: Sequencing tutorial Peter HANTZ EMBL Heidelberg. Uni Osnabruck M. Waterman Dideoxy (Sanger) sequencing Principle: Gel electrophoresis: discrimination.

Illumina (Solexa) sequencing

Bridge amplification: initiation

GeneCore

On the surface: complementary oligos

Page 9: Sequencing tutorial Peter HANTZ EMBL Heidelberg. Uni Osnabruck M. Waterman Dideoxy (Sanger) sequencing Principle: Gel electrophoresis: discrimination.

Illumina (Solexa) sequencing

EMBL Gene Core

Page 10: Sequencing tutorial Peter HANTZ EMBL Heidelberg. Uni Osnabruck M. Waterman Dideoxy (Sanger) sequencing Principle: Gel electrophoresis: discrimination.

Illumina (Solexa) sequencing

TGCA

Data aquisition:

sequencing by synthesis:“reverible terminator” nucleotides blocked + fluorescently labeled

illumina.com

de-blocking to enable the synthesisdye cleavage+eliminationwash step+repeat

Page 11: Sequencing tutorial Peter HANTZ EMBL Heidelberg. Uni Osnabruck M. Waterman Dideoxy (Sanger) sequencing Principle: Gel electrophoresis: discrimination.

Illumina (Solexa) sequencing

Mate-pair sequencing

Page 12: Sequencing tutorial Peter HANTZ EMBL Heidelberg. Uni Osnabruck M. Waterman Dideoxy (Sanger) sequencing Principle: Gel electrophoresis: discrimination.

Single Molecule Real Time Sequencing

Detection:"Zero-Mode Waveguide" holes:near-field standing waves

(~Total Internal Reflection )

WikiPacific Biosciences

Present performance:

1,500 bp in read lengths

Principle:

fluorescent label on the terminal phosphate of NTP-sDNA polymerase:

cleaves this incorporation lasts ~ mS

Page 13: Sequencing tutorial Peter HANTZ EMBL Heidelberg. Uni Osnabruck M. Waterman Dideoxy (Sanger) sequencing Principle: Gel electrophoresis: discrimination.

Assembling

Shotgun sequencingThe genome is fragmented randomly (sonication)No positional and orientatin information is availableThe fragments are sequenced

The results have to be assembled

Merging reads into contigs

Page 14: Sequencing tutorial Peter HANTZ EMBL Heidelberg. Uni Osnabruck M. Waterman Dideoxy (Sanger) sequencing Principle: Gel electrophoresis: discrimination.

Bridges of KönigsbergLeonhard Euler, 1735

www.bioalgorithms.info

Find a path that visits each bridge (=edge) once!

Eulerian path problem: visit each edge once and only once: linear-time algorithm

Graphsset of edges that connect pairs of nodesused to model pairwise relations between certain objects

Page 15: Sequencing tutorial Peter HANTZ EMBL Heidelberg. Uni Osnabruck M. Waterman Dideoxy (Sanger) sequencing Principle: Gel electrophoresis: discrimination.

Hamiltonian Path Problem www.ams.org

Find a route that visits each node (=each airport)exactly once

the amount of computation necessary, using the most efficient algorithms known at present, grows exponentially with the size of the route map

This is an NP (Non-Polynomial) -problem

Page 16: Sequencing tutorial Peter HANTZ EMBL Heidelberg. Uni Osnabruck M. Waterman Dideoxy (Sanger) sequencing Principle: Gel electrophoresis: discrimination.

Traveling Salesman Problem

Find the shortest path which visits every vertex exactly once.

That is: the shortest Hamiltonian pathway

www.wolfram.com

This is also an NP-hard problem...

Page 17: Sequencing tutorial Peter HANTZ EMBL Heidelberg. Uni Osnabruck M. Waterman Dideoxy (Sanger) sequencing Principle: Gel electrophoresis: discrimination.

The Shortest Superstring Problem

Problem: Given a set of strings, find a shortest string that contains all of them

Input: Strings s1, s2,…., sn

Output: A string s that contains all strings s1, s2,…., sn as substrings, such that the length of s is minimized

Equivalent of: -finding the shortest Hamiltonian pathway-TSP

Page 18: Sequencing tutorial Peter HANTZ EMBL Heidelberg. Uni Osnabruck M. Waterman Dideoxy (Sanger) sequencing Principle: Gel electrophoresis: discrimination.

Graph Theory helps DNA assembly

"Translation" of the problem: a model

Nodes: reads Edges: connects nodes if the corresponding reads overlap

Example: assembling a bacterial genome

Red lines - wrong assemblyBold Black lines - good assembly

Assembling the reads = finding the shortest Hamiltonian pathway = TSP = SSPNP...impossible...?

University of Maryland

Page 19: Sequencing tutorial Peter HANTZ EMBL Heidelberg. Uni Osnabruck M. Waterman Dideoxy (Sanger) sequencing Principle: Gel electrophoresis: discrimination.

The Way Out: Constructing and analyzing de Bruijn Graphs

J. Kaptcianos

Finding Eulerian paths in the de Bruijn graph can lead to sequence reconstruction

Linear problem!

Page 20: Sequencing tutorial Peter HANTZ EMBL Heidelberg. Uni Osnabruck M. Waterman Dideoxy (Sanger) sequencing Principle: Gel electrophoresis: discrimination.

Thank You for Your attention!

Page 21: Sequencing tutorial Peter HANTZ EMBL Heidelberg. Uni Osnabruck M. Waterman Dideoxy (Sanger) sequencing Principle: Gel electrophoresis: discrimination.

Second-generation DNA sequencing "Sequencing by synthesis" methods

(Solexa) 300bp [normal] - 10kb [mate-pair]

(454)1-10 kb, and 20 kb in expt. stage

Nature Biotech, vol. 26

DNA Colonies amplified by PCR: “Polonies” (Solexa)

isothermal extension "bridge PCR" note: even PCR-free!

(454)emulsion PCR

fluorescent imaging of the entire arrayReads: (Solexa): ~50-80 (454): ~200-300

Page 22: Sequencing tutorial Peter HANTZ EMBL Heidelberg. Uni Osnabruck M. Waterman Dideoxy (Sanger) sequencing Principle: Gel electrophoresis: discrimination.

Illumina (Solexa) sequencing

Paired-end sequencingflow cell:

EMBL GeneCore

Page 23: Sequencing tutorial Peter HANTZ EMBL Heidelberg. Uni Osnabruck M. Waterman Dideoxy (Sanger) sequencing Principle: Gel electrophoresis: discrimination.

ABI: capillary electrophoresis sequencing and SoLID

Page 24: Sequencing tutorial Peter HANTZ EMBL Heidelberg. Uni Osnabruck M. Waterman Dideoxy (Sanger) sequencing Principle: Gel electrophoresis: discrimination.

Directed graphs

We assign a certain direction with the edges

The Eulerian Path Problem can be re-formulated accordingly:

Visit each edge 1! while passing along the edges in their direction

Note: Eulerian path might not exist!

Page 25: Sequencing tutorial Peter HANTZ EMBL Heidelberg. Uni Osnabruck M. Waterman Dideoxy (Sanger) sequencing Principle: Gel electrophoresis: discrimination.

Examples:

(also known as Overlap-Layout-Consensus method)

M. Waterman

Red: repeats

kezdet tenyleg legrovidebb-e

Page 26: Sequencing tutorial Peter HANTZ EMBL Heidelberg. Uni Osnabruck M. Waterman Dideoxy (Sanger) sequencing Principle: Gel electrophoresis: discrimination.

The Way Out: Constructing and analyzing de Bruijn Graphs

J. Kaptcianos

directed graph representing overlaps between sequences of symbols

Given sequences of symbols (~reads): ATG, TGG, TGC, GTG, GGC, GCA, GCG, CGT"k-length fragments" (k=3)

Nodes: fragments of k-1 (k-1=2)Edges: k-length fragments connecting overlapping vertices

Finding Eulerian paths in the de Bruijn graph can lead to sequence reconstruction(Superpath problem, Merging transformation, etc.)

Linear problem!