Next Generation Sequencing at the - AMCbioinformatics.amc.nl/...analysis/2018-sequencing... · 2005...

96
Next Generation Sequencing at the Marja Jakobs Core Facility Genomics

Transcript of Next Generation Sequencing at the - AMCbioinformatics.amc.nl/...analysis/2018-sequencing... · 2005...

Page 1: Next Generation Sequencing at the - AMCbioinformatics.amc.nl/...analysis/2018-sequencing... · 2005 First Next Generation sequencing system: GS 20, 454 Life Sciences (100 bp) ...

Next Generation Sequencing at the

Marja Jakobs Core Facility Genomics

Page 2: Next Generation Sequencing at the - AMCbioinformatics.amc.nl/...analysis/2018-sequencing... · 2005 First Next Generation sequencing system: GS 20, 454 Life Sciences (100 bp) ...

1953 Discovery of DNA structure

1961 Sequentional oligosynthesis possible

1977 Sanger sequencing method published (Frederick Sanger)

1983 Development of PCR

1985 Human Genome Project proposed $10 / base

1987 First automated sequencing machine: ABI 370

1990 Human Genome Project officially started

1996 Capillary sequencer: ABI 310

2000 Parallelized adapter / ligation mediated, bead-based sequencing technology

launching "Next Generation" sequencing

2003 Human Genome Project completed for $0.01 / base

2005 First Next Generation sequencing system: GS 20, 454 Life Sciences (100 bp)

2006 Solexa (later Illumina): Genome Analyser

2007 Applied Biosystems: SOLiD

2010 Semi Conductor Sequencing by Ion Torrent: PGM & Illumina: H1Seq

2011 Roche / 454: 700 bp readlength, 500 Mb output, Illumina MiSeq

2012 Ion Torrent: Ion Proton

2014 Illumina: NextSeq

2015 Oxford Nanopore: MinION, PromethION, Ion Torrent S5 / S5XL

Illumina: HiSeq 4000

2016 Illumina: MiniSeq, Human Genome $0,002 / 1000 basen

Dideoxynucleotides and radioactive labeled fragments

Tim

elin

e D

NA

se

qu

en

cin

g

Page 3: Next Generation Sequencing at the - AMCbioinformatics.amc.nl/...analysis/2018-sequencing... · 2005 First Next Generation sequencing system: GS 20, 454 Life Sciences (100 bp) ...

1953 Discovery of DNA structure

1961 Sequentional oligosynthesis possible

1977 Sanger sequencing method published

1983 Development of PCR

1985 Human Genome Project proposed $10 / base

1987 First automated sequencing machine: ABI 370

1990 Human Genome Project officially started

1996 Capillary sequencer: ABI 310

2000 Parallelized adapter / ligation mediated, bead-based sequencing technology

launching "Next Generation" sequencing

2003 Human Genome Project completed for $0.01 / base

2005 First Next Generation sequencing system: GS 20, 454 Life Sciences (100 bp)

2006 Solexa (later Illumina): Genome Analyser

2007 Applied Biosystems: SOLiD

2010 Semi Conductor Sequencing by Ion Torrent: PGM & Illumina: H1Seq

2011 Roche / 454: 700 bp readlength, 500 Mb output, Illumina MiSeq

2012 Ion Torrent: Ion Proton

2014 Illumina: NextSeq

2015 Oxford Nanopore: MinION, PromethION, Ion Torrent S5 / S5XL

Illumina: HiSeq 4000

2016 Illumina: MiniSeq, Human Genome $0,002 / 1000 basen

fl

Fluorescent dyes

A

T

G

C

ACCGTA

ACCGT

ACCG

ACC

AC

A

T

T

G

C

C

A

A

Tim

elin

e D

NA

se

qu

en

cin

g

Page 4: Next Generation Sequencing at the - AMCbioinformatics.amc.nl/...analysis/2018-sequencing... · 2005 First Next Generation sequencing system: GS 20, 454 Life Sciences (100 bp) ...

1953 Discovery of DNA structure

1961 Sequentional oligosynthesis possible

1977 Sanger sequencing method published

1983 Development of PCR

1985 Human Genome Project proposed $10 / base

1987 First automated sequencing machine: ABI 370

1990 Human Genome Project officially started

1996 Capillary sequencer: ABI 310

2000 Parallelized adapter / ligation mediated, bead-based sequencing technology

launching "Next Generation" sequencing

2003 Human Genome Project completed for $0.01 / base

2005 First Next Generation sequencing system: GS 20, 454 Life Sciences (100 bp)

2006 Solexa (later Illumina): Genome Analyser

2007 Applied Biosystems: SOLiD

2010 Semi Conductor Sequencing by Ion Torrent: PGM & Illumina: H1Seq

2011 Roche / 454: 700 bp readlength, 500 Mb output, Illumina MiSeq

2012 Ion Torrent: Ion Proton

2014 Illumina: NextSeq

2015 Oxford Nanopore: MinION, PromethION, Ion Torrent S5 / S5XL

Illumina: HiSeq 4000

2016 Illumina: MiniSeq, Human Genome $0,002 / 1000 basen

Tim

elin

e D

NA

se

qu

en

cin

g

Page 5: Next Generation Sequencing at the - AMCbioinformatics.amc.nl/...analysis/2018-sequencing... · 2005 First Next Generation sequencing system: GS 20, 454 Life Sciences (100 bp) ...

1953 Discovery of DNA structure

1961 Sequentional oligosynthesis possible

1977 Sanger sequencing method published

1983 Development of PCR

1985 Human Genome Project proposed $10 / base

1987 First automated sequencing machine: ABI 370

1990 Human Genome Project officially started

1996 Capillary sequencer: ABI 310

2000 Parallelized adapter / ligation mediated, bead-based sequencing technology

launching "Next Generation" sequencing

2003 Human Genome Project completed for $0.01 / base

2005 First Next Generation sequencing system: GS 20, 454 Life Sciences (100 bp)

2006 Solexa (later Illumina): Genome Analyser

2007 Applied Biosystems: SOLiD

2010 Semi Conductor Sequencing by Ion Torrent: PGM & Illumina: H1Seq

2011 Roche / 454: 700 bp readlength, 500 Mb output, Illumina MiSeq

2012 Ion Torrent: Ion Proton

2014 Illumina: NextSeq

2015 Oxford Nanopore: MinION, PromethION, Ion Torrent S5 / S5XL

Illumina: HiSeq 4000

2016 Illumina: MiniSeq, Human Genome $0,002 / 1000 basen

Tim

elin

e D

NA

se

qu

en

cin

g

Page 6: Next Generation Sequencing at the - AMCbioinformatics.amc.nl/...analysis/2018-sequencing... · 2005 First Next Generation sequencing system: GS 20, 454 Life Sciences (100 bp) ...

Tim

elin

e D

NA

se

qu

en

cin

g 1953 Discovery of DNA structure

1961 Sequentional oligosynthesis possible

1977 Sanger sequencing method published

1983 Development of PCR

1985 Human Genome Project proposed $10 / base

1987 First automated sequencing machine: ABI 370

1990 Human Genome Project officially started

1996 Capillary sequencer: ABI 310

2000 Parallelized adapter / ligation mediated, bead-based sequencing technology

launching "Next Generation" sequencing

2003 Human Genome Project completed for $0.01 / base

2005 First Next Generation sequencing system: GS 20, 454 Life Sciences (100 bp)

2006 Solexa (later Illumina): Genome Analyser

2007 Applied Biosystems: SOLiD

2010 Semi Conductor Sequencing by Ion Torrent: PGM & Illumina: H1Seq

2011 Roche / 454: 700 bp readlength, 500 Mb output, Illumina MiSeq

2012 Ion Torrent: Ion Proton

2014 Illumina: NextSeq

2015 Oxford Nanopore: MinION, PromethION, Ion Torrent S5 / S5XL

Illumina: HiSeq 4000

2016 Illumina: MiniSeq, Human Genome $0,002 / 1000 basen NovaSeq

Page 7: Next Generation Sequencing at the - AMCbioinformatics.amc.nl/...analysis/2018-sequencing... · 2005 First Next Generation sequencing system: GS 20, 454 Life Sciences (100 bp) ...

Next Generation Sequencing

Mass Parallel Sequencing of unique DNA molecules

Main NGS platforms: • Roche 454 • SOLiD 5500 / Wildfire • Illumina • Ion Torrent • Nanopore • PacBio

Page 8: Next Generation Sequencing at the - AMCbioinformatics.amc.nl/...analysis/2018-sequencing... · 2005 First Next Generation sequencing system: GS 20, 454 Life Sciences (100 bp) ...

CONVENTIONAL

one sample

one tube

one reaction

one result

MPS

Pool of molecules

one reaction vessel

many reactions

many results

Differences in Sequencing Strategies

Page 9: Next Generation Sequencing at the - AMCbioinformatics.amc.nl/...analysis/2018-sequencing... · 2005 First Next Generation sequencing system: GS 20, 454 Life Sciences (100 bp) ...

• Single molecule vs amplified DNA

• Sequencing by synthesis

• Sequencing with terminators

Several strategies

Page 10: Next Generation Sequencing at the - AMCbioinformatics.amc.nl/...analysis/2018-sequencing... · 2005 First Next Generation sequencing system: GS 20, 454 Life Sciences (100 bp) ...

• SMRT sequencing

Detection of incorporation of a single molecule (PAC BIO)

• Nanopore sequencing

Pull a lineair DNA strand through a nanopore (NanoPore)

• Amplify a single molecule to obtain a pool of identical molecules

Emulsion PCR (IonTorrent)

Polony formation (Illumina)

Two approaches

Single molecule Amplified DNA

Page 11: Next Generation Sequencing at the - AMCbioinformatics.amc.nl/...analysis/2018-sequencing... · 2005 First Next Generation sequencing system: GS 20, 454 Life Sciences (100 bp) ...

Single Molecule Real Time sequencing (PacBio)

Page 12: Next Generation Sequencing at the - AMCbioinformatics.amc.nl/...analysis/2018-sequencing... · 2005 First Next Generation sequencing system: GS 20, 454 Life Sciences (100 bp) ...

SMRTbell library

Page 13: Next Generation Sequencing at the - AMCbioinformatics.amc.nl/...analysis/2018-sequencing... · 2005 First Next Generation sequencing system: GS 20, 454 Life Sciences (100 bp) ...

Single Molecule sequencing (Oxford Nanopore)

Page 14: Next Generation Sequencing at the - AMCbioinformatics.amc.nl/...analysis/2018-sequencing... · 2005 First Next Generation sequencing system: GS 20, 454 Life Sciences (100 bp) ...

Amplified DNA

Emulsion PCR (IonTorrent)

Polony formation (Illumina)

Page 15: Next Generation Sequencing at the - AMCbioinformatics.amc.nl/...analysis/2018-sequencing... · 2005 First Next Generation sequencing system: GS 20, 454 Life Sciences (100 bp) ...

Single strand • No amplification • No bias • No copying erros • Long reads

- Very low signal

- High error rate (14%)

Amplified DNA • Strong signals • PCR errors are averaged

- PCR bias/errors

- Short reads - Loose modifications

Single vs Amplified

Page 16: Next Generation Sequencing at the - AMCbioinformatics.amc.nl/...analysis/2018-sequencing... · 2005 First Next Generation sequencing system: GS 20, 454 Life Sciences (100 bp) ...

2002

2005-2014

2006

2010-2014

2014

2011 2007-2013

2011

2012

2015

2011-2014

2015

2016

2017 NovaSeq

Page 17: Next Generation Sequencing at the - AMCbioinformatics.amc.nl/...analysis/2018-sequencing... · 2005 First Next Generation sequencing system: GS 20, 454 Life Sciences (100 bp) ...

Next Generation Sequencing platforms CFG

• Ion Torrent

PGM (Personal Genome Machine)

Proton

• Illumina

MiSeq

HiSeq 4000 (VU)

Page 18: Next Generation Sequencing at the - AMCbioinformatics.amc.nl/...analysis/2018-sequencing... · 2005 First Next Generation sequencing system: GS 20, 454 Life Sciences (100 bp) ...

High-throughput sequencing

Sanger

ABI

Ion

Torrent Illumina

PGM /

PI MiSeq

HiSeq

4000

Run time

sequencer

96

samples /

hour

5 / 2.5

hours

20 – 35

hours 1-3 days

Through

put

5 *105 x

500 bp

250 * 106

bp / year!

30Mb –

2Gb

per run

6 Gb

per run

81-94 Gb

(650-750 Gb)

Per lane

(per flowcell)

sequence

length 500 bp

200 - 400

bp 2x 150 bp 2x 150 bp

Analysis

time

15 min

sample 1 day - months

Page 19: Next Generation Sequencing at the - AMCbioinformatics.amc.nl/...analysis/2018-sequencing... · 2005 First Next Generation sequencing system: GS 20, 454 Life Sciences (100 bp) ...

Workflow Amplified Single Molecule Sequencing

Library preparation

Emulsion PCR ‘Polony’ PCR on a slide

Semiconductor sequencing (Ion Torrent)

Sequencing by synthesis (Illumina)

Data Analysis

Library preparation

Page 20: Next Generation Sequencing at the - AMCbioinformatics.amc.nl/...analysis/2018-sequencing... · 2005 First Next Generation sequencing system: GS 20, 454 Life Sciences (100 bp) ...

Library preparation

platform specific adapter ligation / primers to DNA fragments

20

A B

DNA fragment

barcode

Page 21: Next Generation Sequencing at the - AMCbioinformatics.amc.nl/...analysis/2018-sequencing... · 2005 First Next Generation sequencing system: GS 20, 454 Life Sciences (100 bp) ...

Barcoding > multiplex advantages

Libraries ePCR Enrichment Deposition

1

2

3

4

5

6

7

8

No

barcode

1

2

3

4

5

6

7

8

9

10

barcode

21

Page 22: Next Generation Sequencing at the - AMCbioinformatics.amc.nl/...analysis/2018-sequencing... · 2005 First Next Generation sequencing system: GS 20, 454 Life Sciences (100 bp) ...

Workflow Amplified Single Molecule Sequencing

Library preparation

Emulsion PCR ‘Polony’ PCR on a slide

Semiconductor sequencing (Ion Torrent)

Sequencing by synthesis (Illumina)

Data Analysis

Page 23: Next Generation Sequencing at the - AMCbioinformatics.amc.nl/...analysis/2018-sequencing... · 2005 First Next Generation sequencing system: GS 20, 454 Life Sciences (100 bp) ...

P5

Rd 1 Seq primer

Sequence of intererest

Index Seq Primer

P7

INDEX

Rd 2 Seq Primer

Illumina Library

Page 24: Next Generation Sequencing at the - AMCbioinformatics.amc.nl/...analysis/2018-sequencing... · 2005 First Next Generation sequencing system: GS 20, 454 Life Sciences (100 bp) ...

GCTGATCAG…

GGGGGGGCG…

Illumina: Sequencing by terminators

Page 25: Next Generation Sequencing at the - AMCbioinformatics.amc.nl/...analysis/2018-sequencing... · 2005 First Next Generation sequencing system: GS 20, 454 Life Sciences (100 bp) ...

MiSeq HiSeq

Flow Cell Compartment

optics

Reagent compartment

optics

Touch screen monitor

Reagent compartment

Syringe pumps

Flow Cell Compartment

Keyboard & Mouse tray

1 flowcell, 1 lane, ~25M reads 2 flowcells, 8 lanes each, 300-400M reads / lane

Page 26: Next Generation Sequencing at the - AMCbioinformatics.amc.nl/...analysis/2018-sequencing... · 2005 First Next Generation sequencing system: GS 20, 454 Life Sciences (100 bp) ...

Workflow Amplified Single Molecule Sequencing

Library preparation

Emulsion PCR ‘Polony’ PCR on a slide

Semiconductor sequencing (Ion Torrent)

Sequencing by synthesis (Illumina)

Data Analysis

Page 27: Next Generation Sequencing at the - AMCbioinformatics.amc.nl/...analysis/2018-sequencing... · 2005 First Next Generation sequencing system: GS 20, 454 Life Sciences (100 bp) ...

Ion OneTouch 2 System Ion Chef

Emulsie PCR

Page 28: Next Generation Sequencing at the - AMCbioinformatics.amc.nl/...analysis/2018-sequencing... · 2005 First Next Generation sequencing system: GS 20, 454 Life Sciences (100 bp) ...

Emulsion PCR

clonal amplification of DNA fragments

Create

“Water-in-oil”

emulsion

Mix DNA Library

& capture beads

(1 cpb)

+ PCR Reagents

+ Emulsion Oil

Perform emulsion PCR

library DNA

A

B

Micro-reactors

Annealing of

sequencing

primer

“Break micro-

reactors”

enrich for DNA-

positive beads

Page 29: Next Generation Sequencing at the - AMCbioinformatics.amc.nl/...analysis/2018-sequencing... · 2005 First Next Generation sequencing system: GS 20, 454 Life Sciences (100 bp) ...

Emulsion PCR

Preferably 1cpb (clonal amplification)

Also possible: • ≥ 2 cpb

• 1 copy, ≥ 2 beads

29

Page 30: Next Generation Sequencing at the - AMCbioinformatics.amc.nl/...analysis/2018-sequencing... · 2005 First Next Generation sequencing system: GS 20, 454 Life Sciences (100 bp) ...

Possible outcome emulsion PCR

multicopy beads

duplicate reads

clonal amplification

Page 31: Next Generation Sequencing at the - AMCbioinformatics.amc.nl/...analysis/2018-sequencing... · 2005 First Next Generation sequencing system: GS 20, 454 Life Sciences (100 bp) ...

Workflow Amplified Single Molecule Sequencing

Library preparation

Emulsion PCR ‘Polony’ PCR on a slide

Semiconductor sequencing (Ion Torrent)

Sequencing by synthesis (Illumina)

Data Analysis

Page 32: Next Generation Sequencing at the - AMCbioinformatics.amc.nl/...analysis/2018-sequencing... · 2005 First Next Generation sequencing system: GS 20, 454 Life Sciences (100 bp) ...

Semiconductor Sequencing chip

Page 33: Next Generation Sequencing at the - AMCbioinformatics.amc.nl/...analysis/2018-sequencing... · 2005 First Next Generation sequencing system: GS 20, 454 Life Sciences (100 bp) ...

Ion Torrent: Semiconductor sequencing

Next Gen Sequencing

PPi

Page 34: Next Generation Sequencing at the - AMCbioinformatics.amc.nl/...analysis/2018-sequencing... · 2005 First Next Generation sequencing system: GS 20, 454 Life Sciences (100 bp) ...

Fast (real time) Direct Detection

DNA Ions Sequence

– Nucleotides flow sequentially over Ion semiconductor

chip

– One sensor per well per sequencing reaction

– Direct detection of natural DNA extension, no camera’s

– Millions of sequencing reactions per chip

– Fast detection, fast cycle time, real time detection

Sensor Plate

Silicon Substrate Drain Source Bulk

dNTP

To column

receiver

∆ pH

∆ Q

∆ V

Sensing Layer

H+

Page 35: Next Generation Sequencing at the - AMCbioinformatics.amc.nl/...analysis/2018-sequencing... · 2005 First Next Generation sequencing system: GS 20, 454 Life Sciences (100 bp) ...

Sequencing Workflow

Flow Order

1-mer

2-mer

3-mer

4-mer TACG

The signal strength is proportional to the number of nucleotides incorporated

Key: TCAG for signal calibration and normalization

TTCTGCGAA

35

Key sequence

Page 36: Next Generation Sequencing at the - AMCbioinformatics.amc.nl/...analysis/2018-sequencing... · 2005 First Next Generation sequencing system: GS 20, 454 Life Sciences (100 bp) ...

Ion Proton™ Sequencer PersonalGenomeMachine

Page 37: Next Generation Sequencing at the - AMCbioinformatics.amc.nl/...analysis/2018-sequencing... · 2005 First Next Generation sequencing system: GS 20, 454 Life Sciences (100 bp) ...

PGM Chip chamber

Chip

Page 38: Next Generation Sequencing at the - AMCbioinformatics.amc.nl/...analysis/2018-sequencing... · 2005 First Next Generation sequencing system: GS 20, 454 Life Sciences (100 bp) ...

38

Data analysis Simplified Next Generation Sequencing

Impossible to assemble manually

Same dataset,

different parameters

Page 39: Next Generation Sequencing at the - AMCbioinformatics.amc.nl/...analysis/2018-sequencing... · 2005 First Next Generation sequencing system: GS 20, 454 Life Sciences (100 bp) ...

Genome sequencing:

• De novo sequencing

• Metagenome

• Whole Genome

• Targeted resequencing

Amplicon panels Capture panels

Applications Next Generation Sequencing

selected gene panels, mutation analysis(SNPs, low frequent

mutations etc), structural variation (deletions,

insertions, inversions, CNVs), whole exome sequencing, mitochondrial sequencing

RNA sequencing

Epigenome sequencing

Single Cell Sequencing

• •

Page 40: Next Generation Sequencing at the - AMCbioinformatics.amc.nl/...analysis/2018-sequencing... · 2005 First Next Generation sequencing system: GS 20, 454 Life Sciences (100 bp) ...

De Novo Sequencing

Page 41: Next Generation Sequencing at the - AMCbioinformatics.amc.nl/...analysis/2018-sequencing... · 2005 First Next Generation sequencing system: GS 20, 454 Life Sciences (100 bp) ...

VIDISCA-454 De novo virus discovery

Michel de Vries

Cloning in TA vector Colony-PCR

Sequencing of colony-PCR products

Next Generation Sequencing Ion Torrent PGM sequencing

Page 42: Next Generation Sequencing at the - AMCbioinformatics.amc.nl/...analysis/2018-sequencing... · 2005 First Next Generation sequencing system: GS 20, 454 Life Sciences (100 bp) ...

Mixed bacterial genomes in the sample

DNA cut into fragments

Sequencing Alignment of reads to

reference bacterial genome Relative abundance of bacterial species

sample

Metagenome Sequencing

Page 43: Next Generation Sequencing at the - AMCbioinformatics.amc.nl/...analysis/2018-sequencing... · 2005 First Next Generation sequencing system: GS 20, 454 Life Sciences (100 bp) ...

Targeted resequencing

Multiplex Amplification

Page 44: Next Generation Sequencing at the - AMCbioinformatics.amc.nl/...analysis/2018-sequencing... · 2005 First Next Generation sequencing system: GS 20, 454 Life Sciences (100 bp) ...

Coverage distribution (after optimization)

2017 Artec, Olaf Mook

Page 45: Next Generation Sequencing at the - AMCbioinformatics.amc.nl/...analysis/2018-sequencing... · 2005 First Next Generation sequencing system: GS 20, 454 Life Sciences (100 bp) ...

Compare reads with reference

2017 Artec, Olaf Mook

Page 46: Next Generation Sequencing at the - AMCbioinformatics.amc.nl/...analysis/2018-sequencing... · 2005 First Next Generation sequencing system: GS 20, 454 Life Sciences (100 bp) ...

Applications Next Generation Sequencing Genome sequencing:

RNA sequencing: • Transcriptome • Expression • Alternative splicing • miRNA • ncRNA • ..

Epigenome sequencing

Single Cell Sequencing

• •

Page 47: Next Generation Sequencing at the - AMCbioinformatics.amc.nl/...analysis/2018-sequencing... · 2005 First Next Generation sequencing system: GS 20, 454 Life Sciences (100 bp) ...

More to RNA than messenger RNA

Transcriptome

Non-coding

>80%

Protein

-coding

<3%

Genome mRNA Protein

Small non-coding RNA <300 nt

Long non-coding RNA >200 nt miRNA (18-24 nt) Post-transcriptional gene regulation

piRNA (26-31 nt) Germ line transposon silencing

snRNA (¬150nt) pre-mRNA splicing

snoRNA (60-300 nt) RNA modification, rRNA processing

scaRNA RNA modification

Y RNA DNA replication, small RNA maturation

vRNA (88-98nt)

siRNA (21-23nt)

Circular RNA

(circRNAs) (100- >4000 nt)

Transcriptional RNA

tRNA (70-100nt) Translation

rRNA (120-4700nt) Translation

Repetitive DNA / RNA

Interspread repeats (SINEs,LINEs)

Processed pseudogenes

Simple sequence repeats

Segmental duplication

Blocks of tandem repeats

2017 Artec, Brendon Scicluna

Page 48: Next Generation Sequencing at the - AMCbioinformatics.amc.nl/...analysis/2018-sequencing... · 2005 First Next Generation sequencing system: GS 20, 454 Life Sciences (100 bp) ...

Relative abundance of RNA species

Rela

tive a

bu

ndan

ce (%

)

80

4

13

1 2

2017 Artec, Brendon Scicluna

Page 49: Next Generation Sequencing at the - AMCbioinformatics.amc.nl/...analysis/2018-sequencing... · 2005 First Next Generation sequencing system: GS 20, 454 Life Sciences (100 bp) ...

Relative abundance of RNA species

Rela

tive a

bu

ndan

ce (%

)

80

4

13

1 2

RIN (RNA integrity number)

2017 Artec, Brendon Scicluna

Page 50: Next Generation Sequencing at the - AMCbioinformatics.amc.nl/...analysis/2018-sequencing... · 2005 First Next Generation sequencing system: GS 20, 454 Life Sciences (100 bp) ...

Relative abundance of RNA species

Rela

tive a

bu

ndan

ce (%

)

80

4

13

1 2

RIN (RNA integrity number)

2017 Artec, Brendon Scicluna

Page 51: Next Generation Sequencing at the - AMCbioinformatics.amc.nl/...analysis/2018-sequencing... · 2005 First Next Generation sequencing system: GS 20, 454 Life Sciences (100 bp) ...

Relative abundance of RNA species

Rela

tive a

bu

ndan

ce (%

)

80

4

13

1 2

RIN (RNA integrity number)

2017 Artec, Brendon Scicluna

Page 52: Next Generation Sequencing at the - AMCbioinformatics.amc.nl/...analysis/2018-sequencing... · 2005 First Next Generation sequencing system: GS 20, 454 Life Sciences (100 bp) ...

Approaches for RNA sequencing

Gene expression Target poly(A) mRNAs (enrich or selectively amplify).

Alternative splicing Target exon/intron boundaries by either doing long

read sequencing (>300 bp) or paired end read

sequencing (≥ 2 × 100).

miRNA (small RNAs) Target short reads using size selection purification

because miRNAs are in the 18–23 bp range.

Non-coding RNA Directional RNA sequencing is critical (strand-

specific)

Anti-sense RNA Consider combining mRNA expression with

directional RNA sequencing

Single cell RNA Requires whole transcriptome amplification.

Critical challenge is the technical noise created by

amplification.

2017 Artec, Brendon Scicluna

Page 53: Next Generation Sequencing at the - AMCbioinformatics.amc.nl/...analysis/2018-sequencing... · 2005 First Next Generation sequencing system: GS 20, 454 Life Sciences (100 bp) ...

Preparing RNA for next generation sequencing

The core steps in preparing RNA for NGS analysis are: converting target to double-stranded DNA cDNA fragmenting and/or sizing the target sequences to a desired length

attaching oligonucleotide adapters to the ends of target fragments

quantitating the final library product for sequencing

2017 Artec, Brendon Scicluna

PCR and/or Amplification bias

Page 54: Next Generation Sequencing at the - AMCbioinformatics.amc.nl/...analysis/2018-sequencing... · 2005 First Next Generation sequencing system: GS 20, 454 Life Sciences (100 bp) ...

HTSeq counts human mouse

Comparison of RNA library prep kits: • Kapa mRNA Hyper prep kit

• Kapa total RNA Hyper prep kit

with RiboErase

• Ovation RNA Seq System v2 i.c.w. Ovation Ultralow Library System v2

2017 Jakobs & Jongejans

Page 55: Next Generation Sequencing at the - AMCbioinformatics.amc.nl/...analysis/2018-sequencing... · 2005 First Next Generation sequencing system: GS 20, 454 Life Sciences (100 bp) ...

Concluding remarks

RNA samples contain many types of RNA molecules, not only mRNA.

Ribosomal RNA is the predominant biotype

RNA samples represent molecules at various stages in different

ratio’s of the RNA life cycle RINs > 6.0 are OK

Know your library preparation kits!

Page 56: Next Generation Sequencing at the - AMCbioinformatics.amc.nl/...analysis/2018-sequencing... · 2005 First Next Generation sequencing system: GS 20, 454 Life Sciences (100 bp) ...

Applications Next Generation Sequencing

Genome sequencing:

RNA sequencing:

Epigenome sequencing

• Histon modification and composition • Chromatin accessibility • Chromatin interaction

Single Cell Sequencing

• •

Page 57: Next Generation Sequencing at the - AMCbioinformatics.amc.nl/...analysis/2018-sequencing... · 2005 First Next Generation sequencing system: GS 20, 454 Life Sciences (100 bp) ...

• Stable heritable traits that cannot be explained by changes in DNA sequence

• Describes gene regulation processes that drive gene expression in relation to (cell) differentiation and development

Epigenetics

57

Genetics

Environment

Epigenetics (complex) disease

2017 Artec, Peter Henneman

Epigenetics: bridge between “nature” and “nurture”

Page 58: Next Generation Sequencing at the - AMCbioinformatics.amc.nl/...analysis/2018-sequencing... · 2005 First Next Generation sequencing system: GS 20, 454 Life Sciences (100 bp) ...

Epigenetic modifications

58 2017 Artec, Peter Henneman

1

2

3

DNA and histon modification go hand in hand. Regulation of gene expression often involves complex chromatin interactions. Altogether this is called the Epigenome

Page 59: Next Generation Sequencing at the - AMCbioinformatics.amc.nl/...analysis/2018-sequencing... · 2005 First Next Generation sequencing system: GS 20, 454 Life Sciences (100 bp) ...

Epigenetics and NGS

• DNA-methylation

• Detecting Histon modifications eg CHIPseq

• Detecting chromatin accessibility eg

• Detecting chromatin interaction eg HiC

ATAC, Mnase, Dnase, FAIRE

Page 60: Next Generation Sequencing at the - AMCbioinformatics.amc.nl/...analysis/2018-sequencing... · 2005 First Next Generation sequencing system: GS 20, 454 Life Sciences (100 bp) ...

Applications Next Generation Sequencing

Genome sequencing:

RNA sequencing:

Epigenome sequencing

Single Cell Sequencing

• •

Yong Wang, Nicholas E. Navin

Advances and Applications of Single-Cell Sequencing Technologies

Mol Cell, Volume 58, Issue 4, 2015, 598–609

Page 61: Next Generation Sequencing at the - AMCbioinformatics.amc.nl/...analysis/2018-sequencing... · 2005 First Next Generation sequencing system: GS 20, 454 Life Sciences (100 bp) ...

Figure 3. Broad Applications of SCS in Biological and Biomedical ResearchPanels illustrating the diverse fields of biology that have

been impacted by SCS technologies over the past 5 years. Image credits: neurobiology, Zeynep Saygin (Cell Picture Show); germli...

Yong Wang, Nicholas E. Navin

Advances and Applications of Single-Cell Sequencing Technologies

Mol Cell, Volume 58, Issue 4, 2015, 598–609

Single Cell biology impacts most areas of research

Page 62: Next Generation Sequencing at the - AMCbioinformatics.amc.nl/...analysis/2018-sequencing... · 2005 First Next Generation sequencing system: GS 20, 454 Life Sciences (100 bp) ...

By Kierano - Own work, CC BY-SA 4.0, https://commons.wikimedia.org/w/index.php?curid=63202666

Page 63: Next Generation Sequencing at the - AMCbioinformatics.amc.nl/...analysis/2018-sequencing... · 2005 First Next Generation sequencing system: GS 20, 454 Life Sciences (100 bp) ...

Individual cells behave differently from the average of many cells

Page 64: Next Generation Sequencing at the - AMCbioinformatics.amc.nl/...analysis/2018-sequencing... · 2005 First Next Generation sequencing system: GS 20, 454 Life Sciences (100 bp) ...

Pooled mRNA analysis is misleading

Page 65: Next Generation Sequencing at the - AMCbioinformatics.amc.nl/...analysis/2018-sequencing... · 2005 First Next Generation sequencing system: GS 20, 454 Life Sciences (100 bp) ...

Pooled cell data can be misleading

Page 66: Next Generation Sequencing at the - AMCbioinformatics.amc.nl/...analysis/2018-sequencing... · 2005 First Next Generation sequencing system: GS 20, 454 Life Sciences (100 bp) ...

Level of detail

Page 68: Next Generation Sequencing at the - AMCbioinformatics.amc.nl/...analysis/2018-sequencing... · 2005 First Next Generation sequencing system: GS 20, 454 Life Sciences (100 bp) ...

Conditions Single Cell partitioning systems

• Single cell suspension at high enough concentration

No aggregates / clumping

No doublets

o FACS

o Dnase / trypsin treatment

o Cell strainer

• As little manipulation time as possible

Avoiding pertubation of transcriptomal profiles in RNA expression

• Nanoliter protocols

technical and cost efficient

Page 69: Next Generation Sequencing at the - AMCbioinformatics.amc.nl/...analysis/2018-sequencing... · 2005 First Next Generation sequencing system: GS 20, 454 Life Sciences (100 bp) ...

Flowsorting

384 plate filled with lysis buffer

Reversed Transcription cDNA Library preparation Sequencing

Page 70: Next Generation Sequencing at the - AMCbioinformatics.amc.nl/...analysis/2018-sequencing... · 2005 First Next Generation sequencing system: GS 20, 454 Life Sciences (100 bp) ...

integrated fluidic circuits

C1 Single cell capture, Fluidigm

C1_fluidigm.mp4

Page 71: Next Generation Sequencing at the - AMCbioinformatics.amc.nl/...analysis/2018-sequencing... · 2005 First Next Generation sequencing system: GS 20, 454 Life Sciences (100 bp) ...

Reverse transcription • Conversion of RNA to cDNA

Amplification

• Amplification of cDNA to usable amounts

Barcoding & Indexing

• Labeling library sources Library prep & Analysis

• Prepare sample for sequencing • Data output type

Components of RNA seq method

nL rxn

μL rxn

Page 72: Next Generation Sequencing at the - AMCbioinformatics.amc.nl/...analysis/2018-sequencing... · 2005 First Next Generation sequencing system: GS 20, 454 Life Sciences (100 bp) ...

1 cell 1 cell 1 cell

2 cells 1 cell 1 cell

Result cDNA synthese C1

Page 73: Next Generation Sequencing at the - AMCbioinformatics.amc.nl/...analysis/2018-sequencing... · 2005 First Next Generation sequencing system: GS 20, 454 Life Sciences (100 bp) ...

Chromium Single Cell Solution 10x Genomics

microfluidics

Page 74: Next Generation Sequencing at the - AMCbioinformatics.amc.nl/...analysis/2018-sequencing... · 2005 First Next Generation sequencing system: GS 20, 454 Life Sciences (100 bp) ...

Droplet encapsulation

GEM

10x_Genomics_MMshort.mp4

Page 75: Next Generation Sequencing at the - AMCbioinformatics.amc.nl/...analysis/2018-sequencing... · 2005 First Next Generation sequencing system: GS 20, 454 Life Sciences (100 bp) ...

Yields / efficiency

FACS: 50% - 60% informative wells

C1: 50% - 70% capture efficiency

Chromium: 50% - 60% informative cells

Dependent on cell type and viability

Page 76: Next Generation Sequencing at the - AMCbioinformatics.amc.nl/...analysis/2018-sequencing... · 2005 First Next Generation sequencing system: GS 20, 454 Life Sciences (100 bp) ...

FACS microfluidics

Protocol example

SMART-seq2

MATQ-seq

MARS-seq

CEL-seq

C1

DROP-seq

InDrop

Chromium

Seq-well

SPLIT-seq (SMARTer)

Transcript data

Full lenght Full lenght 3’end counting

3’end counting

Full lenght 3’end counting

3’end counting

3’end counting

3’end counting

3’end counting

Platform Plate-based Plate-based Plate-based Plate-based Microfluidics Droplet Droplet Droplet Nanowell array

Plate-based

Throughput (♯cells)

102-103 102-103 102-103 102-103 102-103 103-104 103-104 103-104 103-104 103-105

Typical read depth (per cell)

106 106 104-105 104-105 106 104-105 104-105 104-105 104-105 104

Reaction volume

microliter microliter microliter nanoliter nanoliter nanoliter nanoliter nanoliter nanoliter microliter

Brief overview of scRNA-seq approaches

Haque et al. Genome Medicine (2017) 9:75

Page 77: Next Generation Sequencing at the - AMCbioinformatics.amc.nl/...analysis/2018-sequencing... · 2005 First Next Generation sequencing system: GS 20, 454 Life Sciences (100 bp) ...

Works for both research and diagnostics

• Detection of alternative splicing

• Expression

• Receptor rearrangement

• Virus discovery

• De novo sequencing

• Target capture

• Exomes

• Whole genomes

• HT sequencing of large genes

• chimerism detection

• Single Cell genomics / transcriptomics

Conclusions next gen sequencing

Page 78: Next Generation Sequencing at the - AMCbioinformatics.amc.nl/...analysis/2018-sequencing... · 2005 First Next Generation sequencing system: GS 20, 454 Life Sciences (100 bp) ...

Next challenge in NGS

• Bioinformatics – Data analysis – Tracking and tracing – Interpretation

• Data storage – New developments in analysis software

• Logistics – pre- post PCR laboratories – Data and sample tracking

Page 79: Next Generation Sequencing at the - AMCbioinformatics.amc.nl/...analysis/2018-sequencing... · 2005 First Next Generation sequencing system: GS 20, 454 Life Sciences (100 bp) ...

The future of medical genetics 1000 dollar genome once in a life time test? Interpretation of data Whole genome analysis might replace exome sequencing (routine now) Identification of genes for many recessive diseases

multidiciplinary

Page 80: Next Generation Sequencing at the - AMCbioinformatics.amc.nl/...analysis/2018-sequencing... · 2005 First Next Generation sequencing system: GS 20, 454 Life Sciences (100 bp) ...

Future applications

Whole Genome

Whole Exome

Targeted resequencing

Epigenome,

Conformation

LDC, FFPE / Frozen tissue : : Split-Seq

Page 81: Next Generation Sequencing at the - AMCbioinformatics.amc.nl/...analysis/2018-sequencing... · 2005 First Next Generation sequencing system: GS 20, 454 Life Sciences (100 bp) ...

Acknowledgements

Core Facility Genomics Linda Koster Suzan Kenter Ferry de Klein

Clinical Genetics Martin Haagmans Olaf Mook

Clinical Genetics Frank Baas

Bioinformatics Laboratory Aldo Jongejan Perry Moerland

Page 82: Next Generation Sequencing at the - AMCbioinformatics.amc.nl/...analysis/2018-sequencing... · 2005 First Next Generation sequencing system: GS 20, 454 Life Sciences (100 bp) ...
Page 83: Next Generation Sequencing at the - AMCbioinformatics.amc.nl/...analysis/2018-sequencing... · 2005 First Next Generation sequencing system: GS 20, 454 Life Sciences (100 bp) ...
Page 84: Next Generation Sequencing at the - AMCbioinformatics.amc.nl/...analysis/2018-sequencing... · 2005 First Next Generation sequencing system: GS 20, 454 Life Sciences (100 bp) ...
Page 85: Next Generation Sequencing at the - AMCbioinformatics.amc.nl/...analysis/2018-sequencing... · 2005 First Next Generation sequencing system: GS 20, 454 Life Sciences (100 bp) ...

library prep € 80 - €125

MiSeq PE 150 25 *10^6 reads per run € 1.300

HiSeq PE 150 350*10^6 reads per laan € 2.623

SR 50 350*10^6 reads per laan € 1.108

PGM 200 bp 5*10^6 reads per 318 chip € 860

400 bp 5*10^6 reads per 318 chip € 950

Proton 200 bp 70*10^6 reads per chip € 1.070

C1 capture & RNA isolatie & cDNA synthese € 1.109,41 per IFC (96 captures)

library prep € 1.211,07 per IFC (96 captures)

Chromium 3’ lib prep (SC) ~ € 1.900 Per library

DNA ~€400 Per library

Page 86: Next Generation Sequencing at the - AMCbioinformatics.amc.nl/...analysis/2018-sequencing... · 2005 First Next Generation sequencing system: GS 20, 454 Life Sciences (100 bp) ...

Structural Variation

Page 87: Next Generation Sequencing at the - AMCbioinformatics.amc.nl/...analysis/2018-sequencing... · 2005 First Next Generation sequencing system: GS 20, 454 Life Sciences (100 bp) ...
Page 88: Next Generation Sequencing at the - AMCbioinformatics.amc.nl/...analysis/2018-sequencing... · 2005 First Next Generation sequencing system: GS 20, 454 Life Sciences (100 bp) ...

miRNA sequencing

Page 89: Next Generation Sequencing at the - AMCbioinformatics.amc.nl/...analysis/2018-sequencing... · 2005 First Next Generation sequencing system: GS 20, 454 Life Sciences (100 bp) ...

Called vs reference homopolymers

Page 90: Next Generation Sequencing at the - AMCbioinformatics.amc.nl/...analysis/2018-sequencing... · 2005 First Next Generation sequencing system: GS 20, 454 Life Sciences (100 bp) ...

Unique Molecular Identifiers

• Uses large numbers (>105) of tags in the intial RT primer

• Each transcript from a given gene is primed with a unique barcode sequence

• Unique combination of tag and transcript can be used to identify individual RNA

molecules

• Chances of two transcripts receiving the same tag are effectively zero

Page 91: Next Generation Sequencing at the - AMCbioinformatics.amc.nl/...analysis/2018-sequencing... · 2005 First Next Generation sequencing system: GS 20, 454 Life Sciences (100 bp) ...
Page 92: Next Generation Sequencing at the - AMCbioinformatics.amc.nl/...analysis/2018-sequencing... · 2005 First Next Generation sequencing system: GS 20, 454 Life Sciences (100 bp) ...
Page 93: Next Generation Sequencing at the - AMCbioinformatics.amc.nl/...analysis/2018-sequencing... · 2005 First Next Generation sequencing system: GS 20, 454 Life Sciences (100 bp) ...
Page 94: Next Generation Sequencing at the - AMCbioinformatics.amc.nl/...analysis/2018-sequencing... · 2005 First Next Generation sequencing system: GS 20, 454 Life Sciences (100 bp) ...
Page 95: Next Generation Sequencing at the - AMCbioinformatics.amc.nl/...analysis/2018-sequencing... · 2005 First Next Generation sequencing system: GS 20, 454 Life Sciences (100 bp) ...

Reverse transcription Template switch

•Poly A or random priming

•Generates full length cDNA

C1 SMARTer

Page 96: Next Generation Sequencing at the - AMCbioinformatics.amc.nl/...analysis/2018-sequencing... · 2005 First Next Generation sequencing system: GS 20, 454 Life Sciences (100 bp) ...

cDNA

Library Preparation