6/23/11 Technologies&for& metagenomic&data …statweb.stanford.edu/~susan/Summer11/Lect2-NGS.pdf ·...
Transcript of 6/23/11 Technologies&for& metagenomic&data …statweb.stanford.edu/~susan/Summer11/Lect2-NGS.pdf ·...
6/23/11
1
Technologies for metagenomic data collec7on Overview
• “Old school techniques” • Next genera7on sequencing • Microarrays: PhyloChip
Old school techniques
• Pla7ng – Can only work with culturable strains – Very laborious
• Flow cytometry (unconven7onal) – Davey, HM. and Kell, DB. Flow cytometry and cell sor7ng of heterogeneous microbial popula7ons:
the importance of single-‐cell analysis. Microbiological reviews, Dec. 1996, 641-‐696.
• qPCR – Need primers for every species – Cannot iden7fy previously unknown species – Laborious
“Old genera7on sequencing”: Sanger sequencing
• DNA is fragmented
• Cloned to a plasmid vector
• Cyclic sequencing reac7on
• Separa7on by electrophoresis
• Readout with fluorescent tags
Current genera7on next genera7on sequencing pla[orm comparison
Read length* Gb per run*
Technology
Illumina www.illumina.com
GA II /HiSeq
2 x 100 bp 20+ Gb Bridge amplifica7on
454 www.454.com
GS FLX Titanium
1x400-‐600 bp 2x140-‐200 bp
~0.5 Gb Emul7on PCR amplifica7on Pyrosequencing by extension
Applied Biosystems www.appliedbiosystems.com
SOLiD 3 2 x 50 bp 20+ Gb Emulsion PCR amplifica7on Liga7on-‐based sequencing Alignment in color space
Helicos www.helicosbio.com
Single molecule
2 x 25-‐55 bp 21–28Gb No amplifica7on Single molecule sequencing
* Instrument vendors are constantly upda7ng their technology, chemistry, and QA in order to increase these parameters
Emulsion PCR
• Fragments, with adaptors, are PCR amplified within a water drop in oil.
• One primer is aiached to the surface of a bead.
• Used by 454, Polonator and SOLiD.
Bridge PCR
• DNA fragments are flanked with adaptors. • A flat surface coated with two types of primers, corresponding
to the adaptors. • Amplifica7on proceeds in cycles, with one end of each bridge
tethered to the surface. • Used by Solexa.
6/23/11
2
Roche 454 GS FLX Sequencing
• 1 million reads per run
• 400-‐500 base pares per read
• Ability to mul7plex (sequence many specimens in a single run) with bar codes
Marcel Margulies, et. al. Nature 437, 376-‐380 (15 September 2005)
How does Roche 454 sequencing work?
• Nucleo7des are cycled in order T, A, C, G (“flows”)
• When incorporated onto a template a fluorescent signal is emiied
• The amount of fluorescence is recorded
• The reagents are washed and the flow cycle restarts
hip://www.youtube.com/watch?v=kYAGFrbGl6E
454 base calling
>ACGCTCTTGGCCAGTCTAGTCGTGCG >CCAGTCTAGTCGTGCGACGCTCTTGG >GGCCAGTCTAGTCGTGCGACGCTCTT >ACGCTCTTAGTCTAGTCGTGCGGGCC >TCTTGGCCAGTCTAGTCGTGCGACGC
T:0!A:1!C:1!G:0!T:1!A:1!C:0!G:2!T:1!A:1!
ACTAGGTA
General strategy for
• Select a marker genomic region – Must be present in all bacteria – Must have regions constant for all bacteria for primer design
– Must have regions of variability so that individual species/taxa can be dis7nguished
• Extract DNA from biological specimens • Amplify the marker from extracted DNA • Sequence mul7ple samples in mul7plex
Targeted sequencing of 16S rRNA gene V2 V1 V4 V3 V6 V5 V8 V7 V9
A B 454 adaptors
Barcodes (10 nt)
Forward/reverse primers
5’ 5’
Loca7on of the hyper-‐variable regions of the 16S rRNA and a typical primer construct for amplicon library prepara7on
500 1000 1500
16S rRNA small subunit is commonly used • Present in “all” bacteria • Contains variable regions that (may) carry enough phylogene7c informa7on to dis7nguish different bacteria on a “deep-‐enough” taxonomic/phylogene7c level
Limita7ons
• Unique sequences represent unique phylotypes, not species, due to 16S gene mul7plicity
• Abundances computed with different primers are not comparable
!"#$%&!
!!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
! !
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!'()'*!
!'*)'+!
((,!-./0.1$2
3(!,!-./0.1$2
Most variance in the data is attributable to the 16S rRNA locus at which the microbiomes are assessed. !
Similarity of phyla (left) and genera (right) identified using different primers.!
!"#$%&'()'*+,-.'/0'1(##(0'
!"#$""%&'()'*&+%,&'*)'-
!"
#!
#"
!"#!$$%&&%'!%#!"
!"#!$
!"##"$
!%#!"
!"#$%&'()'*%+%&,'-+'.(##(+'
!"#$""%&'()'*&+%,&'*)'-
%$&
&%%
%$&
'()*+,-.-$$
#'()*$+,+-.
#'()*$+,+/0