Should Ion Torrent Sequencing Be Used For Amplicon Sequencing? - Lauren Bragg
-
Upload
australian-bioinformatics-network -
Category
Technology
-
view
2.335 -
download
1
description
Transcript of Should Ion Torrent Sequencing Be Used For Amplicon Sequencing? - Lauren Bragg
Lauren Bragg| Bioinformatician13 February 2014
CSIRO COMPUTATIONAL INFORMATICS
Should Ion Torrent sequencing be used for amplicon sequencing?
What are ideal characteristics of an amplicon-appropriate platform? • Long read lengths
• High accuracy
• High throughput (or low cost per ‘tag’)
• Amplicon composition an accurate reflection of the community composition
2 |
3 |
The Ion Torrent PGM
Superficial comparison of Roche 454 Pyrosequencing and Ion Torrent• Similar library preparation• Light versus pH detections• ‘TACG’ flow pattern versus 32-base flow pattern• Both append 0+ bases during each flow
Definitely a technology under development• Seems like the kits/chips/software are updated constantly • Length and density of reads makes it a compromise between 454 and Illumina•Paired-end is apparently available (Scott Chandry)
I would not have recommended Ion Torrent a year ago…• High over-call/undercall error rate (~ 1.38% global mean rate)
4 |
InDel error-rate
Substitution error-rate
I would not have recommended Ion Torrent a year ago…• High over-call/undercall error rate (1.3%)• Mean flow error-rate varies wildly between flow positions
5 |
I would not have recommended Ion Torrent a year ago…• High over-call/undercall error rate (1.3%)• Mean flow error-rate varies wildly between flow positions• High frequency indels (relative to reference) – 1 per 2Kb ref.
genome
6 |
Across both strands Strand-specific
I would not have recommended Ion Torrent a year ago…• High over-call/undercall error rate (1.3%)• Mean flow error-rate varies wildly between flow positions• High frequency indels (relative to reference) – 1 per 2Kb ref.
genome• Small bias against low G+C%, and very strong bias against high G+C
% bugs
7 |
But what about now?Turns out it’s very difficult to analyse the new data using my existing
workflow…• No longer support SFF format• In theory the flow-values can be accessed from the flow-value field…• But it turns out that flow-values don’t correspond to the called sequence
• The phase-correction module can yield reads which are substantially different from their flowgram… (Out-of-Phase (OOP) reads).
• Life tech won’t/can’t support a phase-corrected flowgram
8 |
T A C G T
0 1.15 0 3.32 0
A GGG
A C GG
Flow cycle
Flow calls
Inferred
BaseCaller
After much ado, the results…
9 |
Flow error-rate
10 |
Flow error-rate profile differs between OOP reads and non-OOP reads
11 |
High frequency indels
12 |
• Still present at around the same frequency (1 per 2Kb reference)
Summary and recommendationsError-rate• The over-call/under-call error-rate has decreased dramatically,
although flow-specific error-rates persist.• Error profile differs between OOP and in-phase reads• High-frequency indels will still cause issues but unlikely to cause
‘genus’ changes in classification. Read-length• Read lengths consistently achieving 400bpCheap cost-per-base• 3 million 400bp reads from a 316 chip for ~$900 (not including
labour)Biases (TBA)
13 |
CSIRO COMPUTATIONAL INFORMATICS
Acknowledgements
Computational InformaticsLauren BraggBioinformaticiant +61 7 3214 2945e [email protected]
UQ
Gene TysonMargaret ButlerPhil Hugenholtz
UWS
Glenn Stone