Should Ion Torrent Sequencing Be Used For Amplicon Sequencing? - Lauren Bragg

Lauren Bragg| Bioinformatician13 February 2014

CSIRO COMPUTATIONAL INFORMATICS

Should Ion Torrent sequencing be used for amplicon sequencing?

What are ideal characteristics of an amplicon-appropriate platform? • Long read lengths

• High accuracy

• High throughput (or low cost per ‘tag’)

• Amplicon composition an accurate reflection of the community composition

2 |

3 |

The Ion Torrent PGM

Superficial comparison of Roche 454 Pyrosequencing and Ion Torrent• Similar library preparation• Light versus pH detections• ‘TACG’ flow pattern versus 32-base flow pattern• Both append 0+ bases during each flow

Definitely a technology under development• Seems like the kits/chips/software are updated constantly • Length and density of reads makes it a compromise between 454 and Illumina•Paired-end is apparently available (Scott Chandry)

I would not have recommended Ion Torrent a year ago…• High over-call/undercall error rate (~ 1.38% global mean rate)

4 |

InDel error-rate

Substitution error-rate

I would not have recommended Ion Torrent a year ago…• High over-call/undercall error rate (1.3%)• Mean flow error-rate varies wildly between flow positions

5 |

I would not have recommended Ion Torrent a year ago…• High over-call/undercall error rate (1.3%)• Mean flow error-rate varies wildly between flow positions• High frequency indels (relative to reference) – 1 per 2Kb ref.

genome

6 |

Across both strands Strand-specific

I would not have recommended Ion Torrent a year ago…• High over-call/undercall error rate (1.3%)• Mean flow error-rate varies wildly between flow positions• High frequency indels (relative to reference) – 1 per 2Kb ref.

genome• Small bias against low G+C%, and very strong bias against high G+C

% bugs

7 |

But what about now?Turns out it’s very difficult to analyse the new data using my existing

workflow…• No longer support SFF format• In theory the flow-values can be accessed from the flow-value field…• But it turns out that flow-values don’t correspond to the called sequence

• The phase-correction module can yield reads which are substantially different from their flowgram… (Out-of-Phase (OOP) reads).

• Life tech won’t/can’t support a phase-corrected flowgram

8 |

T A C G T

0 1.15 0 3.32 0

A GGG

A C GG

Flow cycle

Flow calls

Inferred

BaseCaller

After much ado, the results…

9 |

Flow error-rate

10 |

Flow error-rate profile differs between OOP reads and non-OOP reads

11 |

High frequency indels

12 |

• Still present at around the same frequency (1 per 2Kb reference)

Summary and recommendationsError-rate• The over-call/under-call error-rate has decreased dramatically,

although flow-specific error-rates persist.• Error profile differs between OOP and in-phase reads• High-frequency indels will still cause issues but unlikely to cause

‘genus’ changes in classification. Read-length• Read lengths consistently achieving 400bpCheap cost-per-base• 3 million 400bp reads from a 316 chip for ~$900 (not including

labour)Biases (TBA)

13 |

CSIRO COMPUTATIONAL INFORMATICS

Acknowledgements

Computational InformaticsLauren BraggBioinformaticiant +61 7 3214 2945e [email protected]

UQ

Gene TysonMargaret ButlerPhil Hugenholtz

UWS

Glenn Stone

Should Ion Torrent Sequencing Be Used For Amplicon Sequencing? - Lauren Bragg

Technology

Transcript of Should Ion Torrent Sequencing Be Used For Amplicon Sequencing? - Lauren Bragg