20 min
20 min
60 min
~2 ng
(3) Template-switching by TGIRT
Alkaline treatment cDNA clean-up
(4) Adaptor ligation bythermostable 5’ AppDNA/RNA ligase
R2 RNA 3’-Blocker
5’
5’
3’-N R2R DNA
5’ 3’OH
TGIRT
cDNA clean-up
(5) PCR amplification
5’-App 3’-Blocker 5’ 3’ R1R R2R
5’
R2R
R2
P53’
5’
Barcode+P7R1
’
5’5’P
DNA nick
P
(2) Dephosphorylation &denaturation
5’ 3’ OH5’3’ OH
(1) Plasma DNA
Target DNA (-)
5’5’5’ 3’ OH
3’ OH3’ OH5’ 3’ OH
5’ 3’ OH
Target DNA (-)
Target DNA (+)
Target DNA (+)
Target DNA (-)
UMI
UMI
3’R1R
Use of Thermostable Group II Intron Reverse Transcriptases (TGIRTs) for Single-Stranded DNA-seq of Cell-Free DNA in Human Plasma and Molecular Diagnostics
Douglas C. Wu and Alan M. Lambowitz Institute of Cellular and Molecular Biology and Department of Molecular Biosciences, The University of Texas at Austin
▪ 2-h workflow from purified DNA to library ▪ Simple protocol for adding unique molecular
identifiers (UMIs) to exclude PCR duplicates
▪ Window protection score (WPS) analysis of long (120-180 nt) fragments exhibits periodicity expected for nucleosome packaging
▪ WPS analysis of shorter (35-80 nt) fragments resulting from DNA nicking by endogenous nucleases footprints binding sites for transcription factors, such as CTCF
▪ TGIRT-seq predicted nucleosomal binding sites in plasma DNA from a healthy individual match previous studies (2)
▪ TGIRTs have higher fidelity than conventional viral RT in RNA-seq (6)
▪ TGIRT ssDNA-seq has a 1.5X higher mismatch rates than Nextera XT
▪ TGIRT ssDNA-seq more prone to indels in mononucleotide runs ≥4 nt
▪ K12 (MG1655) genomic DNA coverage comparable to Nextera XT (5)
▪ High coverage of GC-enriched regions reflects ligation bias
Conclusion
Reference
Grant Support and Conflict-of-interest Statement
(B) TGIRT ssDNA-seq of Human Plasma DNA
▪ TGIRT-seq of cell-free plasma DNA from a healthy individual gives data similar to that obtained by conventional ssDNA-seq (2)
▪ Major peak at ~167 nt corresponds to DNA fragments protected in nucleosome cores
▪ 10.4-bp periodicity (gray dashed lines) reflects minor groove nicking of nucleosome-bound DNA by endogenous DNases
▪ Dinucleotides pattern at the ends of 167-nt DNA fragments are as expected for inter-nucleosome cleavage
Supported by NIH grants GM37949 and GM37951 and Welch Foundation Grant F-1607. Thermostable group II intron reverse transcriptase (TGIRT) enzymes and methods for their use are the subject of patents and patent applications that have been licensed by the University of Texas at Austin and East Tennessee State University to InGex, LLC. A.M.L. and the University of Texas are minority equity holders in InGex, LLC, and A.M.L. and other present and former Lambowitz laboratory members receive royalty payments from sales of TGIRT enzymes and licensing of intellectual property.
4. Uhlen et al. Science. 2015 5. basespace.illumina.com/projects/21071065 6. Mohr et al. RNA. 2013
▪ TGIRT-seq of fragmented E. coli genomic DNA versus simulation indicates that each DNA fragment has a unique UMI with negligible (<0.005%) recopying of DNA templates (not shown)
Introduction
1. Fan et al. PNAS. 2008 2. Snyder et al. Cell. 2016 3. Sun et al. PNAS. 2015
▪ Tissue-of-origin of plasma DNA from a healthy individual deduced from analysis of nucleosome spacing signals downstream of transcription start sites and published RNA-seq data (4) for ssDNA-seq (2) and TGIRT-seq
(A) Streamlined Protocol
(D) TGIRT ssDNA-seq Metrics• Cell-free (cf) DNA in human plasma consists largely of nucleosome-bound DNA fragments released by apoptosis of lymphoid and myeloid cells in blood (1,2)
• In a variety of disease states, plasma is enriched in DNA fragments released from dying cells in the affected tissues. These can be identified by tissue-specific differences in nucleosome positioning, transcription factor occupancy, and DNA methylation sites, thereby providing diagnostic information (2,3)
• Single-stranded DNA sequencing (ssDNA-seq) is more suitable for the analysis of highly fragmented, nicked DNA samples than are conventional dsDNA-seq methods
• The novel end-to-end template-switching activity of Thermostable Group II Intron Reverse Transcriptases (TGIRTs) facilitates ssDNA-seq by enabling direct attachment of DNA-seq adaptors to cDNA product strands without end repair, tailing, or ligation
• TGIRT ssDNA-seq libraries can be constructed from small amounts of starting material in ~2 h with fewer reagents and lower cost than other ssDNA-seq methods
• TGIRTs enable efficient ssDNA-seq that can be used for analysis of cfDNA in human plasma and other bodily fluids
• Identification of protein binding features of cfDNA provides information about the tissue-of-origin and has potential diagnostic applications
• TGIRT DNA-seq should also be applicable to ancient DNA, FFPE DNA, and bisulfite-treated DNA
(C) TGIRT ssDNA-seq Analysis of Nucleosome Positioning and Transcription Factor Occupancies in Human Plasma DNA
(ref. 2)
Read 1 Read 2
A
C
G
T
0 5 10 15 −20 −15 −10 −5
0.00.20.40.60.8
0.00.20.40.60.8
0.00.20.40.60.8
0.00.20.40.60.8
Position Relative to Read ends
Frac
tion
of R
eads
Nextera XTTGIRT−seq
0
1
2
3
4
0 25 50 75 100GC %
Nor
mal
ized
cov
erag
e Nextera XT(Gini: 0.276±0.00946)TGIRT−seq (Gini: 0.263±0.0198)
0
3
6
9
0 10 20 30 40 50Level of Coverage
% o
f Gen
ome
Nextera XT (R−sqrd: 0.912±0.00711)TGIRT−seq (R−sqrd: 0.899±0.0145)
WGS Theoretical (Poisson)
167 nt
0.0
0.5
1.0
1.5
2.0
2.5
0 50 100
150
200
250
300
350
400
Fragment length (nt)
Perc
ent r
eads
ssDNA−seqTGIRT−seq
ssDNA−seq TGIRT−seq
−120−100 −8
0−60−40−20 0 20 40 60 80 10
0−120−100 −8
0−60−40−20 0 20 40 60 80 10
0−0.2
−0.1
0.0
0.1
0.2
Positions relative to center of 167−nt fragments
Nor
mal
ized
cou
nt
AA/AT/TA/TTGG/GC/CG/CC
167 nt
0.00.51.01.52.02.5
0 50 100
150
200
250
300
350
400
Fragment length (nt)
Perc
ent r
eads
ssDNA−seqTGIRT−seq
Long (120−180 nt)Short (35−80 nt)
−1000
−800
−600
−400
−200 0200
400
600
800
1000
−1
0
1
2
0
5
Position relative to CTCF binding sites
ssDNA−seq (ref.2)TGIRT−seq
Scal
ed W
PS
0.00.20.40.60.8
−720
−640
−560
−480
−400
−320
−240
−160 −80 0 80 160
240
320
400
480
560
640
720
Difference in distancebetween nucleosome centers(bp)[ssDNA−seq (ref.2) vs TGIRT−seq]
Peak
cou
nt x105
(a)
(b)
(c)
Figure 3
(Mononucleotide runs < 4)
Indel Rate
Mism
atch Rate
0
1
2
0
1
2
3
Gen
ome
sequ
ence
err
or ra
te
Nextera XT TGIRT−seq
x10−5
x10−3
0.00
0.05
0.10
0.15
0.20
0 1 2 3 4 5 6 7 8 9Mononucleotide run (nt)
Aver
age
inde
l per
read
per m
onon
ucle
otid
e ru
n
Nextera XTTGIRT−seq
0.00
0.25
0.50
0.75
1.00
1.25
−720
−640
−560
−480
−400
−320
−240
−160−80 0 80 160
240
320
400
480
560
640
720
Difference in distancebetween nucleosome centers(bp)[ssDNA−seq (ref.2) vs TGIRT−seq]
Peak
cou
nt
x104
Top Related