2016 08 30 MaterialsAndMethods submission with ref · ! 4! Primer,extensionandpurification,...
Transcript of 2016 08 30 MaterialsAndMethods submission with ref · ! 4! Primer,extensionandpurification,...
1
Supplementary Information
SI Materials and Methods
Cell lines
Human lymphocyte GM12878 was purchased from the NIGMS Human Genetic Cell Repository
(Coriell Institute) and cultured in RPMI 1640 medium (no phenol red) supplemented with 15%
fetal bovine serum. Cells were maintained at 37°C in a 5% CO2 humidified chamber.
Oligonucleotides and adapteors
Oligonucleotides for Ad1: AD1T: 5’-‐phos-‐GATCGGAAGAGCACACGTCTGAACTCCAGTCA-‐SpC3;
AD1B: 5’-‐NNNNNGACTGGTTCCAATTGAAAGTGCTCTTCCGATC*T. Oligonucleotides for Ad2: AD2T:
5’-‐phos-‐AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGT-‐SpC3; AD2B: 5’-‐
ACACTCTTTCCCTACACGACGCTCTTCCGATCTNNNNN-‐SpC3. Oligonucleotides for primer extension:
Bio3: 5’-‐bio-‐TGACTGGAGTTCAGACGTGTGCTCTTCCGATCT. Above oligonucleotides were
synthesized by IDT. PCR primers for Ad1 were ordered from Sigma: Pu: 5’-‐
GACTGGTTCCAATTGAAAGTGCTCTTCCGATC*T; Pi: 5’-‐
TGACTGGAGTTCAGACGTGTGCTCTTCCGATC*T. “*” indicated phosphorothioate bond. PCR
primers for Ad2 and final library (Universal and Index primers for Illumina) were purchased from
New England Biolabs.
To prepare Ad1 or Ad2, AD1T & AD1B or AD2T & AT2B were annealed, respectively. 5nmol AD1T
or AD2T and 6 nmol AD1B or AD2B were mixed together in 50 µL Hybridization Buffer (10 mM
Tris-‐HCl pH7.5, 100 mM NaCl, 0.1 mM EDTA) and boiled for 2 min, then slowly cooled down to
25°C.
2
Damage-‐Seq library preparation
Treatment of cells and isolation of fragmented genomic DNA
GM12878 cells were grown to ~8x105 cells/ml before treatment. To treat the cells with cisplatin
(Sigma) or oxaliplatin (LC Labs), fresh stocks were made every time before treatment. The drug
was dissolved in DMSO to 20 mM and immediately added to medium to a final concentration of
200 µM. Cells were further incubated at 37°C for 1.5 hour, then transferred to a pre-chilled 15 ml
tube on ice, collected by centrifugation, and washed by ice-cold PBS. Cell pellets were frozen at -
80°C.
Cell pellets were resuspended in 900 µL cold lysis buffer (10 mM Tris-‐HCl pH8.0, 1 mM EDTA,
250 mM NaCl, 0.5% TritonX-‐100 and 0.1% SDS) with 10 µL RNaseA (Sigma) and incubated on ice
for at least 10 min. Lysates were sonicated by Misonix Sonicator 3000 with a microtip on ice
water to generate fragments averaging 400 bp in length and then centrifuged at 14,000 rpm for
10 min to pellet debris. The supernatant was incubated with 10 µL Proteinase K (New England
Biolabs) at 55°C for 30 min, followed by phenol/chloroform extraction and ethanol precipitation.
The DNA pellet was dissolved in 100 µL 1xTE buffer, and fragments 200-‐700 bp in length were
selected by 0.5x/0.7x (50/70 µL) HighPrep PCR beads (MagBio) according to the manufacture’s
guidelines. DNA concentration was determined by NanoDrop 1000 (Thermo).
For naked DNA samples, DNA fragments were prepared from untreated GM12878 cells as
described above. For untreated samples, 1 µg size-‐selected DNA was used to prepare library by
NEBNext DNA preparation kit following the manufacture’s protocol (unlike the Damage-‐seq
protocol). For cisplatin treatment, 5 µg DNA were incubated with 20 µM cisplatin in a final
volume of 50 µL at room temperature for 15 min. Treated DNA were purified through a G50 spin
column (GE) immediately and subjected to Damage-‐seq.
3
End-‐repair, dA-‐tailing and first adaptor ligation
Size-‐selected DNA (5 µg) were used for End-‐repair and dA-‐tailing by NEBNext DNA preparation
kit following the manufacture’s instruction. Ad1 (500 pmol) was ligated to both ends by
NEBNext Quik Ligase for more than 12 hr under 16°C, then purified by HighPrep PCR beads and
eluted in 126 µL 0.1x TE.
Damage-‐specific immunoprecipitation by antibodies
Damaged DNA immunoprecipitation was performed as described previously with modification.
To denature DNA, 42 µL of 8M Urea were added (to a final concentration of 2M), followed by
boiling for 2 min and immediately putting on ice water for 2 min. Then 50 µg denatured
sonicated salmon sperm DNA (Stratagene) and 20 µL of 10x PEXB Buffer (see below) were added,
followed by incubation with antibody-‐coated beads which were prepared as described below in
totally 200 µL of 1x PEXB Buffer (1xPBS, 2 mM EDTA, 0.01% Triton X-‐100 and 0.025% BSA). A
slurry of 40 µL of anti-‐rat Dynabeads (11035, Thermo) was washed three times with 1x PEXB
Buffer, and then incubated with 10 µg carrier DNA, 1.5 µL anti-‐cisplatin modified DNA antibody
(ab103261, Abcam) in 100 µL of 1x PEXB Buffer for 3 hrs at 4°C. After incubation, beads were
washed by 1x PEBX Buffer and incubated with denatured DNA overnight at 4°C.
The beads were washed sequentially with PEXU Buffer (1xPBS, 2 mM EDTA, 0.01% Triton X-‐100
and 1.6M Urea), PEX Buffer (1xPBS, 2 mM EDTA, 0.01% Triton X-‐100), IP Buffer (20 mM Tris-‐Cl
pH 8.0, 2 mM EDTA, 150 mM NaCl, 1% Triton X-‐100, and 0.5% sodium deoxycholate), and TE
Buffer (10 mM Tris-‐Cl pH 8.0 and 1 mM EDTA). The fragments containing damage were eluted
by incubation with 100 µL of Elution Buffer (50 mM NaHCO3, 1% SDS) at 65°C for 10 min. The
eluted DNA was then isolated by phenol/chloroform extraction followed by ethanol
precipitation.
4
Primer extension and purification
NEBNext Q5 Hot Start HiFi PCR Master Mix was used for primer extension in the presence of
purified DNA and 30 pmol Bio3 in a thermocycler for 45s at 98°C followed by 5 min at 65°C. We
chose this enzyme because it has high fidelity and furthermore, it can carry out primer extension
at the same temperature used for annealing. Then 2 µL of Exo1 (New England Biolabs) were
added to degrade excessive primers at 37°C for 10 min. After HighPrep PCR beads purification,
DNA was denatured by boiling for 2 min and immediately putting on ice water for 2 min, then
incubated with 5 µL of Dynabeads MyOne Streptavidin C1 (Thermo) in 30 µL of 1x B&W Buffer (5
mM Tris-‐HCl pH 7.5, 0.5 mM EDTA, 1 M NaCl) at 4°C for 1 hr. Biotinylated DNA were eluted by a
short incubation (~10s) in 100 µL of nonionic water at 75°C and concentrated by ethanol
precipitation.
Second adaptor ligation, PCR amplification and high-‐throughput sequencing
To add the second adaptor to the 3’ end, purified DNA were incubated with 40 pmol of Ad2 in
10 µL of 1x Hybridization Buffer for 10 min at 65°C and then for 5 min at 16°C in a thermal cycler.
To perform ligation, 4 µL of 5x ligase buffer, 1 µL of T4 DNA ligase HC (Thermo), 1 µL of 50%
PEG8000 (New England Biolabs), and 4 µL of H2O were added to each reaction. The reactions
were incubated overnight at 16°C.
For quality check, one percent of ligation products were PCR-‐amplified with primers Pu/Pi
(Primers 1 for Ad1) or Universal/Index1 (Primers 2 for Ad2, E7350S, New England Biolabs). After
HighPrep PCR beads purification, ligated DNA were PCR-‐amplified by NEBNext Q5 Hot Start HiFi
PCR Master Mix for 11-‐13 cycles with NEBNext Multiplex Oligos for Illumina (New England
Biolabs). The PCR products were purified by HighPrep PCR beads and concentration was
determined by Pico Green (Thermo).
5
XR-‐Seq library preparation for cisplatin and oxaliplatin induced damages
XR-‐seq libraries were prepared as described (1) with modifications in the damage-‐specific
immunoprecipitation and in vitro reversal steps. GM12878 cells were treated with 200 µM
cisplatin or oxaliplatin for 3 hr and collected as described above. Cells were lysed and primary
excision products were pulled down by TFIIH co-‐immunoprecipitation and followed by adapter
ligation on both ends. The damage-‐specific immunoprecipitation with anti-‐cisplatin modified
DNA antibody was performed as described in Damage-‐seq with minor difference. A slurry of 25
µL of anti-‐rat Dynabeads and 1 µL anti-‐cisplatin modified DNA antibody were used per reaction.
Pre-‐incubated beads were then incubated with ligation products and 20 µg of denatured
sonicated salmon sperm DNA in 100 µL of 1x PEXB Buffer at 4°C overnight. After sequential
wash with PEX Buffer twice, IP Buffer once and TE Buffer once, the ligation products containing
damage were eluted by incubation with 100 µL of Elution Buffer at 65°C for 10 min. The eluted
DNA was purified by phenol/chloroform extraction and ethanol precipitation. To reverse
cisplatin or oxalipaltin-‐induced damages in vitro, damaged DNA were incubated with 200 mM of
NaCN at 65°C overnight, then purified through a G50 spin column and followed by ethanol
precipitation. Purified DNA were amplified by PCR and purified by native PAGE as described to
make the library.
Sequencing and genome alignments
All sequencing libraries were sequenced on HiSeq 2500 platform by the University of North
Carolina High-‐Throughput Sequencing Facility. Based on our previous experience with XR-‐seq, in
which 5 million uniquely mapped reads are sufficient to detect repair enrichment over genes,
we sequenced Damage-‐seq to at least 10 million mapped reads per sample. Generally, this
6
required multiplexing ≤ 4 samples per lane for Damage-‐seq and ≤ 8 samples per lane for XR-‐seq.
Summary of alignments are available in SI appendix Tables S3,4.
Damage-‐seq sequence analysis. Libraries were sequenced to produce paired-‐end 50nt reads,
allowing us to establish unique aligned reads and distinguish between damage hot spots and
amplification artifacts. Products of primer extension of undamaged DNA were filtered using
cutadapt (2) filtering the adapter sequence 5’-‐GACTGGTTCCAATTGAAAGTGCTCTTCCGATCT-‐3’.
Paired end reads were aligned to the hg19 reference genome with bowtie using command line
options –q -‐-‐nomaqround -‐-‐phred33-‐quals –S –X 1000 –m 4 –seed 123. The damage position
and nucleotide composition were then determined as the 2 nt upstream of the first read start
using samtools and bedtools.
XR-‐seq. XR-‐seq data was analyzed as previously reported. Flanking adapter sequences were
removed using trimmomatic (3). Reads were aligned to the hg19 human reference genome
using bowtie (4) with the command options -‐q -‐-‐nomaqround -‐-‐phred33-‐quals -‐m 4 -‐n 2 -‐e 70 -‐l
20 -‐-‐best –S. Uniquely aligned reads were obtained using samtools.
Data visualization
For comparison of the DNA damage and repair signal, we normalized all the count data by the
sequencing depth and data is available for viewing as a track hub on the UCSC genome browser
(https://genome.ucsc.edu/cgi-‐bin/hgGateway) by pasting the link:
http://trackhubs.its.unc.edu/sancarlb/Platinum_damage/hub.txt. The raw data and bigwig
tracks are available with GEO accession GSE82213. (www.ncbi.nlm.nih.gov/geo/).
7
ENCODE data
GM12878 stranded RNA-‐seq (ENCODE DCC accessions ENCSR00CUH), and DNase-‐seq (accession
ENCSR000EJD), as well as chromHMM chromatin state segmentation (UCSC accession
wgEncodeEH000784) and nucleosome data (Mnase-‐seq, accession ENCSR000CXP) were
downloaded from the ENCODE portal (http://genome.ucsc.edu/ENCODE/) or viewed on the
UCSC browser.
Di-‐nucleotide frequencies
The di-‐nucleotide positions for the hg19 reference genome were established using oligoMatch
from the UCSC tools.
Chromatin state analysis
Bedtools (5) coverage was used to calculate the damage and repair levels over each of the 15
predicted chromatin states defined by the ChromHMM algorithm (6). Merged data from two
biological replicates was used. Values were normalized per million mapped reads and per Kb of
interval length and plotted with R.
Plotting average damage and repair profiles
Average damage and repair profiles from the merged biological replicates was calculated over
GM12878 DNase peaks using bedtools (5) coverage. Counts were normalized per million
mapped reads and plotted with R. For plots, data was binned into 50nt windows.
For average XR-‐seq profiles relative to the annotated TSS or TES, we limited the gene list to
genes that do not have overlapping or neighboring genes for at least 6000bp upstream or
downstream on either strand and were at least 10,000bp in length. The highest quartile of
expressed genes from GM12878 (n=442) cells was identified as previously described. Briefly, we
8
calculated FPKM for the two mapped RNA-‐seq replicates using cufflinks (7) and the UCSC hg19
genes.gtf. Merged biological replicates were used for plotting. Read counts were calculated
from the aligned .bam files using bedtools coverage, normalized per million mapped reads and
plotted with R. For plots, data was binned into 50nt windows.
For nucleosome analysis, nucleosome positions were determined with DNAPOS2 (8). Average
damage and repair signal from merged biological replicates surrounding the 2,000,000 randomly
picked nucleosome center position was calculated using the R GenomicRanges (9) and
genomation packages (10). For plotting, data was binned to 5nt windows.
References
1. Hu, J., Adar, S., Selby, C.P., Lieb, J.D. & Sancar, A. Genome-‐wide analysis of human global and transcription-‐coupled excision repair of UV damage at single-‐nucleotide resolution. Genes Dev 29, 948-‐960 (2015).
2. Martin M (2011) Cutadapt removes adapter sequences from high-‐throughput sequencing reads. EMBnet. journal 17(1), pp-10.
3. Bolger AM, Lohse M, & Usadel B (2014) Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30(15):2114-‐2120.
4. Langmead B, Trapnell C, Pop M, & Salzberg SL (2009) Ultrafast and memory-‐efficient alignment of short DNA sequences to the human genome. Genome Biol 10(3):R25.
5. Quinlan AR & Hall IM (2010) BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26(6):841-‐842.
6. Ernst, J. et al. Mapping and analysis of chromatin state dynamics in nine human cell types. Nature 473, 43-‐49 (2011).
7. Trapnell C, et al. (2010) Transcript assembly and quantification by RNA-‐Seq reveals unannotated transcripts and isoform switching during cell differentiation. Nat Biotechnol 28(5):511-‐515.
8. Chen K, et al. (2013) DANPOS: dynamic analysis of nucleosome position and occupancy by sequencing. Genome Res 23(2):341-‐351.
9. Lawrence M, et al. (2013) Software for computing and annotating genomic ranges. PLoS Comput Biol 9(8):e1003118.
10. Akalin A, Franke V, Vlahovicek K, Mason CE, & Schubeler D (2015) Genomation: a toolkit to summarize, annotate and visualize genomic intervals. Bioinformatics 31(7):1127-‐1129.
SIappendixFigureS1DetailedSchema-cofDamage-Seq.
9
A. B.
SIappendixFigureS2Damage-seqandXR-seqofoxalipla-n.A)AgarosegelanalysisofDamage-seqlibraries. DNA fragments from oxalipla-n-treated cells were amplified with sets of primerscomplementary to the1stand2ndadapters.B)Na-vepolyacrylamidegelelectrophoresisofXR-seqlibrariesshowingoxalipla-nadductreversalbyNaCNisnecessaryforPCRamplifica-onofsequencinglibraries.
10
Cispla-n
Rep1 Rep2
GA
TC
Oxalipla-n
Rep1 Rep2
A.
B.
SI appendix Figure S3. Single nucleo-de frequencies in damage-seq reads. Supplemental to mainFigure2a.Nucleo-defrequenciesareploKedforposi-ons3ntupstreamofthereadstartand10ntintothe read for each damage type and an undamaged control.A) Data is represented for the secondreplicateofcispla-nDamage-seqnotshowninthemainFigure.LeNpanelisfrommainFigure2a.B)Data for two biological replicates of oxalipla-n Damage-seq.C)Data is represented for the secondreplicateofsequencedundamagedDNA.notshowninthemainFigure.LeNpanelisfrommainFigure2c.
C.
Control
Rep1 Rep2
11
−5 −4 −3 −2 −1 G G 1 2 3 4 5
Position relative to dimers
Nuc
leot
ide
frequ
ency
(%)
0
20
40
60
80
100
Cispla-ndamage-seqreplicate2
−5 −4 −3 −2 −1 G G 1 2 3 4 5
Position relative to dimers
Nuc
leot
ide
frequ
ency
(%)
0
20
40
60
80
100
Randomgenomicloci
−5 −4 −3 −2 −1 G G 1 2 3 4 5
Position relative to dimers
Nuc
leot
ide
frequ
ency
(%)
0
20
40
60
80
100
GA TC
−5 −4 −3 −2 −1 G G 1 2 3 4 5
Position relative to dimers
Nuc
leot
ide
frequ
ency
(%)
0
20
40
60
80
100
Oxalipla-ndamage-seq
A. B.
C.Rep1 Rep2
SI appendix Figure S4. Sequence context for the 5nt flanking G-G di-nucleo-des at the -1 and -2posi-ons rela-ve to the read start for the second biological replicate of cispla-n damage seq (A),randomlyselected26mersinthehg19referencegenome(B)andoxalipla-nDamage-seq(C).
12
Histogram of GMCisP_XR3h_Rep1_length[, 1]
GMCisP_XR3h_Rep1_length[, 1]
Freq
uenc
y
10 20 30 40 50
0e+0
02e
+06
4e+0
66e
+06
8e+0
61e
+07
Histogram of GMCisP_XR3h_Rep2_length[, 1]
GMCisP_XR3h_Rep2_length[, 1]Fr
eque
ncy
0 10 20 30 40 50
0.0e
+00
2.0e
+06
4.0e
+06
6.0e
+06
8.0e
+06
1.0e
+07
1.2e
+07
Histogram of OXP_XR_Rep1_length[, 1]
OXP_XR_Rep1_length[, 1]
Freq
uenc
y
0 10 20 30 40 50
0.0e
+00
4.0e
+06
8.0e
+06
1.2e
+07
Histogram of OXP_XR_Rep2_length[, 1]
OXP_XR_Rep2_length[, 1]
Freq
uenc
y
0 10 20 30 40 50
0.0e
+00
4.0e
+06
8.0e
+06
1.2e
+07
Cispla-n
Rep1 Rep2
Oxalipla-n
SIappendixFigureS5.Distribu-onofXR-seqfragmentsizesfromtheCispla-n-andOxalipla-n-XR-seqbiologicalreplicatesinGM12878.Readsof50ntlikelyreflectsmallfrac-onofcontaminantDNA.
13
1−2
2−3
3−4
4−5
5−6
6−7
7−8
8−9
9−10
10−1
111−1
212−1
313−1
414−1
515−1
616−1
717−1
818−1
919−2
020−2
121−2
222−2
323−2
424−2
525−2
6
GG
dim
er fr
eque
ncy
(%)
010203040
Position along excised fragment
Cispla-nXR-seq–rep2
−5 −4 −3 −2 −1 G G 1 2 3 4 5
Position relative to dimers
Nuc
leot
ide
frequ
ency
(%)
0
20
40
60
80
100
1−2
2−3
3−4
4−5
5−6
6−7
7−8
8−9
9−10
10−1
111−1
212−1
313−1
414−1
515−1
616−1
717−1
818−1
919−2
020−2
121−2
222−2
323−2
424−2
525−2
6
GG
dim
er fr
eque
ncy
(%)
01020304050
Position along excised fragment
Oxalipla-nXR-seq–rep1
14 15 16 17 18 19 20 21 22 23 24 25
Position in excised oligo
% o
f rea
ds
0
20
40
60
80
100
A. B.
14 15 16 17 18 19 20 21 22 23 24 25
Position in excised oligo
% o
f rea
ds
0
20
40
60
80
100
1−2
2−3
3−4
4−5
5−6
6−7
7−8
8−9
9−10
10−1
111−1
212−1
313−1
414−1
515−1
616−1
717−1
818−1
919−2
020−2
121−2
222−2
323−2
424−2
525−2
6
GG
dim
er fr
eque
ncy
(%)
01020304050
Position along excised fragment
Oxalipla-nXR-seq–rep2
Cispla-nXR-seq–rep2
Oxalipla-nXR-seq–rep1
Oxalipla-nXR-seq–rep2
GA TC
SIappendixFigureS6.Singlenucleo-deresolu-onmappingofrepair.SupplementaltomainFig2e,f.A)Frequencyoftherelevantdi-nucleo-de,G-G,ateachposi-onof26ntXR-seqexcisionfragmentsinthe second replicate of cispla-n- and for oxalipla-n- XR-seq. B) The corresponding nucleo-defrequenciesatthe5ntflankingtheG-Gdimeratposi-on19-20.
14
Scalechr1:
100 Mb hg1950,000,000 100,000,000 150,000,000 200,000,000
HLCisP_P
GMCisPXR3hRep1PLU
HLCisP_M
GMCisPXR3hRep1MIN
Scalechr8:
50 Mb hg1950,000,000 100,000,000
HLCisP_P
GMCisPXR3hRep1PLU
HLCisP_M
GMCisPXR3hRep1MIN
Chr1Damage+Repair+
Damage-Repair-
Chr2Damage+Repair+
Damage-Repair-
Chr3Damage+Repair+
Damage-Repair-
Chr4Damage+Repair+
Damage-Repair-
Chr5Damage+Repair+
Damage-Repair-
Chr6Damage+Repair+
Damage-Repair-
Chr7Damage+Repair+
Damage-Repair-
Chr8Damage+Repair+
Damage-Repair-
Scalechr2:
100 Mb hg1950,000,000 100,000,000 150,000,000 200,000,000
HLCisP_P
GMCisPXR3hRep1PLU
HLCisP_M
GMCisPXR3hRep1MIN
Scalechr3:
50 Mb hg1950,000,000 100,000,000 150,000,000
HLCisP_P
GMCisPXR3hRep1PLU
HLCisP_M
GMCisPXR3hRep1MIN
Scalechr4:
50 Mb hg1950,000,000 100,000,000 150,000,000
HLCisP_P
GMCisPXR3hRep1PLU
HLCisP_M
GMCisPXR3hRep1MIN
Scalechr5:
50 Mb hg1950,000,000 100,000,000 150,000,000
HLCisP_P
GMCisPXR3hRep1PLU
HLCisP_M
GMCisPXR3hRep1MIN
Scalechr6:
50 Mb hg1950,000,000 100,000,000 150,000,000
HLCisP_P
GMCisPXR3hRep1PLU
HLCisP_M
GMCisPXR3hRep1MIN
Scalechr7:
50 Mb hg1950,000,000 100,000,000 150,000,000
HLCisP_P
GMCisPXR3hRep1PLU
HLCisP_M
GMCisPXR3hRep1MIN
Scalechr9:
50 Mb hg1950,000,000 100,000,000
HLCisP_P
GMCisPXR3hRep1PLU
HLCisP_M
GMCisPXR3hRep1MIN
Chr9
Chr10
Chr11
Chr12
Scalechr10:
50 Mb hg1950,000,000 100,000,000
HLCisP_P
GMCisPXR3hRep1PLU
HLCisP_M
GMCisPXR3hRep1MIN
Scalechr11:
50 Mb hg1950,000,000 100,000,000
HLCisP_P
GMCisPXR3hRep1PLU
HLCisP_M
GMCisPXR3hRep1MIN
Scalechr12:
50 Mb hg1950,000,000 100,000,000
HLCisP_P
GMCisPXR3hRep1PLU
HLCisP_M
GMCisPXR3hRep1MIN
Scalechr16:
20 Mb hg1950,000,000
HLCisP_P
GMCisPXR3hRep1PLU
HLCisP_M
GMCisPXR3hRep1MIN
Chr14
Chr16
Chr15
Scalechr14:
50 Mb hg1950,000,000 100,000,000
HLCisP_P
GMCisPXR3hRep1PLU
HLCisP_M
GMCisPXR3hRep1MIN
Scalechr15:
50 Mb hg1950,000,000 100,000,000
HLCisP_P
GMCisPXR3hRep1PLU
HLCisP_M
GMCisPXR3hRep1MIN
Scalechr17:
20 Mb hg1910,000,000 20,000,000 30,000,000 40,000,000 50,000,000 60,000,000 70,000,000 80,000,000
HLCisP_P
GMCisPXR3hRep1PLU
HLCisP_M
GMCisPXR3hRep1MIN
Chr17
Chr18
Chr19
Chr20
Chr21
Chr22
ChrX
Scalechr18:
20 Mb hg1910,000,000 20,000,000 30,000,000 40,000,000 50,000,000 60,000,000 70,000,000
HLCisP_P
GMCisPXR3hRep1PLU
HLCisP_M
GMCisPXR3hRep1MIN
Scalechr19:
20 Mb hg1910,000,000 20,000,000 30,000,000 40,000,000 50,000,000
HLCisP_P
GMCisPXR3hRep1PLU
HLCisP_M
GMCisPXR3hRep1MIN
Scalechr20:
20 Mb hg1910,000,000 20,000,000 30,000,000 40,000,000 50,000,000 60,000,000
HLCisP_P
GMCisPXR3hRep1PLU
HLCisP_M
GMCisPXR3hRep1MIN
Scalechr21:
10 Mb hg1910,000,000 20,000,000 30,000,000 40,000,000
HLCisP_P
GMCisPXR3hRep1PLU
HLCisP_M
GMCisPXR3hRep1MIN
Scalechr22:
20 Mb hg1910,000,000 20,000,000 30,000,000 40,000,000 50,000,000
HLCisP_P
GMCisPXR3hRep1PLU
HLCisP_M
GMCisPXR3hRep1MIN
ScalechrX:
50 Mb hg1950,000,000 100,000,000 150,000,000
HLCisP_P
GMCisPXR3hRep1PLU
HLCisP_M
GMCisPXR3hRep1MIN
Chr13Scalechr13:
50 Mb hg1950,000,000 100,000,000
HLCisP_P
GMCisPXR3hRep1PLU
HLCisP_M
GMCisPXR3hRep1MIN
Damage+Repair+
Damage-Repair-
Damage+Repair+
Damage-Repair-
Damage+Repair+
Damage-Repair-
Damage+Repair+
Damage-Repair-
SIappendixFigureS7.15
Scalechr13:
50 Mb hg1950,000,000 100,000,000
HLCisP_P
GMCisPXR3hRep1PLU
HLCisP_M
GMCisPXR3hRep1MIN
Scalechr1:
100 Mb hg1950,000,000 100,000,000 150,000,000 200,000,000
HLOxP_P
GMOxPXR3hRep1PLU
HLOxP_M
GMOxPXR3hRep1MIN
Scalechr12:
50 Mb hg1950,000,000 100,000,000
HLCisP_P
GMCisPXR3hRep1PLU
HLCisP_M
GMCisPXR3hRep1MIN
Chr1Damage+Repair+
Damage-Repair-
Chr2Damage+Repair+
Damage-Repair-
Chr3Damage+Repair+
Damage-Repair-
Chr4Damage+Repair+
Damage-Repair-
Chr5Damage+Repair+
Damage-Repair-
Chr6Damage+Repair+
Damage-Repair-
Chr7Damage+Repair+
Damage-Repair-
Chr8Damage+Repair+
Damage-Repair-
Chr9
Chr10
Chr11
Chr12
Chr14
Chr16
Chr15
Chr17
Chr18
Chr19
Chr20
Chr21
Chr22
ChrX
Chr13
Damage+Repair+
Damage-Repair-
Damage+Repair+
Damage-Repair-
Damage+Repair+
Damage-Repair-
Damage+Repair+
Damage-Repair-
Scalechr8:
50 Mb hg1950,000,000 100,000,000
HLOxP_P
GMOxPXR3hRep1PLU
HLOxP_M
GMOxPXR3hRep1MIN
Scalechr9:
50 Mb hg1950,000,000 100,000,000
HLOxP_P
GMOxPXR3hRep1PLU
HLOxP_M
GMOxPXR3hRep1MIN
Scalechr17:
20 Mb hg1910,000,000 20,000,000 30,000,000 40,000,000 50,000,000 60,000,000 70,000,000 80,000,000
HLOxP_P
GMOxPXR3hRep1PLU
HLOxP_M
GMOxPXR3hRep1MIN
Scalechr16:
20 Mb hg1950,000,000
HLOxP_P
GMOxPXR3hRep1PLU
HLOxP_M
GMOxPXR3hRep1MIN
Scalechr2:
100 Mb hg1950,000,000 100,000,000 150,000,000 200,000,000
HLOxP_P
GMOxPXR3hRep1PLU
HLOxP_M
GMOxPXR3hRep1MIN
Scalechr3:
50 Mb hg1950,000,000 100,000,000 150,000,000
HLOxP_P
GMOxPXR3hRep1PLU
HLOxP_M
GMOxPXR3hRep1MIN
Scalechr4:
50 Mb hg1950,000,000 100,000,000 150,000,000
HLOxP_P
GMOxPXR3hRep1PLU
HLOxP_M
GMOxPXR3hRep1MIN
Scalechr5:
50 Mb hg1950,000,000 100,000,000 150,000,000
HLOxP_P
GMOxPXR3hRep1PLU
HLOxP_M
GMOxPXR3hRep1MIN
Scalechr6:
50 Mb hg1950,000,000 100,000,000 150,000,000
HLOxP_P
GMOxPXR3hRep1PLU
HLOxP_M
GMOxPXR3hRep1MIN
Scalechr7:
50 Mb hg1950,000,000 100,000,000 150,000,000
HLOxP_P
GMOxPXR3hRep1PLU
HLOxP_M
GMOxPXR3hRep1MIN
Scalechr10:
50 Mb hg1950,000,000 100,000,000
HLOxP_P
GMOxPXR3hRep1PLU
HLOxP_M
GMOxPXR3hRep1MIN
Scalechr11:
50 Mb hg1950,000,000 100,000,000
HLOxP_P
GMOxPXR3hRep1PLU
HLOxP_M
GMOxPXR3hRep1MIN
Scalechr14:
50 Mb hg1950,000,000 100,000,000
HLOxP_P
GMOxPXR3hRep1PLU
HLOxP_M
GMOxPXR3hRep1MIN
Scalechr15:
50 Mb hg1950,000,000 100,000,000
HLOxP_P
GMOxPXR3hRep1PLU
HLOxP_M
GMOxPXR3hRep1MIN
Scalechr18:
20 Mb hg1910,000,000 20,000,000 30,000,000 40,000,000 50,000,000 60,000,000 70,000,000
HLOxP_P
GMOxPXR3hRep1PLU
HLOxP_M
GMOxPXR3hRep1MIN
Scalechr19:
20 Mb hg1910,000,000 20,000,000 30,000,000 40,000,000 50,000,000
HLOxP_P
GMOxPXR3hRep1PLU
HLOxP_M
GMOxPXR3hRep1MIN
Scalechr20:
20 Mb hg1910,000,000 20,000,000 30,000,000 40,000,000 50,000,000 60,000,000
HLOxP_P
GMOxPXR3hRep1PLU
HLOxP_M
GMOxPXR3hRep1MIN
Scalechr21:
10 Mb hg1910,000,000 20,000,000 30,000,000 40,000,000
HLOxP_P
GMOxPXR3hRep1PLU
HLOxP_M
GMOxPXR3hRep1MIN
Scalechr22:
20 Mb hg1910,000,000 20,000,000 30,000,000 40,000,000 50,000,000
HLOxP_P
GMOxPXR3hRep1PLU
HLOxP_M
GMOxPXR3hRep1MIN
ScalechrX:
50 Mb hg1950,000,000 100,000,000 150,000,000
HLOxP_P
GMOxPXR3hRep1PLU
HLOxP_M
GMOxPXR3hRep1MIN
SIappendixFigureS8. 16
SI appendix Figure S8. Whole genome map of damage and repair of oxalipla-n damage.Screenshots of damage and repair signals, separated by strand, for all the chromosomes of thehumangenome.
SIappendixFigureS7.Wholegenomemapofdamageandrepairofcispla-ndamage.Screenshotsofdamageandrepairsignals,separatedbystrand,forallthechromosomesofthehumangenome.
17
Scalechr17:
20 Mb hg1950,000,000
HLOxP_P
GMOxPXR3hRep1PLU
HLOxP_M
GMOxPXR3hRep1MIN
GM12878 Sg 2
GM78 cel pA+ - 2
GM78 cel pA+ + 2
Scalechr17:
20 kb hg197,570,000 7,580,000 7,590,000 7,600,000 7,610,000
ATP1B2TP53
TP53
TP53TP53TP53TP53TP53TP53TP53TP53TP53TP53
HV941431HV941433HV941428HV941434HV941486
TP53TP53TP53
HV941429TP53
HV941440HV941478HV941442HV941444
TP53HV941430
WRAP53
WRAP53WRAP53
WRAP53WRAP53WRAP53
WRAP53EFNB3
HLOxP_P
GMOxPXR3hRep1PLU
HLOxP_M
GMOxPXR3hRep1MIN
GM12878 Sg 1
GM78 cel pA+ - 2
GM78 cel pA+ + 2
20Kb
Scalechr17:
20 Mb hg1950,000,000
HLCisP_P
GMCisPXR3hRep1PLU
HLCisP_M
GMCisPXR3hRep1MIN
GM12878 Sg 2
GM78 cel pA+ - 1
GM78 cel pA+ + 1
OxaliplaGn20Mb
Damage-seq:GG…50ntXR-seq:TCTTTTTGAAAGCTGGTCTGGTCCTTT
Damage+
Damage-Repair+
Repair-
RNA+RNA-
DNAseHS
Damage+
Damage-Repair+
Repair-
RNA+RNA-
DNAseHS
A.
B.
C.
SI appendix Figure S9. Genome-wide paKerns of damage and repair of oxalipla-n damage. A)Representa-ve screen shot of damage and repair signals, separated by strand, for the en-rechromosome 17. B) Zoom-in on a ~80kbp segment of chromosome 17 which includes TP53. C)Representa-veXR-seqandDamage-seqreadsthatcaptureaspecificPt-d(GpG)damage.
18
CisplaGnCisplaGn
OxaliplaGnOxaliplaGn
DamageTS DamageNTSRepairTS RepairNTS
GGfrequency
B.
C. D.
G.
OxaliplaGn OxaliplaGn
E. F.
A.
SI appendix Figure S10. Repair and damage at transcribed genes. Supplemental tomain Fig3d-f.A)Oxalipla-n damage and repair profiles at the transcribed and non-transcribed strands are ploKedsurroundingtheTSSofhighlyexpressedgenes,B)similartoa,exceptwithazoomed-inscaleforthedamagelevels.C)Cispla-ndamageandrepairareploKedsurroundingtheTESofhighlyexpressedD)similartoc,exceptwithazoomed-inscaleforthedamagelevels.E)Sameasc,exceptforoxalipla-ndamageandrepair,F)sameasd,exceptforoxalipla-ndamage.G)G-GfrequencyploKedsurroundingtheTESofhighlyexpressedgenes.
19
CisplaGn OxaliplaGn
RepairDamage
RepairDamage
A. B.
SIappendixFigureS11.OpenregionsinthegenomehavehigherrepairbutliKledifferenceindamage.A) Plofng cispla-n damage and repair around DNAse-HS sites in GM12878. B) Same as c, exceptploKedisoxalipla-ndamageandrepair.
20
3.Poisedpromoter
4.Strongenhancer
1.Ac-vepromoter
2.Weakpromoter
6.Weakenhancer
8.Insulator
9.Txntransi-on
11.Weaktxn
12.Repressed
5.Strongenhancer
7.Weakenhancer
10.Txnelonga-on
13.Heterochroma-c
14.Repe--ve
15.Repe--ve
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
0.0
0.5
1.0
1.5
2.0
coun
ts p
er K
b pe
r mil
read
s
B. C.1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
0.0
0.5
1.0
1.5
2.0
coun
ts p
er K
b pe
r mil
read
s
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
0.0
0.5
1.0
1.5
2.0
coun
ts p
er K
b pe
r mil
read
s
CisplaGndamage GGfrequency
OxaliplaGndamage
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
0123456
coun
ts p
er K
b pe
r mil
read
s
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
0123456
coun
ts p
er K
b pe
r mil
read
s
OxaliplaGn
012
3456
012
Readsp
erKbpe
rmilmappe
dRe
pair
Damage
A.
SI appendix Figure S12. Damage and repair at different chroma-n states.A) Analysis of oxalipla-nrepair (top) and damage (bo*om) levels across the 15 annotated chroma-n states in GM12878 B)Small varia-ons in cispla-n and oxalipla-n damage levels at the different chroma-n states areobserved when plofng damage on a smaller scale. C) Varia-ons in the frequency of G-G in thedifferentstatesmirrorsthevaria-onindamagelevels.
21
22
TableS1.DinucleotidefrequenciesflankingthereadstartofDamage-seqreadsCisplatin Rep1 (-3)-(-4) (-2)-(-3) (-1)-(-2) (-1)-(1) (1)-(2) TT 5.9 1.8 1.2 4.0 5.9 TC 2.0 2.3 0.7 2.5 4.4 CT 7.2 1.6 5.5 0.9 6.5 CC 2.5 1.0 0.8 0.7 8.7 AA 10.0 2.2 1.2 2.4 8.2 AC 1.8 1.0 0.6 2.0 5.1 AG 9.0 33.6 5.3 3.0 8.1 AT 6.3 1.5 1.4 2.5 5.4 CA 11.3 2.1 1.1 1.2 7.5 CG 1.6 3.5 0.6 0.4 1.4 GA 9.7 2.6 6.9 20.3 5.9 GC 1.8 3.7 1.1 19.1 4.6 GG 9.5 21.1 64.1 18.7 6.5 GT 5.2 1.4 5.0 15.7 8.8 TA 7.4 1.6 0.7 2.9 5.6 TG 8.8 18.9 3.7 3.7 7.2 Cisplatin Rep2 (-3)-(-4) (-2)-(-3) (-1)-(-2) (-1)-(1) (1)-(2) TT 6.0 1.9 1.3 3.8 5.8 TC 2.1 2.1 0.8 2.2 4.1 CT 7.1 1.7 4.9 1.0 6.5 CC 2.5 1.0 0.8 0.7 8.4 AA 10.4 2.6 1.3 2.7 8.7 AC 1.9 1.1 0.7 2.1 5.0 AG 8.9 33.6 6.0 3.2 8.1 AT 6.5 1.7 1.5 2.7 5.5 CA 11.2 2.3 1.2 1.3 7.7 CG 1.5 3.3 0.7 0.4 1.4 GA 9.8 2.9 7.3 20.5 6.0 GC 1.9 3.4 1.2 18.9 4.5 GG 8.9 20.2 62.7 19.0 6.5 GT 5.0 1.6 4.6 15.2 9.2 TA 7.6 1.8 0.8 2.9 5.7 TG 8.6 18.7 4.1 3.5 7.0
23
Oxaliplatin Rep1 (-3)-(-4) (-2)-(-3) (-1)-(-2) (-1)-(1) (1)-(2) TT 9.0 3.9 3.8 5.7 6.1 TC 3.3 3.8 2.3 3.4 4.0 CT 9.0 3.0 7.6 2.5 7.8 CC 3.5 2.3 2.1 1.6 10.0 AA 8.8 3.9 3.2 3.8 9.1 AC 2.9 2.1 2.0 1.7 5.2 AG 6.9 22.3 4.1 3.6 8.3 AT 9.1 3.1 3.2 2.5 6.3 CA 8.7 3.1 2.5 3.4 9.3 CG 1.0 3.8 0.6 1.0 1.6 GA 6.3 2.8 3.7 16.8 4.7 GC 2.5 4.6 2.1 22.0 4.4 GG 7.4 13.6 51.9 10.2 5.4 GT 6.1 2.2 4.8 11.6 5.7 TA 7.6 2.7 2.1 4.7 6.0 TG 7.8 22.8 3.9 5.4 6.1 Oxaliplatin Rep2 (-3)-(-4) (-2)-(-3) (-1)-(-2) (-1)-(1) (1)-(2) TT 8.8 2.0 1.7 4.6 5.8 TC 2.4 2.6 1.0 2.9 4.2 CT 9.8 1.6 7.2 1.1 9.2 CC 2.8 1.1 0.9 0.8 12.5 AA 8.4 2.1 1.5 2.1 8.9 AC 2.3 1.0 0.8 1.2 4.9 AG 6.7 27.1 3.3 2.0 7.7 AT 9.6 1.6 1.5 1.5 6.0 CA 9.3 1.8 1.2 1.5 11.0 CG 1.1 4.9 0.4 0.4 1.9 GA 6.4 1.8 3.2 20.3 3.9 GC 1.9 4.9 1.1 29.7 3.8 GG 8.2 16.0 67.9 11.0 4.3 GT 6.6 1.2 4.5 13.4 5.3 TA 7.7 1.5 0.9 3.6 5.7 TG 8.0 28.8 2.8 3.8 5.1
24
TableS2.DinucleotidefrequenciesatspecificpositioninXR-seqreads.Onlyreadsof26ntlengthwereusedforthisanalysis.Cisplatin XR-seq Rep1 (17-18) (18-19) (19-20) (20-21) (21-22) TT 5.350120035 2.317744776 0.718172576 1.332937183 2.958202774 TC 3.021537512 1.102780437 0.456586363 0.946426107 4.74323262 CT 9.326330915 4.67418824 1.051472956 1.374649293 3.707487615 CC 5.759760449 2.035778312 0.639850027 0.913014126 4.651138201 AA 6.109291016 3.49313812 2.509942972 5.042909549 4.295448622 AC 3.619173191 1.50948673 2.22282197 3.125262683 7.512144885 AG 10.9638881 16.73015119 12.42875503 7.419283895 9.434655521 AT 6.904649499 4.174673823 3.04189133 4.624519637 6.423850022 CA 8.522170066 3.969999001 1.06146483 2.428395532 2.105985767 CG 2.709965494 4.207795037 3.245138742 1.211739448 1.255883443 GA 5.322919932 9.759365003 15.68422928 18.63592941 6.936977706 GC 2.487289437 1.349881076 2.608540039 6.73579211 12.21517211 GG 6.76671934 19.05072436 37.55327017 33.65717932 19.09367884 GT 3.991198692 3.472519967 3.31349585 6.70629229 8.329178864 TA 5.953068854 2.980909176 0.956338681 1.558864564 1.20838239 TG 13.19191747 19.17086475 12.50802918 4.286804858 5.12858062 Cisplatin XR-seq Rep2
(17-18) (18-19) (19-20) (20-21) (21-22) TT 5.286587879 2.296428978 0.607421775 1.185720056 2.938130453 TC 2.984899054 1.068941063 0.384098469 0.883939507 4.362072401 CT 9.366312668 4.826545495 1.006859308 1.299078853 3.727651799 CC 5.713205613 2.066032082 0.619936586 0.868082231 4.495175061 AA 6.113653659 3.309740021 2.241576275 4.337871917 4.290170536 AC 3.741462139 1.479183309 2.009987493 3.111549979 6.878378578 AG 10.80339128 16.64304053 12.15094883 7.16961352 8.342948318 AT 7.040863838 4.114652253 2.692316761 4.182408425 6.013483349 CA 8.327661075 3.843083442 0.9882296 2.104068816 2.519182252 CG 2.677988209 4.288330884 3.346869885 1.213884859 1.293456257 GA 5.250520349 9.069817877 14.69166278 17.77608438 7.722726716 GC 2.584425097 1.347738926 2.471092211 7.171893652 12.98866386 GG 6.936547835 19.61291145 39.95334022 36.30616669 19.33354355 GT 4.234203678 3.470955663 3.118597657 7.034101176 8.693311209 TA 5.854781034 2.872188022 0.879975188 1.30695567 1.35149596 TG 13.08349659 19.69041 12.83708696 4.048580269 5.049609696
25
Oxaliplatin XR-seq Rep1 (17-18) (18-19) (19-20) (20-21) (21-22) TT 6.94017857 2.807392784 0.608301687 0.774051157 2.356301081 TC 2.947505461 1.088990394 0.305937101 0.628042286 3.588398813 CT 10.9939323 4.530829171 1.088108864 1.614634478 4.794398678 CC 5.516208983 1.825137043 0.59115826 1.330805668 6.719562985 AA 6.147870321 2.928139115 1.688499985 3.525763931 5.357903838 AC 3.716893977 1.285269876 1.340087086 2.302179903 6.640712627 AG 7.919834755 13.95087561 8.510168906 4.815346567 10.48503293 AT 9.119200241 4.952432209 2.064913042 2.523675406 4.796022546 CA 7.708244794 3.097044789 0.803537253 2.103279354 2.689147486 CG 2.918159852 5.5469542 3.373613041 1.648318126 1.893730875 GA 4.104789575 5.312838548 10.10432205 20.81930138 5.844159309 GC 2.819312705 1.657020106 4.459842856 11.83581217 14.83678858 GG 6.195101673 19.23135275 46.99375274 34.59902192 17.84467802 GT 4.678734014 3.950017283 2.741324475 6.379177196 6.806323896 TA 5.155826345 2.265646567 0.570581508 0.831327277 1.063169135 TG 13.11820643 25.57005955 14.75585115 4.269263184 4.283669207 Oxaliplatin XR-seq Rep2 (17-18) (18-19) (19-20) (20-21) (21-22) TT 6.517572508 2.652764575 0.646059225 0.758311531 2.092331946 TC 3.013541828 1.16317109 0.387253983 0.70420062 3.551743993 CT 11.0024725 4.682189634 1.218973125 1.763590814 4.923139092 CC 5.8261404 2.082799034 0.751484748 1.521219022 7.316419627 AA 5.814715533 2.822620419 1.892261316 3.919724208 4.896712329 AC 3.772981164 1.351974479 1.632437155 2.794104512 6.944905726 AG 7.781180187 13.31539748 8.329743634 5.210733385 10.94644667 AT 8.92336342 4.655827833 2.14076898 2.669666758 4.6862521 CA 7.440775504 3.151969493 0.930836044 2.249916969 2.692191292 CG 3.115845337 5.94486785 3.58089073 1.929001975 2.142649024 GA 4.20493654 5.935632548 11.17454178 20.48639986 5.858069853 GC 3.249214947 1.884240044 4.692573928 12.05487488 15.13516455 GG 6.438812464 18.96349996 45.19496568 32.98842686 17.23601636 GT 4.90421647 3.861418443 2.608322394 5.798235672 6.232228598 TA 4.685344627 2.084988624 0.596630849 0.818275787 0.956284627 TG 13.30888657 25.44663849 14.22225643 4.333317146 4.38944421
26
TableS3.SummarystatisticsforallDamage-seqsequencingsamplesinthisstudy.
SampleName TotalReads Reads after
filtering
Total Aligned Pairs
Unique Aligned Pairs
% Aligned Average Fragment Length
HLCisP_rep1 34514632 29424268 24378977 23654148 68.5 176.7 HLCisP_rep2 40418699 34552648 27785100 26970800 66.7 164.6 HLOxP_Rep1 29651260 14604091 10514412 10149131 34.2 209.3 HLOxP_Rep2 33016641 25643907 18324749 16370527 49.6 202.8
HLunD_Rep1 6889258 6788256 3171901 3151831 45.7 161.6 HLunD_Rep2 9175301 9060212 3733693 3711577 40.5 157.2
HLCisPvitro_Rep1 41928298 37969998 24872196 24335674 58.0 185.4 HLCisPvitro_Rep2 34580644 32120669 22012872 21586194 62.4 191.6
TableS4.SummarystatisticsforallXR-seqsequencingsamplesinthisstudy.
SampleName Mean Length
Total FASTQ Reads
Unique Aligned Reads
Percent Unique Aligned
GMCisP_XR3h_Rep1 26.4 80338902 35203560 43.8 GMCisP_XR3h_Rep2 27.1 83700727 35203560 42.1
OXP_XR_Rep1 26.74 99390886 37221727 37.4 OXP_XR_Rep2 27.97 124668994 39786126 31.9