Molecular signatures of transgenerational response to ... · polyacanthus from other locations on...

SUPPLEMENTARY INFORMATIONDOI: 10.1038/NCLIMATE3087

NATURE CLIMATE CHANGE | www.nature.com/natureclimatechange 1

1

Signatures of transgenerational molecular response to ocean 1

acidification in a reef fish 2

Celia Schunter, Megan J. Welch, Taewoo Ryu, Huoming Zhang, Michael L. Berumen, 3

Göran E. Nilsson, Philip L. Munday* and Timothy Ravasi* 4

5

Supplementary information 6

7

Study organism & selection of parental pairs by behavioral phenotype 8

Acanthochromis polyacanthus was chosen for this study due to its advantageous life 9

history traits for laboratory rearing as well as previous information on the effects of CO2 10

on the behavior of this species1,2. Adult A. polyacanthus were caught on multiple reefs 11

around the Orpheus Island region of northern Great Barrier Reef (GBR), Australia 12

(18°38'24,3"S,146°29'31,8"E) using a baited barrier net and small hand nets for capture. 13

The fish were then brought to and maintained at James Cook University’s Experimental 14

Marine Aquarium Facility. The individuals used in the study were retrieved from a single 15

population to assure that the analysis of CO2 related molecular mechanisms were not 16

confounded by population differences of the wild adults (genetic or environmental). 17

Nevertheless, similar behavioural impairments have been recorded in populations of A. 18

polyacanthus from other locations on the GBR (e.g. Lizard Island; Welch and Munday 19

unpublished data) and in other reef fish species3 indicating that the behavioural effects of 20

high CO2 are not restricted to this population of A. polyacanthus adults. 21

22

Molecular signatures of transgenerational response to ocean acidification in a species of

reef fish

© 2016 Macmillan Publishers Limited. All rights reserved.

http://dx.doi.org/10.1038/nclimate3087

2

Individual adult A. polyacanthus were held under high CO2 (754 μatm) for 7 days, after 23

which they were tested for changes in their olfactory responses to chemical alarm cues 24

(CAC). A two-channel flume (30 cm x 13 cm) was used to test for olfactory preference 25

between blank seawater or CAC, scaled from previous studies2,4 to accommodate adult 26

fish. CAC water and untreated water were fed into the flume at a constant rate of 450 ml 27

min1 monitored by a flow meter. To extract CAC, conspecific adult fish were euthanized 28

with a quick blow to the head, superficial cuts were made along each side of the donor 29

fish and rinsed with 30 ml of control water for each side, a concentration based on 30

previous CAC response ratios3. The extracted CAC was mixed with 10 L of high CO2 31

water in the tank used to supply CAC to the flume to ensure a consistent concentration of 32

fresh CAC for the duration of each trial. A ratio of one donor fish to one test fish was 33

used. Behavioral sensitivity to high CO2 treatment was measured based on the amount of 34

time an individual spent in the CAC, where ≤ 30% time spent in the cue was considered 35

“tolerant”, and ≥ 70% time spent in the cue was considered “sensitive”. Adults were 36

further categorized by size, and breeding pairs were formed using tolerant male + tolerant 37

female and sensitive male + sensitive female of approximately equal size. 38

39

Climate models project that CO2 levels in the atmosphere and surface ocean will exceed 40

700 µatm by the end of this century5–7. Elevated CO2 levels cause a range of sensory and 41

behavioural impairments in coral reef fish, including altered antipredator responses8. 42

Importantly, individuals vary in their sensitivity to high CO2, with the greatest variation 43

among individuals occurring around 700 µatm3,9,10. Therefore, a CO2 level of 44

approximately 700 µatm was chosen as it will be experienced by fish during the second 45


3

half of this century and it is also the CO2 level where the maximum expression of 46

phenotypic variation in behaviour should favour adaption. 47

48

Experimental design & sample description 49

Sensitive and tolerant breeding pairs were divided equally into control (414 μatm) and 50

high CO2 (754 μatm, Table S1) treatments where they were held for three months prior to 51

the start of the breeding season. A. polyacanthus lay demersal clutches of eggs that are 52

cared for by both parents until hatching at approximately 10 days post-fertilization 53

(personal observation). Immediately after hatching clutches of offspring were removed 54

from their parents and placed into tanks with the same environmental conditions as their 55

parents: either control water or high CO2 water (Table S1). This provided four offspring 56

groups: (1) offspring from tolerant parents reared in control water, (2) offspring from 57

tolerant parents reared in high CO2 water, (3) offspring from sensitive parents reared in 58

control water, and (4) offspring from sensitive parents reared in high CO2 water. Multiple 59

family lines (parental pairs) were used to ensure that effects seen were not due to a single 60

breeding pair. To further remove bias due to specific breeding pairs, one tolerant parental 61

pair and one sensitive parent pair were first kept at control levels, bred at control levels 62

and the offspring stayed in control levels. Afterwards these two breeding pairs were 63

transferred into high CO2 to breed and these offspring were subsequently kept at high 64

CO2. Juveniles were reared in their respective environmental conditions until they were 5 65

months old. At this age, the brain is of sufficient size to extract enough RNA and proteins 66

for high throughput sequencing and mass spectrometry analysis. Body weight was 67

measured directly after euthanizing and photos were taken for length measurements. 68


4

Whole brains were dissected out and snap frozen in liquid nitrogen and stored at -80oC. 69

Dissection was randomized among treatments to eliminate any possibility of a sampling 70

time effect. 71

72

Supplementary Table 1. Mean (± s.d.) seawater parameters in the experimental system 73

for adults and juveniles during the experimental seasons. Temperature, pH, salinity, and 74

total alkalinity (TA) were measured directly. pCO2 was estimated from these parameters 75

using CO2SYS. Seawater parameters were consistent for breeding and experimental 76

components of the study. 77

Treatment pHNBS Temperature

(°C)

Salinity TA (μmol.kg-1

SW)

pCO2

(μatm)

Control 8.15 (±0.04) 28.5 (±0.2) 35.0 (±1.2) 2146 (±125) 414 (±46)

CO2 7.94 (±0.04) 28.5 (±0.3) 35.1 (±1.2) 2223 (+146) 754 (±92)

78

Preparation and sequencing of samples 79

At removal of whole brains from freezers, 350 ul of RTL Plus Buffer was added to the 80

brain tissue from a Qiagen AllPrep DNA/RNA Mini Kit. Approximately 30 rnase and 81

dnase free one-use silica beads (Daintree Scientific, Australia) were placed into 82

Eppendorf tubes and samples were homogenized for 30 seconds in a pre-frozen metal 83

tray with a Thermo Fisher Scientific bead beater. Samples were processed by the Qiagen 84

AllPrep DNA/RNA Kit protocol with the exception that at the RNA purification stage the 85

flow through was kept on ice for protein extraction. DNA and total RNA were purified 86

and kept at -80oC. Proteinase inhibitor was added to the flow through (3.5 ul of Halt 87


5

protease & phosphatase inhibitor cocktail 100X, Thermo Fisher Scientific) and the 88

sample was split into two Eppendorf tubes. 1000 ul of cold acetone were added to each 89

tube and vortexed for 10 seconds. The samples were left to precipitate for 30 minutes on 90

ice and then spun at full speed in a 4oC centrifuge for 10 minutes. Acetone was pipetted 91

out without touching the pellet and the pellet was left to dry for 15 minutes in a fume 92

hood and finally stored dry at -80oC. 93

94

De novo genome assembly and annotation 95

In brief, a wild A. polyacanthus fish was previously collected from the same locality on 96

the GBR in Australia and reared in the aquaria as described in Veilleux et al. 201511. 97

Liver genomic DNA of a F1 fish that was ‘developmentally’ reared at +3°C was 98

extracted using the standard phenol-chloroform extraction. Seven mate-pair libraries 99

ranging from 3 to 8kb and five paired end libraries were produced and sequenced on the 100

Illumina Hiseq2000 platform. De novo assembly was performed with a combination of 101

contig assembly with ABySS v1.5.2 (k=65)12 and scaffolding by SSPACE v3.013. The 102

assembled genome size was 992Mb with 30,414 scaffolds (> 500bp) and an N50 of 103

334,400bp. Gene annotation was accomplished with Maker214 by using the transcriptome 104

(de novo assembly in Veilleux et al. 201511) and reference-based assembly by Cufflinks 105

v2.2.115 and the combination of UniProtKB/Swiss-Prot16 and CEGMA core proteins17 as 106

mRNA and protein evidences, respectively, as well as ab initio predictors SNAP18 and 107

AUGUSTUS19. This resulted in 25,301 gene models have an average length of 2,466 bp. 108

Sequence matching and annotation of the gene models was performed with BLASTP 109

v2.2.30 against the nr database (version 01/2015; e-value cutoff: 10-4), BLASTN against 110


6

the eukaryotic nt database (version 01/2015) and BLASTX against the Uniprot (version 111

05/2015; e-value cutoff: 10-e10). Functional annotation of the transcripts was obtained 112

with Blast2GO20 version 3.1.2. Gene-annotation (GO) terms, InterPro IDs and KEGG 113

pathways were added to each transcript if possible. 114

115

Transcriptome mapping 116

Total RNA integrity was measured on an Agilent bioanalyzer and samples had a RIN 117

value of at least seven. Illumina sequencing libraries were produced for each sample with 118

a TrueSeq RNA library Preparation Kits and run on the HiSeq2000 platform by 119

Macrogen (Macrogen Korea). Nine samples were individually barcoded and multiplexed 120

in one Illumina lane to receive an approximate amount of 50 million paired end reads per 121

sample. 36 samples were sequenced on 4 lanes total. Information on the RNA quality, 122

randomized multiplexing to avoid batch effects and raw read count can be found in 123

Supplementary Table 5. Raw fastq reads were quality checked with FastQC21 and quality 124

trimmed with Trimmomatic22. Only high quality reads were accepted for further analysis 125

after removing Illumina adapters and low quality bases at the start and end of each read 126

(if below Phred of 35). The sliding window command was set to 4:20 with a minimum 127

read length of 40. Reads were only included if both paired-end reads passed quality 128

trimming. High quality paired-end reads were then mapped against the A. polyacanthus 129

assembled genome sequence with Tophat 223 by using the bowtie2 very sensitive 130

alignment mode with the custom made transcriptome gff file with transcript annotations. 131

Read counts were obtained for all genome exons and transcripts with htseq-count using 132

the union mode in HTSeq24. 133


7

134

Differential expression analysis 135

Differential expression analysis was performed with DEseq225 in Bioconductor version 136

3.2 in R version 3.2.1. Within treatment variation analysis revealed one family line of 137

tolerant parents at CO2 level to be outliers. This difference is most likely due to the fact 138

that the parent pair once placed into CO2 (after they had already bred in control 139

condition) took a long time to reproduce so that these offspring reached 5 months in May 140

2015, whereas all other samples were collected at 5 months of age between January and 141

February 2015. This seasonal difference in collection most likely caused a large gene 142

expression difference. To avoid this environmental bias we removed these three 143

individuals from the final transcriptome analysis, leaving two family lines and six 144

samples for the tolerant CO2 treatment. It has to be mentioned that all major results 145

described in this study are also found when including these three outliers, therefore not 146

skewing the main findings. 147

148

For the final analysis, firstly global expression differences between control (18 samples) 149

and high CO2 condition (15 samples) were analyzed with a multifactor analysis by 150

factoring in the different parental phenotypes (Tolerant or Sensitive). The same type of 151

analysis was done comparing all offspring from tolerant parents (n=15) with those from 152

sensitive parents (n=18) factoring in the environmental treatment (control or high CO2). 153

To get a more detailed idea of the expression patterns of each treatment group we 154

analyzed the gene expression differences of: a) control versus high CO2 for offspring 155

with tolerant parents (n=9 vs. 6), b) control versus high CO2 for offspring with sensitive 156


8

parents (n=9 vs. 9), c) Offspring of tolerant versus sensitive parents at control condition 157

(n=9 vs. 9) and d) Offspring of tolerant versus sensitive parents at high CO2 condition 158

(n=6 vs. 9). The significance limit was set after FDR correction at p-adjusted of 0.05, but 159

a minimum of 0.3 log2 Fold Change was applied and accepted if statistically significant 160

and within treatment standard deviation was small (SD <Mean). A Principle Component 161

Analysis (PCA) was performed to visualize the expression patterns of each of the four 162

groups of samples using MeV version 4.926 with median as a centering mode and the 163

number of neighbors for K-Nearest Neighbor (KNN) imputation was set to 10 164

(Supplementary Fig. 1). 165

166

We performed hierarchical clustering of the differentially expressed genes to investigate 167

a possible family effect on expression patterns. In the heatmap, if the three samples from 168

each family (or six for one family line) cluster together, then there is a possible family 169

effect, as these individuals show more similar patterns than other individuals from a 170

different parent pair in the same treatment group (Supplementary Fig. 2). We do not see 171

such an effect when comparing sibling offspring reared in control and high CO2 172

conditions, suggesting that individuals express transcripts more similarly at the treatment 173

level than at the family level. 174

175

Fisher’s exact tests were performed with Blast2GO20 to evaluate the presence of 176

functional enrichment within significantly differentiated (DE) genes for the different 177

comparisons. This was done by comparing the GO terms of the DE genes with the entire 178

transcriptome set with a significance level of FDR 0.05. Final enriched GO-terms were 179


9

reduced to higher level ontology terms with REVIGO27 by using the small setting. The 180

different comparisons resulted in different enriched functions. Whereas it can be seen that 181

control versus high CO2 comparisons resulted in L-serine biosynthesis processes and 182

organic acid metabolic processes which are shared between offspring phenotype 183

(Supplementary Table 3). Control versus CO2 for the sensitive individuals revealed one 184

unique (not shared) process and the tolerant versus sensitive comparison at CO2 showed 185

circadian rhythm and rhythmic processes as enriched functions. Gene expression 186

networks for the differentially expressed gene sets were created in Genemania28 by using 187

the Danio rerio genome. 188

189

Genetic variant analysis 190

To evaluate if the transgenerational signature in phenotype is due to a genetic variation 191

passed on to the next generation, we searched for single nucleotide polymorphisms 192

(SNPs) within the coding regions of the genome. To confidently call variants (SNPs) in 193

the transcripts of the different treatment groups several modifications were done to the 194

Tophat bam alignment files with samtools29. All 36 samples were first sorted, reordered 195

and then deduplicated with Picard tools (http://picard.sourceforge.net/) to eliminate 196

possible PCR biases. Read groups were added to each sample file to distinguish each 197

sample at the moment of merging all bam files to one. To avoid misalignment and false 198

positives we identified regions with insertions and deletions and religned them with the 199

genome Analysis Toolkit (GATK) version 3.530. The Unified Genotyper in GATK was 200

then used to call variants with a minimum Phred score of 30. Recalibration of all quality 201

variant sites was performed against the high quality variants by using a Gaussian mixture 202


10

model with VariantRecalibrator to better distinguish true variants from sequencing errors. 203

Finally, the set of high quality SNPs (Phred score ≥ 30) were obtained from the 204

recalibrated set. To look for variants with a signal of selection that show clear differences 205

between the offspring of tolerant or sensitive parents we used Bayescan31. The software 206

was run with the high quality SNPs vcf file including all 18 individuals per phenotype (T 207

or S) and a false discovery rate threshold of 0.05 was applied (Supplementary Fig. 3). 208

209

Protein digestion and iTRAQ labeling 210

Dried fish brain protein pellets stored at -80oC freezer were resuspended in lysis buffer 211

containing 8 M urea and centrifuged at 15000 rpm for 5 minutes. The supernatant was 212

transferred to a new Eppendorf tube. Protein concentrations were measured using a 2-D 213

Quant kit (GE Healthcare, UK). For each offspring group (Tolerant at control, tolerant at 214

high CO2, sensitive at control and sensitive at high CO2) we pooled six of the samples 215

that were also used for transcriptome sequencing. Due to the removal of one Tolerant 216

high CO2 family line we therefore reduced the number of individuals for proteomics to 217

six (instead of nine) to not induce a bias due to sample size. The samples for which 9 218

individuals were available were randomly chosen but included samples of all three family 219

lines per group. Suspended proteins were pooled at equal concentrations to a final of 100 220

µg. Proteins were reduced and alkylated by following the instruction from iTRAQ 4plex 221

Kit manual (Applied Biosystems, USA). The protein samples were then 1:7 diluted with 222

50 mM triethylammonium bicarbonate and digested using trypsin (Promega, USA) at an 223

enzyme:protein ratio of 1:40 for 16 h at 37°C. The trypsin was inactivated by adding 224

triflouroacetic acid to a final concentration of 2%. The peptides were desalted using 100 225


11

mg capacity Sep-Pak C18 cartridges (Water Corporation, USA). Samples were then 226

incubated with iTRAQ Reagents-4plex reagents (Applied Biosystems) for 60 minutes 227

before pooling all individually labeled peptide samples labeled individually and dried32. 228

229

Peptide fractionation and mass spectrometry analysis 230

Samples were fractionated by strong cation exchange chromatography (SCX) as 231

described earlier26. Briefly, The iTRAQ-labeled peptides were resuspended in 85 µL SCX 232

buffer A and fractionated using an Accela 1250 LC system (Thermo Scientific, USA). A 233

total of 15 peptide fractions were obtained, desalted using Sep-Pak C18 cartridges and 234

dried. The fractions were resuspended in 20 µL of LC-MS sample buffer (97% H20, 3% 235

ACN, 0.1% formic acid) and analyzed three times using a Q Exactive HF mass 236

spectrometer (Thermo Scientific, Germany) coupled with an UltiMate™ 3000 UHPLC 237

(Thermo Scientific). Peptides were separated using an Acclaim PepMap100 C18 column 238

(75 um I.D. X 15 cm, 3 µm particle sizes, 100 Å pore sizes) with a flow rate of 300 239

nL/minute. A 60-minute gradient was established using mobile phase A (0.1% formic 240

acid in H2O) and mobile phase B (0.1% formic acid in 80% acetonitrile): 5%-40% B for 241

40 minutes, 5-minute ramping to 90% B, 90% B for 5 minutes, and 2% B for 10-minute 242

column conditioning. The sample was introduced into mass spectrometer through a 243

Nanospray Flex (Thermo Scientific) with an electrospray potential of 1.5 kV. The ion 244

transfer tube temperature was set at 160°C. The Q Exactive was set to perform data 245

acquisition in the positive ion mode. A full MS scan (350-1400 m/z range) was acquired 246

in the Orbitrap at a resolution of 60,000 (at 200 m/z) in a profile mode, a maximum ion 247

accumulation time of 100 milliseconds and a target value of 3 × e6. Charge state 248


12

screening for precursor ion was activated. The ten most intense ions above a 2e4 249

threshold and carrying multiple charges were selected for fragmentation using higher 250

energy collision dissociation (HCD). The resolution was set as 15000. Dynamic 251

exclusion for HCD fragmentation was 20 seconds. Other setting for fragment ions 252

included a maximum ion accumulation time of 100 milliseconds, a target value of 1 × e5, 253

a normalized collision energy at 28%, and isolation width of 1.8. 254

255

Protein identification and quantitation 256

Raw MS data were converted into Mascot generic format (mgf) files using Proteome 257

Discoverer 1.4 software (Thermo Scientific). These files were submitted to MASCOT 258

v2.3 (Matrix Sciences Ltd, United Kingdom) for database search against an 259

Acanthochromis polychanthus brain protein dataset developed in-house from the 260

transcriptome data. The mass tolerance was set to 10 ppm for precursors, and 0.5 Da for 261

the MS/MS fragment ion. A maximum of one missed cleavage was allowed. Variable 262

modifications included 4-plex iTRAQ at tyrosine and oxidation at methionine. The fixed 263

modifications were set to methylethanethiosulfonate at cysteine and lysine, and 4-plex 264

iTRAQ at N-terminal. The MASCOT result files were processed using Scaffold v4.1.1 265

(Proteome Software Inc. USA) software for validation of peptide and protein 266

identifications with a threshold of 95% using the Prophet algorithm with Scaffold delta-267

mass correction. iTRAQ label-based quantitation of the identified proteins was performed 268

using the Scaffold Q+ algorithm. The intensities of all labeled peptides were normalized 269

across all runs33. Individual quantitative data acquired in each run were normalized using 270

the i-Tracker algorithm34. Peptide intensity was normalized within the assigned protein. 271


13

The reference channel (e.g. 114) was normalized to produce a 1:1 fold change, and the 272

iTRAQ ratios were then transformed to a log scale. P-values were calculated using a 273

paired t-test. We allowed for missing data in one of the technical replicates and accepted 274

differential expression at a fold-change level of 1.5 (consistent over technical replicates). 275

276

Differential Protein Expression 277

A total of 2,702 proteins were confidently detected, however not all these proteins had 278

data for all three technical replicates per sample group and the final number varied 279

between 2,100 to 2,300 per comparison. As per transcriptome analysis four comparisons 280

were performed by changing the reference sample in Scaffold v4.1.1: control versus high 281

CO2 for sensitive and tolerant offspring and also sensitive versus tolerant offspring at 282

control and CO2 conditions. The number of differentially expressed proteins can be found 283

in Figure 2c and the list of proteins in Supplementary Table 2a-d. 284

285

Comparative analysis of transcriptome and proteome 286

It is generally difficult to compare the two levels of molecular responses directly, as 287

transcriptomes are quantitative expression values and mass spectrometry-based 288

proteomes are relative values. Furthermore, the number of proteins detected in the 289

proteomes is generally a smaller fraction of the total number of proteins, whereas with 290

RNA-Seq a large quantity of genome-wide expression is recovered. Hence, the absence 291

of a protein in the proteome data set could mean that this protein is not expressed or that 292

it is not detected or did not passed the filtering criteria. Besides the technical issues of 293

comparing the two molecular levels, direct overlap has shown to be quite low (on average 294


14

27%) even in model species 35. Differential expression overlap varies depending on the 295

comparison, but is on average also low (Table S2), nonetheless, some differentially 296

expressed proteins match the differential expression of transcripts. For the seven 297

commonly (regardless of parental phenotype) differentially expressed transcripts, we 298

could detect three of the directly matching proteins (43% overlap, Table 2). Only one of 299

these three proteins (33%) was also differentially expressed between control condition 300

and high CO2 at least for the offspring of tolerant parents (gene name: phgdh; protein 301

name: d-3-phosphoglycerate dehydrogenase). For the 18 commonly differentially 302

expressed proteins (between control and CO2 condition regardless of parental phenotype), 303

none were differentially expressed for the directly related transcripts. However, for one 304

protein (inactive serine protease PAMR1-like) serine plays an important role, which can 305

also be found for several transcripts (discussed in main text). The rest of the commonly 306

differentially expressed proteins are mostly involved in structural maintenance (such as 307

collagen or myoglobin). One interesting protein that is not differentially expressed at the 308

transcript level is vasotocin, which is upregulated at the high CO2 level. For the sensitive 309

versus tolerant offspring comparison at high CO2 21 % of matching transcripts and 310

proteins (Table S2) were commonly differentially expressed. For example, the circadian 311

rhythm gene nr1d1 and purvalbumin (pvalb) are differentially expressed both at the 312

transcript and protein level. 313

314

Supplementary Table 2. Overlap of differentially expressed transcripts and proteins for 315

the different comparisons. Recovered matching proteins are the proteins that were 316

detected through mass spectrometry-based proteomics that matched the sequence of the 317


15

transcript. Overlapping differential expression are the exactly matching transcripts and 318

proteins that are both differentially expressed. 319

ComparisonParental

phenotype Condition

Differentially expressed

transcriptsRecovered

matching proteins

Overlapping differential expression

Control vs. CO2 T & S 7 3 1Control vs. CO2 T 173 12 1Control vs. CO2 S 62 14 0

Tolerant vs. Sensitive

high CO2 152 14 3 320

321

qRT-PCR validation 322

For RNAseq validation we used Quantitative Realtime PCR to test the expression of a 323

selection of genes. For this we used samples from the same treatments and families but 324

other biological replicates than previously used in the RNAseq analysis to enforce the 325

findings. In the RNAeq analysis we used three individuals per family (2-3 families per 326

treatment) and here we use two other biological replicates per family (2-3 families per 327

treatment group). QRT-PCR Primers were designed by using the associated transcript of 328

the gene of interest with Primer3Plus by setting the settings to qPCR36. Selected primers 329

were additionally checked with the NCBI Primer-BLAST tool to check for specificity. 330

The 20bp long forward and reverse primers were then obtained and HPSF purified by 331

SIGMA (Sigma-Aldrich, Germany). A total RNA quantity of 550ng for each of the 332

sample was reverse transcribed using a high capacity reverse transcription kit from ABI 333

(Applied Biosystems). 15ng of the produced cDNA was used for each of the reactions; 334

three reactions were replicated per sample. This was done with a Fast SYBR green PCR 335

mix (ABI) with the setup per reaction being: 5ul of 2X master mix, 0.25ul of 10uM 336


16

Forward Primer, 0.25ul of 10uM Reverse Primer, 3.5ul H2O and 1ul of cDNA Template 337

for a total reaction of 10ul. In a 384 well clear plate (ABI) the PCR was run: 95oC for 20 338

seconds and then 40cycles of 95oC for 10 seconds and 60oC for 20 seconds. The q-RT 339

PCR was done with a negative control on the 7900 HT Fast Real Time PCR system 340

(ABI) in Genomics section of the Biosciences Core Lab of the King Abdullah University 341

of Science and Technology. Three transcripts which represented the lowest standard 342

deviation in gene expression between all samples were selected as ‘house keeping genes’, 343

whereas each gene was expressed at a different level: low (dhrs7b), intermediate (smurf1) 344

and high (akt1s1) expression. All samples for all analyzed genes were run in triplicates 345

and medians of CT values between technical replicates were used for final quantification 346

and comparison. We used the Livak method and calculated Delta Delta CTs by 347

normalizing the CT values against the housekeeping gene average (Delta CT) and then 348

comparing Delta CTs of different treatments with each other (Delta Delta CT)37. Values 349

of Delta Delta CT were then compared with log2fold differences in the RNAseq data 350

(Supplementary Fig. 4). For comparison of QRT-PCR results with with RNAseq data we 351

performed four comparisons: Tolerant versus sensitive at the (1) control level as well as 352

(2) high CO2 level, and expression at control versus high CO2 for (3) only tolerant or (4) 353

only sensitive specimen. From the three selected ‘small deviation genes’ only the highly 354

expressed one had a significant correlation between RNAseq and qRT-PCR (Pearson's 355

product-moment correlation, p=0.022), which is not usual due to the more variable nature 356

of qRT-PCR. Nine out of ten genes used for validation showed the same expression 357

pattern in qRT-PCR as found with RNAseq (Pearson's product-moment correlation, 358

p<0.05). Only gabra3 did not match significantly, probably due to the very low twofold 359


17

expression differences of the gene. This high percentage of validation shows that the 360

RNAseq results can be replicated not only with a different method, but also with different 361

biological samples and therefore the observed pattern is clearly linked to the treatment. 362


18

References 363

1. Nilsson, G. E. et al. Near-future carbon dioxide levels alter fish behaviour by 364 interfering with neurotransmitter function. Nat. Clim. Chang. 2, 201–204 (2012). 365

2. Welch, M. J., Watson, S.-A., Welsh, J. Q., McCormick, M. I. & Munday, P. L. 366 Effects of elevated CO2 on fish behaviour undiminished by transgenerational 367 acclimation. Nat. Clim. Chang. 4, 1086–1089 (2014). 368

3. Ferrari, M. C. O. et al. Intrageneric variation in antipredator responses of coral reef 369 fishes affected by ocean acidification: implications for climate change projections 370 on marine communities. Glob. Chang. Biol. 17, 2980–2986 (2011). 371

4. Munday, P. L. et al. Ocean acidification impairs olfactory discrimination and 372 homing ability of a marine fish. Proc. Natl. Acad. Sci. 106, 1848–1852 (2009). 373

5. Meinshausen, M. et al. The RCP greenhouse gas concentrations and their 374 extensions from 1765 to 2300. Clim. Change 109, 213–241 (2011). 375

6. McNeil, B. I. & Sasse, T. P. Future ocean hypercapnia driven by anthropogenic 376 amplification of the natural CO2 cycle. Nature 529, 383–6 (2016). 377

7. Collins, M. et al. in Clim. Chang. 2013 Phys. Sci. Basis. Contrib. Work. Gr. I to 378 Fifth Assess. Rep. Intergov. Panel Clim. Chang. [Stocker, T.F., D. Qin, G.-K. 379 Plattner, M. Tignor, S.K. Allen, J. Boschung, A. Nauels, Y. Xia, (Cambridge 380 University Press, Cambridge, United Kingdom and New York, NY, USA., 2013). 381

8. Nagelkerken, I. & Munday, P. L. Animal behaviour shapes the ecological effects 382 of ocean acidification and warming: moving from individual to community-level 383 responses. Glob. Chang. Biol. 22, 974–89 (2016). 384

9. Munday, P. L. et al. Replenishment of fish populations is threatened by ocean 385 acidification. Proc. Natl. Acad. Sci. U. S. A. 107, 12930–4 (2010). 386

10. Munday, P. L. et al. Selective mortality associated with variation in CO2 tolerance 387 in a marine fish. Ocean Acidif. 1, 1–5 (2012). 388

11. Veilleux, H. D. et al. Molecular processes of transgenerational acclimation to a 389 warming ocean. Nat. Clim. Chang. 5, 1074–1078 (2015). 390

12. Simpson, J. T. et al. ABySS: a parallel assembler for short read sequence data. 391 Genome Res. 19, 1117–23 (2009). 392

13. Boetzer, M., Henkel, C. V, Jansen, H. J., Butler, D. & Pirovano, W. Scaffolding 393 pre-assembled contigs using SSPACE. Bioinformatics 27, 578–9 (2011). 394

14. Holt, C. & Yandell, M. MAKER2: an annotation pipeline and genome-database 395 management tool for second-generation genome projects. BMC Bioinformatics 12, 396 491 (2011). 397

15. Trapnell, C. et al. Transcript assembly and quantification by RNA-Seq reveals 398 unannotated transcripts and isoform switching during cell differentiation. Nat. 399 Biotechnol. 28, 511–5 (2010). 400

16. UniProt: a hub for protein information. Nucleic Acids Res. 43, D204–12 (2014). 401 17. Parra, G., Bradnam, K. & Korf, I. CEGMA: a pipeline to accurately annotate core 402

genes in eukaryotic genomes. Bioinformatics 23, 1061–7 (2007). 403 18. Korf, I. Gene finding in novel genomes. BMC Bioinformatics 5, 59 (2004). 404 19. Stanke, M. et al. AUGUSTUS: ab initio prediction of alternative transcripts. 405

Nucleic Acids Res. 34, W435–9 (2006). 406 20. Conesa, A. et al. Blast2GO: a universal tool for annotation, visualization and 407

analysis in functional genomics research. Bioinformatics 21, 3674–6 (2005). 408


19

21. Andrews, S. FASTQC. A quality control tool for high throughput sequence data. 409 (2010). at <http://www.bioinformatics.babraham.ac.uk/projects/fastqc/> 410

22. Bolger, A. M., Lohse, M. & Usadel, B. Trimmomatic: a flexible trimmer for 411 Illumina sequence data. Bioinformatics 30, 2114–2120 (2014). 412

23. Kim, D. et al. TopHat2: accurate alignment of transcriptomes in the presence of 413 insertions, deletions and gene fusions. Genome Biol. 14, R36 (2013). 414

24. Anders, S., Pyl, P. T. & Huber, W. HTSeq - A Python framework to work with 415 high-throughput sequencing data. Bioinformatics 31, 166–169 (2014). 416

25. Love, M. I., Huber, W. & Anders, S. Moderated estimation of fold change and 417 dispersion for RNA-seq data with DESeq2. Genome Biol. 15, 550 (2014). 418

26. Saeed, A. I. et al. TM4: a free, open-source system for microarray data 419 management and analysis. Biotechniques 34, 374–8 (2003). 420

27. Supek, F., Bošnjak, M., Škunca, N. & Šmuc, T. REVIGO summarizes and 421 visualizes long lists of gene ontology terms. PLoS One 6, e21800 (2011). 422

28. Zuberi, K. et al. GeneMANIA prediction server 2013 update: biological network 423 integration for gene prioritization and predicting gene function. Nucleic Acids Res. 424 41, W115–22 (2013). 425

29. Li, H. et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics 426 25, 2078–9 (2009). 427

30. DePristo, M. A. et al. A framework for variation discovery and genotyping using 428 next-generation DNA sequencing data. Nat. Genet. 43, 491–8 (2011). 429

31. Foll, M. & Gaggiotti, O. A genome-scan method to identify selected loci 430 appropriate for both dominant and codominant markers: a Bayesian perspective. 431 Genetics 180, 977–93 (2008). 432

32. Chandramouli, K. H., Reish, D., Zhang, H., Qian, P.-Y. & Ravasi, T. Proteomic 433 Changes Associated with Successive Reproductive Periods in Male Polychaetous 434 Neanthes arenaceodentata. Sci. Rep. 5, 13561 (2015). 435

33. Zhang, H. et al. Study of monocyte membrane proteome perturbation during 436 lipopolysaccharide-induced tolerance using iTRAQ-based quantitative proteomic 437 approach. Proteomics 10, 2780–9 (2010). 438

34. Shadforth, I. P., Dunkley, T. P. J., Lilley, K. S. & Bessant, C. i-Tracker: for 439 quantitative proteomics using iTRAQ. BMC Genomics 6, 145 (2005). 440

35. Ghazalpour, A. et al. Comparative analysis of proteome and transcriptome 441 variation in mouse. PLoS Genet. 7, e1001393 (2011). 442

36. Untergasser, A. et al. Primer3Plus, an enhanced web interface to Primer3. Nucleic 443 Acids Res. 35, W71–4 (2007). 444

37. Livak, K. J. & Schmittgen, T. D. Analysis of relative gene expression data using 445 real-time quantitative PCR and the 2(-Delta Delta C(T)) Method. Methods 25, 446 402–8 (2001). 447

448


20

Supplementary Figure 1 449

450

451

Supplementary Figure 1. Two dimensional projection of the Principle Component 452

Analysis (PCA) performed on the four sample treatments. The figure shows the PC1 and 453

PC3 projections. 454

455

456


21


458

459 460

Supplementary Figure 2. Heatmap of sensitive offspring at Control and CO2 level. 461

Heatmap color intensity is proportional to the expression levels. A two way Hierarchical 462

clustering was performed according to condition as well as family lines (F). 463

464

41−1_C5

33−1_C3

41−1_C4

33−1_C4

33−1_C5

41−1_C2

57−1_C3

57−1_C2

57−1_C4

41−2_H1

73−1_H4

41−2_H2

41−2_H3

73−1_H1

73−1_H3

71−1_H1

71−1_H3

71−1_H4ConditionFamily

FamilyF1F2F3F4F5

ConditionCCO2

−3

−2

−1

0

1

2

3


22


466

Supplementary Figure 3. Bayesian FST outlier detection for SNPs between offspring 467

of tolerant and sensitive parents. The four statistical outliers encountered are marked with 468

the gene name. 469

470


23


472

473

Supplementary Figure 4. Transcripts expression validation by quantitative real-474

time PCR. Nine genes were validated with qRT-PCR and two fold changes are 475

represented for the four different comparisons: CO2 versus control only for Tolerant 476

samples (CO2 vs. C_T) or for sensitive samples (CO2 vs. C_S), or expression between 477

Tolerant and Sensitive samples at control condition (T vs. S_C) or high CO2 (T vs. 478

S_CO2). The red line corresponds to the qRT-PCR expression levels and the blue line to 479

the RNA-seq results. All genes have a significant correlation between qRT-PCR and 480

RNA-seq (P value of Pearson's product-moment correlation <0.05) except gabra3. 481

0.0

0.5

1.0

1.5

2.0

CO2vsC_T TvsS_C CO2vsC_S TvsS_CO2

2FoldC

hange

pck1

−1

0

1

2

CO2vsC_T TvsS_C CO2vsC_S TvsS_CO2

2FoldC

hange

fgf1

−0.5

0.0

0.5

1.0

CO2vsC_T TvsS_C CO2vsC_STvsS_CO2

2FoldC

hange

ciart

−1.0

−0.5

0.0

0.5

1.0


2FoldC

hange

per1

−1.0

−0.5

0.0

0.5

1.0


2FoldC

hange

nfil3

−1.0

−0.5

0.0

0.5

1.0


2FoldC

hange

shmt2

−1.0

−0.5

0.0

0.5

1.0


2FoldC

hange

glrk

−1.0

−0.5

0.0

0.5

1.0


2FoldC

hange

gabra1

−1.0

−0.5

0.0

0.5

1.0


2FoldC

hange

gabra3

method qPCR RNAseq


24

Supplementary Table Legends 482

Supplementary Table 1 a-‐d. List of significantly differentially expressed transcripts. 483

Supplementary Table 2a-‐d: List of differentially expressed proteins 484

Supplementary Table 3: Significant (FDR<0.05) enrichment of biological functions 485

for each comparison 486

Supplementary Table 4. Differentially expressed Solute Carrier Transporter Genes 487

(SLC) and transporter proteins. 488

Supplementary Table 5. Sequencing information for each biological sample 489


Molecular signatures of transgenerational response to ... · polyacanthus from other locations on...

Documents

Transcript of Molecular signatures of transgenerational response to ... · polyacanthus from other locations on...