SUPPLEMENTARY INFORMATION Comparative integrated omics: identification … · 2015-06-17 ·...
Transcript of SUPPLEMENTARY INFORMATION Comparative integrated omics: identification … · 2015-06-17 ·...
SUPPLEMENTARY INFORMATION
Comparative integrated omics: identification of key
functionalities in microbial community-wide metabolic networks
Hugo Roumea,1*, Anna Heintz-Buscharta*, Emilie EL Mullera, Patrick Maya, Venkata P 5
Satagopama, Cédric C Lacznya, Shaman Narayanasamya, Laura A Lebruna, Michael R
Hoopmannb, James M Schuppc, John D Gillecec, Nathan D Hicksc, David M Engelthalerc,
Thomas Sauterd, Paul S Keimc, Robert L Moritzb, and Paul Wilmesa,2
aLuxembourg Centre for Systems Biomedicine, University of Luxembourg, Esch-sur-Alzette, Luxembourg.
bInstitute for Systems Biology, Seattle, WA, USA. 10 cThe Translational Genomic Research Institute-North, Flagstaff, AZ, USA.
dLife Science Research Unit, University of Luxembourg, Luxembourg, Luxembourg.
1Current address: Laboratory of Microbial Ecology and Technology, Ghent University, Belgium
2Corresponding author: P Wilmes, Luxembourg Centre for Systems Biomedicine, University of Luxembourg, 15
L-4362 Esch-sur-Alzette, Luxembourg. E-mail: [email protected].
* These authors contributed equally to this work.
2
Supplementary Materials and Methods 20
Biomolecular extraction and quality assessment
For each sampling date, three 200 mg sub-samples derived from one OMMC islet (defined,
herein1, as technical replicates) were used for biomolecular extraction. Each of the individual
biomolecular fractions isolated from the technical replicates were combined into a pool in
order to yield sufficient biomolecular quantities for subsequent high-throughput analysis and 25
to reduce the influence of fine-scale within-sample heterogeneity (as demonstrated by Roume
et al.2).
For the quality assessment of the isolated genomic DNA, fractions were separated by
electrophoresis on a 1 % agarose gel containing 4 ‰ ethidium bromide (PlusOne Ethidium
Bromide, GE Healthcare). For size estimation, the MassRuler DNA ladder mix (Fermentas) 30
was loaded onto the gels. Agarose gels were visualized on an InGenius gel imaging and
analysis system (Syngene, Cambridge, UK; as described in Roume et al.1). The DNA pool
sample was snap-frozen in liquid nitrogen in the elution buffer and stored at -80 °C until
library preparation and sequencing.
RNA quality assessment and quantification was carried out using an Agilent 2100 35
Bioanalyzer (Agilent Technologies, Diegem, Belgium; as described in Roume et al.1).
The quality of protein extracts was assessed following 1D-SDS-PAGE separation.
Preparation of RNA for shipment and sequencing
The volume of the RNA pool sample was adjusted to 180 µl using RNase free-water. A
mixture of 18 µl of a 3 M (w/v) sodium acetate solution, 2 µl of a glycogen solution 40
3
(10.mg/µl) and 600 µl of ice-cold 96 % (v/v) ethanol was added to the RNA solution. The
RNA solution was gently mixed by inversion and precipitated at -20 °C for at least 1.h.
Following centrifugation at 10 000 x g for 30 min at 4 °C, the RNA pellet was washed three
times with ice-cold 70 % (v/v) ethanol. The pellet was then air dried for 5 min at room
temperature overlaid with 100 µl of Ambion® RNAlater (Life Technologies, Gent, Belgium) 45
and placed at -20 °C until shipment. For reclamation of the RNA, the solution was
centrifuged at 14 000 x g for 10 min at 8 °C. After removal of the supernatant, the pellet was
washed three times with 100 µl of 80 % (v/v) ethanol, followed by 25 000 x g centrifugation
for 3 min at 8 °C. The pellet was then air dried until all visible ethanol had evaporated. The
dried RNA pellet was then re-suspended in 1 mM sodium citrate buffer solution (pH 6.4). 50
RNA-Sequencing
In order to enrich the RNA fraction in mRNA, the rRNA was subtracted from the total RNA
fraction using the Ribo-Zero rRNA Removal Kit (Meta-Bacteria; Epicentre, Madison, WI,
USA). This was followed by cDNA synthesis and amplification using the ScriptSeqTM v2
RNA-Seq library preparation kit (Epicentre). 55
rRNA removal
The procedure consisted of an initial washing and re-suspension of the Ribo-Zero
microspheres with dedicated solutions. Following treatment of the total RNA sample with
Ribo-Zero removal solution, the RNA sample solution was added to the re-suspended Ribo-
Zero microspheres and selected by hybridisation. The RNA-microsphere solution was then 60
removed and the rRNA-depleted sample was purified by ethanol precipitation.
4
Library preparation
The cDNA synthesis and amplification were performed using the ScriptSeqTM v2 RNA-Seq
library preparation kit (Epicentre). The procedure consisted of initial fragmentation of the
RNA, followed by the annealing of the cDNA synthesis primer. Briefly, addition of 4 µl
cDNA Synthesis Master Mix to the fragmented RNA solution was followed by incubation 65
first at 25 °C for 5 min and then 42 °C for 20 min. cDNA Synthesis Master Mix was prepared
by combining 3 µl cDNA Synthesis PreMix (ScriptSeq v2 RNA-Seq library preparation kit),
0.5 µl of DTT (100 mM) and 0.5 µl of StartScript Reverse Transcriptase solution (Epicentre).
After cooling to 37 °C, 1 µl of finishing solution (Epicentre) was added to the cDNA
synthesis solution and incubated at 37 °C for 10 min, before an incubation at 95 °C for 3 min. 70
8 µl of terminal tagging master mix was then added to each solution and incubated at 25 °C
for a further 15 min, following incubation at 95 °C for 3 min. Terminal tagging master mix
was prepared using 7.5 µl of terminal tagging premix (Epicentre) and 0.5 µl of DNA
polymerase. The 3’-terminal tagged cDNA was then purified using the AMPure XP system
(Beckman Coulter, Brea, CA, USA). The purified cDNA strand was then amplified by PCR, 75
resulting in the generation of the second strand of cDNA, addition of the Illumina adaptor
sequences and incorporation of specific barcodes, as well as amplification. Finally, the RNA-
Seq library was purified using AMPure XP system (Beckman Coulter). The size distribution
of the RNA-Seq library was assessed using an Agilent 2100 Bioanalyzer.
80
DNA/cDNA sequencing
DNA/cDNA was prepared according to the modified instructions from The Wellcome Trust
Sanger Institute3. The metagenomic sequencing protocol used a 96-well library preparation
and the molecular barcoding method for Illumina library construction. The barcodes were
5
designed using Hamming codes, which allow single nucleotide sequencing errors to be
corrected and single indels (insertions/deletions) to be detected without ambiguity. Several 85
optimisations were employed in the indexing protocol. The tags used were 8 bp long, which
allowed the design of larger number of barcodes with error-correcting capability. The bar
codes were introduced in a regular PCR, which simplified the PCR step and allowed for use
of as few as six cycles. Before pooling, the relative concentration of each sample library was
measured by quantitative PCR (qPCR), which then allowed the accurate pooling of libraries 90
together and improved the uniformity of their representation.
DNA fragmentation
1-5 µg of DNA was used for the fragmentation step, as determined following gel analysis,
and re-suspended in 75 µl of 10 mM Tris-HCl, pH 8.5. The sample was then sheared for
150 s in 100 µl Covaris microtubes (Woburn, MA, USA). The following programme was 95
used: duty cycle 20 %, intensity 5, cycle burst 200, power 37 W, temperature 7 °C and mode
‘Freq sweeping’. The sheared DNA samples were transferred into a MicroAmp optical 96-
well plate (Life technologies, Grand Island, NY, USA). Six samples, chosen at random, were
run on an Agilent Bioanalyzer DNA 1000 chip to check the quality of the fragmentation. A
150-250 bp smear was detected and 70-90 % of the initial DNA amount was recovered. 100
Size selection
A large fragment plate was prepared by dispensing 150 µl of beads into wells of a round-
bottom Costar plate (Corning, Glendale, AZ, USA) for each sample. Additionally, a small-
fragment plate was prepared by dispensing 60 µl of beads into wells of a separated Costar
plate for every sample. The large-fragment Costar plate was placed on a magnetic stand, 105
6
allowing the collection of beads. 45 µl of 10 mM Tris-HCl buffer were removed from the
large-fragment wells, leaving all beads in the well. The plate was removed from the magnetic
stand and 190 µl of sample from sonication tubes was added to the large fragment wells on
the Costar plate. Following 8 min of incubation at room temperature, using two magnetic
stands, the large-fragment Costar plate was placed on a magnetic stand to collect beads. 110
Following 5 min incubation at room temperature, 58 µl of bead buffer were removed from
the small fragment wells, leaving behind beads and approximately 2 µl of bead buffer. Small-
fragment wells were then removed from the magnetic stand. 300 µl of supernatant were
transferred from large-fragment wells into small-fragment wells. After transfer, the large-
fragment plate was discarded. Samples were then incubated in small-fragment wells for 115
5 min and the plate was placed on a magnetic plate to collect beads. 300 µl of supernatant
were removed and replaced by 300 µl of 80 % ethanol without disturbing beads. Following
30 s of incubation, ethanol was removed. This last step was repeated two times. Following
ethanol removal by air-drying for 5 min, the small fragment plate was removed from the
magnetic stand and 45 µl of pre-warmed 10 mM Tris-HCl (pH 8.5) were dispensed into the 120
wells. After re-suspension of beads by vigorous mixing, the solution was incubated for 2 min
and placed on a magnetic plate to collect beads. Following a further 3 min incubation, 42.5 µl
of supernatant containing the size-selected products were transferred to a new plate.
All enzymes and reaction buffers used for the end-repair, dA-tailing, ligation and PCR
amplification were provided from the KAPA Library Preparation Kits with Standard PCR 125
Library Amplification/Illumina series (KapaBiosystems, Woburn, MA, USA).
7
End-repair
To the supernatant containing the sheared DNA, 10 µl of end-repair buffer and 5 µl of end-
repair enzyme mix were added. 100 µl of this solution were added to each well. The plate
was then covered, vortexed briefly and spun down before being incubated for 30 min at 130
20 °C in a thermocycler. The samples were then cleaned using Agencourt AMPure SPRI
beads (Beckman Coulter). 90 µl of SPRI beads were added into each well, the plate was
covered and vortexed for 30 s. The reaction plate was placed on the magnetic SPIRPlate for
10 min and beads separated from the solution. Following incubation for 10 min, the
completely clear solution was discarded. 200 µl of 70 % (v/v) ethanol were added to each 135
well and incubated for 30 s at room temperature. The ethanol washes were repeated two
times. The reaction plate was dried for 15 min in a thermocycler and each sample was eluted
with 43.5 µl of 10 mM Tris buffer (pH 8.0).
A-tailing
To 30 µl of end-repaired DNA, 5 µl of 10x A-tailing buffer, 3 µl of A-tailing enzyme and 140
12.µl of water were added. 50 µl of A-tailing master mix were added to each well. The plate
was then covered, vortexed briefly and spun down before being incubated for 30 min at
30 °C in a thermocycler. The sample was then cleaned up using the Agencourt AMPure SPRI
bead method with 90 µl of SPRI beads placed in each well. Each sample was then eluted with
36 µl of 10 mM Tris buffer (pH 8.0). The same six samples, randomly chosen for the DNA 145
fragmentation step, were run on an Agilent Bioanalyzer DNA 1000 chip and the average
sample concentration was calculated using the “integrated peak” function.
8
Adapter preparation and ligation
To 30 µl of A-tailed DNA, 10 µl of 5x ligation buffer, 5 µl of DNA ligase and 5 µl of DNA
adaptor (30 µM) were added. 50 µl of this reaction mixture, were added to each well. The 150
plate was then incubated at 20 °C (room temperature) for 15 min. The sample was cleaned up
using Agencourt AMPure SPRI beads by addition of 40 µl of SPRI beads to each well. Each
sample was then eluted with 30 µl of 10 mM Tris buffer (pH 8.0) / 0.05 % (v/v) Tween 20.
The same six samples were randomly chosen for the DNA fragmentation step and run on an
Agilent DNA 1000 chip to check the success of the ligation; the smear obtained had an 155
average molecular size of 50 to 200 bp larger than before ligation.
PCR amplification
Library amplification was carried out according to a modified reaction setup as defined in the
KAPA Library Preparation kit instructions (Illumina). 25 µl of 2x Kapa HIFI Hotstart Mix
were spiked with 1 M betaine, aiding in amplification of high-GC-content regions and 160
reducing biases. 1 µl of the supplied PCR primers were added to each tube and a sufficient
quantity of water was added to reach 50 µl of solution volume per PCR reaction. The tubes
were then transferred into a PCR thermocycler and run with the recommended KAPA library
preparation cycling programme. The samples were then cleaned using the SPRI beads
method with 40 µl of SPRI beads. Each sample was eluted using 50 µl of 10 mM Tris-HCl / 165
0.05 % (v/v) Tween 20. The same six samples as those randomly chosen at the DNA
fragmentation step, were run on an Agilent Bioanalyzer DNA 1000 chip to check the success
of the indexing enrichment PCR.
9
Final library quantification by qPCR
The KAPA Library Quant kit (KapaBiosystems) for final library quantification was used 170
(Illumina). Briefly, after an initial 1:1 000 dilution in 10 mM Tris-HCl, pH 8.0 + 0.05 % (v/v)
Tween 20, the 2x KAPA SYBR FAST qPCR master mix was used to amplify the DNA
library with six other standards on an ABI 7900 thermocycler. The qPCR step was conducted
using the following cycling conditions: 95 °C for 5 min followed by 35 cycles at 95 °C for
30.s and at 60 °C for 45 s. The concentration of the library was established using the 175
standards according to the manufacturer’s instructions. Both the standards and the libraries
with unknown concentration were run in triplicate. Prior to sequencing, the libraries were
diluted to the required concentration (e.g. 4.5 pM) by following the Illumina cluster
generation protocol.
180
Sequence assembly
For each of the two sampling dates, the raw paired-end 100 nt read metagenomic and
metatranscriptomic sequences were processed separately first using the PAired-eND
Assembler4 (PANDAseq, Supplementary Figure 1, step 1) to assemble overlapping read
pairs. PANDAseq was run with a score threshold of 0.9 and 25 nt minimum overlap
requirement to determine the location of the amplification primers, identify the optimal 185
overlap between reads, correct sequencing errors and check length and base quality. The
reads selected by the PANDAseq assembler were extracted from the raw sequence files using
in-house Perl scripts. The remaining non-redundant paired-end reads were trimmed using the
trim-fastq.pl script from the PoPoolation package5 using a quality-threshold of 20 (1 %
probability of miscall) and a minimum length of 40 nt resulting in two quality trimmed read 190
sets, one including still paired-end and one containing only single-end reads, where the other
10
read pair was discarded during quality trimming. Metagenome and metatranscriptome
FASTQ files were then combined into a combined FASTQ file. All PANDAseq and single-
end reads were then combined into a single FASTQ file. Paired-end and single-end reads
were made non-redundant using CD-HIT-dup6 (Supplementary Figure 1, step 2). None-195
redundant reads were then used as input for the MOCAT assembly pipeline7 (Supplementary
Figure 1, step 3), using default parameters. The assembled contigs were filtered with
minimum length threshold of 150 nt. To enhance the final assembly by reads that were
assembled by PANDAseq, but not used by the MOCAT assembly pipeline, all PANDAseq
reads were mapped onto the MOCAT contigs using SOAPaligner8, with the following 200
parameter settings: -r 2 -M 4 -l 30 -v 10 -p 8 (Supplementary Figure 1, step 4). The
unmapped PANDAseq assembled reads with a minimum length of 150 bp, were extracted and
added to the contigs. The final contig files were made non-redundant using CD-HIT6 (-c 1.0)
by clustering identical sequences.
205
Extraction and classification of reads mapping to rRNA genes in the metagenomic data
EMIRGE9 was run using metagenomic reads quality filtered to minimum average QV = 30
and a minimum of 40 bp with the trim-fastq.pl script from the PoPoolation package5. The
reference database used was the truncated (non-redundant) small ribosomal subunit SILVA10
database release 111. The consensus sequences at 100 iterations were extracted and named
based on their identification in the reference database, without imposing any normalized 210
posterior probability.
Generation of additional metagenomic data for contig extension and analysis
To obtain an additional high-depth metagenomic dataset, a total of four additional floating
11
sludge samples islets were collected on 23rd February 2011, each representing a biological
replicate. Biomolecular extraction was performed on 200 mg of starting material as for the 215
other two dates, using the protocol described by Roume et al.1. The resulting DNA fractions
were sequenced under sample names I, II, III and IV using the sequencing protocol described
above. Four technical replicates from sample I were generated, resulting in four libraries I-1,
I-2, I-3 and I-4. While the remaining biological replicates II, III and IV were sequenced once
each, generating a total of 12.6 gigabases metagenomic sequence. The reads were then 220
assembled using AMOS11 and MetaVelvet12.
Gene annotation
Non-redundant contig files were split into two distinct files, one file with contigs of a length
below 500 bp and another file with contigs lengths above or equal to 500 bp. The contigs 225
with lengths below 500 bp were annotated using FragGeneScan13 (Supplementary Figure 1,
step 5), using settings for short sequence reads with sequencing error (-complete 0 –train
illumine_5). The contigs with a length equal or above 500 bp were annotated with Prodigal
gene finder14 (v2.60, Supplementary Figure 1, step 5) using the MOCAT gene prediction
processing steps. The resulting amino acids sequence files were then combined into a single 230
file and made non-redundant using CD-HIT with a sequence identity threshold of 1.0 and a
description string length within the cluster file of 5 000 (-c 1.0 –d 5000; Supplementary
Figure 1, step 6).
All sequences were mapped to the KEGG database version 64.0 using BLAT15 and sequences
were annotated with KOs (KEGG orthologous groups; Supplementary Figure 1, step 7). 235
12
Pre-processing of protein fraction for high-throughput analysis
Following 1D-SDS-PAGE electrophoresis and staining with Imperial protein stain (Thermo
Scientific, Erembodegem, Belgium; as described in Roume et al.1), the protein gel was
conserved at 4 °C in the dark and under vacuum in sealing foil D0316L-20 (DOMO
ELEKTRO, Herentals, Belgium). Prior to further analysis, entire lanes were cut into 1 mm-240
slices using a grid cutter (MEE-1x5, Gel Company, San Francisco, CA, USA), yielding
approximately 70 slices per lane. Two 1 mm-slices were combined in a single well of a 96-
well V-bottom plate with a hole introduced into the bottom using a 30 gauge lancet needle
(Becton Dickinson, Franklin Lakes, NJ, USA). The wells contained size 11 black hexagonal
glass beads (SB3656, Fusion Beads) to prevent the gel pieces from clogging the hole. For in-245
gel digestion, an automated liquid handling system (Tecan EVO, Männedorf, Switzerland)
was used for reduction, alkylation, tryptic digestion and peptide extraction from the gel
pieces. After extraction, the peptide solution was dried and reconstituted in 20 µl of a
solution of 0.1 % (v/v) [trifluoroacetic acid (TFA, 5 %; Sigma) / acetonitrile (ACN, 95 %;
BioSolve, Valkenwaard, Netherlands)] in MilliQ H2O in a round bottom polypropylene 96-250
well plate (Greiner Bio-One, Monroe, NC, USA) and placed into an Eksigent Nano 2D plus
system autosampler (ABSciex, Framingham, MA, USA) for analysis.
Liquid chromatography
Peptides obtained from the 1 mm-gel bands were separated using an Eksigent Nano 2D LC
plus system employing splitless nanoflow. Reverse phase high performance liquid 255
chromatography (RP-HPLC) and separation columns were prepared in-house by packing a
Kasil fritted capillary [360 µm outer diameter (OD), 75 µm inner diameter (ID)] with a 1 cm
bed of ReproSil Pur C18-AQ 3 µm 120 Å stationary phase (Dr. Maisch GmbH, Ammerbuch,
13
Germany) for the sample trap and desalting column. A Kasil fritted capillary (360 µm OD,
75 µm ID) was packed with a 15 cm bed of the same stationary phase as the separation 260
column and this was connected to a PicoTip emmiter (360 µm OD x 20 µm ID, Tip 10 µm,
FS360-20-10-N-20) for nano-electrospray ionisation. For each LC run, the sample was
injected for 10 minutes at 2.5 µl/min with loading buffer (2 % v/v acetonitrile and 0.1 % v/v
formic acid). The sample was separated by a linear gradient changing from 98 % solvent A
(0.1 % v/v formic acid in water) and 2 % solvent B (0.1 % v/v formic acid in acetonitrile) to 265
40 % A and 60 % B in 60 min at 0.3 µl/min.
Mass spectrometry
Following LC separation, the peptides were analysed on a LTQ-Velos Orbitrap (Thermo-
Fisher, San Jose, CA, USA). MS1 data were collected over the range of 300 – 2 000 m/z in
the Orbitrap at a resolution of 30 000. Fourier-transform mass spectrometry (FTMS) preview 270
scan and predictive automatic gain control (pAGC) were enabled. The full scan FTMS target
ion volume was 1 x 106 with a maximum fill time of 500 ms. MS2 data were collected in the
LTQ-Velos with a target ion volume of 1 x 104 and a maximum fill time of 100 ms. The 10
most intense peaks were selected (within a window of 2.0 Da) for higher-energy collisional
dissociation at 15 000 resolution in the Orbitrap. Dynamic exclusion was enabled in order to 275
exclude an observed precursor for 180 s after two observations. The dynamic exclusion list
size was set at the maximum 500 and the exclusion width was set at ±5 ppm based on
precursor mass. Monoisotopic precursor selection and charge state rejection were enabled to
reject precursors with z = +1 or unassigned charge state.
280
14
Protein identification
For MS analysis, Thermo .RAW files were converted to mzXML format using MSConvert
(ProteoWizard16) and searched with X!Tandem17 version 2011.12.01.1. Spectra were
searched against the metagenomic and metatranscriptomic data, common lab protein
contaminants, and decoys. Redundancy was removed from these three data sets using
BlastClust. The contaminant database was a modified version of the common Repository of 285
Adventitious Proteins (cRAP, www.thegpm.org/crap) with the Sigma Universal Standard
Proteins removed and human angiotensin II and [Glu-1] fibrinopeptide B (MS test peptides)
added, for a total of 66 entries. Decoys were generated with Mimic (www.kaell.org), which
randomly shuffles peptide sequences between tryptic residues, but retains peptide sequence
homology in decoy entries. 290
Search criteria used for X!Tandem included a precursor mass tolerance of 15 ppm and a
fragment mass tolerance of 15 ppm for higher-energy collisional dissociation spectra.
Peptides were assumed to be semi-tryptic (cleavage after K or R except when followed by P),
but semi-tryptic peptides with up to 2 missed cleavages were allowed. The search parameters
included a static modification of +57.021464 Da at C for carbamidomethylation by 295
iodoacetamide and potential modifications of +.15.994915 Da at M for oxidation, -
17.026549 Da at N-terminal Q for deamidation, and -18.010565 Da at N-terminal E for loss
of water from formation of pyro-Glu. Additionally, -17.026549 Da at the N-terminal
carbamidomethylated C for deamidation from formation of S-carbamoylmethylcysteine and
N-terminal acetylation were searched. Peptide spectrum matches (PSMs) obtained from 300
X!Tandem were validated using the Trans Proteomic Pipeline18 version 4.6 Rev.1. The PSMs
were analysed with PeptideProphet19 to assign each PSM a probability of being correct.
Accurate mass binning was employed to promote PSMs whose theoretical mass closely
matched the observed mass of the precursor ion, and to correct for any systematic mass
15
errors. Decoys and the non-parametric model option were used to improve PSM scoring. 305
Protein identifications were inferred with ProteinProphet. The ProteinProphet scores were
then analysed in iProphet20, which combines results from multiple fractions and multiple
database searches (although here, only X!Tandem was used) and assigns a probability for
each unique protein and its corresponding peptide sequences. The false discovery rate for a
given iProphet probability was calculated using the number of decoy protein inferences at 310
that probability. Only proteins identified at iProphet probabilities corresponding to a false
discovery rate (FDR) less than 1.0 % were further considered.
KO annotations and protein quantification
The corresponding sequences of the identified proteins were collected in FASTA format from
the non-redundant single peptides/protein sequences database previously generated from the 315
combined metagenome and metatranscriptome assemblies. Relative protein quantitation was
performed using the normalized spectral index (NSI) measure using an in-house software tool
called NSICalc as previously described21. The amino acid sequences of identified proteins
were then mapped to the KO library (KEGG database version 64.0) using BLAT15 (e-
value<10-5, %identity>50, score>50). From the resulting list of KOs, the frequency of each 320
KO was determined at the protein level, using an in-house developed Perl script.
Gene copy and transcript abundances
To account for differences in read depth and sampling, the number of raw sequence reads
from autumn and winter metagenome and metatranscriptome libraries were equilibrated by
randomly selecting reads in the larger libraries from the autumn sample to mirror the number 325
of reads in the smaller library (winter) using an in-house Perl script based on the ‘shuffle’
16
method from the ‘List::Util’ CPAN package. This resulted in 14 546 374 reads and
16 443 761 reads being used from the metagenomic and metatranscriptomic libraries,
respectively. The four balanced raw read sequence libraries were then mapped separately to
the combined assembly for both sampling dates using SOAPaligner8 with the following 330
parameter settings: -r 2 -M 4 -l 30 -v 10 -p 8.
For each library, reads were mapped to genes and counted, except for reads mapping to
multiple genes, for which weighted proportions were used. Next, to obtain the abundances of
genes and transcripts, read counts were normalized by the length of the respective gene
sequences22, to obtain normalized gene copy abundances and normalized transcript 335
abundances, respectively. Normalized gene copy abundances per KO were obtained by
calculating the sum of normalized gene copy abundances from all genes belonging to the
same KO group. Similarly, the KO-wise transcript abundances were calculated as the sum of
normalized transcript abundances over all genes within the same KO.
340
Relative gene expression
Relative expression of KOs was determined by dividing the normalized transcript abundance
of each KO by the inferred gene copy abundance of the same KO23.
The relative expression of a KO is greatly dependent on the normalized gene copy
abundance, if this value is close to zero. Similarly, KOs with normalized gene copy
abundances close to zero are prone to be falsely identified as highly expressed due to their 345
very low gene copy abundances. Therefore, highly expressed KOs were selected based on
normalized gene copy and transcript abundances, in addition to relative expression. KOs
were considered highly expressed, if their relative expression was above the 90th percentile.
In addition, to avoid false positive identification of highly expressed KOs, the normalized
17
transcript abundances of highly expressed KOs had to be above the 3rd quartile of the 350
normalized transcript abundances of all KOs or normalized gene copy abundances had to be
above an empirically determined threshold. This threshold was determined by sorting KOs by
their normalized gene copy abundances and applying a sliding window approach to
determine the lowest normalized gene copy abundance with an average relative expression
robustly within the interquartile range (see Supplementary Figure 2). 355
Sensitivity of gene expression analysis to imposed cut-offs
Results of the analysis of KOs exhibiting high relative expression are dependent on the
quantile of KOs considered highly expressed (we considered KOs with a relative expression
above the 90th percentile), as well as the lower cut-off which was set for gene copy
abundances to avoid false positive identification of KOs exhibiting very low gene copy 360
abundances and only slightly higher transcript abundances as highly expressed. Therefore, we
first analysed, if the numbers and identities of KOs identified based on their relative
expression were robust to different levels of noise added to the data. In addition, we analysed
whether the conclusions would also be robust to small to moderate changes in the selected
cut-off values. 365
To address the first point (robustness to noise), we changed the gene copy and transcript
abundances by a random number following a uniform distribution within different limits. The
lowest of these limits corresponded to +/- 1 read mapped per kilobase of metagenomic
sequence and the highest to +/- 50 reads mapped per kilobase. We then analysed the numbers
and identities of the highly expressed genes in 100 repetitions of each test, given the chosen 370
cut-offs and the identities of enriched pathways within the selected sets of KOs. We
compared the results to the same analysis carried out using a single numerical cut-off for
18
relative expression (minimal relative transcript abundance = 10 times relative gene copy
abundance).
To address the second point (robustness to changes in cut-offs), we changed each cut-off 375
within small to moderate limits. For the inclusion of KOs irrespective of their gene copy
abundances, cut-offs at steps between the 55th to 95th percentile of transcript abundances were
used. For the exclusion of KOs with low gene copy abundances, different cut-offs of robust
relative expression were set between the 20th and 95th percentile. We then analysed the
numbers and identities of the KOs found to be highly expressed, as well as the pathways 380
enriched in these KOs.
Comparison of the metagenomic dataset with the metatranscriptomic and metaproteomic
datasets
The congruency of metagenomic and metatranscriptomic datasets was determined by
calculating the proportion of KOs with at least one gene having at least 10 metagenomic
reads per kilobase mapping to it and also at least one gene resulting in the mapping of 10 385
metatranscriptomic reads per kilobase. The congruency of the metagenomic and
metaproteomic datasets was calculated analogously as the proportion of KOs with at least one
gene mapped by at least 10 metagenomic reads per kilobase that also had at least one gene
identified at the protein level.
390
Analysis of pathway membership
Assignment of KOs to KEGG pathways was done by using the KO to pathway link
(http://rest.kegg.jp/link/Ko/pathway) in the KEGG database version 67.1. Enrichment of KOs
19
in specific pathways was tested using a hypergeometric test and p-values were adjusted using
FDR-control24. Test results with adjusted p-values below 0.05 were considered significant.
395
Community-wide metabolic network reconstructions
The KO to R (reaction) link (http://rest.kegg.jp/link/Ko/reaction) was used to associate each
individual KO to the corresponding reactions following the R to RP (reaction pair) link
(http://rest.kegg.jp/link/reaction/RP), thereby, associating individual reactions to their
corresponding main reaction pairs. The RPAIR annotation was specifically chosen to ignore
unspecific compounds of reactions (water, energy carriers and cofactors), thereby, only 400
taking into account the main compounds of a reaction. The reaction pairs (RP) were then
further selected by using only RPAIRs with assigned reaction classes
(http://rest.kegg.jp/link/rn/rc). Finally, the reaction pair - compounds link
(http://rest.kegg.jp/list/RP) was used to associate individual RPs to corresponding pair(s) of
compounds. As some KOs have identical compounds (e.g. subunits of the same enzyme or 405
enzymes that catalyse each other’s reverse reaction), KOs with identical compounds (as
annotated in the KEGG database version 67.1) were grouped. A KO network graph was built
by using all KOs as nodes. Edges between two KOs were introduced if a product metabolite
of one KO was found as a substrate metabolite of the other KO. Multiple edges between the
same KOs were reduced into a single edge. For topological analysis of the reconstructed 410
metabolic network, nodes sharing the exact same edges were regrouped to be represented as a
single node. Multiple edges connecting the same nodes were likewise combined into a single
edge. The undirected network graph was visualized and analysed using Cytoscape25,
employing a spring embedded layout. Singleton nodes not connected to any other node were
removed. 415
20
Calculation of gene copy and transcript number and relative expression of regrouped nodes
Gene copy numbers and transcript numbers of nodes which represented several KOs due to
their sharing of the same edges were calculated by summing all normalized gene copy and
transcript numbers, respectively. Relative expression of a node was calculated by dividing the 420
node-wise sum of normalized transcript abundances by the node-wise sum of normalized
gene copy abundances, thereby levelling relative expression of the regrouped KOs.
Identification of choke points
Choke points as defined by Rahman and Schomburg26 are enzymes which consume or
produce unique metabolites and possess a high load score in the metabolic network 425
reconstruction. To assess whether a node within the metabolic network reconstructions could
qualify as choke point, the number of edges representing every metabolite was counted,
yielding the number of occurrence. In the cases where an edge represented more than one
metabolite, the lowest number of occurrence was assigned to this edge. Every node was also
assigned the lowest number of occurrence of all its edges. Nodes with an assigned number of 430
occurrence of 1 were considered potential choke points. The number of occurrence for every
node is listed and potential choke points are highlighted in Supplementary Dataset 7.
Weighted load score
Weighting of compounds in a bi-partite RPAIR-based metabolic network reconstruction has
been shown to increase pathfinding accuracy27. The number of occurrence described above 435
was assigned as edge weight within the metabolic network reconstructions. A weighted
betweenness centrality was calculated using the R-package igraph28. Alternative weighted
load scores were calculated for each node from the weighted betweenness centralities and the
21
degree as defined in the manuscript, and potential key functionalities were determined using
this measure and expression as described in the main manuscript. The resulting weighted 440
load scores and the identities of the key nodes were compared to the unweighted results
discussed in the manuscript.
Matching of genes to Candidatus Microthrix parvicella Bio17 genome
Amino acid sequences were aligned using BLAT15 with cut-offs chosen as follows: e-value <
10-5, %-identity > 50, score > 50. 445
Alignment of contigs encoding key functionalities
Contigs were selected based on the following criteria: (1) they encoded a gene annotated with
a KO representing a key functionality, and (2) the expression of this gene was corroborated
by at least one mapped metatranscriptomic read. Selected contigs were aligned to the NCBI
non-redundant nucleotide database using BLASTn with default parameters29. The best hit with 450
a query coverage above 50 % and a percentage identity above 80 % for each contig was
documented. In addition, contigs containing genes annotated as K03921 were aligned to 85
isolate genomes from the same biological wastewater treatment (BWWT) plant using BLAST
and selecting only results with percentage identity above 80 %.
455
Quantification of isolate sequences in combined metagenomic and metatranscriptomic
assemblies
Reads from the balanced libraries (see section Expression analysis and contextualization of
omic datasets - gene copy and transcript abundances) were mapped against the genome of
22
Nitrosomonas sp. Is79 (Ref. 30) and Isolate LCSB065 using SOAPaligner8 with the
following parameter settings: -r 2 -M 4 -l 30 -v 10 -p 8 and mapped reads were counted.
460
Ammonia monooxygenase (AMO) contig extension
Genes encoding subunits A and B of ammonia monooxygenase (amoA and amoB) are
established phylogenetic markers31. However, none of the contigs from the combined autumn
and winter assembly that contained an open reading frame annotated as encoding for a
subunit of AMO (K10944, K10945, or K10946) harboured a complete gene. In order to
recover a full gene sequence and determine the position of the amoA genes recovered from 465
the combined metagenomic and metatranscriptomic data relative to other known amoA
sequences within a phylogenetic tree, we employed a contig extension protocol in order to
increase the length of the contigs via extension and merging of these contigs.
The contig extension protocol was carried out in a step-wise fashion including contig
extension, gene calling and gene annotation. Contig alignment, merging and extension was 470
performed using minimus2 (Ref. 32) from the AMOS suite33 with a minimum overlap of 60
bases at 98 % identity. The contig extension protocol was performed in three steps: i) contigs
were extended by aligning contigs annotated with the same KO IDs; ii) contigs were
extended by aligning contigs annotated with KOs K10944, K10945 or K10946; iii) contigs
were extended using a high-depth metagenomic assembly (see section Generation of 475
additional metagenomic data for contig extension and analysis) from a different sample,
using the AMO contigs from the previous step as a reference. This procedure was performed,
because the genes encoding the subunits of AMO are known to exist in a cluster/operon34.
Following the run on minimus2, gene calling was performed on the resultant contigs using
FragGeneScan13 and Prodigal14 using default parameters as previously used. Predicted 480
23
amino acid sequences were then merged and made non-redundant based on 100 % sequence
identity using CD-HIT35 and were re-annotated with KO IDs using our annotation pipeline.
Gene calling and re-annotation steps were performed to ensure that the extended contigs
retained their original annotation reference. The extended contigs were then used for
downstream analysis in order to associate these AMO genes to a bacterial species. 485
AmoA phylogenetic analysis
Nearly complete amino acid sequences (201 – 274 amino acids) of AmoA and/or MmoA
from representative organisms belonging to the beta-Proteobacteria, gamma-Proteobacteria
and archaea were retrieved from the Refseq protein database (see Supplementary Table 3)
and aligned with ClustalOmega (using default parameters). The alignment file was submitted 490
to a phylogenetic analysis using the Phylogeny.fr customized workflow service36 including
alignment curation with Gblocks37 (using default parameters), tree construction with PhyML38
(bootstrap of 100), and visualization by TreeDyn39.
Isolate LCSB065 isolation
Isolate LCSB065 was obtained from an OMMC biomass sample diluted by a factor of 104. 495
The biomass was first cultivated on Petri dishes of wastewater-agar medium (1.5 % agar;
w/v) in 800 ml filtered (0.2 µm, Sartorius, Göttingen, Germany) wastewater from the
Schifflange BWWT plant. A single colony was then transferred to a Petri dish with R2A
medium40 and cultivated at 20 °C under aerobic conditions. Isolates were grown on different
growth media recommended for the culture of bacteria from water and wastewater, 500
particularly Microthrix parvicella, such as R2A40, wastewater agar medium, MSV + peptone
and MSV A + B41 or Slijkhuis A and F42 under different growth conditions.
24
Nile red staining
Lipid inclusions were visualized using a protocol modified from Fowler & Greenspan43. A
stock solution of Nile red (Sigma-Aldrich, Diegem, Belgium) in acetone (Sigma-Aldrich) 505
was prepared at a concentration of 500 µg/ml and preserved at 4 °C protected from light.
50 µL of a working solution, containing 2.5 µl of the stock solution in 1 ml of 75 % (v/v)
glycerol, were deposited onto a microscopy glass slide with heat fixed bacterial cells. After
5 min incubation, epifluoresence and bright field microscopic observations of the same fields
of view were carried out on an inverted microscope (Nikon Ti) equipped with a 60 × oil 510
immersion Nikon Apo-Plan lambda objective (1.4 N.A). Intermediate magnification 1.5 ×
was used in order to better resolve images. Excitation light was from a Xenon arc lamp, and
the beam was passed through an Optoscan monochromator (Cairn Research, Kent, UK) with
550/20 nm selected band pass. Emitted light was reflected through a 620/60 nm bandpass
filter with a 565 dichroic connected to a cooled CCD camera (QImaging, Exi Blue). All 515
imaging data were collected and analysed using the OptoMorph (Cairn Research, Kent, UK)
and ImageJ44.
Isolate genome sequencing and genome assembly
Following DNA extraction from isolate cultures using the Power Soil DNA isolation kit (MO
BIO, Carlsbad, CA), a paired-end sequencing library with a theoretical insert size of 300 bp 520
was prepared with the AMPure XP/Size Select Buffer Protocol as previously described by
Kozarewa & Turner3, modified to allow for size-selection of fragments using the double solid
phase reversible immobilisation procedure described earlier by Rodrigue et al.45
and sequenced on an Illumina HiSeq with a read length of 100 bp. The resulting 854 683
25
paired raw reads were de-duplicated with FastUniq46, and quality filtered to a minimum 525
average QV = 30 and a minimum length of 60 bp with the trim-fastq.pl script from the
PoPoolation suite5, leading to 551 103 read pairs and 145 500 single-end reads (high quality
data yield of 73 %). Two separate preliminary assemblies were obtained with IDBA-UD47;
v1.1.0, with parameters –mink 30 –maxk 90 –step 5 –similar 98 --pre_correction); and
SPAdes 2.5.0, using the hammer read error correction module which filtered for contigs 530
shorter than 200 bp. The resulting assemblies were merged with phrap (minimum overlap of
50 bp). The merged assembly was manually inspected with Consed48 and contigs broken
where merge conflicts were detected.
The resulting contigs were uploaded to and analysed using RAST49. According to this
analysis, the isolate was a Rhodococcus sp. Similarity of contigs to bacterial genomes (NCBI 535
database, accessed 6th March 2014) were assessed by BLAT15. The bitscore of each hit was
recorded and only contigs with a hit to Rhodococcus spp. with a bitscore within 60 % of the
best hit’s bitscore were selected for the final set. This led to the removal of 120 contigs most
of which had low coverage of reads.
Filtered reads were mapped onto the assembled contigs using BWA50 with default parameters. 540
Reads mapping with mapping quality scores at or above five were used to assess contig
coverage (Supplementary Dataset 8). The final set of contigs was submitted to AmphoraNet
server to determine the isolate species searching for 31 phylogenetic marker genes51, as well
as to RAST49 (accession number 6666666.64457) and annotated.
The COG protein profile of Isolate LCSB065 was determined as previously described by 545
Muller et al.21. Annotation of KOs was carried out on protein predictions from RAST as
described for metagenomic proteins.
26
27
Supplementary Results and Discussion
Sensitivity of gene expression analysis to imposed cut-offs
We found that the cut-offs imposed to avoid false-positives within the highly expressed genes 550
and genes encoding key functionalities led to a greater robustness against noise than would
be observed if simple numerical cut-offs had been chosen. This was obvious from the
simulation of noise, in which the number and variance of highly expressed genes grew with
the noise, when a numerical cut-off was chosen, whereas the choice of cut-offs, as defined
herein, resulted in mostly stable numbers (Supplementary Figure 3a&b). 555
The identities and numbers of KOs identified as exhibiting high relative expression in our
datasets were not overly sensitive to the cut-offs imposed to avoid false-positive results. The
numbers of highly expressed KOs decreased by less than 20 % by the exclusion of KOs with
low gene copy abundances (Supplementary Figure 3c). As most genes with very high
transcript abundances did not have low gene copy numbers (Figure 3), the cut-off selected 560
for inclusion of KOs with high transcript abundances irrespective of their gene copy numbers
changed the total number of highly expressed KOs by less than 5 %. If the transcript
abundance cut-off for genes with low gene copy abundances was set between the 55th and
95th percentile (our default value was the 75th percentile), an enrichment with the KOs above
the 90th percentile of the highly expressed KOs of 4 or 5 out of the 5 pathways in autumn, and 565
5 out of the 6 pathways in winter was consistently detected (Supplementary Figure 3d).
Only few additional pathways were enriched after variation of this cut-off (Supplementary
Figure 3e). In particular, the findings of pathways ko00910 “Nitrogen metabolism” and
ko00190 “Oxidative phosphorylation” as exhibiting overall high levels of gene expression
were resilient to changes in cut-offs. 570
28
In conclusion, the described gene expression analysis is robust to noise, as well as too small
to moderate changes in the chosen cut-offs.
Effect of regrouping redundant KOs into single nodes
To carry out a topological analysis of the reconstructed metabolic network, nodes and edges
were rendered non-redundant, by representing multiple KOs with identical substrate and 575
product metabolites as a single node. Due to this step, 229 and 220 nodes representing more
than one KO were part of the autumn and winter metabolic network reconstructions,
respectively. The calculated load scores were overall only mildly affected by the regrouping
(Spearman correlation of load scores in the redundant and non-redundant autumn or winter
network reconstructions: 0.98). 70 % of key functionalities identified in the non-redundant 580
network were shared between both autumn network reconstructions (40 % for the winter
network reconstructions), as reported in Supplementary Dataset 7. The rationale behind
regrouping redundant KOs into single nodes was based on the fact that most KOs represent
subunits of enzyme complexes, which do not work in parallel, but rather cooperatively in
metabolizing substrates. Consequently, some of the additional nodes found in the networks 585
with regrouped redundant KOs represent multi-subunit complexes, such as AMO, and we
therefore believe that the practice of regrouping redundant KOs into single nodes is
warranted.
Genomic analysis of Isolate LCSB065
Mapping of filtered reads onto Isolate LCSB065’s contigs revealed a mean/standard 590
deviation empirical sequencing insert size of 244±43 bp, and a mean read depth per mapped
position (coverage) of 27±11x (median 25x).
29
As a first approach to analyse Isolate LCSB065’s genetic potential, protein coding genes
were annotated with KOs as before for the metagenomic and metatranscriptomic sequences.
Of utmost interest, out of 3 373 protein coding genes that could be annotated with KOs, 420 595
genes (12.5 %) were annotated as KOs belonging to “Lipid metabolism”, according to KEGG
Orthology. This relates to rank 4 behind “Amino acid metabolism” (19.1 %), “Carbohydrate
metabolism” (16.2 %) and “Xenobiotics biodegradation and metabolism” (14.0 %). Similar
results were obtained from COG categories52 and SEED subsystems categories of the
predicted proteins53. The assembled genome also encodes genes for the synthesis and 600
polymerisation of poly-hydroxybutyrate (PHB, Supplementary Dataset 8 indicated in
Figure 5b) and for the synthesis of TAGs (Supplementary Dataset 8 indicated in Figure
5b), inclusions of which are visible following Nile Red staining (see Supplementary Figure
8). Nine other genes beside the gene matching to the three metagenomic contigs encoding
acyl-[acyl-carrier protein] desaturases are annotated as desaturases in the isolate genome. 605
The genomic region of the gene matching to the three metagenomic contigs encoding acyl-
[acyl-carrier protein] desaturases was assessed. The gene is the first gene of a syntenous
block of four genes present in Rhodococcus jostii RHA1, Nocardia farcinica IF, Godronia
bronchialis and Tsukamurella paurometabola DSM 20162. This block encodes the
homologous fatty acid desaturase (peg.6927), followed by a tRNA dihydrouridine 610
synthase B, a cell envelope-associated transcriptional attenuator LytR-CpsA-Psr of the
subfamily A1 and a phosphate transport system regulatory protein PhoU. The genome of
Isolate LCSB065 furthermore contains six genes coding for lipases with an export signal
peptide, thereby reinforcing its potential keystone role in the community (Supplementary
Dataset 8). The genomic enrichment of genes involved in lipid metabolism suggests that the 615
isolated Rhodococcus sp. may occupy a function as keystone species within the sampled
OMMCs. Additional work is required to elucidate the exact role of this organismal group
30
within the OMMC community.
31
620
List of Supplementary Figures
Supplementary Figure 1 Overview of the assembly and annotation pipeline.
Supplementary Figure 2 Determination of lower thresholds for gene
abundances for the selection of highly expressed
KOs within the reconstructed community-wide
metabolic networks.
Supplementary Figure 3 Results of the sensitivity analyses.
Supplementary Figure 4 Expression of KOs in metabolic pathways at the
protein level.
Supplementary Figure 5 Generalized OMMC-wide metabolic network
reconstructed from the combined metagenomic and
metatranscriptomic data of OMMCs sampled in
autumn and winter.
Supplementary Figure 6 Representation of the fatty acid metabolic pathway
(ko01212).
Supplementary Figure 7 OMMC-wide metabolic networks reconstructed
from metagenomic and metatranscriptomic data of
OMMCs sampled in autumn and winter.
Supplementary Figure 8 Micrographs of Isolate LCSB065.
32
List of Supplementary Tables
Supplementary Table 1 Physicochemical characteristics of the wastewater at
the time of sampling in the anoxic tank of the
Schifflange BWWT plant.
Supplementary Table 2 Quantitative and qualitative characteristics of
biomacromolecular fractions sequentially isolated
from the OMMC samples.
Supplementary Table 3 Statistics of the combined assembly of the
metagenomic and metatranscriptomic sequence
datasets.
Supplementary Table 4 Ammonia-oxidizing organisms with corresponding
accession numbers of amoA genes used for the
reconstruction of the phylogenetic tree.
Supplementary Table 5 Numbers of metagenomic and metatranscriptomic
reads from autumn and winter dates mapping to the
isolate genomes of Nitrosomonas sp. Is79 and
Isolate LCSB065.
33
List of Supplementary Datasets
Supplementary Dataset 1 Dominant genera in the autumn and winter samples
determined based on reconstruction of 16S rRNA
gene sequences from the metagenomic data.
Supplementary Dataset 2 KO gene copy (KOGA), transcript (KOTA) and
protein (NSI) abundances in datasets from autumn
(04 October 2010) and winter (25 January 2011)
samples.
Supplementary Dataset 3 Highly expressed metabolic KOs in the (a) autumn
and (b) winter sample and pathways overrepresented
by metabolic KOs highly expressed in the (c)
autumn or (d) winter sample.
Supplementary Dataset 4 Generalized microbial community-level
reconstructed metabolic network reconstructed
using the combined metagenomic and
metatranscriptomic datasets in simple interaction
format.
Supplementary Dataset 5 Autumn-specific reconstructed microbial
community-level metabolic network in simple
interaction format.
Supplementary Dataset 6 Winter-specific reconstructed microbial community-
level metabolic network in simple interaction
34
format.
Supplementary Dataset 7 Results of topological analyses. List of all nodes in
the metabolic networks reconstructed from OMMCs
sampled in (a) autumn and (b) winter with
topological measures and expression values. (c)
KOs with betweenness centrality values different
between metabolic networks reconstructed from
OMMCs sampled in autumn and winter, and (d)
pathways enriched in these KOs. KOs with key
functionality in OMMCs sampled in (e) autumn or
(f) winter including topological analysis results,
gene abundances and expression, as well as protein
abundances and pathway membership. (g) Results of
BLAST searches of all genes encoding key
functionalities against publicly available bacterial
genomes.
Supplementary Dataset 8 Summary of characteristics of Isolate LCSB065’s
genome, results of the phylogenetic analysis, and
coverage of Isolate LCSB065 contigs by reads from
the isolate genome sequencing.
35
Supplementary Figures
Supplementary Figure 1 Overview of the assembly and annotation pipeline. Annotated KOs
were used for the subsequent metabolic network reconstructions. QC: quality control, 1-7:
steps in the pipeline.
1
2
3
4
5
6
metagenomic reads metatranscriptomic reads
PANDAseq + trim-fastq.pl: joverlaps joined and trimmed, QC
MOCAT assembly pipeline: combined assembly of metagenomic and metatranscriptomic reads
SOAP: reads joined by PANDAseq mapped to MOCAT contigs to recruit unassembled reads
autumn sample winter sample autumn sample winter sample
FragGeneScan: genes called on contigs < 500 nt
Prodigal: genes called on contigs ≥ 500 nt
CD-HIT: processed reads made non-redundant
7
CD-HIT: predicted amino acid sequences made non-redundant
BLAST-X and KEGG: enzyme annotation (e-value < 10-5, %identity > 50, score > 50)
K00001 K00002 K00003 K00004 K00005 …!
nucleotide sequence amino acid sequence legend: KXXXXX KO identifier!
contigs > 150 nt
36
Supplementary Figure 2 Determination of lower thresholds for gene abundances for the
selection of highly expressed KOs within the reconstructed community-wide metabolic
networks. (a) and (b) Moving median of KO relative expression (relExp) versus gene copy
abundance (KOGA) for (a) autumn and (b) winter. (c) and (d) KO relative expression
(relExp) versus gene copy abundance (KOGA) for (a) autumn and (b) winter. Vertical lines
indicate lowest gene abundances with robust gene expression values within the interquartile
range.
a b
c d
●
●
●
●
●●
●
●●
●
●●
●●●● ●●●●●●●●●●●●
●●
●
●● ●●●●●●●●●●●●●●●
●
●
●
●
●
●
●●●●
●●●●●●
●
●●●●●
●●●●●●●●●●●●
●
●●●●●
●
●●
●●
●
●●●
●●●●●●●●●●●●●●●●●●●●
●
●●●●●●●●●●●
●●●
●●
●●●
●
●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●
●●
●●●●
●●●●
●
●
●
●●
●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●
●●●●
●
●●●●●●●●●●●
●
●●●●●●
●
●●●●●●●●●●
●●
●●
●●●●
●●●●●●
●
●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●
●
●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●
●
●●●●●●
●
●●●●
●●●●
●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●
●●●
●●
●●
●●●●●●●●●●●●●●●●●
●
●●●●●
●
●●●●●●●●●●●●
●
●●●●●
●
●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●
●●
●
●
●●
●●
●
●
●
●●
●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●
●●●●●●
●
1e−03 1e−01 1e+010.1
0.51.0
5.010.0
50.0100.0
mov
ing
med
ian
wint
er re
lExp
rwinter KOGA
●
●
●
●
●
●●
●
●●●
●
●●●●●
●
●●● ●●
●●●●●●●●●●●●●
●●●
●●●●●●●●
●
●●●
●
●
●●
●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●
●
●●
●●●●
●●●●
●
●●●●●●●●●●●●●●●●●●●
●
●
●●●●●●●●●●
●●
●●
●●●
●●●●●●●●●●●●●●
●
●
●●●●●●●●●●
●
●
●
●●●
●●
●●●
●●●●●●●●●●
●●
●●●●●●
●●●●
●
●
●
●
●
●●
●
●●
●●●●●
●●
●●●●●●●●●
●●●●●●●
●
●
●●●●
●●●●
●
●
●
●●●●●●●●●●●●●●●●●●●●●●●●
●●
●●
●●●●●
●●●●●●●●●●●●
●
●●●●
●●●
●
●
●
●
●
●●●●●
●●●●●●●●●●●●●
●●●●●
●
●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●
●●●●●●●●●●●●●●●
●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●
●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●
●
●●●●
●●
●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●
●
●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●
1e−03 1e−01 1e+01
0.2
0.51.02.0
5.010.020.0
50.0100.0
mov
ing
med
ian
autu
mn
relE
xpr
autumn KOGA
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●●●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●●●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●●
●●
●
●●
●
●
●
●●
●
●
●●●
●
●
●●
●●
●●
●
●
●
●
●
●
●
●
●●●●
●
●
●●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●●●
●
●●
●●●●
●
●
●●●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●●
●●
●
●
●
●●
●
●
●●
●●
●●
●
●●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●●
●●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●●
●●
●
●
●
●
●
●
●
●●●
●
●●
●
●
●
●
●
●●
●
●
●●
●
●●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●●●●
●●●
●●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●●●
●
●●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
●●
●
●●●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●●●
●
●
●
●●●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●●
●
●
●
●
●
●
●●
●●●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●●
●●
●
●
●●●
●●
●●
●
●●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●●●●
●
●
●●
●
●●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●
●●●●●●●
●
●
●
●
●
●
●●
●●
●
●
●●
●
●●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●●
●
●
●●
●
●
●●●
●●
●
●●●●
●
●
●
●
●●
●
●
●●●●
●
●●
●
●
●●●
●
●●
●
●●●●●●
●
●
●
●●
●
●
●●
●
●
●●
●
●●
●
●●●●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●●
●
●
●
●●
●
●
●
●●
●
●
●
●●●
●
●
●
●
●
●
●
●
●●●
●
●●●●●
●
●●●
●
●●
●
●
●
●
●●
●●●
●●●
●
●
●●
●
●
●
●
●
●
●
●●●●●●
●●
●
●●●●
●
●
●
●
●●
●
●●
●
●●
●
●●●
●
●
●
●
●
●
●●
●●
●
●
●●
●
●
●
●
●●
●
●
●
●
●●
●
●
●●●
●
●
●●
●
●
●●
●
●
●
●
●
●
●●●
●●●●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●●●●
●●
●
●
●●
●●●●
●
●
●●
●●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●●●●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●●●
●
●●●
●
●
●
●
●
●
●
●●
●
●
●
●
●●●
●
●●●
●
1e−03 1e−01 1e+01
1e−02
1e−01
1e+00
1e+01
1e+02
autu
mn
relE
xpr
autumn KOGA
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●●
●
●
●●●
●
●
●●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●●●●●●●
●
●
●
●
●
●●●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●●●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●●
●
●
●
●
●
●●●
●
●●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●●
●
●●
●
●●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●●
●
●
●
●
●
●
●●●
●●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●●●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●●
●●
●●
●●
●●
●
●●●
●●
●●
●
●
●
●
●
●
●
●
●
●●●●●
●●
●
●●●
●
●
●
●
●
●
●
●
●●
●●
●●●●●●
●
●
●
●
●●●●
●●
●●
●●●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●●●
●
●
●
●
●●
●
●
●
●
●●●
●
●
●
●
●●
●
●
●
●●
●
●●
●
●
●
●
●●●
●
●●●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●●
●●●
●
●
●
●
●●●●
●
●
●●
●●●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●●●
●
●
●
●
●
●
●●
●
●
●
●
●●
●●●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●●●
●
●
●
●
●
●●●
●
●●
●●
●
●
●
●●
●
●
●
●
●
●
●●●●
●●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●●●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●●●●●
●
●
●
●●●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●●
●●
●●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●●●
●●
●●
●
●
●
●●●●
●
●
●
●
●
●
●
●
●●●
●
●
●●
●●●
●
●
●●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●●
●
●●
●
●
●
●
●
●
●●●●
●
●
●
●
●●
●●
●●
●
●
●
●
●
●
●
●
●
●●●
●●
●
●
●
●●●
●
●
●
●●
●
●
●
●
●
●
●●●
●●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●●●
●
●
●
●
●
●
●●●
●●
1e−03 1e−01 1e+01
1e−02
1e−01
1e+00
1e+01
1e+02
wint
er re
lExp
r
winter KOGA
37
Supplementary Figure 3 Results of the sensitivity analyses. (a) and (b) Effect of noise on
the number and identity of genes with high relative expression determined by comparing a
simple numerical cut-off and our method for the reduction of false-positives by excluding
genes with very low gene copy abundances. Filled boxes indicate total number of genes with
high relative expression, boxes with blue or brown lines indicate the sizes of the intersect of
genes with high relative expression without noise and after addition of noise; (a) autumn
dataset, (b) winter dataset. (c) Effect of changing the cut-offs for exclusion of genes with low
gene copy abundances to reduce false-positives on numbers of genes with a high relative
expression. (d) and (e) Effect of changing the cut-offs for exclusion of genes with low gene
copy abundances to reduce false-positives on the identities of pathways enriched with highly
expressed genes. (d) Number of pathways enriched with highly expressed genes that are
found at the chosen cut-off (0.75) and after variation of the cut-off. (e) Number of additional
a b
c d e
0.5
0.55 0.
60.
65 0.7
0.75 0.
80.
85 0.9
0.95 1
addi
tiona
l enr
iche
d pa
thwa
ys0
1
2
3
4
cut−off for exclusion of geneswith low gene copy abundance
autumn winter
0.5
0.55 0.
60.
65 0.7
0.75 0.
80.
85 0.9
0.95 1
cons
erve
d en
riche
d pa
thwa
ys
0
2
4
6
8
cut−off for exclusion of geneswith low gene copy abundance
autumn winter
0.5
0.55 0.
60.
65 0.7
0.75 0.
80.
85 0.9
0.95 1
high
ly e
xpre
ssed
gen
es
0
50
100
150
200
250
cut−off for exclusion of geneswith low gene copy abundance
autumn winter
●
●
●●
●
●●
●
●
1 2 3 4 5 7 10 15 20 50
0
100
200
300
400
num
ber o
f gen
es
●
●
●
●
●●
noise [reads per kb]
simple cut−off
●● ● ●
●●●●●●● ● ●● ●
1 2 3 4 5 7 10 15 20 50
0
100
200
300
400
num
ber o
f gen
es
●
●● ●●
●●
●
noise [reads per kb]
exclusion of genes withlow copy abundance
●●● ●● ●
●
● ●
●●
●
1 2 3 4 5 7 10 15 20 50
0
100
200
300
400
num
ber o
f gen
es ●
●●
● ●●
●
noise [reads per kb]
simple cut−off
●
●●
1 2 3 4 5 7 10 15 20 50
0
100
200
300
400
num
ber o
f gen
es
● ● ● ●
noise [reads per kb]
exclusion of genes withlow copy abundance
38
pathways enriched in highly expressed genes, which are not found at the chosen cut-off but
after relaxation of the cut-off.
625
39
Supplementary Figure 4 Expression of KOs in metabolic pathways at the protein level. (a)
and (b) Comparison of gene copy abundances (KOGA) and protein abundances (NSI) in the
(a) autumn and (b) winter samples. (c) and (d) Comparison of gene transcript abundances
(KOTA) and protein abundances (NSI) in the (c) autumn and (d) winter samples. (e) and (f)
Comparison of expression values relative to gene copy numbers (KOGA), transcript
expression levels (KOTA) and protein abundances (rel. protein exp.) in the (e) autumn and (f)
winter samples. (c to f) Highly expressed KOs are highlighted in red.
a
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
● ●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
● ●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●●
●●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
● ●
●
●
●●●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
● ●●
● ●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●● ●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●●●
●
●
1e−03 1e−01 1e+01 1e+031e−07
1e−05
1e−03
1e−01
autu
mn
prot
eom
ics N
SI
autumn KOGA
●
●
●
●
●
●
●
●●
●
●● ●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●●
● ●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●●
●
●
●
●
●
●●●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●● ●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●●●●
●
●
●
●
1e−03 1e−01 1e+01 1e+031e−07
1e−05
1e−03
1e−01
wint
er p
rote
omics
NSI
winter KOGA
b
c
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
● ●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●●
●●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
● ●
●
●
●●●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
● ●●
● ●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●● ●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●●●
●
●
1e−03 1e−01 1e+01 1e+031e−07
1e−05
1e−03
1e−01
autu
mn
prot
eom
ics N
SI
autumn KOTA
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
● ●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●●●●●
e
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●●
●
●●●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●● ●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
● ●
●●●●
●
●
●●
●
●
●
●●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●●
●
●
●
●
●
●
● ●
●●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
● ●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●●●
●
●
1e−03 1e−01 1e+01 1e+031e−07
1e−05
1e−03
1e−01
autu
mn
rel.
prot
ein
expr
.
autumn relExp
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●●
●●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●●
●
●
●
●●
●
●
●
●●●●●●
d
f
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●● ●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●●●●
●
● ●
●●
●
●●
●●
●● ●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●●
●
●
●●
1e−03 1e−01 1e+01 1e+031e−07
1e−05
1e−03
1e−01
win
ter r
el. p
rote
in e
xpr.
winter relExp
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●●
●
●
●
●●
●
●
●●
●● ●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●●
●●
●
●
●
●
●
●
●
●●
●
●● ●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
● ●
● ●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●● ●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
● ●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●●●●
●
●
●
●
1e−03 1e−01 1e+01 1e+031e−07
1e−05
1e−03
1e−01
wint
er p
rote
omics
NSI
winter KOTA
●●
●
●
●
●
●
●
●
● ●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●●
●
●
●
●●●●●
40
Supplementary Figure 5 Generalized OMMC-wide metabolic network reconstructed from
the combined metagenomic and metatranscriptomic data of OMMCs sampled in autumn and
winter. Coloured nodes indicate pathways enriched in highly expressed KOs; blue – oxidative
phosphorylation; yellow – nitrogen metabolism; red – TCA cycle; purple – glycerolipid
metabolism. Large grey nodes are highly expressed during at least one season. Opacity of
nodes indicates shortest average path length (the more transparent, the longer the path
length).
Node 4
"K12253"
"K14269"
"K04128"
"K10674"
"K08357-K08358"
"K01698"
"K02083"
"K09880"
"K06720"
"K03688"
"K03852""K00797"
"K01922"
"K01874"
"K11180-K11181"
"K00549"
"K07173"
"K13038"
"K05827"
"K01778"
"K00275"
"K01712"
"K13747"
"K01423"
"K05275"
"K08353"
"K00198-K03518-K03519-K03520" "K00192-K00195"
"K01480""K12251""K09065"
"K01932"
"K01883-K01884"
"K01799"
"K01594""K02536"
"K03343"
"K01581"
"K12254""K10764"
"K12743"
"K03119"
"K00677"
"K01761"
"K01427-K01428-K01429-K01430-K14048"
"K01740""K01046-K14675" "K01738"
"K12339"
"K00290"
"K13746"
"K13034"
"K00471"
"K11263"
"K03474"
"K01672-K01674"
"K01457"
"K14541"
"K05888"
"K06281-K06282"
"K02492"
"K08688"
"K00640"
"K00194-K00197"
"K08295""K00169-K00170-K00171-K00172"
"K00631-K03621-K08591"
"K09011"
"K14138"
"K00122-K00123-K00124-K00127-K05299-K08348-K15022"
"K00620""K00174-K00175-K00176-K00177"
"K10702""K00382"
"K01026""K04042"
"K01655-K02594" "K14681"
"K00619-K14682"
"K01062"
"K01049""K02535"
"K00467"
"K01739""K00031"
"K06193""K00166-K00167-K11381"
"K01676-K01677-K01678-K01679" "K00137"
"K13929""K05823"
"K01060""K01438"
"K00316"
"K01039-K01040"
"K00655"
"K01034-K01035" "K01640"
"K00657""K00635""K01067""K01905"
"K00997-K06133"
"K00651"
"K00641""K00164""K00658"
"K04020"
"K01647-K01648-K15230-K15231" "K15741"
"K00625-K13788-K15024"
"K01659""K10977"
"K01714"
"K00161-K00162-K00163" "K02437"
"K03779""K00158""K00681"
"K00830"
"K11717"
"K01760-K14155" "K14085"
"K00823""K00156""K01758"
"K01512""K14260" "K12256"
"K11258""K01636"
"K16164-K16165"
"K01652-K01653" "K00101"
"K00822""K00016"
"K02510""K00129""K00027-K00028-K00029"
"K00835""K01958-K01959-K01960"
"K00149""K03417""K08691"
"K01255-K01256-K01270-K07751-K11140-K11142"
"K03737""K00128""K03416""K00656"
"K00827-K14272" "K00814"
"K08967" "K00824""K03851"
"K14264""K00260-K00261-K00262-K15371" "K01952""K00264-K00265-K00266"
"K00831"
"K01437""K07806"
"K01953""K00859""K01479""K09482"
"K05597""K00292-K00293" "K01469"
"K01923"
"K01937"
"K01656""K15849"
"K00287""K00289"
"K13501"
"K00603""K01915""K01425"
"K01930""K01443""K11754"
"K00764""K02433-K02434-K02435"
"K09758" "K13990"
"K01657-K01658-K13503" "K13497"
"K03342""K00812-K00813-K11358-K14454-K14455"
"K00817""K00820"
"K01458""K00832""K01664-K01665-K13950" "K00815"
"K00604"
"K09698"
"K02224"
"K01846""K02232" "K01885"
"K12234""K01663-K02500-K02501"
"K09251""K14268"
"K13821"
"K09470"
"K14163""K00826"
"K10206""K00819"
"K01776""K00284""K01580""K01744""K00294""K00836-K15785" "K00816""K14267"
"K00840""K00606""K15372""K13566"
"K00024-K00025-K00026-K00051-K00116" "K07250-K13524" "K00841"
"K01940""K01044"
"K03918""K01697""K00548"
"K14157""K00030""K01895"
"K01452"
"K01643-K01644"
"K13559"
"K13877"
"K00109""K01908"
"K04487""K13017""K01919-K11204"
"K00821""K08969"
"K00818""K01924""K05830"
"K00252"
"K07511"
"K01715"
"K01692"
"K07516""K00255-K07512" "K15016"
"K10527"
"K00023"
"K09479"
"K07515"
"K00665""K01964-K15036" "K11533"
"K00648"
"K01961-K01962-K01963-K02160-K11262" "K00668"
"K14471-K14472"
"K10852"
"K02372"
"K00253""K11410"
"K00208"
"K01968-K01969"
"K00209"
"K02371"
"K15019"
"K01847-K01848-K01849-K11942"
"K01687"
"K05939"
"K01027-K01028-K01029" "K01965-K01966-K15052"
"K00491-K13242" "K00232"
"K07514"
"K01613"
"K01825"
"K00248-K09478"
"K00074"
"K01782""K06445"
"K00249"
"K00234-K00235-K00237-K00239-K00240-K00241-K00242-K00244-K00245-K00246-K00247"
"K00271"
"K05606"
"K01755""K01468"
"K00052"
"K02517"
"K00263"
"K01439"
"K01555-K16171"
"K01682"
"K00647-K09458" "K00667"
"K01681"
"K03801"
"K03644"
"K00215""K00611" "K01478"
"K01921"
"K00059"
"K06447"
Node 5
"K12355""K01254"
"K00507"
"K00227"
"K00083"
"K07534"
"K07535"
"K13239" "K07753"
"K09829"
"K04118"
"K02609-K02610-K02611-K02612-K02613" "K10978"
"K10258"
"K00130"
"K13779"
"K05824"
"K12658"
"K01500"
"K02549"
Node 6
"K12405"
"K10214"
"K01467""K01800"
"K01796"
"K01705"
"K02618"
"K13019"
"K01661"
"K11731"
"K07536"
"K16173"
"K01484"
"K00213"
"K01434"
"K00454"
"K06015"
"K07425"
"K00508"
"K00300"
"K00257"
"K00490"
"K08262"
"K11987"
"K05607-K13766"
"K00108-K00499"
"K03921"
"K08683"
"K01615"
"K06446"
"K13767"
"K14446"
"K14534"
"K11538"
"K04126""K14732"
"K01708"
"K04127"
"K01720"
"K01702"
"K03383"
"K00286"
"K01703-K01704"
"K01881"
"K10793-K10794"
"K00097"
"K01706"
"K01777"
"K00472"
"K01466"
"K01870"
"K10536"
"K01869"
"K14259"
"K01477"
"K01707"
"K01941"
"K03182-K03186" "K01845"
"K01716""K13777-K13778" "K10780"
"K01583-K01584-K01585-K02626"
"K01887""K00019"
"K01873"
"K01476"
"K01673"
"K00318""K01750"
"K05551""K13018"
"K00634"
"K15232"
"K04110""K00140"
"K00627""K00613"
"K03821""K14469""K15018""K01911-K14760" "K14467""K07823"
"K08764""K01578""K08692-K14067-K14451" "K15742"
"K05822""K10817""K01902-K01903"
"K06718""K01899-K01900" "K00674"
"K16007""K15913"
"K02615""K01641""K00626"
"K01897-K15013"
"K15866""K09699""K01907""K00673"
"K14371"
"K00645""K01649""K00186-K00187" "K15880"
"K00652""K14245"
"K14674""K15038""K08748""K01913"
"K01041""K01912-K02614"
"K13356""K01896""K00021-K00054" "K04116""K03823"
"K12298""K00643""K11992"
"K13745""K06130"
"K14676""K00043"
"K05287" "K15359"
"K00100""K01075"
"K00132-K04073" "K00639""K00193"
"K13922""K01638"
"K00654"
"K13923"
"K04105""K00680"
"K01442"
"K00273"
"K04072""K04021"
"K01505"
"K01637""K00157"
"K14621""K01058""K00020"
"K01031-K01032"
"K00022""K00632"
"K07508-K07509" "K07513"
"K00135-K00139"
"K01076"
"K05605"
"K13776"
"K01904"
"K01068"
"K02428"
"K03426"
"K01515-K08312"
"K01791"
"K01241""K00698"
"K02527"
"K16330""K00876"
"K00925""K01950"
"K09759"
"K01914"
"K01886""K13035"
"K00602-K01492"
"K00278"
"K01424""K03465"
"K13998""K00926"
"K11541""K00931"
"K01006-K01007"
"K11540""K12657"
"K01954-K01955-K01956" "K00901"
"K04112-K04113-K04114-K04115"
"K00949""K12659""K13941""K12524-K12525"
"K00929""K00932""K00928"
"K01522""K00939"
"K00867-K03525-K09680" "K01008"
"K00951"
"K00868"
"K00930""K00950""K00940""K14154"
"K02231""K00874"
"K10807-K10808"
"K00857""K00942"
"K00955-K13811"
"K00765-K02502" "K06164"
"K15519""K00856""K00899"
"K13930""K00943""K00860"
"K00944""K00985"
"K06982""K03338""K16328""K00948""K10353"
"K01431""K00133"
"K00394-K00395" "K00003""K00364"
"K01942-K03524"
"K15728"
"K00088"
"K00609-K00610" "K15786" "K13547"
"K12526"
"K01951""K01579"
"K01779""K03800"
"K01918"
"K00143" "K01756"
"K05957""K01939""K01876"
"K15894"
"K00145"
"K05829"
"K00599"
"K00679"
"K00772"
"K13294"
"K00759"
"K01490"
"K05907""K11212"
"K01081-K03787-K11751"
"K01120-K13298"
"K00760"
"K01928"
"K01844"
"K16054"
"K01749"
"K01929"
"K08964"
"K03340"
"K00440-K00441-K00443"
"K00329-K00330-K00331-K00332-K00333-K00334-K00335-K00336-K00337-K00338-K00339-K00340-K00341-K00342-K00343-K03878-K03879-K03880-K03881-K03882-K03883-K03884"
"K01719"
"K06042""K05895"
"K01935"
"K08963"
"K04835"
"K08351""K01012"
"K03184"
"K06126""K06134"
"K02827-K02828"
"K02230-K09882-K09883"
"K00355"
"K02304""K01599"
"K03795"
"K03185"
"K01470"
"K03365"
"K03403-K03404-K03405"
"K13016-K13020"
"K00231"
"K01772""K01473-K01474"
"K00979"
"K11185-K11186"
"K12725"
"K11207"
"K00228-K02495"
Node 3
"K12728"
"K01627"
"K03270"
"K05934""K00558"
"K05928"
"K00563-K00564-K06970-K08316-K12297"
"K03183"
"K05936" "K01843"
"K13623""K02188"
"K03428""K01611"
"K13542-K13543"
"K00833"
"K06127"
"K09187-K09188-K11419-K11420-K11423-K11429"
"K00589-K02303-K02496" "K02302"
"K03394"
"K15792"
"K00595"
"K00568"
"K01485"
"K11752"
"K00790"
"K13015"
"K01498"
"K00956-K00957-K00958"
"K12711" "K00077"
"K15998"
"K15989"
"K00773"
"K01591"
"K00788"
"K01465"
"K11121""K03707"
"K00969-K06210-K13522"
"K13421"
"K01488""K15938""K00106"
"K08693""K01486""K02473"
"K00972-K11528"
"K01489"
"K00758"
"K00757"
"K00087-K11177-K11178-K13480-K13481-K13482-K13483"
"K03783-K03784" "K01487"
"K00756""K03815"
"K01815"
"K02259"
"K02474"
"K03274"
"K01119"
"K03337"
"K00065"
"K01514-K01524" "K09018-K09024"
"K00763"
"K00767"
"K00010"
"K03462"
"K08281""K03517"
"K14447"
"K00469"
"K00207"
Node 2
"K00768"
"K00226-K00254" "K02472"
"K00762"
"K13763"
"K01493"
"K01000"
"K01239"
"K01494"
"K03269"
"K00769-K03816" "K15780"
"K01520""K02563"
"K02233"
"K07264"
"K01129""K00761-K02825"
"K02319-K02320-K02321-K02322-K02323-K02324-K02325-K02327-K02335-K02337-K02338-K02339-K02340-K02341-K02342-K02343-K02684-K03504-K03763-K14159"
"K16329"
"K01814""K11755"
"K01523"
"K03335"
"K01195"
"K01686"
"K16044"
"K01464"
"K04034-K04035"
"K01685"
"K00543"
"K00789"
"K01243""K13621"
"K00390""K00547"
"K13622"
"K01251"
"K00013"
"K01745"
"K01089""K01598"
"K01892"
"K01014"
"K01082"
"K01590"
"K00954-K02201"
"K01693"
"K01586""K00380-K00381-K00392"
"K02169"
"K00551-K00570"
"K05362"
"K00802"
"K03897"
"K00559"
"K04075"
"K04566-K04567-K04568"
"K02301"
"K00040"
"K00082""K00041"
"K01812""K03336"
"K01496"
"K14152"
"K00798"
"K03927"
"K00075"
"K02225-K02227"
"K04486-K05602" "K13388"
"K12726"
"K00387-K05301-K07147"
"K12502"
"K00588"
"K00569"
"K13384"
"K05831"
"K01925"
"K01582"
"K08968""K02228"
"K03785-K03786"
"K00973"
"K01858"
"K01819""K10708""K00066"
"K10046"
"K01711"
"K00966-K00971" "K09461"
"K01092"
"K00322-K00323-K00324-K00325"
"K01690"
"K06118"
"K00800"
"K01820"
"K00483"
"K01916"
"K00218"
"K04040"
"K11333-K11334-K11335"
"K04037-K04038-K04039"
"K00039"
"K14448"
"K02377"
"K00975"
"K01057-K07404" "K12450"
"K00354"
"K13832""K00014"
"K00012""K01784"
"K00965""K00963"
"K05358"
"K00036"
"K01792"
"K01735"
"K15917"
"K01085"
"K01596"
"K00560""K01610""K02586-K02588-K02591"
"K01948""K02203"
"K00927""K00865-K15788" "K10760"
"K00864""K00863-K05878-K05879"
"K11529""K00855"
"K00852""K07031""K13829"
"K00917""K00962""K00919""K00524-K00525-K00526" "K00848""K00888"
"K00921""K02999-K03002-K03004-K03005-K03006-K03007-K03010-K03011-K03013-K03016-K03018-K03021-K03023-K03040-K03041-K03042-K03043-K03044-K03046-K03048-K03060-K13797-K13798" "K13940"
"K00869""K00872-K02204"
"K01768-K08043-K08048-K08049-K11029" "K01525"
"K01509-K01553" "K00923"
"K00946"
"K00882"
"K12446""K00885"
"K00884""K00853"
"K00922"
"K00877-K00941" "K00878""K00527""K00912"
"K14153""K00887"
"K13799"
"K16146""K00854"
"K00847"
"K00850""K00849"
"K00900"
"K13057"
"K11753"
"K13830""K00851"
"K03272""K10710""K00858""K00945-K13800" "K00703-K16148" "K00891""K04765"
"K09903"
"K01623-K01624-K11645"
"K01621"
"K00134-K00150-K05298-K10705"
"K01783"
"K00131""K01622"
"K01635-K08302"
"K01803"
"K01689" "K13831"
"K01601-K01602"
"K06131-K06132"
"K00616"
"K00615"
"K13810"
"K00058"
"K02446-K03841-K04041" "K01837"
"K01834-K15633-K15634-K15635"
"K01628"
"K01841"
"K01629""K01662"
"K00096"
"K03431""K09483"
"K08097"
"K03635-K03636"
"K04713"
"K16019"
"K03473" "K01186"
"K00983"
"K00968"
"K00523-K12452"
"K00794"
"K16020"
"K10960"
"K00995"
"K03637-K03639"
"K01808"
"K01807"
"K00384"
"K00796" "K16306"
"K00048""K03472"
"K02548"
"K01077-K01113"
"K01654"
"K01183-K13381" "K06120-K06121-K06122"
"K01053"
"K00493-K14338"
"K01095-K01096"
"K00005"
"K00011"
"K01632"
"K01633"
"K02564-K12410"
"K15669"
"K01101"
"K06920"
"K03476"
"K00423-K00434"
Node 1
"K01709"
"K01818"
"K00103"
"K06879-K09457"
"K12451"
"K01790"
"K01218"
"K14470"
"K01179-K01225"
"K00067"
"K03082"
"K13085"
"K03587-K05364-K05365" "K14394"
"K03526"
"K00793"
"K09709""K04712"
"K01198""K01078-K09474"
"K00090-K06152"
"K02821-K03475" "K01207"
"K00299""K02771" "K03273""K01117"
"K02786-K02787-K02788" "K02815""K07026"
"K14449" "K02798-K02799-K02800" "K02775"
"K00099"
"K01710"
"K02768-K02769-K02770"
"K02802-K02803-K02804"
"K01786-K03077-K03080"
"K02763-K02764-K02765" "K01804""K02782-K02783"
"K01189-K07406-K07407"
"K04719"
"K01222"
"K02777" "K01787"
"K01626-K03856"
"K00068""K00009"
"K00033"
"K11532"
"K00895"
"K01770"
"K16055""K05947"
"K00720""K00697"
"K01810-K06859"
"K01835"
"K01112""K12506"
"K03271"
"K16011"
"K15779"
"K15778"
"K01809"
"K06153"
"K15916"
"K00721""K00754"
"K01839"
"K00845"
"K00696"
"K10012"
"K01139" "K03429"
"K00748"
"K00693-K00694-K16150-K16153"
"K01769-K12319-K12320-K12323-K12324" "K15914"
"K13656"
"K01805"
"K01187-K12317-K15922"
"K01201"
"K00008"
"K01182"
"K01193"
"K02778-K02779"
"K00886""K01232"
"K05351""K00691-K00705" "K01087"
"K01226"
"K00978"
"K01838""K01194-K05342"
"K01223""K12309"
"K01199"
"K01190""K05343"
"K01188-K05349-K05350" "K01229-K12111"
"K00034"
"K02793-K02794-K02795-K02796" "K02790-K02791"
"K02817-K02818-K02819"
"K12308"
"K02808-K02809-K02810"
"K08679"
"K00045""K00115-K00117"
"K00689-K00690-K05341"
"K01220""K08678""K01785"
"K00991"
"K01178""K01840"
"K00702"
"K15982-K15983"
"K00004"
"K15817""K15812"
"K06167"
"K15813""K01822"
"K13370" "K15054"
"K00511"
"K00251"
"K12343-K12344"
"K16051"
"K09472"
"K05898"
"K01728""K01253"
"K10219"
"K03391"
"K00151"
"K10221"
"K10027"
"K02294"
Node 7
"K15745" "K14605""K09835""K02293"
"K06013"
"K00514"
"K01042"
"K01634"
"K00119""K00376"
"K09459"
"K00967"
"K04708""K05884"
"K15888"
"K15887"
"K00805"
Node 8
"K05356"
"K03831-K15376"
"K15857"
"K12503"
"K00801"
"K05954"
"K10187"
"K11778"
"K02291""K14066"
"K01597""K05355"
"K13787-K13789"
"K00787-K00795"
"K06443"
"K06045"
"K00044"
"K02292"
"K01499"
"K09879"
"K00806"
"K01823"
"K01130"
"K00791"
"K03379"
"K03527"
"K10026"
"K05979"
"K02523"
"K03392"
"K15767"
"K14520"
"K01061"
"K03381"
"K01801"
"K00450"
"K01857"
"K01588"
"K01055"
"K01821"
"K13774-K13775"
"K03333-K16045"
"K14727"
"K10217"
"K13043" "K15067"
"K00217"
"K16242-K16243-K16244-K16245-K16246"
"K00452"
"K09516"
"K00368"
"K01502"
"K00258"
"K10700"
"K01852""K00485"
"K11693"
"K01853"
"K03366"
"K10619-K16303"
"K10618"
"K07538"
"K06163"
"K14083"
"K14519"
"K01576"
"K05797"
"K09471"
"K15758""K00399-K00401-K00402"
"K05921"
"K03388-K03389-K03390"
"K01575"
"K01725"
"K00446"
"K01617"
"K12468"
"K00451"
"K01131"
"K05783"
"K01589"
"K07539"
"K01607"
"K03464"
"K00224"
"K15981-K16046"
"K10676"
"K01856"
"K00222"
"K05917"
"K00480""K00517"
"K10797""K11147-K11153"
"K05915"
"K01868"
"K01933"
"K00053"
"K00060-K15789"
"K11808""K00155"
"K03367-K14188"
"K05549-K05550-K05784"
"K07249"
"K02551"
"K01721" "K10220"
"K00360-K00367-K00370-K00371-K00372-K00373-K00374-K02567-K02568"
"K00448-K00449"
"K02636"
"K01051"
"K04480-K14081"
"K02164-K02305-K02448-K04561-K04748"
"K01826"
"K01002"
"K00466""K00453""K07106"
"K01609""K00486"
"K00055"
"K01564"
"K04097"
"K00319-K13942"
"K04099""K00383"
"K00432"
"K04100-K04101"
"K02170""K00481""K00799"
"K10535""K00320""K00577-K00581-K00582-K00583-K00584"
"K01817""K01560-K01561"
"K15228" "K13498""K01069"
"K01759""K15762"
"K04091"
"K01618"
"K01816"
"K00315"
"K10713"
"K00430-K11188"
"K03781""K03396"
"K01699-K13919-K13920"
"K06116-K06117" "K00980"
"K00502"
"K00086"
"K11780-K11781"
"K01091"
"K08093"
"K01867"
"K00463"
"K01734"
"K01866""K15226"
"K00501""K00500"
"K04518-K05359"
"K01889-K01890" "K01746"
"K01934" "K00766""K03782"
"K13951-K13952-K13980"
"K00492"
"K00459"
"K00001"
"K00529-K05710"
"K00114""K13954"
"K13953""K00121"
"K12732"
"K10775""K10944-K10945-K10946"
"K14028"
"K00285"
"K00104-K11472-K11473-K11517" "K00274"
"K03418""K14170"
"K01079"
"K00015-K00018" "K00200-K00203-K00205-K11260-K11261"
"K05603"
"K00148""K01592"
"K00457""K00270"
"K00276""K01569"
"K01851-K02361-K02552"
"K01875"
"K01070"
"K01563""K02554"
"K00141"
"K01066"
"K15739"
"K04103"
"K03179"
"K00146"
"K01920"
"K01608""K01737"
"K01917"
"K00301-K00302-K00303-K00314"
"K10774"
"K01126"
"K01726""K01620"
"K01945"
"K14422"
"K00102-K03777-K03778"
"K01668""K00600""K03181""K01666"
"K03430""K01667"
"K01872"
"K01011""K02619"
"K07246"
"K00605"
"K01752"
"K00259""K01571-K01572-K01573"
"K10218""K14759"
"K01754""K01775"
"K01501""K01568"
"K04782"
"K05396""K11787"
"K15864""K00362-K00363-K00366-K03385-K04016-K15876"
"K01556""K03735-K03736-K04019"
"K10216""K04022""K15066"
"K01733"
"K01451"
"K11788"
"K09019-K16066"
"K03153""K01426"
"K14419-K14420"
"K01878-K01879-K01880-K14164"
"K00281-K00282-K00283"
"K00050-K15893"
"K00210-K00211-K04517"
"K00042""K00220"
"K00002"
"K13812"
"K00998"
"K00873-K12406"
"K00981"
"K00468"
"K06215-K08681"
"K01495-K09007"
"K10011"
"K13036"
"K14652"
"K01893"
"K01497"
"K00505"
"K14187"
"K01619""K01736"
"K04093-K04516-K06208-K06209" "K00049-K12972"
"K02858"
"K03738"
"K01694""K01695-K01696-K06001"
"K03339"
"K00006-K00057-K00105-K00111-K00113"
"K16305"
"K00147""K13853"
"K01433-K01938-K13402"
"K01455"
"K05601""K01447-K01448"
"K07248"
"K00601-K08289-K11175" "K00288"
"K01639"
"K00297"
"K01114"
"K01625""K01595"
"K01593""K01491-K13403"
"K01631"
41
Supplementary Figure 6 Representation of the fatty acid metabolic pathway (ko01212).
KOs with a betweenness centrality that is much higher in the metabolic network
reconstructed from the OMMC sampled in winter compared to the OMMC sampled in
autumn are highlighted in red. The key functionality of KO K03921 (acyl-[acyl-carrier
protein] desaturase) is highlighted in pink. Winter KOs with a high relative gene expression
are represented in blue. The dotted lines represent the continuity of the pathway and the
circles represent metabolites.
Palmi&c(acid(Hexadecanoyl0CoA(
K01897(
Hexadecanoyl(
K00665(
Stearoyl0CoA( Oleoyl0CoA( Linoleoyl0CoA(K00059(K10258(
K00647(K00208((K02372(K00647(K09458(K00208(K02371(K00059(K10258(
K03921' K10256(
Icosanoyl0CoA( Icosenoyl0CoA( Icosadienoyl0CoA(
Octadecatrienoyl0CoA(K10257(
Icosatrienoyl0CoA(
…..(
…..(
…..( …..(
K00059(
Acetyl0CoA(
…..(
K01716(K00665(K02371(K02372(K09458(K00208(K00647(
Malonyl0ACP(
42
Supplementary Figure 7 OMMC-wide metabolic networks reconstructed from
metagenomic and metatranscriptomic data of OMMCs sampled in (a) autumn and (b) winter.
Large nodes indicate KOs encoding key functionalities whereas colours represent pathway
membership: yellow, nitrogen metabolism; orange, fatty acid biosynthesis; light green,
benzoate degradation; dark green, porphyrin and chlorophyll metabolism; pink, cystein and
methionine metabolism and red, other pathways.
630
a b
43
Supplementary Figure 8 Micrographs of Isolate LCSB065. (a) Non-polar granules observed
in Isolate LCSB065 following Nile Red staining (λex 550/20 nm, λem 620/60 nm); (b)
Bright field micrograph; (c) Overlay of (a) and (b). The scale bar is equivalent to 10 µm.
a b c
44
Supplementary Tables
Supplementary Table 1 Physicochemical characteristics of the wastewater at the time of
sampling in the anoxic tank of the Schifflange BWWT plant.
Sampling dates Suspended
solids (g/l)
pH NO3-
(mg/l)
O2
(mg/l)
NH4
(mg/l)
PO4
(mg/l)
Air
temperature
(°C)
Water
temperature
(°C)
4 October 2010 2.78 6.94 2.77 0.64 0.6 2.06 15 20.7
25 January 2011 3.24 7.01 3.37 1.13 1.72 1.02 0 14.5
635
45
Supplementary Table 2 Quantitative and qualitative analyses of biomacromolecular
fractions sequentially isolated from the OMMC samples.
Sampling dates DNA RNA Protein
A260/A280 Quantity
[µg]
RIN Quantity
[µg]
Quantity per
gel lane [µg]
4 October 2010 2.20 15.44 8.9 81.97 20.1
25 January 2011 2.03 6.11 9.2 75 21.8
46
Supplementary Table 3 Statistics of the combined assembly of the metagenomic and
metatranscriptomic sequence datasets. 640
Statistic Value Unit
Total contig length 1 617 735 059 nt
Average contig length 239 nt
N50 258 nt
Maximal contig length 12 591 nt
Number of contigs 6 761 781
47
Supplementary Table 4 Ammonia-oxidizing organisms with corresponding accession
numbers of amoA genes used for the reconstruction of the phylogenetic tree.
645
Organism name Accession number
Candidatus Nitrososphaera gargensis gi|166007511
Methylobacter tundripaludum gi|493947876
Methylococcus capsulatus Bath gi|53803062
Methylomicrobium alcaliphilum 20Z gi|357403888
Nitrosococcus halophilus Nc4 gi|292490804
Nitrosococcus oceani ATCC19707 gi|77165960
Nitrosococcus watsonii C-113 gi|300113334
Nitrosomonas cryotolerans ATCC49181 gi|12620331
Nitrosomonas europaea ATCC19718 gi|30248947
Nitrosomonas eutropha C91 gi|114332043
Nitrosomonas sp. Is79A3 gi|339481967
Nitrosomonas sp. JL21 gi|19310210
Nitrosomonas sp. AL212 gi|325981491
Nitrosospira briensis C-128 gi|1732262
Nitrosospira multiformis ATCC25196 gi|82701932
Nitrosospira sp. APG3A gi|490283770
Nitrosospira sp. En13 gi|121483569
Nitrosospira sp. NpAV gi|2062746
Nitrosospira sp. Np39-19 gi|2425028
Nitrosovibrio tenuis gi|1732264
48
Supplementary Table 5 Numbers of metagenomic and metatranscriptomic reads from
autumn and winter dates mapping to the isolate genomes of Nitrosomonas sp. Is79 (ref. 30)
and Isolate LCSB065.
Dataset Sampling dates Target organism
Nitrosomonas sp. IS79 Isolate LCSB065
metagenomic reads 4 October 2010 7 282 5 665
metagenomic reads 25 January 2011 13 058 7 941
metatranscriptomic reads 4 October 2010 4 004 1 923
metatranscriptomic reads 25 January 2011 28 631 20 784
49
Supplementary References
1. Roume H, Muller EE, Cordes T, Renaut J, Hiller K, Wilmes P. A biomolecular isolation
framework for eco-systems biology. ISME J 2013; 7: 110-121. 650
2. Roume H, Heintz-Buschart A, Muller EE, Wilmes P. Sequential isolation of metabolites,
RNA, DNA, and proteins from the same unique sample. Microbial Metagenomics,
Metatranscriptomics, and Metaproteomics. Method Enzymol 2013; 531: 219-236.
655
3. Kozarewa I, Turner DJ. 96-plex molecular barcoding for the Illumina Genome Analyzer.
High-Throughput Next Generation Sequencing 2011; 279-298.
4. Masella AP, Bartram AK, Truszkowski JM, Brown DG, Neufeld JD. PANDAseq: paired-
end assembler for illumina sequences. BMC Bioinformatics 2012; 13: 31. 660
5. Kofler R, Orozco-terWengel P, De Maio N, Pandey RV, Nolte V, Futschik A et al.
PoPoolation: a toolbox for population genetic analysis of next generation sequencing data
from pooled individuals. PloS one 2011; 6: e15925.
665
6. Huang Y, Niu B, Gao Y, Fu L, Li W. CD-HIT Suite: a web server for clustering and
comparing biological sequences. Bioinformatics 2010; 26: 680-682.
7. Kultima JR, Sunagawa S, Li J, Chen W, Chen H, Mende DR et al. MOCAT: a
metagenomics assembly and gene prediction toolkit. PloS one 2012; 7: e47656. 670
8. Li R, Li Y, Kristiansen K, Wang J. SOAP: short oligonucleotide alignment program.
50
Bioinformatics 2008; 24: 713-714.
9. Miller CS, Baker BJ, Thomas BC, Singer SW, Banfield JF. EMIRGE: reconstruction of 675
full-length ribosomal genes from microbial community short read sequencing data. Genome
Biol 2011; 12: R44.
10. Quast C, Pruesse E, Yilmaz P, Gerken J, Schweer T, Yarza P et al. The SILVA ribosomal
RNA gene database project: improved data processing and web-based tools. Nucleic Acids 680
Res 2013; 41: D590-D596.
11. Treangen TJ, Koren S, Sommer DD, Liu B, Astrovskaya I, Ondov B et al. MetAMOS: a
modular and open source metagenomic assembly and analysis pipeline. Genome Biol 2013;
14: R2. 685
12. Namiki T, Hachiya T, Tanaka H, Sakakibara Y. MetaVelvet: an extension of Velvet
assembler to de novo metagenome assembly from short sequence reads. Nucleic Acids Res
2012; 40: e155-e155.
690
13. Rho M, Tang H, Ye Y. FragGeneScan: predicting genes in short and error-prone reads.
Nucleic Acids Res 2010; 38: e191-e191.
14. Hyatt D, Chen G-L, LoCascio P, Land M, Larimer F, Hauser L. Prodigal: prokaryotic
gene recognition and translation initiation site identification. BMC Bioinformatics 2010; 11: 695
119.
51
15. Kent WJ. BLAT—the BLAST-like alignment tool. Genome Res 2002; 12: 656-664.
16. Kessner D, Chambers M, Burke R, Agus D, Mallick P. ProteoWizard: open source 700
software for rapid proteomics tools development. Bioinformatics 2008; 24: 2534-2536.
17. Craig R, Cortens JP, Beavis RC. Open source system for analyzing, validating, and
storing protein identification data. J Proteome Res 2004; 3: 1234-1242.
705
18. Deutsch EW, Mendoza L, Shteynberg D, Farrah T, Lam H, Tasman N et al. A guided
tour of the Trans-‐Proteomic Pipeline. Proteomics 2010; 10: 1150-1159.
19. Keller A, Nesvizhskii AI, Kolker E, Aebersold R. Empirical statistical model to estimate
the accuracy of peptide identifications made by MS/MS and database search. Anal Chem 710
2002; 74: 5383-5392.
20. Shteynberg D, Deutsch EW, Lam H, Eng JK, Sun Z, Tasman N et al. iProphet: multi-
level integrative analysis of shotgun proteomic data improves peptide and protein
identification rates and error estimates. Mol Cell Proteomics 2011; 10: M111.007690. 715
21. Muller EE, Pinel N, Laczny CC, Hoopmann MR, Narayanasamy S, Lebrun LA et al.
Community-integrated omics links dominance of a microbial generalist to fine-tuned
resource usage. Nat Commun 2014; 5: 5603; 1-10.
720
22. Lee S, Seo CH, Lim B, Yang JO, Oh J, Kim M et al. Accurate quantification of
transcriptome from RNA-Seq data by effective length normalization. Nucleic Acids Res 2011;
52
39: e9-e9.
23. Tsementzi D, Poretsky R, Rodriguez-‐R LM, Luo C, Konstantinidis KT. Evaluation of 725
metatranscriptomic protocols and application to the study of freshwater microbial
communities. Environ Microbiol Rep 2014; 6: 640-655.
24. Benjamini Y, Hochberg Y. Controlling the false discovery rate: a practical and powerful
approach to multiple testing. Journal of the Royal Statistical Society. Series B 730
(Methodological) 1995; 289-300.
25. Shannon P, Markiel A, Ozier O, Baliga NS, Wang JT, Ramage D et al. Cytoscape: a
software environment for integrated models of biomolecular interaction networks. Genome
Res 2003; 13: 2498-2504. 735
26. Rahman SA, Schomburg D. Observing local and global properties of metabolic
pathways:‘load points’ and ‘choke points’ in the metabolic networks. Bioinformatics 2006;
22: 1767-1774.
740
27. Faust K, Croes D, van Helden J. Metabolic pathfinding using RPAIR annotation. J Mol
Biol 2009; 388: 390-414.
28. Csardi G, Nepusz T. The igraph software package for complex network research.
InterJournal, Complex Systems 2006; 1695: 1-9. 745
29. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search
53
tool. J Mol Biol 1990; 215: 403-410.
30. Bollmann A, Sedlacek CJ, Norton J, Laanbroek HJ, Suwa Y, Stein LY et al. Complete 750
genome sequence of Nitrosomonas sp. Is79, an ammonia oxidizing bacterium adapted to low
ammonium concentrations. Stand Genomic Sci 2013; 7: 469.
31. Liu W, Li L, Khan MA, Zhu F. Popular molecular markers in bacteria. Mol Genet
Microbiol Virol 2012; 27: 103-107. 755
32. Sommer DD, Delcher AL, Salzberg SL, Pop M. Minimus: a fast, lightweight genome
assembler. BMC Bioinformatics 2007; 8: 64.
33. Treangen TJ, Sommer DD, Angly FE, Koren S, Pop M. Next generation sequence 760
assembly with AMOS. Curr Protoc Bioinformatics 2011; 11.8. 1-11.8. 18.
34. Sayavedra-‐Soto L, Hommes N, Alzerreca J, Arp D, Norton JM, Klotz M. Transcription of
the amoC, amoA and amoB genes in Nitrosomonas europaea and Nitrosospira sp. NpAV.
FEMS Microbiol Lett 1998; 167: 81-88. 765
35. Fu L, Niu B, Zhu Z, Wu S, Li W. CD-HIT: accelerated for clustering the next-generation
sequencing data. Bioinformatics 2012; 28: 3150-3152.
36. Dereeper A, Guignon V, Blanc G, Audic S, Buffet S, Chevenet F et al. Phylogeny. fr: 770
robust phylogenetic analysis for the non-specialist. Nucleic Acids Res 2008; 36: W465-W469.
54
37. Castresana J. Selection of conserved blocks from multiple alignments for their use in
phylogenetic analysis. Mol Biol Evol 2000; 17: 540-552.
775
38. Guindon S, Dufayard J-F, Lefort V, Anisimova M, Hordijk W, Gascuel O. New
algorithms and methods to estimate maximum-likelihood phylogenies: assessing the
performance of PhyML 3.0. Syst Biol 2010; 59: 307-321.
39. Chevenet F, Brun C, Bañuls A-L, Jacq B, Christen R. TreeDyn: towards dynamic 780
graphics and annotations for analyses of trees. BMC Bioinformatics 2006; 7: 439.
40. Reasoner D, Geldreich E. A new medium for the enumeration and subculture of bacteria
from potable water. App Environ Microbiol 1985; 49: 1-7.
785
41. Levantesi C, Rossetti S, Thelen K, Kragelund C, Krooneman J, Eikelboom D et al.
Phylogeny, physiology and distribution of ‘Candidatus Microthrix calida’, a new Microthrix
species isolated from industrial activated sludge wastewater treatment plants. Environ
Microbiol 2006; 8: 1552-1563.
790
42. Slijkhuis H. Microthrix parvicella, a filamentous bacterium isolated from activated sludge:
cultivation in a chemically defined medium. App Environ Microbiol 1983; 46: 832-839.
43. Fowler SD, Greenspan P. Application of Nile red, a fluorescent hydrophobic probe, for
the detection of neutral lipid deposits in tissue sections: comparison with oil red O. J 795
Histochem Cytochem 1985; 33: 833-836.
55
44. Schneider CA, Rasband WS, Eliceiri KW. NIH Image to ImageJ: 25 years of image
analysis. Nat Methods 2012; 9: 671-675.
800
45. Rodrigue S, Materna AC, Timberlake SC, Blackburn MC, Malmstrom RR, Alm EJ,
Chisholm SW. Unlocking short read sequencing for metagenomics. PLoS one 2010; 5:
e11840.
46. Xu H, Luo X, Qian J, Pang X, Song J, Qian G et al. FastUniq: a fast de novo duplicates 805
removal tool for paired short reads. PloS one 2012; 7: e52249.
47. Peng Y, Leung HC, Yiu S-M, Chin FY. IDBA-UD: a de novo assembler for single-cell
and metagenomic sequencing data with highly uneven depth. Bioinformatics 2012; 28: 1420-
1428. 810
48. Gordon D, Abajian C, Green P. Consed: a graphical tool for sequence finishing. Genome
Res 1998; 8: 195-202.
49. Aziz RK, Bartels D, Best AA, DeJongh M, Disz T, Edwards RA et al. The RAST Server: 815
rapid annotations using subsystems technology. BMC Genomics 2008; 9: 75.
50. Li H, Durbin R. Fast and accurate short read alignment with Burrows–Wheeler transform.
Bioinformatics 2009; 25: 1754-1760.
820
51. Kerepesi C, Bánky D, Grolmusz V. AmphoraNet: The webserver implementation of the
AMPHORA2 metagenomic workflow suite. Gene 2014; 533: 538-540.
56
52. Tatusov RL, Galperin MY, Natale DA, Koonin EV. The COG database: a tool for
genome-scale analysis of protein functions and evolution. Nucleic Acids Res 2000; 28: 33-36. 825
53. Overbeek R, Begley T, Butler RM, Choudhuri JV, Chuang H-Y, Cohoon M et al. The
subsystems approach to genome annotation and its use in the project to annotate 1000
genomes. Nucleic Acids Res 2005; 33: 5691-5702.
830