Multi-omics analysis of niche specificity provides new insights into
ecological adaptation in bacteria
Bo Zhu1*, Muhammad Ibrahim1*, Zhouqi Cui1, Guanlin Xie1, Gulei Jin2, Michael
Kube3, Bin Li1$, Xueping Zhou1$
1State Key Laboratory of Rice Biology, Institute of Biotechnology, Zhejiang
University, Hangzhou 310029, China2Hangzhou Guhe Info Co., Ltd, Hangzhou 310029, China3Albrecht Daniel Thaer-Institute of Agricultural and Horticultural Sciences,
Humboldt-Universität zu Berlin, 14195 Berlin, Germany
Running title: Ecological adaptation in B. seminalis
*Authors contribute equally to the work$Corresponding author:
Bin Li, Xueping Zhou
Mailing address: State Key Laboratory of Rice Biology, Institute of Biotechnology,
Zhejiang University, 310058, Hangzhou, China.
Phone: 86-571-88982412. Fax: 86-571-88982412.
[email protected]; [email protected]
Conflict of Interest Statement
The authors declare no conflict of interest.
1
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
12
Materials and Methods
Strains used in this study
B. seminalis strains DSM 23518 (= LMG 24067 T), 0901, S9 and R456 originated
from CF patient’s sputum (Vanlaere et al 2008), diseased apricot (Fang et al 2009),
westlake water (Fang et al 2011), and rice rhizospheric soil (Li et al 2011),
respectively. Unless otherwise specified, cultures of bacterial strains were maintained
on nutrient agar (NA) or nutrient broth (NB) media at 30°C prior to use. Cultures
were stored long term in 20% aqueous glycerol at -80°C.
Characterization of ecological roles
B. seminalis strains were tested for virulence in the alfalfa model (Bernier et al 2003),
which was carried out as described by Ibrahim et al. (2012). Pathogenicity of B.
seminalis to apricot was examined according to the method of Fang et al. (2009)
except that premature fruits were inoculated with 10 μL of bacterial suspensions at the
concentration of 1 × 105 CFU/mL using sterilized tips. Inhibition of B. seminalis on
the mycelial growth of R. solani was determined according to the method of Li et al.
(2011). The morphology of bacterial cells was observed using a JEOL JSM-6400
scanning electron microscope (Hitachi, Tokyo, Japan).
Growth in various niches
Adaptation of B. seminalis strains to various niches were investigated by incubating
the four strains under CF, water and soil extract media, respectively, while plant
condition was excluded for only strain 0901 was pathogenic to apricot. Water medium
that contains M9 minimal salts with 3% glycerol, was used to simulate the water
environment (Schell et al 2011). CF medium was prepared to mimic the sputum of CF
patients according to the method of (Dinesh 2010). Soil extract medium was prepared
to mimic soil conditions based on recent paper (Yoder-Himes et al 2009) with the
exception that soil was collected from the rice rhizosphere, which was the original
2
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
34
niche for strain R456. In addition, three different niche conditions were tested for
each of strains S9, DSM 23518, and R456. The bacterial numbers were counted based
on the measurement of OD 600 value (Ibrahim et al 2012).
Whole genome sequencing, assembly and annotation
Bacterial genomic DNA, isolated using Wizard Genomic DNA Purification Kit
(Promega, Madison, WI, USA), was used for whole-genome sequencing, which was
performed by using Pacbio sequencing (Pacific Biosciences, Menlo Park, CA, USA),
454 sequencing (Roche, Branford, CT, USA) and Illumina sequencing (Illumina, San
Diego, CA, USA). Sequence runs for four single-molecule real-time (SMRT) cells
were performed on the PacBio RS II sequencer with a 120-minute movie time/SMRT
cell. SMRT Analysis portal version 2.1 was used for read filtering and adapter
trimming, with default parameters, and postfiltered data of 350 - 580 Mb (around 40 -
60X coverage) on each cell/per strain with an average read length of 7 kb were
considered for further assembly. All the four genomes were first de novo assembled
using HGAP assembly protocol, which is available with the SMRT Analysis packages
and accessed through the SMRT Analysis Portal version 2.1. After this first round,
PBJelly V14.1.14 was used to fill and reduce as many captured gaps as possible to
produce upgraded draft genomes (English et al 2012). As B. seminlais genomes are
much bigger than that of the normal bacteria, around 50 scaffolds were generated after
this step. Then quality filtered Illumina and 454 sequencing reads were then used to
correct the false SNPs and Indels due to the low coverage in some regions. Also, these
reads were used enabling gap closure on the pre-assembled genomes by using WGS-
assembler and SSPACE (Boetzer et al 2011, Myers et al 2000). Finally, the consensus
was obtained based on the above procedure. If it was not complete sequence,
scaffolding and gap closure were repeated again until we get the almost complete
bacterial genome sequences.
Coding DNA Sequences (CDSs) were predicted using Prodigal version 2.6 with
default parameters (Hyatt et al 2010). To refine the accuracy, RNA-Seq results were
also used for improvement of gene prediction. Gene functions were automatically
assigned by RAST annotation engine (Aziz et al 2008) Predicted genes were
compared via Blastn against the genomic sequences to verify the accuracy. rRNA
operons and tRNA were predicted by RNAmmer and tRNAscan-SE (Lagesen et al
3
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
56
2007, Lowe and Eddy 1997), while additional analysis was carried out by using
NCBI’s uniprot database (http://www.ncbi.nlm.nih.gov/), COG (Tatusov et al 2001)
KEGG (Ogata et al 1999) and GO terms (Ashburner et al 2000).
Variant calling
Paired-end reads generated from Illumina sequencing were mapped onto genome
sequence by using Burrows–Wheeler Alignment (Li and Durbin 2009). Default
settings were used except the maximum edit distance was set to 0.02 (-n 0.02).
MarkDuplicates command in Picard (http://picard.sourceforge.net/) was used to
remove the reads that mapped to the same positions in strain DSM 23518 genome
(PCR duplications). After IndelRealigner and BaseRecalibrator, SNPs and Indels
were called using GATK (Gac et al 2013, Tenaillon et al 2012). Default settings were
used except the maximum read depth in GATK was set to 500 (-dcov 500). The
generated SNPs and Indels were then filtered using custom Perl scripts to minimize
the false positive mutation calls. First, mutations with a total read depth below 20X
were discarded. Second, SNPs and Indels with a Phred quality score below 30 were
removed. Third, the mutation calls were only kept when at least 80% of the reads was
positive. The lists of SNPs/Indels were then annotated by in-house Perl scripts. For
the mutations that happened in the coding regions, PROVEAN was used to predict
whether a protein sequence variation is deleterious or neural (Chieng et al 2012).
Phylogenetic and comparative genome analysis
The sequences from four whole genome sequenced strains were aligned and
visualized by using Murasaki software (Popendorf et al 2010). For genome-based
phylogeny, in addition to the four B. seminalis genomes that sequenced in this study,
28 complete Burkholderia genome sequences were obtained from Burkholderia
Genome Database (Winsor et al 2008). Furthermore, a well-resolved phylogenetic
tree were also generated based on the multi-locus sequence analysis (MLSA) of the
atpD, gltB, gyrB, lepA, phaC, recA and trpB genes, which has been widely applied in
identification and discrimination of the Burkholderia species (Spilker et al 2009). The
identity of strains was confirmed by calculating whole-genome average nucleotide
identity (ANI) based on Blast and MUMer algorithm by using JSpecies (Richter and
Rosselló-Móra 2009). Multiple sequence alignment was done by using Muscle 3.8
(Edgar 2004) and ML tree was generated by MEGA 6 (Tamura et al 2013). In
4
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
78
addition, GIs were detected by applying IslandViewer which integrated with mostly
used GI detection algorithem IslandPick, SIGI-HMM and IslandPath-DIMOB
(Langille and Brinkman 2009).
DNA methylation analysis pipeline
SMRT generated data was analyzed with RS_Modification and Motif Analysis
pipeline in SMRT analysis 2.2, which was provided by Pacific Biosciences SMRT
portal with default parameters. In this default parameters, coverage and IPD (inter-
pulse duration) ratio were calculated by dividing a methylated base in the DNA
template to an incorporation opposite of a canonical base (Lluch-Senar et al 2013).
All the data sets contain kinetic values for each reference position and DNA strand
with the corresponding sequences generated from assembly procedure. For statistical
analysis, methylation site positions were divided into three parts (up-stream 200 bp
coding region, coding region and down-stream 200 bp coding region). For every gene,
top methylated strain was then selected out for further analysis.
Growth conditions for RNA-Seq analysis
In order to simulate the original niche environments of four B. seminalis strains, 2 mL
of overnight cultured bacteria were inoculated into 50 mL of the following four types
of media. Water medium that contains M9 minimal salts (0.6% Na2PO4 + 0.3%
KH2PO4 + 0.05% NaCl + 0.1% NH4Cl + 0.02% MgSO4 + 0.015% CaCl2) with 3%
glycerol, was used for simulation of the water environment (Schell et al 2011). CF
medium was prepared according to the method of (Dinesh 2010). Briefly, 5.0 g/L
mucin from pig stomach mucosa (Sangon Biotech), 4.0 g/L low molecular-weight
salmon sperm DNA (Fluka), 5.9 mg diethylenetriaminepentaacetic acid (DTPA)
(Sigma), 5.0 g/L NaCl (Sigma), 2.2 g/L KCl (Sigma), 1.8 g/L Tris base (Sigma), were
mixed together autoclaved and 5.0 mL/L egg yolk emulsion (Oxoid), 5.0 g/L
casamino acids (Sangon Biotech) were added when temperature reached to 37°C after
autoclaving. Soil extract medium was prepared to mimic soil conditions based on
recent paper (Yoder-Himes et al 2009) with the exception that soil was collected from
the rice rhizosphere, which was the original niche for strain R456. Plant condition to
obtain in vivo bacteria was prepared according to the method of our recent paper (Li
5
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
910
et al 2014).
Total RNA harvesting
Each bacterial strain was incubated under its condition to stationary phase. After
centrifugation of 4500 g at 4°C, pellets were re-suspended in 3 mL of PBS. One
milliliter of bacterial culture was subjected to RNA purification by RNeasy Mini Kit
(Qiagen) and eluted in 50 µl of RNase-free water. Samples were treated with DNaseI
to remove any residual DNA and purified by phenol-chloroform-isoamyl alcohol
extraction and ethanol precipitation.
mRNA purification and cDNA synthesis
Ten micrograms from each total RNA sample was treated with the MICROBExpress
Bacterial mRNA Enrichment kit (Ambion) and RiboMinus™ Transcriptome Isolation
Kit (Bacteria) (Invitrogen) following the manufacturer’s instructions. Samples were
resuspended in 15 μL of RNase-free water. Bacterial mRNAs were chemically
fragmented to the size range of 200-250 bp using 1 × fragmentation solution
(Ambion) for 2.5 min at 94°C. cDNA was generated according to instructions given in
SuperScript Double-Stranded cDNA Synthesis Kit (Invitrogen). Briefly, each mRNA
sample was mixed with 100 pmol of random hexamers, incubated at 65°C for 5 min,
chilled on ice, mixed with 4 μL of First-Strand Reaction Buffer (Invitrogen), 2 μL of
0.1 M DTT, 1 μL of 10 mM RNase-free dNTPmix, 1 μL of SuperScript III reverse
transcriptase (Invitrogen), and incubated at 50°C for 1 h. To generate the second
strand, the following Invitrogen reagents were added: 51.5 μL of RNase-free water, 20
μL of second-strand reaction buffer, 2.5 μL of 10 mM RNase-free dNTP mix, 50 U E.
coli DNA Polymerase, 5 U E. coli RNase H, and incubated at 16 °C for 2.5 h.
RNA Sequencing
The Illumina Paired End Sample Prep kit was used for RNA-Seq library creation
according to the manufacturer’s instructions as follows: Fragmented cDNA was end-
repaired, ligated to Illumina adaptors, and amplified by 18 cycles of PCR. Paired-end
100-bp reads were generated by high-throughput sequencing with the Illumina
Hiseq2000 Genome Analyzer instrument.
RNA-Seq data analysis
After removing the low quality reads and adaptors, RNA-Seq reads were aligned to
6
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
1112
the corresponding B. seminalis genome using TopHat 2.0.7 (Trapnell et al 2009),
allowing for a maximum of two mismatches. If reads mapped to more than one
location, only the one showing the highest score was kept. Reads mapping to rRNA
and tRNA regions were removed from further analysis. After getting the reads number
from every sample, edgeR with TMM normalization method was used to determine
the DEGs. Significantly differentially expressed genes (FDR value < 0.05 and at least
two fold changes) were selected for further analysis. Cluster 3.0 and Treeview 1.1.6
were used to generate the heatmap cluster based on the RPKM values (de Hoon et al
2004, Saldanha 2004).
COG enrichment analysis
All the DEGs between different strains or conditions will be classified by COG
category (Tatusov et al 2001). Based on the whole-genome COG classification, the
significance of COG category about DEGs under the same COG category will be
tested based on the Hypergeometric Distribution,
M N Mni n i
Ni x n
p
In which, N means the number of genes in the genome, M means the number of genes
assigned to one COG category in the whole genome, n means the number of DEGs
and I means the number of genes fill into one COG category in DEGs. The results
were shown on Table S4.
Validation of mix sample method
Each sample was derived from a pool of five biological replicates, which has been
developed to increase the efficiency and cost-effectiveness with equivalent statistical
power (Greenwald et al 2012, Peng et al 2003). To validate the accuracy of mix-
sample method, single biological RNA sample from SE of strain DSM 23518 were
prepared. Correlation coefficient between samples was determined by statistical
7
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
1314
analysis.
Quantitative real-time PCR
Total RNAs were extracted from exponentially growing cells, using an RNeasy Mini
spin columns Kit (Qiagen) and was treated with a unit of RNase-free DNase I
(Qiagen), and cDNA synthesis was performed with a Moloney murine leukemia virus
reverse transcriptase first-strand cDNA synthesis kit (QIAGEN). The cDNA was then
used directly as the template for qRT-PCR using a SYBER Green master mix (Protech
Technology Enterprise Co., Ltd.) on an ABI Prism 7000 sequence detection system
(Applied Biosystems). Primers for quantitative real-time PCR (qRT-PCR) of the
selected genes were designed by using Primer 3 based on the genome sequences
(Untergasser et al 2012). All these primers are listed in Table S3 and an annealing
temperature of 58ºC was used for all the primers. Short-chain dehydrogenase
(BCAL2694), which has been proved to be stably expressed in Bcc, was used as
internal control (Van Acker et al 2013). Fold changes were calculated according to the
delta-delta CT method and the values were also shown on Table S3. The correlation
between RNA-Seq results and qRT-PCR results were tested by Pearson's correlation
method.
Supplementary references
Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM et al (2000). Gene Ontology: tool for the unification of biology. Nat Genet 25: 25-29.
Aziz RK, Bartels D, Best AA, DeJongh M, Disz T, Edwards RA et al (2008). The RAST Server: rapid annotations using subsystems technology. BMC Genomics 9: 75.
Bernier SP, Silo-Suh L, Woods DE, Ohman DE, Sokol PA (2003). Comparative analysis of plant and animal models for characterization of Burkholderia cepacia virulence. Infect Immun 71: 5306-5313.
Boetzer M, Henkel CV, Jansen HJ, Butler D, Pirovano W (2011). Scaffolding pre-assembled contigs using SSPACE. Bioinformatics 27: 578-579.
Chieng S, Carreto L, Nathan S (2012). Burkholderia pseudomallei transcriptional adaptation in macrophages. BMC Genomics 13: 328.
8
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220221
222223
224225226
227228
229230
1516
de Hoon MJL, Imoto S, Nolan J, Miyano S (2004). Open source clustering software. Bioinformatics 20: 1453-1454.
Dinesh SD (2010). Artificial Sputum Medium. Protocol Exchange doi:10.1038/protex.2010.212.
Edgar RC (2004). MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res 32: 1792-1797.
English AC, Richards S, Han Y, Wang M, Vee V, Qu J et al (2012). Mind the gap: upgrading genomes with Pacific Biosciences RS long-read sequencing technology. PLoS One 7: e47768.
Fang Y, Li B, Wang F, Liu B, Wu Z, Su T et al (2009). Bacterial fruit rot of apricot caused by Burkholderia cepacia in China. Plant Pathol J 25: 429-432.
Fang Y, Xie G, Lou M, Li B, Muhammad I (2011). Diversity analysis of Burkholderia cepacia complex in the water bodies of West Lake, Hangzhou, China. The Journal of Microbiology 49: 309-314.
Gac M, Cooper TF, Cruveiller S, Médigue C, Schneider D (2013). Evolutionary history and genetic parallelism affect correlated responses to evolution. Mol Ecol 22: 3292-3303.
Greenwald JW, Greenwald CJ, Philmus BJ, Begley TP, Gross DC (2012). RNA-seq analysis reveals that an ECF σ Factor, AcsS, regulates achromobactin biosynthesis in Pseudomonas syringae pv. syringae B728a. PLoS One 7: e34804.
Hyatt D, Chen GL, LoCascio PF, Land ML, Larimer FW, Hauser LJ (2010). Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinformatics 11.
Ibrahim M, Tang Q, Shi Y, Almoneafy A, Fang Y, Xu L et al (2012). Diversity of potential pathogenicity and biofilm formation among Burkholderia cepacia complex water, clinical, and agricultural isolates in China. World J Microb Biot 28: 2113-2123.
Lagesen K, Hallin P, Rødland EA, Stærfeldt HH, Rognes T, Ussery DW (2007). RNAmmer: consistent and rapid annotation of ribosomal RNA genes. Nucleic Acids Res 35: 3100-3108.
Langille MGI, Brinkman FSL (2009). IslandViewer: an integrated interface for computational identification and visualization of genomic islands. Bioinformatics 25: 664-665.
Li B, Liu BP, Yu RR, Lou MM, Wang YL, Xie GL et al (2011). Phenotypic and molecular characterization of rhizobacterium Burkholderia sp. strain R456 antagonistic to Rhizoctonia solani, sheath blight of rice. World J Microb Biot 27: 2305-2313.
9
231232
233234
235236
237238239
240241
242243244
245246247
248249250
251252253
254255256
257258259
260261262
263264265266
1718
Li B, Ibrahim M, Ge M, Cui Z, Sun G, Xu F et al (2014). Transcriptome analysis of Acidovorax avenae subsp. avenae cultivated in vivo and co-culture with Burkholderia seminalis. Sci Rep 4.
Li H, Durbin R (2009). Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25: 1754-1760.
Lluch-Senar M, Luong K, Lloréns-Rico V, Delgado J, Fang G, Spittle K et al (2013). Comprehensive methylome characterization of Mycoplasma genitalium and Mycoplasma pneumoniae at single-base resolution. PLoS Genetics 9: e1003191.
Lowe TM, Eddy SR (1997). tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence. Nucleic Acids Res 25: 955-964.
Myers EW, Sutton GG, Delcher AL, Dew IM, Fasulo DP, Flanigan MJ et al (2000). A whole-genome assembly of Drosophila. Science 287: 2196-2204.
Ogata H, Goto S, Sato K, Fujibuchi W, Bono H, Kanehisa M (1999). KEGG: Kyoto encyclopedia of genes and genomes. Nucleic Acids Res 27: 29-34.
Peng X, Wood CL, Blalock EM, Chen KC, Landfield PW, Stromberg AJ (2003). Statistical implications of pooling RNA samples for microarray experiments. BMC Bioinformatics 4: 26.
Popendorf K, Tsuyoshi H, Osana Y, Sakakibara Y (2010). Murasaki: A Fast, Parallelizable Algorithm to Find Anchors from Multiple Genomes. PLoS One 5: e12651.
Richter M, Rosselló-Móra R (2009). Shifting the genomic gold standard for the prokaryotic species definition. Proc Natl Acad Sci U S A 106: 19126-19131.
Saldanha AJ (2004). Java Treeview—extensible visualization of microarray data. Bioinformatics 20: 3246-3248.
Schell MA, Zhao P, Wells L (2011). Outer Membrane Proteome of Burkholderia pseudomallei and Burkholderia mallei From Diverse Growth Conditions. J Proteome Res 10: 2417-2424.
Spilker T, Baldwin A, Bumford A, Dowson CG, Mahenthiralingam E, LiPuma JJ (2009). Expanded multilocus sequence typing for Burkholderia species. J Clin Microbiol 47: 2607-2610.
Tamura K, Stecher G, Peterson D, Filipski A, Kumar S (2013). MEGA6: molecular evolutionary genetics analysis version 6.0. Mol Biol Evol 30: 2725-2729.
Tatusov RL, Natale DA, Garkavtsev IV, Tatusova TA, Shankavaram UT, Rao BS et al (2001). The COG database: new developments in phylogenetic classification of proteins from complete genomes. Nucleic Acids Res 29: 22-28.
10
267268269
270271
272273274
275276
277278
279280
281282283
284285286
287288
289290
291292293
294295296
297298
299300301
1920
Tenaillon O, Rodríguez-Verdugo A, Gaut RL, McDonald P, Bennett AF, Long AD et al (2012). The molecular diversity of adaptive convergence. Science 335: 457-461.
Trapnell C, Pachter L, Salzberg SL (2009). TopHat: discovering splice junctions with RNA-Seq. Bioinformatics 25: 1105-1111.
Untergasser A, Cutcutache I, Koressaar T, Ye J, Faircloth BC, Remm M et al (2012). Primer3—new capabilities and interfaces. Nucleic acids research 40: e115-e115.
Van Acker H, Sass A, Bazzini S, De Roy K, Udine C, Messiaen T et al (2013). Biofilm-grown Burkholderia cepacia complex cells survive antibiotic treatment by avoiding production of reactive oxygen species. PLoS ONE 8: e58943.
Vanlaere E, LiPuma JJ, Baldwin A, Henry D, De Brandt E, Mahenthiralingam E et al (2008). Burkholderia latens sp. nov., Burkholderia diffusa sp. nov., Burkholderia arboris sp. nov., Burkholderia seminalis sp. nov. and Burkholderia metallica sp. nov., novel species within the Burkholderia cepacia complex. Int J Syst Evol Microbiol 58: 1580-1590.
Winsor GL, Khaira B, Van Rossum T, Lo R, Whiteside MD, Brinkman FSL (2008). The Burkholderia Genome Database: facilitating flexible queries and comparative analyses. Bioinformatics 24: 2803-2804.
Yoder-Himes D, Chain P, Zhu Y, Wurtzel O, Rubin E, Tiedje JM et al (2009). Mapping the Burkholderia cenocepacia niche response via high-throughput sequencing. Proc Natl Acad Sci U S A 106: 3976-3981.
11
302303
304305
306307
308309310
311312313314315
316317318
319320321
322
323
324
325
326
327
328
329
330
331
332
333
2122
Supplementary Figure and Table Legends
Figure S1: Distribution of differentially expressed genes along the chromosome.
Grey thick circles sorted by strain 0901 from inner to outer represent strains 0901,
DSM 23518, R456 and S9 chromosomes, respectively. The red, green, blue and black
peaks outside the chromosome represent the log2 RPKM values of genes under CF,
apricot, soil and water conditions. Outside the black peak (water RPKM value) is the
heatmap of genes density every 10 kb along the chromosome from blue to red.
Figure S2: Full genome alignment among the four strains 0901, DSM 23518, R456
and S9 of Burkholderia seminalis.
Figure S3: Expression pattern cluster based on the normalized RPKM values. The
cluster of RNA-Seq samples based on the log2 RPKM values.
Figure S4: The histogram of the number of DNA methylation in Burkholderia
seminalis strains 0901, S9, R456 and DSM 23518.
Figure S5: Phylogenetic relationship of four Burkholderia seminalis strains to other
species of Burkholderia. (a) Maximum-likelihood tree was constructed by using
MLSA from four sequenced B. seminalis strains in this study and other 28
Burkholderia strains. Among these strains, B. seminalis DSM 23518 (= LMG 24067),
B. lata 383, B. thailandensis E264, B. mallei ATCC 23344, B. phymatum STM815, B.
phytofirmans PsJN and B. xenovorans LB400 are type strains. (b) Maximum
likelihood tree was constructed based on whole genome sequences. Among these
strains, the type strains are the same as that of (a).
Figure S6: Correlation coefficient between SE-single sample and SE-mix sample of
strain DSM 23518 based on the log2 RPKM values.
12
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
2324
Figure S7: Correlation coefficient between SE-mix sample and W-mix sample of
strain DSM 23518 based on the log2 RPKM values.
Table S1: Physiological characteristics of Burkholderia seminalis strains 0901, S9,
R456 and DSM 23518.
Table S2: Comparison of general genomic features between Burkholderia seminalis
strains 0901, DSM 23518, R456 and S9.
Table S3: Summary of RNA-Seq results (Illumina HiSeq 2000).
Table S4: Integrated information of Burkholderia seminalis strains 0901, S9, R456
and DSM 23518.
Table S5: Average Nucleotide Identity (ANI) among the Burkholderia seminalis
genomes and the selected Burkholderia cenocepacia genomes.
Table S6: COG enrichment results from DEGs. a), strain 0901; b), strain DSM
23518; c), strain S9; d), strain R456.
Table S7: Gene clusters involved in niche adaptation.
Table S8: (a): Primers of qRT-PCR used in this study. (b): Internal primer used in
qRT-PCR and its RPKM values in different strains and conditions.
13
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
2526