Download - media.nature.com file · Web viewTitle: Experimental warming reveals positive feedbacks to climate change in the Eurasian Steppe. Authors: Ximei Zhang1,2, Eric R Johnston3, Linghao

Title: Experimental warming reveals positive feedbacks to climate change in the Eurasian

Steppe

Authors: Ximei Zhang1,2, Eric R Johnston3, Linghao Li4, Konstantinos T Konstantinidis2,3

and Xingguo Han4,5

Affiliations: 1Key Laboratory of Dryland Agriculture, MOA, Institute of Environment and

Sustainable Development in Agriculture, Chinese Academy of Agricultural Sciences,

Beijing100081, China. 2School of Biology, 3School of Civil and Environmental Engineering,

Georgia Institute of Technology, Atlanta, GA 30332, USA. 4State Key Laboratory of

Vegetation and Environmental Change, Institute of Botany, Chinese Academy of Sciences,

Beijing 100093, China. 5Institute of Applied Ecology, Chinese Academy of Sciences,

Shenyang 110016, China.

1

1

2

3

4

5

6

7

8

9

10

11

1

2

Supplementary materials and methods

Shotgun metagenomic sequence pre-processing and annotation

Paired-end shotgun metagenomic sequences were merged using PEAR (Zhang et al., 2014).

Successfully merged reads underwent quality trimming using the SolexaQA package (Cox et

al., 2010); reads were trimmed where Phred quality scores dropped below 17 (>98%

accuracy). Nonmerge-able reads were trimmed separately. Reads >80-bp following trimming

were retained. FragGeneScan was used for the protein prediction (Rho et al., 2010), adopting

the Illumina 1% error model. The resulting amino acid (a.a.) sequences were searched against

Swiss-Prot (UP consortium, 2014), using blastp (Camacho et al., 2009) (blast+ version

2.2.29, options: –word_size 3, outfmt 6). Matches with a bit score >75, alignment length >25

a.a., and a.a. identity >40% were deemed acceptable for further analysis. A count matrix (with

sample metagenomes as columns and gene annotations or metabolic process categories as

rows) was generated to summarize the occurrence of each Swiss-Prot entry in each sample.

Corresponding Gene Ontology Annotations of functions and processes for each Swiss-Prot

entry was obtained from the uniprot_sprot.dat file provided on

http://www.uniprot.org/downloads (downloaded on July 2014). A count matrix of

summarized gene ontology pathways was processed with the DESeq2 package (Love et al.,

2014) to identify differentially-abundant pathways between the treatments and to generate

log2 transformations of gene/process abundance ratios. The raw count data underwent a

variance-stabilizing transformation, which is used for logarithmically distributed count data

with low mean values that tend to have high variance. This transformation results in new

values that have a relatively constant variance along the range of mean values and confers a

reduced false positive rate for less abundant genes (Anders & Huber, 2010). P-values of

pathways for SOM catabolism and S and N cycling were transformed to account for false

discovery rate from multiple testing using Benjamini–Hochberg correction (adjusted P-

values; Benjamini and Hochberg, 1995).

Taxonomic composition analysis

Using both merged and non-merged reads >80bp after trimming, the relative abundance of

2

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

3

4

http://www.uniprot.org/downloads

various prokaryotic phyla was determined. To achieve this, candidate 16S rRNA sequences

were first identified by searching reads from entire metagenomes against the May 2013

release of Greengenes 16S ribosomal database (DeSantis et al., 2006) pre-clustered at 88%

identity, using blastn (blast+ version 2.2.29, options: -word_size 16, -outfmt 6, -task blastn, -

dust no). Matches with a bit-score >45 were used in a subsequent filtering process. Various

scripts in Qiime were used for the majority of the remaining workflow (Caporaso et al.,

2010). Candidate 16S rRNA underwent closed-reference OTU (Operational Taxonomic

Unit)-picking against Greengenes database pre-clustered at 99% (options: -m uclust_ref –s

0.97) (Edgar, 2010). Taxonomic annotations for each representative sequence provided by

Greengenes were used for each resulting OTU. The phyla abundances for each sample were

then summarized using this information.

To estimate the abundance of fungi relative to prokaryotes, these reads were also

aligned to the Silva database (Quast et al., 2013), using similar procedures. In particular, 18S

rRNA genes were used as references for fungi while 16S rRNA genes for archaea and

bacteria. The percentage of reads belonging to each of the three groups was calculated as

their relative abundance. Bacteria were found to make up ~94% of reads, and thus we further

calculated bacterial taxonomic richness by sampling equal number of sequences.

Supplementary referencesAnders S, Huber W. (2010). Differential expression analysis for sequence count data.

Genome Biol 11: 106.Camacho C, Coulouris G, Avagyan V, Ma N, Papadopoulos J, Bealer K et al. (2009).

BLAST plus: architecture and applications. BMC Bioinformatics 10: 421.Caporaso JG, Kuczynski J, Stombaugh J, Bittinger K, Bushman FD, Costello EK et al.

(2010). QIIME allows analysis of high-throughput community sequencing data. Nat Methods 7: 335–336.

Cox MP, Peterson DA, Biggs PJ. (2010). SolexaQA: At-a-glance quality assessment of Illumina second-generation sequencing data. BMC Bioinformatics 11: 485.

Edgar RC. (2010). Search and clustering orders of magnitude faster than BLAST. Bioinformatics 26: 2460–2461.

Rho M, Tang H, Ye Y. (2010). FragGeneScan: predicting genes in short and error-prone reads. Nucleic Acids Res 38: e191.

UniProt (UP) consortium. (2014). Activities at the universal protein resource (UniProt). Nucleic Acids Res 42: D191–D198.

Zhang J, Kobert K, Flouri T, Stamatakis A. (2014). PEAR: a fast and accurate Illumina

3

41

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

5960616263646566676869707172737475

5

6

paired-end read merge R. Bioinformatics 30: 614–620.

4

76

7

8

Supplementary Table S1. Information pertaining to the merge-ability, sequence quality, and

number of annotation made for shotgun-metagenome datasets representing each soil sample.

Watering WarmingSample

ID

Number of raw

sequences

Merged

Reads used in

blastp

Unique

Nonmerged reads

used in blastp

Total Reads

used in Blastp

Annotations

Made

0 0 ZXM-2 6884555 6595856 124994 6720850 974720

0 0 ZXM-3 7819079 6888224 518126 7406350 1148176

0 0 ZXM-4 9197718 6848079 1313986 8162065 1169287

0 0 ZXM-5 10170980 5698469 1967701 7666170 1096956

0 0 ZXM-7 14151791 10502887 723359 11226246 1632378

0 1 ZXM-8 12581286 7014906 2393488 9408394 1357400

0 1 ZXM-9 11380278 5587307 2934746 8522053 1182923

0 1 ZXM-10 13011134 9499393 806617 10306010 1522940

0 1 ZXM-11 12353161 9569367 529014 10098381 1475441

0 1 ZXM-12 13543460 9311805 1231322 10543127 1557768

0 1 ZXM-13 30159789 17040454 6098582 23139036 3091273

1 0 ZXM-14 13486585 9295828 986281 10282109 1575095

1 0 ZXM-16 11865752 9391361 495893 9887254 1414182

1 0 ZXM-17 13550713 10890587 365306 11255893 1611091

1 0 ZXM-18 11252495 8517501 839109 9356610 1364387

1 0 ZXM-19 11152412 8935248 391014 9326262 1255115

1 0 ZXM-20 9999200 7826047 535233 8361280 1321991

1 1 ZXM-21 8657718 7674047 224799 7898846 1233259

1 1 ZXM-22 8483799 7500108 244559 7744667 1239844

1 1 ZXM-23 8539432 7404922 333403 7738325 1210705

1 1 ZXM-24 9112796 7943643 361350 8304993 1290767

1 1 ZXM-25 12553267 8710502 1977398 10687900 1595095

1 1 ZXM-26 14116212 9027687 3090373 12118060 1609336

Mean

(standard deviation)11914070

(4532103)

8594532

(2319260)

1238550

(1372003)

9833082

(3236203)

1431745

(407992)

The 0 and 1 for watering/warming represent without and with the treatment, respectively.

5

77

78

79

9

10

Supplementary Figure S1. The spatial distribution of plots in the three blocks. C, T, W and

WT means control, warming, watering and warming plus watering, respectively.

6

80

81

82

11

12

Supplementary Figure S2. Curves representing soil microbial community complexity

estimations as determined by Nonpareil. Nonpareil is a statistical tool that uses read

redundancy to estimate dataset complexity and the amount of sequencing effort needed to

achieve a desired level of coverage. Circles on curves represent the coverage of the actual

sequencing richness from the dataset supplied by the user in relation to the entire curve.

Curves positioned on the right represent more sequence diverse metagenomes than curves

positioned on the left.

7

83

84

85

86

87

88

89

90

13

14

Supplementary Figure S3. Correlation between changes in microbial community and plant

community compositions as an effect of the treatments. The graph represents the primary

axes resulting from PCoA of Microbial community Mash-based distances (y-axis) and Bray-

Curtis distances of the corresponding plant communities (x-axis) for the same experimental

plots. The proportion of variation explained by Plant Bray Curtis PCoA axis 1 is 37.2% and

by metagenomics Mash PCoA axis 1 is 7.5%.

8

91

92

93

94

95

96

97

15

16

Supplementary Figure S4. Effect of experimental treatments on the phylogenetic

composition of soil microbial communities. The graph represents the results of the PCoA

analysis of the weighted UniFrac distances among the communities sampled based on 16S

rRNA gene amplicon-based OTUs. The proportion of variation explained by PCoA axes 1

and 2 is 14.11% and 9.82%, respectively.

9

98

99

100

101

102

103

104

17

18

Supplementary Figure S5. The similar effects of different experimental treatments on the taxonomic (a, b) and functional (c, d) structure of microbial communities.

10

105106

107

19

20