Inferring microbial ecosystem function from community structure

Inferring microbial ecosystem function from community structure

Je� S. Bowman and Hugh W. Ducklow

Lamont-Doherty Earth Observatory at Columbia [email protected] | www.polarmicrobes.org

Introduction and Motivation

Marine microbes play a central role in the sustainability of the global ocean by mediating the �ow of carbon and nutrients through the marine system. Ecologists commonly study the structure and composition of marine microbial communities by analyzing the 16S rRNA gene. Although this data is well suited to evaluating di�erences between communities, and to correlating community structure with other environmental parameters (e.g. chlorophyll concentration, temperature, sa-linity), it is less well suited to describing the ecosystem functions (i.e. metabolic functions) of these community. Although metagenomics and other techniques can bridge the gap between microbial community structure and ecosystem function these techniques are costly, data intensive, and low throughput.

Our goal was to develop a high-throughput method for inferring community metabolism from community taxonomy. By evaluating metabolic structure in place of community structure we capture key inter-sample relationships and their impact on microbial ecosystem function. Our method produces pathway genome databases (PGDBs) that describe the metabolic pathways likely to be present in the sample. These PGDBs are amenable to �ux-based metabolic modeling. Future work will focus on predicting the �ow of elements and energy through these pathways, providing a way to model the impact of changing commu-nity structure on biogeochemical cycles.

Here we apply our method to a seasonally variable, depth strati�ed microbial community from the West Antarctic Peninsula, a region undergoing unprecedented environmental change.

16S sequence library, the bigger

the better!

Obtain all completed genomes

Build 16S rRNA reference tree

Find consensus genome for

each tree node

Place reads on reference tree

Extract pathways for each placement

Generate confidence score

for sample

Predict metabolic pathways

Calculate confidence for

each node

Evaluate genomic

plasticity for terminal nodes

Evaluate relative core genome size

Fig. 1. Methods. Our metabolic inference pipe-line, PAPRICA [1], uses a phylogenetic placement program (pplacer) [2] to place query reads on a reference tree of 16S rRNA genes from all complet-ed genomes. We determine a consensus genome for each point of placement on the tree, and deter-mine the metabolic pathways represented in these genomes. Separately we determine a con�dence score for each point of placement on the reference tree from a novel indicator of genomic stability.

Terminal Node

Terminal Node

Internal Node

Core genome

Accessory Genome

= ( )

(1 )

Fig. 2. Con�dence score. Placements can be made to terminal and internal nodes. To determine the con�dence (c) of a metabolic inference for a given placement we con-sider the core genome size (Score), the mean genome size of the clade (Sclade), and the mean index of plasticity for the clade (ф; Fig. 3).

Fig. 3. Genomic plasticity of genomes in our database. A major impediment to accurate metabolic inference is the genetic diversity that can exist within even a narrow taxonomic clade. We developed a con�dence metric for our inferred metab-olisms that is based on the degree of genomic plasticity present inherent to each genome. X-axis gives the position of each genome on our reference tree, Y-axis gives the degree of plasticity. Unusually plastic genomes are indicated by Roman numerals. I) Nanoarcheum equitans II) the Mycobacteria III) a butyrate producing bacterium within the Clostridium IV) Candidatus Hodgkinia circadicola V) the Myco-plasma VI) Sulcia muelleri VII) Portiera aleyrodidanum VIII) Buchnera aphidicola IX) the Oxalobacteraceae.

0 500 1000 1500 2000 2500

0.0

0.2

0.4

0.6

0.8

1.0

Terminal node

Rel

ativ

e pl

astic

ity

I

IIIII

IV

V VIVII

VIII

IX

Fig. 4. Sample locations within the Palmer LTER o� the WAP (left) and inter-sample similarity (right). The location of Palmer Sta-tion is given by the star. Summer surface and deep samples along with winter surface samples were analyzed [3]. A) Hierarchical cluster-ing of samples by metabolic structure. B) Hierarchical clustering of samples by taxonomic structure. Note duplicate samples in both A and B. C) Distances between samples are in good agreement between the two methods (R2 = 0.70). D) Distances are correlated (R2 = 0.40), albeit less well, the alternate metabolic inferrence approach PICRUSt [4].

●

●

●

●

NW

NE

SW

SE

WAP

sum

mer

_sw

_dee

p_b.

1su

mm

er_s

w_d

eep_

b.2

sum

mer

_nw

_dee

p_b.

1su

mm

er_n

w_d

eep_

b.2

sum

mer

_se_

deep

_b.1

sum

mer

_se_

deep

_b.2

win

ter_

ne_s

hallo

w_b

.1w

inte

r_ne

_sha

llow

_b.2

sum

mer

_ne_

deep

_b.1

sum

mer

_ne_

deep

_b.2

sum

mer

_ne_

shal

low

_b.1

sum

mer

_ne_

shal

low

_b.2

sum

mer

_se_

shal

low

_b.1

sum

mer

_se_

shal

low

_b.2

sum

mer

_sw

_sha

llow

_b.1

sum

mer

_sw

_sha

llow

_b.2

sum

mer

_nw

_sha

llow

_b.1

sum

mer

_nw

_sha

llow

_b.20.

01.

02.

0

Hei

ght

sum

mer

_nw

_dee

p_b.

1su

mm

er_n

w_d

eep_

b.2

sum

mer

_se_

deep

_b.1

sum

mer

_se_

deep

_b.2

sum

mer

_sw

_dee

p_b.

1su

mm

er_s

w_d

eep_

b.2

win

ter_

ne_s

hallo

w_b

.1w

inte

r_ne

_sha

llow

_b.2

sum

mer

_ne_

deep

_b.1

sum

mer

_ne_

deep

_b.2

sum

mer

_se_

shal

low

_b.1

sum

mer

_se_

shal

low

_b.2

sum

mer

_nw

_sha

llow

_b.2

sum

mer

_sw

_sha

llow

_b.1

sum

mer

_sw

_sha

llow

_b.2

sum

mer

_nw

_sha

llow

_b.1

sum

mer

_ne_

shal

low

_b.1

sum

mer

_ne_

shal

low

_b.20.

00.

20.

4

Hei

ght

0.02 0.04 0.06 0.08 0.10 0.12 0.14

0.1

0.3

0.5

Distance by pathway abundance

Dis

tanc

e by

edg

e ab

unda

nce

A B

Surface

Deep

Winter surface

C

0.05 0.10 0.15

0.2

0.4

0.6

0.8

Distance by pathway abundance

Dis

tanc

e by

OTU

abu

ndan

ce

D

This methodR2 = 0.70

PICRUStR2 = 0.40

Clustering by pathway abundance, this method Clustering by edge abundance, this method

Key Points

• Microbial communities can be described by their metabolic structure.• Metabolic structure provides information on potential microbial ecosystem functions.• Representing a microbial community by metabolic structure may provide a way to model the �ow of elements and energy through the community.

1. Bowman, Je� S., and Hugh W. Ducklow. 2015. Microbial Communities Can Be Described by Metabolic Structure: A General Framework and Application to a Sea-sonally Variable, Depth-Strati�ed Microbial Community from the Coastal West Antarctic Peninsula. PloS one, 10.8: e0135868.2. Matsen, F, R Kodner, E Armbrust. 2010. pplacer: Linear time maximum-likelihood and Bayesian phylogenetic placement of sequences onto a �xed reference tree. BMC Bioinformatics, 11:538.3. Luria, C, H Ducklow, L Amaral-Zettler. 2014. Marine bacterial, archaeal and eukaryotic diversity and community structure on the continental shelf of the western Antarctic Peninsula. Aquatic Microbial Ecology, 73:2 107-121.4. Langille, Morgan GI, et al. Predictive functional pro�ling of microbial communities using 16S rRNA marker gene sequences. 2013. Nature biotechnology 31.9: 814-821.

pyruvate fermentation to lactatephosphonoacetate degradation

adenosine nucleotides degradation IIIcreatinine degradation II

D−galacturonate degradation Itriacylglycerol degradation

allantoin degradation to ureidoglycolate I (urea producing)nitrate reduction I (denitrification)

oxalate degradation IIsucrose degradation IV (sucrose phosphorylase)

galactose degradation I (Leloir pathway)threonine degradation I

S−methyl−5−thio−alpha−D−ribose 1−phosphate degradationnitrate reduction IV (dissimilatory)

taurine degradation IVcholesterol degradation to androstenedione II (cholesterol dehydrogenase)

sitosterol degradation to androstenedionereactive oxygen species degradation (mammalian)

alkylnitronates degradationreductive monocarboxylic acid cycle

trehalose degradation VI (periplasmic)arginine degradation III (arginine decarboxylase/agmatinase pathway)

propionyl CoA degradationphenylmercury acetate degradation

thymine degradationglutamate degradation I

uracil degradation I (reductive)ethanol degradation IV

threonine degradation III (to methylglyoxal)formaldehyde oxidation II (glutathione−dependent)

ethanol degradation IIvaline degradation II

S−methyl−5'−thioadenosine degradation IIguanosine nucleotides degradation III

formate oxidation to CO2pyrimidine deoxyribonucleosides degradation

2'−deoxy−alpha−D−ribose 1−phosphate degradationmethylglyoxal degradation II

glutamate degradation Xglucose and glucose−1−phosphate degradation

glycogen degradation Iurate biosynthesis/inosine 5'−phosphate degradation

pseudouridine degradationphenylacetate degradation I (aerobic)

D−mannose degradationurea degradation I

methionine degradation I (to homocysteine)aspartate degradation I

citrulline degradationglutamine degradation I

−0.6 −0.4 −0.2 0.0 0.2 0.4 0.6

Enriched in surface | Enriched in deep and winter

p-value0.05

4.57 x 10-5

Key intracellular metabolismAnaerobic metabolismNitrogen degradationCarbon degradation

C1 metabolism

AutotrophyMercury degradation

Columbia / Kiel University Sustainable Oceans Symposium

Fig. 5. What metabolic pathways are di�erentially present between summer surface samples and winter and deep samples? Having determined that the relationship between samples can be accurately represented by metabolic structure we can begin to ask ecologically relevant questions. A frequent ques-tion posed to community structure data is how are metabolisms partitioned between niches? In the �gure at left color gives the p-value for a Mann-Whit-ney test between sample groups (summer surface vs. summer deep and winter surface). The X-axis gives the anomaly, calculated as the di�erence in sample group means divided by the sum of the sample group means.

Inferring microbial ecosystem function from community structure

Science

Transcript of Inferring microbial ecosystem function from community structure