MACROECOLOGICAL BHPMF – a hierarchical Bayesian approach ...
Transcript of MACROECOLOGICAL BHPMF – a hierarchical Bayesian approach ...
MACROECOLOGICALMETHODS
BHPMF – a hierarchical Bayesianapproach to gap-filling and traitprediction for macroecology andfunctional biogeographyFranziska Schrodt1,2,3,*, Jens Kattge1,2, Hanhuai Shan4,5, Farideh Fazayeli4,
Julia Joswig1, Arindam Banerjee4, Markus Reichstein1, Gerhard Bönisch1,
Sandra Díaz6, John Dickie7, Andy Gillison8, Anuj Karpatne4, Sandra Lavorel9,
Paul Leadley10, Christian B. Wirth2,11, Ian J. Wright12, S. Joseph Wright13 and
Peter B. Reich3,14
1Max Planck Institute for Biogeochemistry,
Hans-Knöll-Strasse 10, 07745 Jena, Germany,2German Centre for Integrative Biodiversity
Research (iDiv) Halle-Jena-Leipzig, Deutscher
Platz 5, 04103 Leipzig, Germany, 3Department
of Forest Resources, University of Minnesota,
St Paul, MN 55108, USA, 4Department of
Computer Science and Engineering, University
of Minnesota, Twin Cities, USA, 5Microsoft
Corporation, One Microsoft Way, Redmond,
WA 98052, USA, 6Instituto Multidisciplinario
de Biología Vegetal (IMBIV – CONICET) and
Departamento de Diversidad Biológica y
Ecología, FCEFyN, Universidad Nacional de
Córdoba, CC 495, 5000, Córdoba, Argentina,7Royal Botanic Gardens Kew, Wakehurst Place,
RH17 6TN, UK, 8Center for Biodiversity
Management, Yungaburra 4884, Queensland,
Australia, 9Centre National de la Recherche
Scientifique, Grenoble, France, 10Laboratoire
ESE, Université Paris-Sud, UMR 8079 CNRS,
UOS, AgroParisTech, 91405 Orsay, France,11University of Leipzig, Leipzig, Germany,12Department of Biological Sciences,
Macquarie University, NSW 2109, Australia,13Smithsonian Tropical Research Institute,
Apartado 0843-03092, Balboa, Republic of
Panama, 14Hawkesbury Institute for the
Environment, University of Western Sydney,
Locked Bag 1797, Penrith, NSW 2751
Australia
ABSTRACT
Aim Functional traits of organisms are key to understanding and predicting bio-diversity and ecological change, which motivates continuous collection of traits andtheir integration into global databases. Such trait matrices are inherently sparse,severely limiting their usefulness for further analyses. On the other hand, traits arecharacterized by the phylogenetic trait signal, trait–trait correlations and environ-mental constraints, all of which provide information that could be used to statis-tically fill gaps. We propose the application of probabilistic models which, for thefirst time, utilize all three characteristics to fill gaps in trait databases and predicttrait values at larger spatial scales.
Innovation For this purpose we introduce BHPMF, a hierarchical Bayesianextension of probabilistic matrix factorization (PMF). PMF is a machine learningtechnique which exploits the correlation structure of sparse matrices to imputemissing entries. BHPMF additionally utilizes the taxonomic hierarchy for traitprediction and provides uncertainty estimates for each imputation. In combinationwith multiple regression against environmental information, BHPMF allows forextrapolation from point measurements to larger spatial scales. We demonstrate theapplicability of BHPMF in ecological contexts, using different plant functional traitdatasets, also comparing results to taking the species mean and PMF.
Main conclusions Sensitivity analyses validate the robustness and accuracy ofBHPMF: our method captures the correlation structure of the trait matrix as wellas the phylogenetic trait signal – also for extremely sparse trait matrices – andprovides a robust measure of confidence in prediction accuracy for each missingentry. The combination of BHPMF with environmental constraints provides apromising concept to extrapolate traits beyond sampled regions, accounting forintraspecific trait variability. We conclude that BHPMF and its derivatives have ahigh potential to support future trait-based research in macroecology and func-tional biogeography.
KeywordsBayesian hierarchical model, gap-filling, imputation, machine learning, matrixfactorization, PFT, plant functional trait, sparse matrix, spatial extrapolation,TRY.
*Correspondence: Franziska Schrodt, MaxPlanck Institute for Biogeochemistry,Hans-Knöll-Strasse 10, 07745 Jena, Germany.E-mail: [email protected]
bs_bs_banner
Global Ecology and Biogeography, (Global Ecol. Biogeogr.) (2015)
© 2015 John Wiley & Sons Ltd DOI: 10.1111/geb.12335http://wileyonlinelibrary.com/journal/geb 1
INTRODUCTION
Functional trait measurements and analyses have been the focus
of numerous studies in recent decades (e.g. Reich et al., 1997;
Wright et al., 2004; Chave et al., 2009; Schrodt et al., 2015).
However, due to the time and resources required and the sheer
number of species on earth, only a small number of species and
their traits could be captured to date, especially in tropical and
remote ecosystems. In addition, trait data are highly dispersed
among numerous datasets and are often not accessible to the
wider scientific community. The integration of databases is thus
becoming increasingly important for the consolidation of glob-
ally dispersed data, as a source of standardized data for further
applications, such as model building and validation, and to
coordinate future measurement efforts.
Combining trait observations from studies with different
research foci produces matrices with substantial gaps. For
example, the largest database for plant traits to date, TRY
(Kattge et al., 2011), currently contains 215 datasets with 5.6
million trait entries for 1100 traits of 2 million individuals,
representing 100,000 plant species. On average only 2 of the
1100 traits represented in TRY are measured for any individual,
restricting the usefulness of combined datasets especially for
multivariate analyses.
General characteristics of traits
Some characteristics inherent to functional traits may support
statistical gap-filling of sparse trait matrices: a strong
phylogenetic trait signal, functional and structural trade-offs
between traits and trait–environment relationships.
The phylogenetic trait signal is an effective aid in predicting
trait values (e.g. Lovette & Hochachka, 2006; Swenson, 2014):
the closer two individuals are related, the more similar their
traits will be – with exceptions due to convergent evolution and
environmental diversification. This phylogenetic trait signal is
reflected in the taxonomic hierarchy: on average individuals
within a species are more similar than individuals within genera,
families or phylogenetic groups (Kerkhoff et al., 2006; Swenson
& Enquist, 2007). In general, most of the trait variance is
observed between species (Kattge et al., 2011), differences
between species are consistent across spatial scales (Kazakou
et al., 2014) and mean trait values at native ranges are appropri-
ate estimates at invaded areas (McMahon, 2002; Ordonez,
2014). Due to these frequently observed patterns, mean trait
values of species and even genera are often used to fill gaps in
trait matrices (Fried et al., 2012; Cordlandwehr et al., 2013).
Traits possessed by any individual are not independent –
functional and structural trade-offs cause correlations between
traits (Reich et al., 1997). This has been characterized at the
global scale, for example for the leaf and wood economic
spectra, respectively (Wright et al., 2004; Chave et al., 2009).
Trait variability is also influenced by the environment which, on
the one hand, causes large-scale patterns, for example latitudinal
gradients of leaf N/P stoichiometry (Reich & Oleksyn, 2004) or
changes in the competitive success of Drosophila species along a
temperature gradient (Davis et al., 1998), and on the other hand
small-scale intraspecific variation, for example in plant traits
(Albert et al., 2010; Clark, 2010) and mammalian body size
(Diniz-Filho et al., 2007).
Gap-filling in trait ecology
Many disciplines, for example psychology (e.g. Schafer &
Graham, 2002), sociology (e.g. Johnson & Young, 2011) or
biogeochemistry (e.g. Moffat et al., 2007; Lasslop et al., 2010)
have developed their own set of accepted techniques for predict-
ing missing values tailored to their data structure. In trait
ecology, due to the relatively recent advent of large databases,
gap-filling methods have been adopted from other disciplines,
often without adjusting them to differences in data structure. In
general three approaches are used within the community of trait
ecologists: (1) deleting rows with missing cases or pair-wise
analysis; (2) predicting trait values based on the taxonomic trait
signal at species or genus level (mean traits); and (3) predicting
values for individual missing entries based on structure, e.g.
multiple imputation (MI) (Rubin, 1987; Su et al., 2011) and
multivariate imputation using chained equations (MICE)
(Rubin, 1987; van Buuren & Groothuis-Oudshoorn, 2011),
or/and spatial correlations within the trait matrix (e.g. kriging;
Lamsal et al., 2012; Räty & Kangas, 2012). These approaches
provide useful imputations in some cases, but whilst deleting
missing cases can introduce bias in model parameters
(Nakagawa & Freckleton, 2008), taking the ‘mean’ adds new data
points without adding new information, which results in incor-
rect confidence limits. The techniques mentioned in (3) on the
other hand were developed to fill gaps in databases with 10–30%
missing entries, but not for a sparsity as high as that observed in
combined trait databases such as TRY.
A recent exercise comparing different approaches (amongst
them ‘mean’ and MICE) to filling gaps in a plant trait matrix –
although at the species rather than the individual level –showed
that all methods were indeed only effective up to 30% gaps, that
genus mean produced the least reliable results and that includ-
ing ecological theory, for example by taking into account trait–
trait correlations, will substantially improve the accuracy of gap-
filling approaches (Taugourdeau et al., 2014). Ogle (2013)
developed a hierarchical Bayesian model, which could be used
for gap-filling of individual trait values based on taxonomic
hierarchy and covariates. However, this method only allows for
one trait to be predicted at a time, ignoring the correlation
structure within a multi-trait matrix. In contrast, some
phylogenetic imputation methods, as reviewed recently by
Swenson (2014), do allow for multiple trait predictions but only
give species or higher-order means and lack any accounting for
intraspecific variability. While this may be resolved, for example
by adding ‘twigs’ on the ends of phylogenies with terminal nodes
as species (Swenson, pers. comm.) to the authors’ knowledge,
this has not been implemented yet.
Here we present a new approach – Bayesian hierarchical
probabilistic matrix factorization (BHPMF) – which imputes
trait values based on the taxonomic hierarchy, structure within
F. Schrodt et al.
Global Ecology and Biogeography, © 2015 John Wiley & Sons Ltd2
the trait matrix and trait–environment relationships at the same
time as providing uncertainty estimates for each single trait
prediction. An extension of the method provides a concept for
out-of-sample predictions – the extrapolation of point measure-
ments to spatial scales beyond measured areas. We evaluate trait
predictions by BHPMF under various aspects and provide per-
spectives for future developments.
METHODS
Probabilistic matrix factorization
The method we present is a development of probabilistic matrix
factorization (PMF) (Salakhutdinov & Mnih, 2008). PMF is a
recommendation system developed on the example of predict-
ing users’ preferences in movies from other users’ movie ratings
(Netflix, 2009). Due to its good scalability and predictive accu-
racy, even for highly sparse datasets, PMF has become a standard
technique for imputing missing data (Koren et al., 2009).
PMF models a sparse matrix, such as the TRY database, as the
scalar (or inner, dot) product of two latent matrices with the aim
of finding a factorization that minimizes the error between pre-
dicted and observed data. This technique is closely related to
principal components analysis (PCA), which converts a set of
observations of correlated variables into a set of values of lin-
early uncorrelated variables called principal components. Like
PCA, PMF is efficient if the original matrix is of low rank, i.e. if
the axes of the original matrix provide strong correlations.
In the original case, each column represents a video and each
row a user, providing a video ranking (Salakhutdinov & Mnih,
2008). In the case of a trait matrix, videos are replaced by traits
and users by entities. In our case, entities are defined as individ-
ual plants. They could equally represent the level of an organ or
the average over different organisms, for example nitrogen
content of a single leaf, motility of a phytoplankton species or
the average beak length of several individuals of an avian species.
For simplicity, we refer to the individual plant as plant from now
on.
In a first step, independent latent vectors are generated over
each row (individual, u) and column (trait, v) of the plant × trait
matrix X N M∈ ×R with N rows and M columns (Fig. 1). Any
missing entry (n, m) in the original matrix X can be predicted as
the inner product of these latent vectors xnm = ⟨un, vm⟩ (Fig. 1).
PMF has been shown to be applicable to biological data, for
example in population genetics (Duforet-Frebourg and Blum,
2014). However, our first experiments indicated that for plant
traits the prediction accuracy of PMF was insufficient: the accu-
racy was worse than using species mean trait values to fill the
gaps (see Results). We therefore developed an extension of PMF,
which accounts for a plants’ taxonomic hierarchy to improve
prediction accuracy (Bayesian hierarchical PMF; BHPMF). The
concept was developed as HPMF in the context of machine
learning (Shan et al., 2012). We here introduce the additional
application of a Gibbs sampler in order to provide a measure of
uncertainty for each imputed trait value (BHPMF) (see ‘Gibbs
sampler – uncertainty quantified trait prediction’), as well as an
extension to facilitate out-of-sample predictions (aHPMF).
Bayesian hierarchical probabilisticmatrix factorization
BHPMF exploits the taxonomic hierarchy of the plant kingdom
as a proxy for the phylogenetic trait signal, with the individual
plant being nested in species, species in genus, genus in family
and family in phylogenetic group (Fig. 2).
BHPMF sequentially performs PMF at the different hierar-
chical levels, using latent vectors of the neighbouring level (ℓ) as
prior information at the current level. For example, trait data
averaged at species level are used to optimize latent vectors at
species level, which in turn act as priors for latent vectors at
the individual level, which finally are optimized against the
observed trait entries in the trait matrix (Fig. 2, equation 1).The
sequential approach across the taxonomic hierarchy turned out
to be most effective if applied iteratively top down and bottom
up.
After transformation of traits to approximate normal distri-
butions and z-score transformation, the cost function is devel-
oped as the sum of absolute deviations of predictions versus
observations for traits (m) of entities (n) (first summand in
equation 1) and the sum of absolute deviations of posterior and
prior of the latent factors u and v (second and third summand of
equation 1) across all hierarchical levels (L):
E xnm nm n m
nm
L
u n p n
= − ⟨ ⟩( )⎧⎨⎩
+ −
( ) ( ) ( ) ( )
=
( )( )−
∑∑ δ
λ
� � � �
�
� �
u v
u u
,2
1
11
2
2 12
2( ) ( ) −( )∑ ∑+ − ⎫⎬⎭n
v m m
m
λ v v� � ,
(1)
Figure 1 Schematic of the probabilistic matrix factorization(PMF) model. u denotes the latent vector on the individual plantside, v the latent vector on the functional trait side, both of whichhave a Gaussian normal distribution with a mean of 0 and avariance of σ2. Each missing entry Xij can be approximated by theproduct of the transposed latent vector U and the latent vector V.
Gap-filling in trait databases
Global Ecology and Biogeography, © 2015 John Wiley & Sons Ltd 3
where λ σ σu u= 2 2 , λ σ σv v= 2 2 with σ being the standard devia-
tion of the imputations. {·} denotes the set of data at all L levels,
and δnm�( ) = 1 when the entry (n, m) of X �( ) is non-missing and 0
otherwise. Replace ℓ − 1 with ℓ + 1 and the parent node (p(n))
with the child node (c(n)) for the bottom-up approach. For
details see BHPMF in Appendix S1 in Supporting Information.
Gibbs sampler – uncertainty quantifiedtrait prediction
The parameters of our BHPMF model are optimized against the
observations in the matrix using a Gibbs sampler (Fazayeli et al.,
2014). The Gibbs sampler is a Markov chain Monte Carlo
(MCMC) method, which samples the probability density distri-
butions of model parameters (here the latent vectors) and
model predictions (here entries in the plant × trait matrix) (see
the grey inset in Fig. 2 and ‘Model evaluation’, as well as Gibbs
sampler results in Appendix S10). The Gibbs sampler-inferred
density distributions of trait values are then used to infer the
most likely imputation value, as well as the associated uncer-
tainty for each prediction.
aHPMF – extrapolation from point measurements toregional scales
If BHPMF is stopped at the species level, i.e. without accounting
for trait variability specific to individual plants, the residual
error represents the intraspecific variability and modelling/
measurement errors. aHPMF focuses on explaining this residual
trait variability based on environmental variables, such as soil
and climate characteristics of their growth environment in order
to enable out-of-sample prediction, i.e. trait predictions for
individual plants where the only known factors are species iden-
tity and location but no traits have actually been measured on
the given individual (Fig. 3).
To capture trait variability that can be attributed to environ-
mental factors, we utilize a hierarchical regression framework,
taking into account the taxonomic structure of plants to regu-
larize the regression model. The regression framework takes as
independent predictor variables the climatic and soil variables
mentioned below at locations with georeferenced trait measure-
ments. The residuals of BHPMF for the 13 plant traits of each
observation are considered as the target dependent variables to
be predicted. We treat each plant trait independently of every
other while regressing them using climate and soil features.
In essence, combining BHPMF with least squares regression
over the residuals against environmental factors, we can model
the unknown value for species n and trait m in a probabilistic
model as
k u v w x enm n m nm= + +α βT T (2)
where (un, vm) are the latent factors, with un having a hierarchical
prior from the taxonomy and x being the environmental condi-
tion with w as the regression coefficient. enm is the zero mean
Gaussian noise. Note that α, β are scalar parameters: for BHPMF
set (α = 1, β = 0), for aHPMF set (α = 1, β = 1). For details see
‘aHPMF’ in Appendix S1.
Data: traits, climate and soil
We demonstrate the applicability of the methods introduced
above on the example of a trait matrix derived from TRY. For
details on data standardization see Kattge et al. (2011). The
spatial distribution of measurement sites and detailed informa-
tion on the original datasets are shown in Fig. S4.1 (Appendix
S4) and Tables S3.1 & S3.2 in Appendix S3.
We extracted a matrix of 13 georeferenced traits consisting of
204,404 trait measurements on 78,300 individuals, spanning
14,320 species, 3793 genera, 358 families and 6 phylogenetic
Figure 2 Schematic of the Bayesian hierarchical probabilisticmatrix factorization (BHPMF) model. N denotes the entity(individual plant) side and U the corresponding matrix of latentvectors on the row side, M the trait side and V the correspondingmatrix of latent vectors on the column side. x denotes an entry inthe original plant × trait matrix S. The numbers in parenthesesshow the taxonomic level L. For example (4) is the species levelwhereas (2) is the family level. The grey inset provides a schemafor the Gibbs sampler where p(n) is the parent node of n in theupper level and c(n) is the set of child nodes n in the lower level.
Figure 3 Schematic of the advanced hierarchical probabilisticmatrix factorization (aHPMF) model. w denotes the regressioncoefficient at different levels of the hierarchy and Q thecorresponding matrix of latent vectors. The numbers inparentheses shows the taxonomic level L.
F. Schrodt et al.
Global Ecology and Biogeography, © 2015 John Wiley & Sons Ltd4
groups. The sparsity ranged from 49.63% for leaf area to 92.33%
for the leaf N to P ratio, with an average sparsity of 79.9% across
the trait matrix (Table 1). All traits were log- and z-transformed
to improve normality and equalize traits in the cost function
during optimization.
For out-of-sample predictions by aHPMF, climate data for
mean annual precipitation, mean annual temperature, isother-
mally and precipitation seasonality were extracted from the
WorldClim dataset (Hijmans et al., 2005) and soil texture (sand,
silt, clay) and soil organic carbon content in the top soil from the
Harmonized World Soil Database v1.2 (FAO et al., 2012).
Model evaluation
We ran PMF, BHPMF and aHPMF on the test dataset extracted
from TRY. Given the plant × trait matrix, we randomly selected
80% of entries for training (parameter setting), 10% for valida-
tion (parameter adjustment by optimizing performance) and
10% for test (independent performance testing after parameter
adjustment and learning). This cross-validation improves model
fidelity by ensuring that none of the observations are known by
the model when performing new predictions. Test entries
without training data in the same row would have highly
inflated variance. Such cases were prevented by adjusting the
splitting accordingly (see ‘BHPMF’ in Appendix S1).
We evaluated the predicted trait values, using the root mean
squared error (RMSE; see equation S13 in Appendix S1) and the
correlation coefficient (R2) of z-transformed predicted versus
observed traits as indicators of overall prediction accuracy. We
compared the performance of PMF, BHPMF and aHPMF with a
baseline of species mean trait values (MEAN), which uses the
overall trait mean of all individual plants within a species for
prediction. The effectiveness of capturing the phylogenetic trait
signal was explored by performing BHPMF including increas-
ingly detailed taxonomic information (Fig. 2).
In order to evaluate how well not only predicted versus meas-
ured but also trait-trait correlations are preserved in BHPMF, we
performed standardized major axis (SMA) regression, the first
principal component vector of a correlation matrix fitted
through the data centroid (Taskinen & Warton, 2011), on the
measured and imputed trait values for some key trait correla-
tions. We also performed a Procrustes analysis with PROTEST
(using the R package ‘vegan’) on a PCA of a subset of the
original data versus a PCA based on the estimated values for
artificially introduced gaps. Due to its good data cover, we per-
formed this test on the RAINFOR extract from the TRY database
(see below). Procrustes is a statistical shape analysis tool (least-
squares orthogonal mapping) which compares two ‘superim-
posed’ matrices for overlap, with placement in space and object
size being adjustable. We show how uncertainty in trait predic-
tions is accounted for using the Gibbs sampler, comparing pre-
diction confidence (SD) with prediction accuracy (RMSE).
The sensitivity of BHPMF to the fraction of gaps and the
effect of using a global database to fill gaps in local or regional
datasets were explored using two approaches. First by ‘cutting
out’ a local dataset with high coverage, adding additional gaps
(0, 10, 30, 60 and 80%; see Table S8.1 in Appendix S8) and
second by using a regional gappy dataset, filling gaps in each of
these ‘cut-outs’ using (1) the global data with information from
the local/regional data and (2) just the local/regional data. For
our local example, we extracted TRY trait data contributed by
the RAINFOR group (Fyllas et al., 2009), which shows a good
coverage (sparsity 11%) and covers most of the Amazon
(Fig. S8.1 in Appendix S8). For our regional example, we
extracted all of the European data (sparsity 72%) (Fig. S8.2 in
Appendix S8). For details on methodology please refer to
Table 1 Number of entries, sparsity androot mean square error (RMSE) ofspecies mean (MEAN), probabilisticmatrix factorization (PMF), Bayesianhierarchical PMF (BHPMF) andadvanced hierarchical PMF (aHPMF) bytrait, as well as R2 values of theregression of imputed versus measuredtraits. The lowest RMSE and highest R2
are shown in bold.
Trait Entries Sparsity
MEAN PMF BHPMF aHPMF
RMSE R2 RMSE R2 RMSE R2 RMSE R2
SLA 33001 57.9 0.53 0.85 0.88 0.49 0.46 0.88 0.53 0.89Plant height 16465 79.0 0.47 0.90 0.91 0.40 0.40 0.92 0.44 0.93Seed mass 7311 90.7 0.37 0.91 0.77 0.40 0.36 0.92 0.36 0.92LDMC 17331 77.9 0.53 0.83 0.87 0.41 0.43 0.88 0.49 0.89SSD 9191 88.3 0.51 0.86 1.01 0.19 0.44 0.87 0.51 0.87Leaf area 39438 49.6 0.50 0.87 0.91 0.41 0.37 0.93 0.41 0.93Leaf N 26882 65.7 0.67 0.77 0.99 0.28 0.53 0.86 0.59 0.86Leaf P 11975 84.7 0.72 0.69 0.78 0.62 0.52 0.83 0.62 0.83Leaf N/area 8180 89.6 0.79 0.65 0.80 0.64 0.51 0.82 0.72 0.82Leaf fresh mass 11484 85.3 0.47 0.89 0.71 0.71 0.27 0.96 0.39 0.96Leaf N/P ratio 5999 92.3 0.76 0.69 0.90 0.44 0.49 0.85 0.67 0.84Leaf C/dry mass 8123 89.6 0.70 0.74 0.884 0.35 0.61 0.61 0.62 0.78Leaf δ 15N 9022 88.5 0.63 0.79 1.02 0.02 0.50 0.87 0.53 0.88Average 15723 79.9 0.58 0.80 0.88 0.41 0.45 0.88 0.53 0.87
SLA, specific leaf area; LDMC, leaf dry matter content; SSD, stem-specific density; Leaf N and Leaf P,leaf nitrogen and phosphorus concentrations per dry mass, respectively; Leaf N/area, leaf nitrogenconcentration per leaf area; Leaf C/dry mass, leaf carbon concentration per dry mass. For definitionsof all traits and data sources as well as corresponding references see the Supporting Information(Appendices S2, S3 and S11 respectively).
Gap-filling in trait databases
Global Ecology and Biogeography, © 2015 John Wiley & Sons Ltd 5
Appendices S8 & S1. Finally, we provide an example for out-of-
sample prediction, extrapolating leaf nitrogen concentration
(leaf N) from point measurements to the whole species range of
Acer saccharum using aHPMF.
Probabilistic matrix factorization and subsequent regression
were developed and applied in MATLAB version 2012a
(MATLAB, 2012). All other analyses were performed using the
statistical platform R version 2.15 (R Core Team, 2014). The
maps reported here were produced in ArcMap 10.1 (ArcGIS
Desktop, 2011) and R, using the tree species distribution map of
A. saccharum from the US Geological Survey (Little, 1971). R
scripts to implement BHPMF are available from the authors by
request.
RESULTS
Predicted versus observed trait values
To analyse prediction accuracy we compare RMSE and the coef-
ficient of determination (R2) for MEAN, PMF, BHPMF and
aHPMF averaged across traits and for each trait separately
(Table 1; for scatterplots of observed versus predicted for all
traits see Fig. S9.1 in Appendix S9). On average, across all traits,
BHPMF outperforms PMF, MEAN and aHPMF, with MEAN
being significantly more accurate than PMF. This holds after
statistical evaluation using a paired t-test with P-values smaller
than 10−5 at all levels, and is supported by the evaluation of the
correlation coefficient R2 (Table 1). As the RMSE is calculated
from z-transformed approximate normal distributions of traits,
a RMSE of 0.45 for BHPMF indicates that the average error of
predictions is about half a standard deviation, or about 10% of
the 95% CI. BHPMF outperforms MEAN and PMF in all traits,
while aHPMF shows the same or higher RMSE and higher R2
than BHPMF for SLA, plant height, leaf dry matter content
(LDMC), leaf carbon (C) per dry mass and leaf δ 15N (D15N)
(Table 1). The advantage of BHPMF over MEAN is largest for
‘physiological traits’, such as leaf N and leaf phosphorus concen-
tration (leaf P), and smaller for more ‘structural traits’ such as
seed mass or plant height. The prediction accuracy of BHPMF
varies across traits: from RMSE = 0.36 (R2 = 0.92) for seed mass
to RMSE = 0.61 (R2 = 0.61) for leaf C content per dry mass.
Interestingly, prediction accuracy is not related to the number of
entries per trait (Table 1).
Accounting for taxonomic hierarchy
The RMSE of MEAN and BHPMF decreases with increasing
taxonomic information, indicating that both methods can
utilize the hierarchical structure to their advantage (Table S7.1
in Appendix S7). This is also supported by the scatter plot of
measured versus predicted specific leaf area (SLA) and leaf N
shown in Fig. 4. With increasing taxonomic information, the
scatter plot approaches the 1:1 line, i.e. prediction accuracy
improves.
Trait–trait correlations
Although the presence of strong trait–trait correlations is a pre-
requisite for the accuracy of BHPMF, such correlations are not
provided a priori and are thus not part of the objective function
used (equation 1). This turns them into a suitable evaluation
measure. An important quality criterion is to what extent the
imputed values reflect the observed bivariate correlations, as this
is a first indication of the extent to which the overall correlation
structure of the n-dimensional trait matrix is maintained by
imputation. Our dataset shows on average strong trait–trait cor-
relations, with some exceptions (Fig. S9.5 in Appendix S9).
BHPMF and MEAN capture these general trait–trait correla-
tions, but BHPMF reproduces extreme values more accurately
than MEAN and is therefore generally better at capturing the
shape of the scatter of observed trait data, which is confirmed by
more similar SMA R2 values (Fig. S9.2 in Appendix S9). Looking
at the multivariate preservation of trait–trait correlations using
Procrustes analysis, our results indicate again that BHPMF does
Figure 4 Scatter plots of predicted versus true values for twotraits with increasing taxonomic information. Left column, leafnitrogen concentration per dry weight; right column, specific leafarea. Row 1, no phylogenetic information is used; row 2, only thephylogenetic group is used; row 3, phylogenetic group and familyare used; row 4, phylogenetic group, family and genus are used;row 5, phylogenetic group, family, genus and species are used.Predictions are based on Bayesian hierarchical probabilistic matrixfactorization. The data are presented in z-transformed space.Dotted lines indicate the 1:1 correlation.
F. Schrodt et al.
Global Ecology and Biogeography, © 2015 John Wiley & Sons Ltd6
not significantly alter the correlation structure of the gap-filled
matrix (Fig. 5). The first four principal component axes explain
83.4% and 83.4% of the variability in the dataset for the original
and gap-filled data, respectively. None of the principal compo-
nent axes are significantly different between the gappy and gap-
filled data for any of the traits. The traits stem specific density
and leaf carbon differ – but not significantly – along the third
and fourth axes (see Fig. S9.3 in Appendix S9).
Uncertainty quantified predictions
The Gibbs sampler provides a probability distribution for every
single prediction, as shown in the example of Gibbs sampler-
generated density plots of BHPMF-estimated LDMC, leaf N and
SLA for A. saccharum and Pinus sylvestris trees (Fig. S10.1 in
Appendix S10). This distribution can be exploited to calculate
indices for the best estimate (e.g. mean) and variability (e.g. SD).
This provides an additional means to evaluate our imputation
model by comparing prediction confidence (SD) with predic-
tion accuracy (RMSE): when we are confident about our pre-
dictions (small SD), these predictions should also be accurate
(small RMSE) and vice versa. Figure S10.2 in Appendix S10
shows that this is indeed the case for the whole 13-trait dataset,
implying that our model is appropriate. This remains true when
we evaluate the Gibbs sampler on each trait separately
(Fig. S10.3 in Appendix S10).
Gap-filling of regional/local data using BHPMF
As expected, increasing the number of gaps in the RAINFOR
dataset generally resulted in a decrease of prediction accuracy
(Fig. 6), although less so for structural traits, such as stem-
specific density (SSD) and plant height. Reproducibility was
high in all cases (Fig. 6). Prediction accuracy of BHPMF was
generally approximately equal, no matter whether the regional
(RforR) or global (WforR) datasets were used to fill the gaps
(Figs 6 & S8.3 in Appendix S8). This was particularly the case if
gap sizes were large (above 10%), whilst RforR outperformed
WforR for the imputations of plant height, leaf N, SLA and leaf
carbon only where additional gap sizes were small (0 and
10%).
Out-of-sample prediction (aHPMF)
We illustrate the extension of BHPMF towards out-of-sample
prediction with the example of leaf N across the species range of
A. saccharum (Fig. 7).
Figure 5 Procrustes analysis errors for the first and secondprincipal component axes comparing a princpal componentsanalysis (PCA) performed on the original, gappy RAINFOR datawith a PCA performed on the RAINFOR data with artificiallyintroduced gaps being filled using Bayesian hierarchicalprobabilistic matrix factorization.
Figure 6 Root mean square error (RMSE) of performingBayesian hierarchical probabilistic matrix factorization (BHPMF)on the RAINFOR cutout (red points in Fig. S8.1 in Appendix S8)for the whole dataset (Total), specific leaf area (SLA), plant height(PlantHt), stem-specific density (SSD), leaf nitrogen (LeafN), leafphosphorus (LeafP), leaf nitrogen per area (LeafNArea), leafcarbon (LeafC) with increasing number of gaps added to theoriginal RAINFOR data (inherent gappiness of 11%). For the totalnumber of gaps for each trait and added gaps per dataset, seeTable S8.1 in Appendix S8. Left- and right-hand sections for eachtrait (separated by a dotted line) show results when using only theRAINFOR data (RforR) or using all available data (WforR),respectively, to fill the gaps.
Gap-filling in trait databases
Global Ecology and Biogeography, © 2015 John Wiley & Sons Ltd 7
BHPMF was stopped at species level, followed by a multivari-
ate regression of residuals of predicted versus observed traits at
the level of the individual plants against environmental condi-
tions. Trait predictions follow the trends in both measured leaf
N and environmental conditions. However, the variability
within aHPMF-predicted leaf N is low (18.69–18.80 mg g−1)
compared with the range of actually measured values (8.65–
28.23 mg g−1). While the amplitude of predicted traits for the
grid elements is much smaller than the observed ranges, it is
notable that A. saccharum occurs over a wide range of environ-
mental conditions and grid averages may not reflect variation at
local scales. The flat response surface Reich & Oleksyn (2004)
found for the correlation between leaf N in the genus Acer and
mean annual temperature may further support the validity of
our results.
DISCUSSION
We demonstrate that BHPMF provides accurate and robust
uncertainty-quantified trait predictions, even for sparse matri-
ces with up to 80% missing entries. BHPMF outperforms the
species MEAN baseline in all aspects: RMSE and R2 of predicted
versus observed entries are smaller and larger, respectively, for
each individual trait (Table 1) and trait–trait correlations are
better retrieved (Fig. S9.2 in Appendix S9). Prediction accuracy
is high (small RMSE) when prediction uncertainty is small
(small SD; Fig. S10.2 in Appendix S10), providing a measure of
confidence for prediction accuracy. In addition, aHPMF pro-
vides a concept to extrapolate from point measurements to
species ranges accounting for intraspecific variability (Fig. 7).
These results give rise to three major questions: (1) Why does
BHPMF provide accurate and robust trait predictions even for
sparse trait matrices? (2) Is the prediction accuracy of BHPMF
sufficient for applications in ecological contexts? (3) What are
the prospects for BHPMF?
Why are BHPMF predictions robust and accurate?
BHPMF is a Bayesian hierarchical approach that simultaneously
takes the taxonomic trait signal, the correlation structure within
the trait matrix and environmental constraints into account to
fill gaps in trait matrices. A comparison of BHPMF-predicted
trait values with results based on PMF, MEAN and aHPMF
indicates the relevance of all three aspects.
PMF has been shown to be accurate even for the imputation
of sparse datasets (Koren et al., 2009). However, in the case of
our test dataset, PMF performs worse than the baseline MEAN
approach. This indicates that our test dataset is not of suffi-
ciently low rank to allow for robust trait predictions based on
the correlation structure in the matrix alone.
BHPMF converts PMF into a hierarchical Bayesian model.
This has two effects: (1) the higher levels of taxonomy (e.g.
phylogenetic group, family) provide almost complete informa-
tion (very low sparsity), which enables efficient PMF; (2)
approximations of the matrices at higher taxonomic level
provide excellent prior information for the approximations at
lower levels (e.g. genus, species), thus constraining imputations
due to the taxonomic signal in trait variation at all levels.
This is achieved without a priori assuming a phylogenetic
signal in the trait variability, but rather by opening the door for
our model to extract a signal, if it should be there. Thus, in some
cases, BHPMF might not put any constraint on the imputed
Figure 7 Advanced hierarchical probabilistic matrix factorization (aHPMF)-predicted leaf nitrogen concentration (mg g−1) of Acersaccharum (a), measured values for leaf nitrogen (mg g−1) (b), MAT (mean annual temperature) (c), and MAP (mean annual precipitation)(d) across the species range of A. saccharum. For a map of the geographic location see Figure S5.1 in Appendix S5.
F. Schrodt et al.
Global Ecology and Biogeography, © 2015 John Wiley & Sons Ltd8
trait values from the taxonomy side, whereas in others, this
signal might be stronger, hence the constraint from the tax-
onomy. The hierarchical taxonomic structure in combination
with a consistent phylogenetic trait signal is therefore the key to
facilitate robust gap-filling by BHPMF – despite high sparsity –
at the level of the individual observations. On the other hand
BHPMF outperforms the MEAN approach in all aspects. This
indicates that the capture by PMF of the correlation structure of
the matrix on top of the phylogenetic trait signal is the key to
providing accurate predictions.
A comparison of BHPMF with aHPMF indicates that explic-
itly taking environmental constraints into account surprisingly
adds little or no improvement to BHPMF: the average R2 was
only higher in 5 out of 13 traits with the RMSE being consist-
ently smaller for BHPMF compared with aHPMF (Table 1). At
the individual level, PMF was replaced by multiple regressions
against environmental constraints, based on the assumption
that the phylogenetic trait signal is mainly observed at the
species level whilst environmental constraints add to the trait
variability at the individual level.
The fact that BHPMF largely outperforms aHPMF is an indi-
cation that, by taking into account trait–trait correlations, PMF
seems to be more appropriate than the multiple regression at the
individual level. Also, BHPMF implicitly takes environmental
information into account via the correlation of measured to
predicted traits. This environmental constraint is related to the
immediate environment experienced by the observed plant and
propagated via trait–trait correlations to the predicted traits,
while environmental information incorporated in the multiple
regression model by necessity represents only the average over
larger scales. Taking environmental information explicitly into
account may therefore not substantially improve BHPMF-based
gap-filling. Rather, it may be a tool to extrapolate traits from
point measurements to the regional scale, i.e. perform out-of-
sample predictions.
Is the prediction accuracy sufficient for applicationsin ecological contexts?
Our results indicate that BHPMF outperforms the species
MEAN baseline in all aspects, but is the prediction accuracy
indeed appropriate for use in ecological contexts, and to what
extent are the results a special case of our test dataset?
The test dataset is an extract of 78,300 individual observations
and the 13 best covered traits from the TRY database, still with
80% of trait entries missing. Our dataset is not typical for
common datasets of traits, which generally originate from spe-
cific measurement campaigns with only 5–30% missing entries.
In these cases, gap-filling may not be such a challenge and
common approaches may be sufficient. However, given the
experience that BHPMF substantially outperforms both PMF
and MEAN, we also expected improved trait predictions in such
cases. Indeed, our sensitivity test using ‘cut outs’ from Europe
and data across the Amazon showed that BHPMF was able to
impute gaps even in smaller and extremely sparse databases.
When taking advantage of using the global database to fill
gaps in a smaller ‘cutout’, possible confounding factors intro-
duced by global trait variability influencing the sometimes more
constrained trait space of a local dataset should be considered.
For example in the case of the RAINFOR cut-out, plant height
was better predicted using just the local dataset. One likely
explanation is that a large amount of new information was
included from the global database, amounting to more than
60% more data with a high standard deviation within species.
For SSD, on the other hand, only 30% more data were contrib-
uted by the global dataset, which were also less plastic within
species compared with plant height. Thus, a global dataset may
easily introduce ‘false’ information in the case of plastic traits
that are highly influenced by local environmental conditions but
provide a lot of valuable additional information in the case of
traits that are mainly determined by phylogeny. We recommend,
depending on the number and sparsity of the data, using both
approaches wherever possible – local and global gap-filling –
and select the best-fitting approach depending on the target trait
and aim.
BHPMF trait prediction uncertainties are well correlated with
their error, turning uncertainty into a good surrogate for pre-
diction error (Fig. S10.2 in Appendix S10). This offers the
opportunity to select trait predictions with high probability of
low errors or weight results in further analyses according to their
uncertainty.
Perspectives
Statistical approaches to gap-filling and improvement of eco-
logical understanding of species occurrence and dynamics have
been criticized for being too complex and including parameters
and assumptions without appropriate prior validation (Lavine,
2010). In contrast, BHPMF is a simple and generic model, which
has the advantage of incorporating factors known to influence
trait expression, such as phylogenetic trait signals, trait–trait
correlations and trade-offs without implicitly requiring prior
assumptions. This simplicity opens the opportunity to add com-
plexity to improve trait predictions, for example by replacing the
taxonomic hierarchy by a hierarchy of phylogenetic distances.
BHPMF has been explicitly developed for gap-filling (matrix
completion), not for the prediction of trait values in individuals
where no trait has ever been measured before, i.e. out-of-sample
predictions. Advancing BHPMF to account for phylogenetic dis-
tances instead of taxonomy will improve opportunities for effi-
cient out-of-sample predictions, at least in well-resolved clades.
Uncertainty quantified predictions support the validity of the
BHPMF approach, where predictions with low uncertainty
(small SD) also show high accuracy (small RMSE), and vice
versa. Using the Gibbs sampler to get an indication as to where
uncertainties are largest merits special attention. Data coverage
for traits such as SLA, which are relatively fast and easy to collect,
is much better than for other traits. In addition, some traits are
highly variable across ecosystems and plant populations. Using
the Gibbs sampler, one can statistically define where more sam-
pling effort is probably going to significantly improve our
Gap-filling in trait databases
Global Ecology and Biogeography, © 2015 John Wiley & Sons Ltd 9
understanding of variation in any specific trait and where the
currently available data are sufficient.
The combination of BHPMF with environmental covariates
may not necessarily improve gap-filling of trait matrices, but
seems to be a promising concept for the extrapolation of traits
from point measurements to regional scales. The concept of
out-of-sample prediction presented here illustrates a fundamen-
tal problem in trait ecology, separating the phylogenetic signal
and environmental impact on intraspecific trait variation. The
regression of trait values against the environment after gap-
filling by BHPMF provides an advantage over using non-gap-
filled data of an improved number of data points, which might
be essential as trait predictions are often limited by data avail-
ability (Verheijen et al., 2013). BHPMF-filled trait matrices
could thus become an invaluable tool for the parameterization
and validation of global vegetation models.
The TRY database is a typical example of a combination of
datasets which have been collected for different purposes. Other
disciplines besides plant ecology which have also started to
combine their (trait) data are faced with the same problem of
sparsity and restrictions in the context of multivariate analyses.
BHPMF provides an opportunity to extract the entire informa-
tion content from such databases combining data from various
aspects measured at locations all over the world.
Conclusion
BHPMF is a hierarchical Bayesian implementation of probabil-
istic matrix factorization, which for the first time simultaneously
utilizes the taxonomic trait signal, the correlation structure
within trait matrices and – implicitly through trait–
environment relationships – environmental constraints for gap-
filling of trait matrices. We demonstrate using the example of
different plant trait datasets that BHPMF provides robust and
accurate predictions even for sparse matrices. In addition, Gibbs
sampler-calculated uncertainties indicate how accurate each
imputed trait value is, and thus how to treat it in further analy-
ses. The combination of BHPMF with environmental informa-
tion provides the opportunity to extrapolate from point
measurements to continuous trait surfaces across large spatial
scales whilst accounting for intraspecific trait variability. We
therefore conclude that BHPMF-based gap-filling and trait pre-
diction has a high potential to support future trait-based
research in macroecology and functional biogeography.
ACKNOWLEDGEMENTS
F.S. was supported by the University of Minnesota, Institute on
the Environment via the grant to P.R.: ‘Transformational steps in
synthesis science’, the Max Planck Institute for Biogeochemistry
and iDiv, the German Centre for Integrative Biodiversity
Research. H.S. was supported by NSF grants IIS-0812183, IIS-
0916750, IIS-1029711, IIS-1017647, and NSF CAREER award
IIS-0953274. This study has been performed with support by the
TRY initiative on plant traits (https://www.try-db.org). TRY is
hosted and developed at the Max Planck Institute for
Biogeochemistry, with support from DIVERSITAS and iDiv. We
thank Ulrich Weber for help with the preparation of soil data
and Maarten Braakhekke, Nate Swenson and two anonymous
referees for helpful comments and suggestions.
REFERENCES
Albert, C.H., Thuiller, W., Yoccoz, N.G., Soudant, A., Boucher,
F., Saccone, P. & Lavorel, S. (2010) Intraspecific functional
variability: extent, structure and sources of variation. Journal
of Ecology, 98, 604–613.
ArcGIS Desktop, E. (2011) Release 10.1. Environmental Systems
Research Institute, Redlands, CA.
van Buuren, S. & Groothuis-Oudshoorn, K. (2011) mice: multi-
ple imputation by chained equations in r. Journal of Statistical
Software, 45, 1–67.
Chave, J., Coomes, D., Jansen, S., Lewis, S.L., Swenson, N.G. &
Zanne, A.E. (2009) Towards a worldwide wood economics
spectrum. Ecology Letters, 12, 351–366.
Clark, J.S. (2010) Individuals and the variation needed for
high species diversity in forest trees. Science, 327, 1129–
1132.
Cordlandwehr, V., Meredith, R., Ozinga, W., Bekker, R., van
Groenendael, J. & Bakker, J. (2013) Do plant traits retrieved
from a database accurately predict on-site measurements?
Journal of Ecology, 101, 662–670.
Davis, A., Lawton, J., Shorrocks, B. & Jenkinson, L. (1998) Indi-
vidualistic species responses invalidate simple physiological
models of community dynamics under global environmental
change. Journal of Animal Ecology, 67, 600–612.
Diniz-Filho, J.A.F., Bini, L.M., Rodríguez, M.A., Rangel,
T.F.L.V.B. & Hawkins, B.A. (2007) Seeing the forest for the
trees: partitioning ecological and phylogenetic components of
Bergmann’s rule in European Carnivora. Ecography, 30, 598–
608.
Duforet-Frebourg, N. & Blum, M.G.B. (2014) Bayesian matrix
factorization for outlier detection: an application in popula-
tion genetics. The Contribution of Young Researchers to
Bayesian Statistics. Springer Proceedings in Mathematics &
Statistics, 63, 143–147.
FAO, IIASA, ISRIC, ISSCAS, JRC (2012) Harmonized World Soil
Database (version 1.2). FAO, Rome, Italy and IIASA,
Laxenburg, Austria.
Fazayeli, F., Banerjee, A., Kattge, J., Schrodt, F. & Reich, P. (2014)
Uncertainty quantified matrix completion using Bayesian
hierarchical matrix factorization. Proceedings of the 13th
International Conference on Machine Learning and
Applications.
Fried, G., Kazakou, E. & Gaba, S. (2012) Trajectories of weed
communities explained by traits associated with species
response to management practices. Agriculture, Ecosystems
and Environment, 158, 147–155.
Fyllas, N.M., Patiño, S., Baker, T.R. et al. (2009) Basin-wide vari-
ations in foliar properties of Amazonian forest: phylogeny,
soils and climate. Biogeosciences, 6, 2677–2708.
F. Schrodt et al.
Global Ecology and Biogeography, © 2015 John Wiley & Sons Ltd10
Hijmans, R.J., Cameron, S.E., Parra, J.L., Jones, P.G. & Jarvis, A.
(2005) Very high resolution interpolated climate surfaces for
global land areas. International Journal of Climatology, 25,
1965–1978.
Johnson, D.R. & Young, R. (2011) Toward best practices in
analyzing datasets with missing data: comparisons and rec-
ommendations. Journal of Marriage and Family, 73, 926–
945.
Kattge, J., Díaz, S., Lavorel, S., et al. (2011) TRY – a global
database of plant traits. Global Change Biology, 17, 2905–
2935.
Kazakou, E., Violle, C., Roumet, C., Navas, M.L., Vile, D., Kattge,
J. & Garnier, E. (2014) Are trait-based species rankings con-
sistent across datasets and spatial scales? Journal of Vegetation
Science, 25, 235–237.
Kerkhoff, A., Fagan, W., Elser, J. & Enquist, B. (2006)
Phylogenetic and growth form variation in the scaling of
nitrogen and phosphorus in the seed plants. The American
Naturalist, 168, E103–E122.
Koren, Y., Bell, R. & Volinsky, C. (2009) Matrix factorization
techniques for recommender systems. IEEE Computer, 42(8),
30–37.
Lamsal, S., Rizzo, D.M. & Meentemeyer, R.K. (2012) Spatial
variation and prediction of forest biomass in a heterogeneous
landscape. Journal of Forestry Research, 23, 13–22.
Lasslop, G., Reichstein, M., Papale, D., Richardson, A., Arneth,
A., Barr, A., Stoy, P. & Wohlfahrt, G. (2010) Separation of net
ecosystem exchange into assimilation and respiration using a
light response curve approach: critical issues and global evalu-
ation. Global Change Biology, 16, 187–208.
Lavine, M. (2010) Living dangerously with big fancy models.
Ecology, 91, 3487.
Little, E.L.J. (1971) Atlas of United States trees, volume 1, conifers
and important hardwoods. Miscellaneous Publication 1146.
US Department of Agriculture, Forest Service, Washington,
DC.
Lovette, I.J. & Hochachka, W.M. (2006) Simultaneous effects of
phylogenetic niche conservatism and competition on avian
community structure. Ecology, 87, 14–28.
McMahon, R.F. (2002) Evolutionary and physiological adapta-
tions of aquatic invasive animals: r selection versus resistance.
Canadian Journal of Fisheries and Aquatic Sciences, 59, 1235–
1244.
MATLAB (2012) Computer software. The MathWorks Inc.,
Natick, MA.
Moffat, A., Papale, D., Reichstein, M., Hollinger, D., Richardson,
A., Barr, A., Beckstein, C., Braswell, B., Churkina, G., Desai, A.,
Falge, E., Gove, J., Heimann, M., Hui, D., Jarvis, A., Kattge, J.,
Noormets, A. & Stauch, V. (2007) Comprehensive comparison
of gap-filling techniques for eddy covariance net carbon
fluxes. Agricultural and Forest Meteorology, 147, 209–232.
Nakagawa, S. & Freckleton, R.P. (2008) Missing inaction: the
dangers of ignoring missing data. Trends in Ecology and Evo-
lution, 23, 592–596.
Netflix (2009) Netflix prize. Available at: http://www
.netflixprize.com.
Ogle, K. (2013) Feedback and modularization in a Bayesian
metaanalysis of tree traits affecting forest dynamics. Bayesian
Analysis, 8, 133–168.
Ordonez, A. (2014) Functional and phylogenetic similarity
of alien plants to co-occurring natives. Ecology, 95, 1191–
1202.
R Core Team (2014) R: a language and environment for statis-
tical computing. R Foundation for Statistical Computing.
Vienna, Austria. Available at: http://www.R-project.org/.
Räty, M. & Kangas, A. (2012) Comparison of k-msn and kriging
in local prediction. Forest Ecology and Management, 263,
47–56.
Reich, P.B. & Oleksyn, J. (2004) Global patterns of plant leaf N
and P in relation to temperature and latitude. Proceedings of
the National Academy of Sciences USA, 101, 11001–11006.
Reich, P.B., Walters, M.B. & Ellsworth, D.S. (1997) From tropics
to tundra: global convergence in plant functioning. Proceed-
ings of the National Academy of Sciences USA, 94, 13730–
13734.
Rubin, D.B. (1987) Multiple imputation for nonresponse in
surveys. Wiley and Sons, New York.
Salakhutdinov, S. & Mnih, A. (2008) Probabilistic matrix fac-
torization. Advances in Neural Information Processing Systems
20 (NIPS 07). Available at: http://www.cs.toronto.edu/
∼rsalakhu/papers/nips07_pmf.pdf
Schafer, J.L. & Graham, J.W. (2002) Missing data: our view of
state of the art. Psychological Methods, 7, 147–177.
Schrodt, F., Domingues, T.F., Feldpausch, T.R. et al. (2015) Foliar
trait contrasts between African forest and savanna trees:
genetic versus environmental effects. Functional Plant Biology,
42, 63–83.
Shan, H., Kattge, J., Reich, P.B., Banerjee, A., Schrodt, F. &
Reichstein, M. (2012) Gap filling in the plant kingdom – trait
prediction using hierarchical probabilistic matrix factoriza-
tion. Proceedings of the 29th International Conference on
Machine Learning (ed. by J. Langford and J. Pineau), pp.
1303–1310. Omnipress, Madison, WI.
Su, Y.S., Gelman, A., Hill, J. & Yajima, M. (2011) Multiple impu-
tation with diagnostics (mi) in r: opening windows into the
black box. Journal of Statistical Software, 45, 55–64.
Swenson, N. & Enquist, B. (2007) Ecological and evolutionary
determinants of a key plant functional trait: wood density and
its community-wide variation across latitude and elevation.
American Journal of Botany, 94, 451–459.
Swenson, N.G. (2014) Phylogenetic imputation of plant func-
tional trait databases. Ecography, 37, 105–110.
Taskinen, S. & Warton, D.I. (2011) Robust estimation and infer-
ence for bivariate line-fitting in allometry. Biometrical Journal,
53, 652–672.
Taugourdeau, S., Villerd, J., Plantureux, S., Huguenin-Elie, O. &
Amiaud, B. (2014) Filling the gap in functional trait databases:
use of ecological hypotheses to replace missing data. Ecology
and Evolution, 4, 944–958.
Verheijen, L.M., Brovkin, V., Aerts, R., Bönisch, G., Cornelissen,
J.H.C., Kattge, J., Reich, P.B., Wright, I.J. & van Bodegom, P.M.
(2013) Impacts of trait variation through observed
Gap-filling in trait databases
Global Ecology and Biogeography, © 2015 John Wiley & Sons Ltd 11
trait–climate relationships on performance of an earth system
model: a conceptual analysis. Biogeosciences, 10, 5497–5515.
Wright, I.J., Reich, P.B., Westoby, M. et al. (2004) The worldwide
leaf economics spectrum. Nature, 428, 821–827.
Additional references are in the supplementary file at: (weblink)
SUPPORTING INFORMATION
Additional supporting information may be found in the online
version of this article at the publisher’s web-site.
Appendix S1 Supplementary methods.
Appendix S2 Definition of traits used in this study.
Appendix S3 References for contributing databases and number
of traits contributed.
Appendix S4 Map of TRY measurement sites.
Appendix S5 Location of Acer saccharum range map and soil
and climate across the range of Acer saccharum
Appendix S6 Correlation between traits and environmental
variables used in aHPMF.
Appendix S7 Root mean squared error comparison between
MEAN, BHPMF and aHPMF across the taxonomic hierarchy.
Appendix S8 Sensitivity analysis.
Appendix S9 Bi- and multivariate relationships between traits,
measured and imputed trait values.
Appendix S10 Gibbs sampler results.
Appendix S11 Additional references of data contributors.
Appendix S12 Author contributions.
BIOSKETCH
Franziska Schrodt is a post-doctoral researcher at the
Max Planck Institute for Biogeochemistry in Jena and
the German Centre for Integrative Biodiversity Research
iDIV Leipzig, Jena, Halle. Her work focuses on the
application of machine learning and nonlinear
statistical tools to the study of biogeochemical patterns.
She is especially interested in plant functional
trait/biodiversity–environment correlations and the
associated implications for ecosystem structure and
functioning.
Editor: José Alexandre Diniz-Filho
F. Schrodt et al.
Global Ecology and Biogeography, © 2015 John Wiley & Sons Ltd12
S1. Methods
S1.1. MEAN
A “hierarchical mean” strategy is used. For example, to predict trait m of plant n, if there are plants in the samespecies as plant n, we use species mean for prediction; otherwise, if there are plants in the same genus as plant n, weuse the genus mean, and so on. In general, among species mean, genus mean, family mean, and phylogenetic groupmean, we use the first available one at the lowest level.
S1.2. BHPMF
BHPMF is a hierarchical Bayesian implementation of PMF. The latent vectors are implemented as Gaussian normaldistributions with prior mean of 0 and a variance of σ 2, which results in two adjustable parameters per latent vector(mean and variance). The length of the latent vectors can be defined, but needs to be constrained to avoid over-fitting(Salakhutdinov and Mnih, 2008). We run a Gibbs sampler for optimization (see S1.3). In principle, this allows us toupdate {U(`)} and {V (`)} - which are the matrices formed by stacking the latent vectors u and v at the taxonomic level of` - in an arbitrary order. Empirically, we do it level by level iteratively following a top-down and bottom-up order. Ineach iteration, we first do a top-down pass to update
({U(1)}, {V (1)}
)to
({U(L)}, {V (L)}
), followed by a bottom-up pass to
update({U(L)}, {V (L)}
)to
({U(1)}, {V (1)}
), and repeat the process for several iterations. The intuition is that after updating(
{U(`)}, {V (`)}), we want to immediately use it for regularization (to prevent unnecessary complexity and over-fitting) in
the next level update. Empirically, we observed that such a strategy converges faster than only doing top-down updatesrepeatedly. The posterior over {U(`)} and {V (`)} is
p({U(`)}, {V (`)}|{X(`)}, σ2,U(0),V (0)
)∝
L∏`=1
{∏n
N(u(`)n |u
(`−1)p(n) , σ
2uI)
∏m
N(v(`)m |v
(`−1)m , σ2
v I)
∏n,m
δ(`)nmN
(x(`)
nm|〈u(`)n , v(`)
m 〉, σ2) }
,
(S1)
where {·} denotes the set of data at all L levels (L = 5 for the TRY data), and δ(`)nm = 1 when the entry (n,m) of X(`)
is non-missing and 0 otherwise. MAP (Maximum a posteriori) inference which is similar to Monte Carlo inference on{U(`)} and {V (`)} can be done by maximizing the logarithm of the posterior in eqn S1, which boils down to minimizingthe regularized squared loss as
E =
L∑`=1
{∑nm
δ(`)nm ‖ x(`)
nm − 〈u(`)n , v(`)
m 〉 ‖22
+ λu
∑n
‖u(`)n − u(`−1)
p(n) ‖22 +λv
∑m
‖v(`)m − v(`−1)
m ‖22
},
(S2)
where λu = σ2/σ2u and λv = σ2/σ2
v and ` − 1 is replaced by ` + 1 for the bottom-up iteration. The objective functioncontaining U(`) and V (`) is given by
E(`) =∑n,m
δ(`)nm ‖ x(`)
nm − 〈u(`)n , v(`)
m 〉 ‖22
+ λu
∑n
‖ u(`)n −u(`−1)
p(n) ‖22 +1(`<L)
∑n′∈c(n)
‖u(`)n −u(`+1)
n′ ‖22
+ λv
∑m
(‖ v(`)
m − v(`−1)m ‖22 +1(`<L) ‖ v(`)
m − v(`+1)m ‖22
),
(S3)
1
where c(n) is the set of child nodes of n, e.g, if n is a species, c(n) denotes plants of that species, and 1(`<L) is anindicator function taking value 1 when ` < L and 0 otherwise. The regularization terms ‖u(`)
n −u(`−1)p(n) ‖
22 and ‖v(`)
m −v(`−1)m ‖22
keep u(`)n and v(`)
m close to the corresponding latent factor at level ` − 1, and the regularization terms∑
c(n)‖u(`)n − u(`+1)
c(n) ‖22
and ‖v(`)m − v(`+1)
m ‖22 keep u(`)n and v(`)
m close to the corresponding latent factor at level ` + 1 (if applicable).At least one trait is needed for each plant in order to run matrix factorization methods. Therefore, we split the
training, test and validation sets as follows: For each plant, if it has at least three traits available, we randomly holdout one trait for test, one trait for validation, and use the rest for training; if it has two traits available, we randomlyhold out one trait for training and one for test; if it only has one trait available, we use it for training. Following such astrategy, each plant has at least one trait in the training set. The test set is used for test and the validation set is usedduring the training process for early stopping, i.e., when there have been more than 5 iterations and the performance onthe validation set decreases, we stop training. We repeat the holding-out process 5 times to get five randomly splitdatasets, then constructing the upper-level matrices for training and validation, but the test set only operates at theplant×trait level.
S1.3. Gibbs samplerMany imputation methods commonly used in ecology do not provide means to assess the uncertainty of every
single predicted value. Ideally, one would like to quantify both, the expected value and the range of variation (in caseof normal distribution mean and standard deviation (SD)) of the imputations. Using Gibbs sampling we infer theprobability distribution for each prediction (Casella and George, 1992). The Gibbs sampler is a Markov chain MonteCarlo algorithm which, based on the Metropolis algorithm (Metropolis et al., 1953), samples from the conditionaldistribution of one variable given all the others. In BHPMF, each variable (element in each latent factor) is conditionallyindependent of most other variables, thereby leading to an efficient sampler. In essence, the procedure is as follows: fora given matrix X, the sampler updates the latent factor matrices (U(`),V (`)) at each level `, keeping the factors at allother levels fixed. Each sample at the lowest level is obtained by sampling the upper level matrices iteratively followinga top-down and bottom-up order (Algorithm S1).
Each row of U(`) (u(`)n ) is independent of U(`)
−n, U(−`)(−p(n),−c(n)), V (−`), X(−`), and X(`)
−n given its Markov blanket (x(`)n , V (`),
u(`−1)p(n) , u(`+1)
c(n) ), where p(n) is the parent node of n in the upper level and c(n) is the set of child nodes n in the lower level.Therefore, the conditional probability of U(`) can be factorized into the product of conditional probability of its rows
p(U(`)|X(`),V (`),U(`−1)
p ,U(`+1)c
)(S4)
=∏
n
p(u(`)
n |x(`)n ,V (`),u(`−1)
p(n) ,u(`+1)c(n)
).
By applying Bayes rule and given that the product of multiple Gaussian distributions is another Gaussian distribution,it can be shown that the conditional probability of un is a Gaussian distribution
p(u(`)
n |x(`)n ,V (`),u(`−1)
p(n) ,u(`+1)c(n)
)= N(u(`)
n |µ∗(`)n ,Σ∗(`)n )
∼∏
m
[δ(`)
nmN(x(`)nm|〈u
(`)n , v(`)
m 〉, σ2)]N(u(`)
n |u(`−1)p(n) , σ
2uI)∏
n′∈c(n)
[N(u(`+1)
n′ |u(`)n , σ2
uI)] (S5)
Σ∗(`)n =
|c(n)` + 1|σ2
uI +
1σ2
∑m
δ(`)nmv(`)
m v(`)Tm
−1
µ∗(`)n = Σ∗(`)n
1σ2
uI
u(`−1)p(n) +
∑n′∈c(n)
u(`+1)n′
+
1σ2
∑m
δ(`)nmx(`)
nmv(`)m
.(S6)
2
where |.| denotes set cardinality.With the similar argument, the conditional probability of V (`) can be factorized into the product of conditional
probability of its rows, where x:m is column m of x.
p(V (`)|X(`),U(`),V (`−1),V (`+1)
)(S7)
=∏
m
p(v(`)
m |x(`):m ,U
(`), v(`−1)m , v(`+1)
m
).
p(v(`)
m |x(`):m ,U
(`), v(`−1)m , v(`+1)
m
)= N(v(`)
m |µ∗(`)m ,Σ∗(`)m )
∼∏
n
[δ(`)
nmN(x(`)nm|〈u
(`)n , v(`)
m 〉, σ2)]
(S8)
N(v(`)m |v
(`−1)m , σ2
v I) N(v(`+1)m |v(`)
m , σ2v I)
where
Σ∗(`)m =
2σ2
vI +
1σ2
∑n
δ(`)nmu(`)
n u(`)Tn
−1
(S9)
and
µ∗(`)m = Σ∗(`)m
[1σ2
vI(v(`−1)
m + v(`+1)m
)+
1σ2
∑n
δ(`)nmx(`)
nmu(`)n
. (S10)
For a given matrix X, the sampler updates the latent factor matrices (U(`),V (`)) at every level `. Each sample at thelowest level is obtained by sampling the upper level matrices iteratively following a top-down and bottom-up order. Ateach iteration, we first do a bottom-up pass to sample (U(L),V (L)) to (U(1),V (1)), followed by a top-down pass to sample(U(1),V (1)) to (U(L),V (L)), and repeat the procedure to generate enough samples (Algorithm S1).
Algorithm S1 Gibbs Sampling for BHPMF1: for ` = 1, · · · , L do2: Initialize model parameters {U1(`),V1(`)}
3: for t = 1, · · · ,T do4: for ` = L, · · · , 1 do . bottom-up5: for each n = 1 · · ·N sample un in parallel (eqn S5):6: ut+1(`)
n ∼ p(ut(`)n |x
(`)n ,V t(`),ut(`−1)
p(n) ,ut(`+1)c(n) )
7: for each m = 1 · · ·M sample vm in parallel (eqn S8):8: vt+1(`)
m ∼ p(vt(`)m |x
(`)m ,U t+1(`), vt(`−1)
m , vt(`+1)m )
9: for ` = 1, · · · , L do . top-down10: for each n = 1 · · ·N sample un in parallel (eqn S5):11: ut+2(`)
n ∼ p(ut+1(`)n |x(`)
n ,V t+1(`),ut+1(`−1)p(n) ,ut+1(`+1)
c(n) )12: for each m = 1 · · ·M sample vm in parallel (eqn S8):13: vt+2(`)
m ∼ p(vt+1(`)m |x(`)
m ,U t+2(`), vt+1(`−1)m , vt+1(`+1)
m )
Where t denotes the trait, c(n) the child node, p(n) the parent node, l the hierarchical level, u and n the row (entityor, in our case, plant) side and v and m the column (trait) side. The used burn-in period was 100 samples with a lag of 2and a final number of 450 samples.
3
S1.4. aHPMF
HPMF was developed to fill gaps in plant trait matrices. To facilitate trait predictions at regional scale, we stopHPMF at species level, followed by a least squares regression of the residuals against environmental features. StoppingHPMF at species level separates phylogenetic conservatism from environmental drivers. Explicitly taking into accountenvironmental conditions as co-determinants of trait variation enables out-of-sample prediction.
Let Q` denote the number of distinct categories available at level ` of the taxonomy, e.g., Q1 is the number ofspecies available at ` = 1, and so on. We learn distinct regression model parameters w`
q for each category q at everylevel ` of the taxonomy by partitioning the observations into their respective categories, and also accounting for ahierarchical regularization among the parameters based on the taxonomic hierarchy.
Let X ∈ RQ×7 be the design matrix of climate and soil features (7 covariates, including a column of ones to handlea constant intercept), where the total number of observations, including all levels of the taxonomic hierarchy, is Q. LetY ∈ RQ×1 be the target residuals of BHPMF to be predicted for a given trait k. Let (X(`)
q ,Y (`)q ) be the subset of data that
belong to the qth category at level `. With w(`)q denoting the regression vector for category q at level `, we consider the
following objective function
E(w) =
L∑`=1
Q∑q=1
γ`∥∥∥Y (`)
q − X(`)q w(`)
q
∥∥∥2
+ λ
L∑`=1
Q∑q=1
‖w`q − w(`−1)
p(q) ‖2
(S11)
where w is the weight vectors over all categories, γ` is a weight term for minimizing the squared errors at level `, λis the trade-off parameter of regularization, and p(q) is the parent of category q. Using vector notations, the objectivefunction can be written as:
E(w) = (Y − Xw)Tγ(Y − Xw) + λwTLw , (S12)
where γ is a vector of γ` for all `, Y is a concatenated vector of Y (`)q , X is a block diagonal concatenation of X(`)
q ,and L is the graph Laplacian of the taxonomic hierarchy. We utilize gradient descent based methods for solving theoptimization problem (eqn S12), in a spirit similar to BHPMF. Finally, we use the learned parameters at the specieslevel for predicting the plant trait residuals of BHPMF over unobserved data instances during the testing phase.
S1.5. RMSE
RMSE is used for evaluation. Assuming there are in total T entries for test, at is the true value and at is the predictedvalue, RMSE is defined as:
RMS E =
√∑t
(at − at)2/T . (S13)
S1.6. Sensitivity analysis
RMSEs of the sensitivity analysis testing the effect of using a global versus local (RAINFOR)/regional (Europe)dataset to fill gaps, as well as the effect of different degrees of sparsity were calculated based on the results of threeBHPMF runs obtained for each subset (global data to fill local gaps, local data to fill local gaps (10%, 30%, 60% and80% extra gaps), as well as global data to fill regional gaps and regional data to fill regional gaps). In order to enablecomparison of the results, all data were z-transformed on the basis of the whole dataset. To ensure consistency in matrixlength, we constrained the random introduction of gaps by retaining one data point in every row (plant), also usingthe same gaps for the respective a) or b) runs. Gaps were added as percentage of the available trait data for each traitwhich resulted in different absolute degrees of missingness. It should also be noted that the total RMSE was calculatedindependently for the whole dataset and is not the average across all RMSEs per trait. Differences in RMSE withineach treatment are due to pre-processing (i.e. differences in the random setting of gaps) in the three different runs ratherthan BHPMF per se.
4
References
Casella, G., George, E.I., 1992. Explaining the gibbs sampler. The American Statistician 46, 167–174.Metropolis, A.W., Rosenbluth, M.N., Rosenbluth, A.H., Teller, H., Teller, E., 1953. Equation of state calculations by fast computing machines.
Journal of Chemical Physics 21, 10871092.Salakhutdinov, S., Mnih, A., 2008. Probabilistic matrix factorization. IEEE CS Press: Advances in Neural Information Processing Systems 20 (NIPS
07) .
5
S2. Definition of traits used in this study
Table S2.1: numbered code, units of measurement, plant functional traits, number of non-missing geo-referenced entries (Nr. geo-ref.) and definitionof the respective trait.
Code Unit Trait Nr. geo-ref. Definition1 mm2 mg−1 Specific leaf area 33001 One sided area of a fresh leaf divided by its oven-dry mass2 m Plant height 16465 Shortest distance between the upper boundary of main photo-
synthetic tissue or reproduction unit on a plant and the groundlevel
3 mg Seed dry mass 7311 Dry mass of a whole single seed4 g g−1 Leaf dry matter content 17331 Leaf dry mass per unit of leaf fresh mass (hydrated)5 mg mm−3 Stem specific density 9191 Oven-dry mass of a section of a plant’s main stem divided by its
fresh volume6 mm2 Leaf area 39438 One-sided projected surface area of a single leaf or leaf lamina7 mg g−1 Leaf nitrogen per weight 26882 Total amount of nitrogen per unit of leaf dry mass8 mg g−1 Leaf phosphorus per weight 11975 Total amount of phosphorus per unit of leaf dry mass9 g m−2 Leaf nitrogen per area 8180 Total amount of nitrogen per unit of leaf area (one-sided)10 mg Leaf fresh mass 11484 Fresh mass of a whole leaf11 g g−1 Leaf nitrogen/phosphorus ratio 5999 Ratio of leaf total nitrogen versus total phosphorus12 mg g−1 Leaf carbon per dry mass 8125 Total carbon per unit of leaf dry mass13 ‰ Leaf δ 15N 9022 foliar 15N:14N ratios relative to 15N:14N ratios in atmospheric N2
S3. References of contributing databases and number of traits contributed
Table S3.1: References of contributing databases in order of decreasing contribution as measured in the number of traits, where Ref.1 contributed23364 traits and Ref.56 37 traits. Where the database has not been published, yet, only the name of the database is given. For a list of contributedtraits see table S3.2, for the list of references, see S11
Ref. Database Reference1 The LEDA Tb (Kleyer et al., 2008)2 Global 15N Db (Craine et al., 2009, 2005)3 Panama Plant Traits Db (Wright et al., 2010)4 Global Leaf N, P Db (Reich et al., 2009)5 Catalonian Mediterranean Forest Trait Db (Ogaya and Penuelas, 2003)6 The VISTA Plant Trait Db (Garnier et al., 2007)7 Herbaceous Leaf Traits Db Old Field New York8 Panama Leaf Traits Db (Messier et al., 2010)9 Midwestern and Southern US Herbaceous Db10 The RAINFOR Plant Trait Db (Fyllas et al., 2009)11 Global Seed Mass, Plant Height Db (Moles et al., 2004, 2005)12 GLOPNET - Global Plant Trait Network Db (Wright et al., 2004, 2006)13 Sheffield Db (Cornelissen et al., 2001, 2003)14 VegClass CBM Global Db (Gillison and Carpenter, 1997)15 Floridian Leaf Traits Db (Cavender-Bares et al., 2006)16 Neotropic Plant Traits Db (Wright et al., 2007)17 Leaf and Whole Plant Traits Db (Shipley, 1989, 1995, 2002)18 Leaf Physiology Db (Kattge et al., 2009)19 Overton/Wright New Zealand Db20 New South Wales Db (Fonseca et al., 2000)21 Traits of Bornean Trees Db (Kurokawa and Nakashizuka, 2008)22 Chinese Leaf Traits Db (Han et al., 2005)23 The Netherlands Plant Traits Db (Ordonez et al., 2010)24 Leaf Biomechanics Db (Onoda et al., 2011)25 Sheffield-Iran-Spain Db (Dıaz et al., 2004)26 Quercus Leaf C and N Db27 Global Leaf Robustness and Physiology Db (Niinemets, 2001)28 Ponderosa Pine Forest Db (Laughlin et al., 2010)29 Ukraine Wetlands Plant Traits Db30 ECOCRAFT (Medlyn and Jarvis, 1999)31 Abisko & Sheffield Db (Cornelissen et al., 1996, 2004)32 ArtDeco Db (Cornwell et al., 2008)33 Photosynthesis and Leaf Characteristics Db34 Causasus Plant Traits Db35 New South Wales Plant Traits Db36 European Mountain Meadows Plant Traits Db (Bahn et al., 1999)37 CORDOBASE (Dıaz et al., 2004)38 Global Respiration Db (Reich et al., 2008)39 Tropical Rainforest Traits Db (Poorter and Bongers, 2006)40 Global A, N, P, SLA Db (Reich et al., 2009)41 ECOQUA South American Plant Traits Db (Muller et al., 2007)42 FAPESP Brazil Rainforest Db43 South African Woody Plants Db (ZLTP)44 French Massif Central Grassland Trait Db (Louault et al., 2005)45 Tropical Traits from West Java Db (Poorter, 2009)46 Jasper Ridge Californian Woody Plants Db (Preston et al., 2006)47 Hawaiian Leaf Traits Db (Penuelas et al., 2010)48 Wetland Dunes Db (Bodegom et al., 2005, 2008)49 Tropical Traits from West Java Db (Shioder et al., 2008)50 Tropical Plant Traits From Borneo Db (Swaine, 2007)51 Herbaceous Traits from the Oland Island Db (Hickler, 1999)52 Traits from Subarctic Plant Species Db (Freschet et al., 2010)53 Leaf and Whole-Plant Traits Db (Sack et al., 2003, 2005, 2006)54 Plant Traits in Pollution Gradients Db55 Cedar Creek Plant Physiology Db56 VirtualForests Trait Db (Gutierrez and Huth, 2012)
Tabl
eS3
.2:D
atab
ases
and
num
bero
ftra
itsco
ntri
bute
d.Fo
ralis
tofr
efer
ence
sse
eta
ble
S3.1
.SL
Ais
the
spec
ific
leaf
area
,SD
Mse
eddr
ym
ass,
LDM
Cle
afdr
ym
atte
rcon
tent
,SSD
stem
spec
ific
area
,LA
leaf
area
,Nle
afni
troge
n,P
leaf
phos
phor
us,N
area
leaf
nitro
gen
peru
nitl
eafa
rea,
LFM
leaf
fres
hm
ass,
Nto
Pth
era
tioof
leaf
nitr
ogen
tole
afph
osph
orus
,Cle
afca
rbon
and
15N
the
leaf
cont
ento
fdel
ta15
nitr
ogen
Ref
.SL
AH
eigh
tSD
ML
DM
CSS
DL
AN
PN
area
LFM
Nto
PC
15N
167
3613
0650
652
2310
9583
210
531
1030
1030
1054
53
5256
293
331
5000
1003
5355
214
5098
455
0255
0255
025
819
660
5483
2219
0213
0972
018
876
1560
3401
1573
1564
989
707
969
991
723
2123
2123
2123
2123
218
1866
1872
1880
1836
1886
1836
919
9716
4126
022
8689
2098
2254
1015
5588
315
0615
6315
5415
2515
6311
1700
8416
1223
7056
015
8520
6178
719
7574
513
2953
245
2852
2594
477
474
286
1447
8147
8115
2270
2255
3514
1656
714
3010
0727
6917
750
840
375
726
720
312
1230
040
031
218
927
312
942
1464
1910
9911
1810
9920
1043
1028
1043
2146
746
647
146
747
122
813
1179
2328
928
726
928
828
228
128
224
1735
2543
042
642
943
026
767
767
2775
511
222
612
722
625
2813
913
913
913
913
713
913
813
913
913
929
214
214
209
214
214
107
214
3033
821
639
694
196
112
3124
924
860
254
624
517
632
576
603
3357
857
834
144
102
144
144
156
129
144
155
3526
129
224
828
736
210
202
210
210
201
3717
713
714
467
2114
676
5570
5050
3826
730
267
7826
739
135
8216
413
582
135
8240
124
4912
712
712
411
941
656
4267
7714
467
7667
7676
Con
tinue
don
Nex
tPag
e...
Tabl
eS3
.2–
Con
tinue
d
Ref
.SL
AH
eigh
tSD
ML
DM
CSS
DL
AN
PN
area
LFM
Nto
PC
15N
4325
960
6060
6044
4312
943
4343
4343
4345
101
9610
110
146
5451
5454
5453
5347
8888
8886
4815
316
149
101
9610
150
3636
3536
3636
3651
8080
8052
4040
4040
4040
5317
430
1122
5440
8040
2640
5545
4545
5637
sum
4204
627
953
1262
022
571
9905
4246
932
246
1415
210
158
1210
375
2892
0410
820
S4. Map of TRY measurement sites
Figure S4.1: Measurement sites of trait data contributed to the TRY project (acquisition date 2012-08-25) on a map of mean annual temperature(extracted from the WorldClim dataset). Note the sparse spatial distribution of measurement sites, especially in extremely cold or hot areas.
S5. Location of Acer saccharum range map and soil and climate across the range of Acer saccharum
Figure S5.1: Geographic location of the species range map of Acer saccharum
Figure S5.2: Soil %clay (a), soil organic carbon (b), %silt (c), %clay (d), isothermality (e) and precipitation seasonality (f) across the species rangeof Acer saccharum. For a map of the geographic location of these maps, see Figure S5.1.
S6. Correlation between traits and environmental variables used in aHPMF
Table S6.1: Correlations between the raw trait values and the environmental variables used within aHPMF (MAT is the mean annual temperature,MAP is the mean annual precipitation, SOC soil organic carbon, AWC the annual water capacity and ISO isothermality
Trait % silt % sand/% silt % SOC MAT MAP AWC ISOSpecific leaf area 0.0654 0.0131 0.0248 -0.1905 -0.0707 0.0059 -0.1701Plant height -0.1725 -0.2779 -0.207 0.6358 0.5786 0.312 0.6951Seed mass -0.2304 -0.1278 -0.1291 0.5304 0.6136 0.3343 0.574LDMC -0.1949 -0.1329 -0.2054 0.3535 0.361 0.1684 0.3579Stem specific density 0.0587 -0.0784 -0.0133 0.2686 0.0491 -0.1442 0.1392Leaf area -0.0401 -0.3173 -0.0765 0.4836 0.4577 0.2924 0.4442Leaf nitrogen concentration -0.0237 -0.0341 -0.0592 0.0269 0.0861 0.0762 0.0888Leaf phosphorus concentration 0.2681 0.0057 0.0883 -0.3262 -0.1687 -0.1491 -0.2661Leaf nitrogen per area 0.0444 -0.0659 -0.0785 0.1289 0.0673 0.0349 0.0894Leaf fresh mass -0.6522 -0.2866 0.6947 0.705 0.6988 0.3934 0.7172Leaf nitrogen/phosphorus ratio -0.4176 -0.0927 -0.2129 0.4456 0.3186 0.1976 0.5253Leaf carbon per dry mass 0.0915 -0.1133 0.0045 0.0814 0.1301 -0.0303 0.0497Leaf δ 15N -0.4532 -0.1218 -0.3584 0.5313 0.2171 0.3115 0.5598
S7. Root mean squared error comparison between MEAN, BHPMF and aHPMF
Table S7.1: RMSE ± range of Species Mean (MEAN), HPMF and aHPMF with increasing taxonomic information. Latent dimension k=15 formatrix factorization methods. Lowest RMSE is shown in bold. Phylo is the phylogenetic group.
Taxonomic info MEAN HPMF aHPMFPhylo 0.9301 ± 0.0026 0.8362 ± 0.0103 ×
Phylo+family 0.7415 ± 0.0019 0.6956 ± 0.0113 ×
Phylo+family+genus 0.6009 ± 0.0021 0.5694 ± 0.0032 ×
Phylo+family+genus+species 0.5853 ± 0.0024 0.5009 ± 0.0034 0.5272 ± 0.0025
S8. Sensitivity analysis
Figure S8.1: Location of the extracted RAINFOR data (red) and the whole TRY data in this region (black).
Table S8.1: Number of observations corresponding to the percentage of added gaps of the RAINFOR database by trait, using a) only the RAINFORdata itself (Rainfor for Rainfor) or b) the whole TRY database (World for Rainfor) to fill these gaps. SLA is the specific leaf area, PlantHt is the Plantheight, SSD is the stem specific density, P is phosphorus, N is nitrogen and C is carbon.
Rainfor for Rainfor World for Rainfor0 10 30 60 80 0 10 30 60 80
Total 8.967 8.070 6.277 3.587 1.798 113.819 112.922 111.129 108.439 106.645SLA 1.360 1.209 954 538 273 33.001 32.850 32.595 32.179 31.914PlantHt 864 771 602 342 174 16.465 16.372 16.203 15.943 15.775SSD 1.327 1.186 936 543 260 9.191 9.050 8.800 8.407 8.124Leaf P 1.353 1.234 963 546 273 11.975 11.856 11.585 11.168 10.895Leaf N 1.364 1.237 967 542 290 26.882 26.755 26.485 26.060 25.808Leaf N/area 1.335 1.211 923 531 262 8.180 8.056 7.768 7.376 7.107Leaf C/dry mass 1.364 1.222 932 545 261 8.125 7.983 7.693 7.306 7.022
Figure S8.2: Location of the extracted European data (red) and the global TRY data (black).
Table S8.2: Number of observations in the gap-filling the European database exercise, using a) only the European data itself (Europe for Europe) orb) the whole TRY database (World for Europe) to fill these gaps. SLA is the specific leaf area, PlantHt is the Plant height, LDMC is the leaf drymatter content, SSD is the stem specific density, N is nitrogen, P is phosphorus and C is carbon
Europe for Europe World for EuropeTotal 155.147 204.389SLA 24.749 33.001PlantHt 13.959 16.465Seed mass 5.930 7.311LDMC 12.560 17.331SSD 3.675 9.191Leaf area 34.813 39.438Leaf N 18.857 26.882Leaf P 7.280 11.975Leaf N/area 5.014 8.180Leaf fresh mass 11.484 11.484Leaf N/P ratio 4.123 5.999Leaf C/dry mass 4.470 8.125Leaf δ 15N 8.233 9.007
Figure S8.3: RMSE of performing HPMF on the European cutout (red points in Figure S8.2) for the whole dataset (Total), specific leaf area (SLA),Plant height (PlantHt), seed mass, leaf dry matter content (LDMC), stem specific density (SSD), leaf area, leaf nitrogen, leaf phosphorus, leafnitrogen per area (LeafNArea), leaf fresh mass (LeafFmass), leaf nitrogen to phosphorus ratio (LeafNP), leaf carbon and leaf delta 15 nitrogen(LeafD15N). Grey and black boxplots (on the left and right for each trait respectively) are RMSE for imputations using just the European dataset(grey) or the whole global dataset (black) to fill the gaps
S9. Bi- and multi-variate relationships between traits, measured and imputed trait values
Figure S9.1: Figure is continued on next page.
observed plant height
Mea
n −
pre
dict
ed p
lant
hei
ght
−4
−2
0
2
4
−4 −2 0 2 4
heightPlant
observed SSD
Mea
n −
pre
dict
ed S
SD
−8
−6
−4
−2
0
−8 −6 −4 −2 0
densityStem specific
observed Leaf P
Mea
n −
pre
dict
ed L
eaf P
−3
−2
−1
0
1
2
−3 −2 −1 0 1 2
P
observed LDMC
Mea
n −
pre
dict
ed L
DM
C
−3.0
−2.5
−2.0
−1.5
−1.0
−0.5
−3.0 −2.0 −1.0
LDMC
observed foliar N/P ratio
Mea
n −
pre
dict
ed fo
liar
N/P
rat
io
1
2
3
4
1 2 3 4
N to P ratio
observed plant height
HP
MF
− p
redi
cted
pla
nt h
eigh
t
−4
−2
0
2
4
−4 −2 0 2 4
heightPlant
observed SSD
HP
MF
− p
redi
cted
SS
D
−8
−6
−4
−2
0
−8 −6 −4 −2 0
densityStem specific
observed Leaf P
HP
MF
− p
redi
cted
Lea
f P
−3
−2
−1
0
1
2
−3 −2 −1 0 1 2
P
observed LDMC
HP
MF
− p
redi
cted
LD
MC
−3.0
−2.5
−2.0
−1.5
−1.0
−0.5
−3.0 −2.0 −1.0
LDMC
observed foliar N/P ratio
HP
MF
− p
redi
cted
folia
r N
/P r
atio
1
2
3
4
1 2 3 4
N to P ratio
observed plant height
aHP
MF
− p
redi
cted
pla
nt h
eigh
t
−4
−2
0
2
4
−4 −2 0 2 4
heightPlant
observed SSD
aHP
MF
− p
redi
cted
SS
D
−8
−6
−4
−2
0
−8 −6 −4 −2 0
densityStem specific
observed Leaf P
aHP
MF
− p
redi
cted
Lea
f P
−3
−2
−1
0
1
2
−3 −2 −1 0 1 2
P
observed LDMC
aHP
MF
− p
redi
cted
LD
MC
−3.0
−2.5
−2.0
−1.5
−1.0
−0.5
−3.0 −2.0 −1.0
LDMC
observed foliar N/P ratio
aHP
MF
− p
redi
cted
folia
r N
/P r
atio
1
2
3
4
1 2 3 4
N to P ratio
observed Leaf N per area
Mea
n −
pre
dict
ed L
eaf N
per
are
a
−2
−1
0
1
2
−2 −1 0 1 2
N per area
observed Leaf C per dry massMea
n −
pre
dict
ed L
eaf C
per
dry
mas
s
5.8
6.0
6.2
6.4
5.8 6.0 6.2 6.4
dry massC per
observed Seed mass
Mea
n −
pre
dict
ed S
eed
mas
s
−5
0
5
10
−5 0 5 10
Seed mass
observed foliar N
Mea
n −
pre
dict
ed fo
liar
N
1
2
3
4
1 2 3 4
N
observed leaf area
Mea
n −
pre
dict
ed le
af a
rea
0
5
10
0 5 10
Leaf area
observed Leaf N per area
HP
MF
− p
redi
cted
Lea
f N p
er a
rea
−2
−1
0
1
2
−2 −1 0 1 2
N per area
observed Leaf C per dry massHP
MF
− p
redi
cted
Lea
f C p
er d
ry m
ass
5.8
6.0
6.2
6.4
5.8 6.0 6.2 6.4
dry massC per
observed Seed mass
HP
MF
− p
redi
cted
See
d m
ass
−5
0
5
10
−5 0 5 10
Seed mass
observed foliar N
HP
MF
− p
redi
cted
folia
r N
1
2
3
4
1 2 3 4
N
observed leaf area
HP
MF
− p
redi
cted
leaf
are
a
0
5
10
0 5 10
Leaf area
observed Leaf N per area
aHP
MF
− p
redi
cted
Lea
f N p
er a
rea
−2
−1
0
1
2
−2 −1 0 1 2
N per area
observed Leaf C per dry massaHP
MF
− p
redi
cted
Lea
f C p
er d
ry m
ass
5.8
6.0
6.2
6.4
5.8 6.0 6.2 6.4
dry massC per
observed Seed mass
aHP
MF
− p
redi
cted
See
d m
ass
−5
0
5
10
−5 0 5 10
Seed mass
observed foliar N
aHP
MF
− p
redi
cted
folia
r N
1
2
3
4
1 2 3 4
N
observed leaf area
aHP
MF
− p
redi
cted
leaf
are
a
0
5
10
0 5 10
Leaf area
Figure S9.1: Scatter plot for pairs of traits on true test data (x) versus predicted test data (y) using the MEAN (blue triangles), HPMF (red circles)and aHPMF (green crosses) approach. All values are given in natural log-normal (base e) transformed space. Note how HPMF and aHPMF imputedvalues are generally less scattered and closer to the 1:1 line than MEAN imputed values. This holds true for structural traits (such as plant height) aswell as physiological traits (like the foliar N to P ratio). All correlations were highly significant at the p <0.001 level.
Figure S9.2: Scatter plots of trait-trait correlations for observed data (left column), Mean predictions (second column) and BHPMF predictions(right column) for specific leaf area (SLA) versus foliar nitrogen (N) concentration (top row) and foliar phosphorus (P) concentration versus foliar Nconcentration (bottom row). Values are given in natural log-normal (base e) transformed space). Note the correlation coefficients (R2) given.
−0.8 −0.4 0.0 0.4
−0.
40.
00.
4
Procrustes errors
PCA 3
PC
A 4
Figure S9.3: Procrustes analysis errors for the 3rd and 4th Principal Component axes comparing a PCA performed on the original, gappy RAINFORdata with a PCA performed on the RAINFOR data with artificially introduced gaps being filled using BHPMF
−1.0 −0.5 0.0 0.5 1.0
−0.
40.
00.
40.
8
Procrustes errors
PCA 5
PC
A 6
Figure S9.4: Procrustes analysis errors for the 5th and 6th Principal Component axes comparing a PCA performed on the original, gappy RAINFORdata with a PCA performed on the RAINFOR data with artificially introduced gaps being filled using BHPMF
Figure S9.5: Bivariate Pearson correlations between traits. Colors indicate strength of relationship with deep red being positive and deep bluenegative correlations. The plots in the diagonal show the distribution of each trait with the number indicating the trait according to Table S2.1
S10. Gibbs sampler results
Figure S10.1: Gibbs sampler generated density plots of BHPMF-estimated natural log-normal transformed leaf dry matter content, foliar nitrogenconcentration and specific leaf area (SLA) of three Acer saccharum (top row) and Pinus sylvestris (bottom row) individuals respectively.
10% 20% 30% 40% 50% 60% 70% 80% 90% 100%0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1All traits
Ave
rag
e o
f R
MS
E
Percentage of Data with Ascending StdFigure S10.2: Gibbs sampler for the 13-trait data set with the inverse of prediction confidence (SD) on the x-axis and the prediction error (RMSE)on the y-axis. The x-axis is produced by sorting the data points in ascending order of their SD, dividing the test sets evenly into 10 classes withascending SD. I.e., the first class contains the 10% of predictions with the lowest SD, the second class the 10% of predictions with the second lowestSD, and so on. We regress the average RMSE per class against the ordered SD.
10% 20% 30% 40% 50% 60% 70% 80% 90% 100%0
0.2
0.4
0.6
0.8
1
1.2
1.41− SLA
Ave
rage
of R
MS
E
Percentage of Data with Ascending Std10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
1.1
1.22− PlantHeight
Ave
rage
of R
MS
E
Percentage of Data with Ascending Std10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
1.1
1.23− SeedMass
Ave
rage
of R
MS
E
Percentage of Data with Ascending Std
10% 20% 30% 40% 50% 60% 70% 80% 90% 100%0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
1.1
1.24− LDMC
Ave
rage
of R
MS
E
Percentage of Data with Ascending Std10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
−0.2
0
0.2
0.4
0.6
0.8
1
1.25− StemSpecificDensity
Ave
rage
of R
MS
E
Percentage of Data with Ascending Std10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
0
0.2
0.4
0.6
0.8
1
1.2
1.46− LeafArea
Ave
rage
of R
MS
E
Percentage of Data with Ascending Std
10% 20% 30% 40% 50% 60% 70% 80% 90% 100%0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
1.1
1.27− LeafN
Ave
rage
of R
MS
E
Percentage of Data with Ascending Std10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
1.1
1.28− LeafP
Ave
rage
of R
MS
E
Percentage of Data with Ascending Std10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
0
0.2
0.4
0.6
0.8
1
1.2
1.49− Leaf nitrogen (N) content per area
Ave
rage
of R
MS
E
Percentage of Data with Ascending Std
10% 20% 30% 40% 50% 60% 70% 80% 90% 100%0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.810− Leaf fresh mass
Ave
rage
of R
MS
E
Percentage of Data with Ascending Std10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
0.2
0.4
0.6
0.8
1
1.2
1.4
1.611− Leaf nitrogen/phosphorus (N/P) ratio
Ave
rage
of R
MS
E
Percentage of Data with Ascending Std10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
0.2
0.4
0.6
0.8
1
1.2
1.4
1.612− Leaf carbon (C) content per dry mass
Ave
rage
of R
MS
E
Percentage of Data with Ascending Std10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
1.1
1.213− Leaf delta 15N
Ave
rage
of R
MS
E
Percentage of Data with Ascending Std
Figure S10.3: Gibbs sampler for each of the 13 traits with the inverse of prediction confidence (SD) on the x-axis and the prediction error (RMSE)on the y-axis. The x-axis is produced as follows. If we have 100 total entries, we rank them by SD in ascending order, so 10% represent the first 10entries with the lowest SD, 20% represent the second 10% entries with the lowest SD etc. From top to bottom and left to right: specific leaf area(SLA), Plant height (PlantHt), seed mass, leaf dry matter content (LDMC), stem specific density (SSD), leaf area, leaf nitrogen, leaf phosphorus, leafnitrogen per area (LeafNArea), leaf fresh mass (LeafFmass), leaf nitrogen to phosphorus ratio (LeafNP), leaf carbon and leaf delta 15 nitrogen(LeafD15N)
S11. Additional references of data contributors
References
Bahn, M., Wohlfahrt, G., Haubner, E., Horak, I., Michaeler, W., Rottmar, K., Tappeiner, U., Cernusca, A., 1999. Leaf photosynthesis, nitrogencontents and specific leaf area of 30 grassland species in differently managed mountain ecosystems in the eastern alps, in: Land-Use Changes inEuropean Mountain Ecosystems. ECOMONT- Concept and Results. in: Land-Use Changes in European Mountain Ecosystems. ECOMONT-Concept and Results, ed.: Cernusca, A. and Tappeiner, U. and Bayfield, N., Blackwell Wissenschaft, Berlin.
Bodegom, P.M.v., Kanter, M., Aerts, C.B.R., 2005. Radial oxygen loss, a plastic property of dune slack plant species. Plant and Soil 271, 351–364.Bodegom, P.M.v., Sorrell, B.K., Oosthoek, A., Bakker, C., Aerts, R., 2008. Separating the effects of partial submergence and soil oxygen demand on
plant physiology. Ecology 89, 193–204.Cavender-Bares, J., Keen, A., Miles, B., 2006. Phylogenetic structure of floridian plant communities depends on taxonomic and spatial scale.
Ecology 87, 109–122.Cornelissen, J., Aerts, R., Cerabolini, B., Werger, M., van der Heijden, M., 2001. Carbon cycling traits of plant species are linked with mycorrhizal
strategy. Oecologia 129, 611–619.Cornelissen, J., Cerabolini, B., Castro-Dez, P., Villar-Salvador, P., Montserrat-Mart, G., Puyravaud, J., Maestro, M., Werger, M., Aerts, R., 2003.
Functional traits of woody plants: correspondence of species rankings between field adults and laboratory-grown seedlings? Journal of VegetationScience 14, 311–22.
Cornelissen, J.H.C., Diez, P.C., Hunt, R., 1996. Seedling growth, allocation and leaf attributes in a wide range of woody plant species and types.Journal of Ecology 84, 755–765.
Cornelissen, J.H.C., Quested, H.M., Gwynn-Jones, D., Van Logtestijn, R.S.P., De Beus, M.A.H., Kondratchuk, A., Callaghan, T.V., Aerts, R., 2004.Leaf digestibility and litter decomposability are related in a wide range of subarctic plant species and types. Functional Ecology 18, 779–786.
Cornwell, W.K., Cornelissen, J.H.C., Amatangelo, K., Dorrepaal, E., Eviner, V.T., Godoy, O., Hobbie, S.E., Hoorens, B., Kurokawa, H., Prez-Harguindeguy, N., Quested, H.M., Santiago, L.S., Wardle, D.A., Wright, I.J., Aerts, R., Allison, S.D., Van Bodegom, P., Brovkin, V., Chatain,A., Callaghan, T.V., Daz, S., Garnier, E., Gurvich, D.E., Kazakou, E., Klein, J.A., Read, J., Reich, P.B., Soudzilovskaia, N.A., Vaieretti, M.V.,Westoby, M., 2008. Plant species traits are the predominant control on litter decomposition rates within biomes worldwide. Ecology Letters 11,1065–1071.
Craine, J.M., Elmore, A.J., Aidar, M.P.M., Bustamante, M., Dawson, T.E., Hobbie, E.A., Kahmen, A., Mack, M.C., McLauchlan, K.K., Michelsen,A., Nardoto, G.B., Pardo, L.H., Peuelas, J., Reich, P.B., Schuur, E.A.G., Stock, W.D., Templer, P.H., Virginia, R.A., Welker, J.M., Wright, I.J.,2009. Global patterns of foliar nitrogen isotopes and their relationships with climate, mycorrhizal fungi, foliar nutrient concentrations, andnitrogen availability. New Phytologist 183, 980–992.
Craine, J.M., Lee, W.G., Bond, W.J., Williams, R.J., Johnson, L.C., 2005. Environmental constraints on a global relationship among leaf and roottraits of grasses. Australian Journal of Botany 86, 12–19.
Dıaz, S., Hodgson, J., Thompson, K., Cabido, M., Cornelissen, J., Jalili, A., Montserrat-Martı, G., Grime, J., Zarrinkamar, F., Asri, Y., Band, S.,Basconcelo, S., Castro-Dıez, P., Funes, G., Hamzehee, B., Khoshnevi, M., Perez-Harguindeguy, N., Perez-Rontome, M., Shirvany, F., Vendramini,F., Yazdani, S., Abbas-Azimi, R., Bogaard, A., Boustani, S., Charles, M., Dehghan, M., de Torres-Espuny, L., Falczuk, V., Guerrero-Campo, J.,Hynd, A., Jones, G., Kowsary, E., Kazemi-Saeed, F., Maestro-Martınez, M., Romo-Dıez, A., Shaw, S., Siavash, B., Villar-Salvador, P., Zak, M.,2004. The plant traits that drive ecosystems: Evidence from three continents. Journal of Vegetation Science 15, 295–304.
Fonseca, C.R., Overton, J.M., Collins, B., Westoby, M., 2000. Shifts in trait-combinations along rainfall and phosphorus gradients. Journal ofEcology 88, 964–977.
Freschet, G.T., Cornelissen, J.H.C., Van Logtestijn, R.S.P., Aerts, R., 2010. Evidence of the plant economics spectrum in a subarctic flora. Journal ofEcology 98, 362–373.
Fyllas, N.M., Patino, S., Baker, T.R., Bielefeld Nardoto, G., Martinelli, L.A., Quesada, C.A., Paiva, R., Schwarz, M., Horna, V., Mercado, L.M.,Santos, A., Arroyo, L., Jimenez, E.M., Luizao, F.J., Neill, D.A., Silva, N., Prieto, A., Rudas, A., Silviera, M., Vieira, I.C.G., Lopez-Gonzalez,G., Malhi, Y., Phillips, O.L., Lloyd, J., 2009. Basin-wide variations in foliar properties of amazonian forest: phylogeny, soils and climate.Biogeosciences 6, 2677–2708.
Garnier, E., Lavorel, S., Ansquer, P., Castro, H., Cruz, P., Dolezal, J., Eriksson, O., Fortunel, C., Freitas, H., Golodets, C., Grigulis, K., Jouany, C.,Kazakou, E., Kigel, J., Kleyer, M., Lehsten, V., Lep, J., Meier, T., Pakeman, R., Papadimitriou, M., Papanastasis, V.P., Quested, H., Qutier, F.,Robson, M., Roumet, C., Rusch, G., Skarpe, C., Sternberg, M., Theau, J.P., Thbault, A., Vile, D., Zarovali, M.P., 2007. Assessing the effectsof land-use change on plant traits, communities and ecosystem functioning in grasslands: A standardized methodology and lessons from anapplication to 11 european sites. Annals of Botany 99, 967–985.
Gillison, A.N., Carpenter, G., 1997. A generic plant functional attribute set and grammar for dynamic vegetation description and analysis. FunctionalEcology 11, 775–783.
Gutierrez, A.G., Huth, A., 2012. Successional stages of primary temperate rainforests of chiloe island, chile. Perspectives in Plant Ecology, Evolutionand Systematics 14, 243 – 256.
Han, W.X., Fang, J.Y., Guo, D.L., Zhang, Y., 2005. Leaf nitrogen and phosphorus stoichiometry across 753 terrestrial plant species in china. NewPhytologist 168, 377–385.
Hickler, T., 1999. Plant functional types and community characteristics along environmental gradients on Oland’s Great Alvar (Sweden). Master’sthesis. University of Lund, Sweden.
Kattge, J., Knorr, W., Raddatz, T., Wirth, C., 2009. Quantifying photosynthetic capacity and its relationship to leaf nitrogen content for global-scaleterrestrial biosphere models. Global Change Biology 15, 976 – 991.
Kleyer, M., Bekker, R., Knevel, I., Bakker, J., Thompson, K., Sonnenschein, M., Poschlod, P., Van Groenendael, J., Klimes, L., Klimesova, J., Klotz,S., Rusch, G., Hermy, M., Adriaens, D., Boedeltje, G., Bossuyt, B., Dannemann, A., Endels, P., Gotzenberger, L., Hodgson, J., Jackel, A.K.,Kuhn, I., Kunzmann, D., Ozinga, W., Romermann, C., Stadler, M., Schlegelmilch, J., Steendam, H., Tackenberg, O., Wilmann, B., Cornelissen, J.,
Eriksson, O., Garnier, E., Peco, B., 2008. The leda traitbase: a database of life-history traits of the northwest european flora. Journal of Ecology96, 1266–1274.
Kurokawa, H., Nakashizuka, T., 2008. Leaf herbivory and decomposability in a malaysian tropical rain forest. Ecology 89, 2645–2656.Laughlin, D.C., Leppert, J.J., Moore, M.M., Sieg, C.H., 2010. A multi-trait test of the leaf-height-seed plant strategy scheme with 133 species from a
pine forest flora. Functional Ecology 24, 493–501.Louault, F., Pillar, V., Aufrre, J., Garnier, E., Soussana, J.F., 2005. Plant traits and functional types in response to reduced disturbance in a
semi-natural grassland. Journal of Vegetation Science 16, 151–160.Medlyn, B.E., Jarvis, P.G., 1999. Design and use of a database of model parameters from elevated co2 experiments. Ecological Modelling 124,
69–83.Messier, J., McGill, B.J., Lechowicz, M.J., 2010. How do traits vary across ecological scales? a case for trait-based ecology. Ecology Letters 13,
838–848.Moles, A.T., Ackerly, D.D., Webb, C.O., Tweddle, J.C., Dickie, J.B., Westoby, M., 2005. A brief history of seed size. Science 307, 576–580.Moles, A.T., Falster, D.S., Leishman, M.R., Westoby, M., 2004. Small-seeded species produce more seeds per square metre of canopy per year, but
not per individual per lifetime. Journal of Ecology 92, 384–396.Muller, S.C., Overbeck, G.E., Pfadenhauer, J., Pillar, V.D., 2007. Plant functional types of woody species related to fire disturbance in forestgrassland
ecotones. Plant Ecology 189, 1–14.Niinemets, U., 2001. Global-scale climatic controls of leaf dry mass per area, density, and thickness in trees and shrubs. Ecology 82, 453–469.Ogaya, R., Penuelas, J., 2003. Comparative field study of quercus ilex and phillyrea latifolia: photosynthetic response to experimental drought
conditions. Environmental and Experimental Botany 50, 137 – 148.Onoda, Y., Westoby, M., Adler, P.B., Choong, A.M.F., Clissold, F.J., Cornelissen, J.H.C., Dıaz, S., Dominy, N.J., Elgart, A., Enrico, L., Fine, P.V.A.,
Howard, J.J., Jalili, A., Kitajima, K., Kurokawa, H., McArthur, C., Lucas, P.W., Markesteijn, L., Perez-Harguindeguy, N., Poorter, L., Richards,L., Santiago, L.S., Sosinski, E.E., Van Bael, S.A., Warton, D.I., Wright, I.J., Joseph Wright, S., Yamashita, N., 2011. Global patterns of leafmechanical properties. Ecology Letters 14, 301–312.
Ordonez, J.C., Van Bodegom, P.M., Witte, J.P.M., Bartholomeus, R.P., Hal, J.R., Aerts, R., 2010. Plant strategies in relation to resource supply inmesic to wet environments: Does theory mirror nature? The American Naturalist 175, 225–239.
Penuelas, J., Sardans, J., Llusia, J., Owen, S.M., Carnicer, J., Giambelluca, T.W., Rezende, E.L., Waite, M., Niinemets, U., 2010. Faster returns onleaf economics and different biogeochemical niche in invasive compared with native plant species. Global Change Biology 16, 2171–2185.
Poorter, L., 2009. Leaf traits show different relationships with shade tolerance in moist versus dry tropical forests. New Phytologist 181, 890–900.Poorter, L., Bongers, F., 2006. Leaf traits are good predictors of plant performance across 53 rain forest species. Ecology 87, 1733–1743.Preston, K.A., Cornwell, W.K., DeNoyer, J.L., 2006. Wood density and vessel traits as distinct correlates of ecological strategy in 51 california coast
range angiosperms. New Phytologist 170, 807–818.Reich, P.B., Oleksyn, J., Wright, I.J., 2009. Leaf phosphorus influences the photosynthesis-nitrogen relation: a cross-biome analysis of 314 species.
Oecologia 160, 207–212.Reich, P.B., Tjoelker, M.G., Pregitzer, K.S., Wright, I.J., Oleksyn, J., Machado, J.L., 2008. Scaling of respiration to nitrogen in leaves, stems and
roots of higher land plants. Ecology Letters 11, 793–801.Sack, L., Cowan, P.D., Jaikumar, N., Holbrook, N.M., 2003. The hydrology of leaves: co-ordination of structure and function in temperate woody
species. Plant, Cell and Environment 26, 1343–1356.Sack, L., Frole, K., 2006. Leaf structural diversity is related to hydraulic capacity in tropical rain forest trees. Ecology 87, 483491.Sack, L., Tyree, M.T., 2005. Leaf Hydraulics and Its Implications in Plant Structure and Function. Oxford, UK.Shioder, S., Rahajoe, J.S., Kohyama, T., 2008. Variation in longevity and traits of leaves among co-occurring understory plants in a tropical montane
forest. Journal of Tropical Ecology 24, 121–133.Shipley, B., 1989. The use of above-ground maximum relative growth rate as an accurate predictor of whole-plant maximum relative growth rate.
Functional Ecology 3, 771–775.Shipley, B., 1995. Structured interspecific determinants of specific leaf area in 34 species of herbaceous angiosperms. Functional Ecology 9,
312–319.Shipley, B., 2002. Trade-offs between net assimilation rate and specific leaf area in determining relative growth rate: relationship with daily
irradiance. Functional Ecology 16, 682–689.Swaine, E.K., 2007. Ecological and evolutionary drivers of plant community assembly in a Bornean rain forest. Ph.D. thesis. University of Aberdeen,
UK.Wright, I.J., Ackerly, D.D., Bongers, F., Harms, K.E., Ibarra-Manriquez, G., Martinez-Ramos, M., Mazer, S.J., Muller-Landau, H.C., Paz, H., Pitman,
N.C.A., Poorter, L., Silman, M.R., Vriesendorp, C.F., Webb, C.O., Westoby, M., Wright, S.J., 2007. Relationships among ecologically importantdimensions of plant trait variation in seven neotropical forests. Annals of Botany 99, 1003–1015.
Wright, I.J., Reich, P.B., Atkin, O.K., Lusk, C.H., Tjoelker, M.G., Westoby, M., 2006. Irradiance, temperature and rainfall influence leaf darkrespiration in woody plants: evidence from comparisons across 20 sites. New Phytologist 169, 309–319.
Wright, I.J., Reich, P.B., Westoby, M., Ackerly, D.D., Baruch, Z., Bongers, F., Cavender-Bares, J., Chapin, T., Cornelissen, J.H.C., Diemer, M.,Flexas, J., Garnier, E., Groom, P.K., Gulias, J., Hikosaka, K., Lamont, B.B., Lee, T., Lee, W., Lusk, C., Midgley, J.J., Navas, M.L., Niinemets, U.,Oleksyn, J., Osada, N., Poorter, H., Poot, P., Prior, L., Pyankov, V.I., Roumet, C., Thomas, S.C., Tjoelker, M.G., Veneklaas, E.J., Villar, R., 2004.The worldwide leaf economics spectrum. Nature 428, 821 – 827.
Wright, S.J., Kitajima, K., Kraft, N.J., Reich, P.B., Wright, I.J., Bunker, D.E., Condit, R., Dalling, J.W., Davies, S.J., Dıaz, S., Engelbrecht, B.M.,Harms, K.E., Hubbell, S.P., Marks, C.O., Ruiz-Jaen, M.C., Salvador, C.M., Zanne, A.E., 2010. Functional traits and the growth-mortality tradeoff
in tropical trees. Ecology 9, 3664–3674.
Individual contributions of the authors: The concept of the paper was developed by Franziska Schrodt, Jens Kattge, Arindam Banerjee and Peter B. Reich. Development of methods by Hanhuai Shan, Farideh Fazayeli, Anuj Karpatne, Franziska Schrodt, Arindam Banerjee and Jens Kattge Data analyses and sensitivity analyses were performed by Franziska Schrodt, Julia Joswig and Farideh Fazayeli. Trait data were provided by the TRY initiative, including major contributions by Jens Kattge, Gerhard Boenisch, Sandra Diaz, John Dickie, Andy Gillison, Sandra Lavorel, Paul Leadley, Christian Wirth, Ian J. Wright, S. Joseph Wright and Peter B. Reich. Franziska Schrodt, Hanhuai Shan, Farideh Fazayeli and Jens Kattge wrote the paper with contributions from all co-authors.