An integrated pan-tropical biomass map using multiple ...
Transcript of An integrated pan-tropical biomass map using multiple ...
1
An integrated pan-tropical biomass map using multiple reference datasets
Avitabile V. 1, Herold M. 1, Heuvelink G.B.M. 1, Lewis S.L. 2,3, Phillips O.L. 2, Asner G. P. 4,
Asthon P., Banin L.F.5, Bayol N. 6, Berry N.12, Boeckx P., de Jong B. 7, DeVries B. 1, Girardin C. 8,
Kearsley E. 9, Lindsell J 10, Lopez-Gonzalez G. 2, Lucas R. 11, Malhi Y. 8, Morel A. 8, Mitchard E. 12,
Nagy L., Qie L.2, Quinones M. 13, Ryan C. 12, Slik F., Sunderland, T., Vaglio Laurin G. 14, Valentini
R. 15, Verbeeck H. 9, Wijaya A. 16, Willcock S. 17
(1). Wageningen University, the Netherlands; (2). University of Leeds, UK; (3).Universtity College
London, UK; (4). Carnegie Institution for Science, USA; (5). Centre for Ecology and Hydrology,
UK. (6). Foret Ressources Management, France; (7). Ecosur, Mexico; (8). University of Oxford,
UK; (9) Ghent University, Belgium; (10). The RSPB Centre for Conservation Science, UK.; (11).
Aberystwyth University, Australia; (12). University of Edinburgh, UK; (13). SarVision, the
Netherlands; (14). Centro Euro-Mediterraneo sui Cambiamenti Climatici, Italy; (15). Tuscia
University, Italy; (16). Center for International Forestry Research, Indonesia; (17). University of
Southampton, UK;
Abstract
We combined two existing biomass datasets (input maps) into a pan-tropical biomass map at 1 km
resolution using an unprecedented reference dataset and a data fusion approach, achieving lower
errors than the input maps and almost unbiased estimates at the continental scale. A variety of field
observations and locally-calibrated high-resolution biomass maps - not used by the input maps -
were harmonized and upscaled to the map resolution, providing 15,969 biomass estimates at 1 km
resolution (reference dataset). The input maps were integrated using a data fusion approach based
on bias removal and weighted linear averaging that incorporates and spatializes the biomass
patterns indicated by the reference data. The method was applied independently in areas (strata)
with homogeneous error patterns of the input maps, which were estimated from the reference data
and additional covariates. The fused map showed biomass stocks for the tropics 15% and 19%
lower than the two input maps, and a different spatial pattern. Compared to the input maps, the
fused map shows higher biomass density in the dense forest areas in the Congo basin, West Africa
Eastern Amazon and South-East Asia, and lower values in Central America and in most dry
vegetation areas of Africa. The validation exercise, based on 3,619 estimates from the reference
2
dataset not used in the fusion process, showed that the fused map had a RMSE 8 – 74% lower than
that of the input maps and, most importantly, nearly unbiased estimates (mean bias 4 Mg dry mass
ha-1 vs. 21 and 28 Mg ha-1 for the input maps). The fusion method can be applied at any scale
including the policy-relevant national level, where it can provide improved biomass estimates by
integrating existing regional biomass maps as input maps and additional, country-specific reference
datasets.
Keywords: aboveground biomass, carbon cycle, forest plots, tropical forest, forest inventory,
REDD+, satellite mapping, remote sensing,
Introduction
Recently, considerable efforts have been made to better quantify the amounts and spatial
distribution of aboveground biomass, a key parameter for estimating carbon emissions and
removals due to land-use change, and related impacts on climate (Baccini et al., 2012; Harris et al.,
2012; Houghton et al., 2012; Saatchi et al. 2011; Mitchard et al. 2014). Particular attention has been
given to the tropical regions, where uncertainties are higher (Ziegler et al., 2012; Grace et al., 2014).
In addition to ground observations acquired by research networks or for forest inventory purposes,
several biomass maps have been recently produced at different scales, using a variety of empirical
modelling approaches based on remote sensing data calibrated by field observations (e.g., Goetz et
al., 2011; Birdsey et al., 2013). Biomass maps at moderate resolution have been produced for the
entire tropical belt by integrating various satellite observations (Saatchi et al., 2011; Baccini et al.,
2012), while higher resolution datasets have been produced at local or national level using medium-
high resolution satellite data (e.g., Avitabile et al., 2012; Cartus et al., 2014), sometimes in
combination with airborne Light Detection and Ranging (LiDAR) data (Asner et al., 2012a, 2012b,
2013, 2014a). The various datasets often have different purposes: research plots provide a detailed
and accurate estimation of biomass (and other ecological parameters or processes) at the local level,
forest inventory networks using a sampling approach to obtain statistics of biomass stocks (or
growing stock volume) per forest type at the sub-national or national level, while high-resolution
biomass maps can provide detailed and spatially explicit estimates of biomass density to assist
natural resource management, and large scale datasets depict biomass distribution for global-scale
carbon accounting and modelling.
3
In the context of the United Nations mechanism for Reducing Emissions from Deforestation and
forest Degradation (REDD+), emission estimates obtained from spatially explicit biomass datasets
may be favoured compared to those based on mean values derived from plot networks. This
preference stems from the fact that plot networks are not designed to represent land cover change
events, which usually do not occur randomly and may affect forests with biomass density
systematically different from the mean value (Baccini and Asner, 2013). With very few tropical
countries having national biomass maps or reliable statistics on forest carbon stocks, regional maps
may provide advantages compared to the use of default mean values (e.g., IPCC (2006) Tier 1
values) to assess emissions from deforestation, if their accuracy is reasonable and their estimates are
not affected by systematic errors (Avitabile et al., 2011). However, these conditions are difficult to
assess since proper validation of regional biomass maps remains problematic, given their large area
coverage and large mapping unit (Mitchard et al., 2013), while ground observations are only
available for a limited number of small sample areas.
The comparison of two recent pan-tropical biomass maps (Saatchi et al., 2011; Baccini et al., 2012)
reveals substantial differences between the two products (Mitchard et al., 2013). Further
comparison with ground observations and high-resolution maps in the Amazon basin indicated
substantially different biomass patterns at regional scales (Mitchard et al., 2014; Baccini and Asner,
2013, Hills et al., 2013). Such comparisons have stimulated a debate over the use and capabilities of
different types of biomass products (Saatchi et al., 2014; Langner et al., 2014) and have highlighted
both the importance and sometimes the lack of integration of different datasets. On one hand, the
two pan-tropical maps are consistent in terms of methodology because both use the same primary
data source (GLAS LiDAR) alongside a similar modelling approach to upscale the LiDAR data to
larger scales. Moreover, they have the advantage of being calibrated using hundreds of thousands of
biomass estimates derived from height metrics computed by a spaceborne LiDAR sensor distributed
over the tropics. However, such maps are based on remotely sensed variables that do not directly
measure biomass, but are sensitive to canopy cover and canopy height parameters that do not fully
capture the biomass variability of complex tropical forests. Furthermore, both products assume
global or continental allometric relationships in which biomass varies only with stand height, and
further errors are introduced by upscaling the calibration data to the coarser satellite data. On the
other hand, ground plots use allometric equations to estimate biomass at individual tree level using
directly measurable parameters such as diameter, height and species identity (hence wood density).
However, they have limited coverage, are not error-free, and compiling various datasets over large
areas is made more complex due to differing sampling strategies (e.g., stratification (or not) of
4
landscapes, plot size, minimum diameter of trees measured). Considering the rapid increase of
biomass observations at different scales and the different capabilities and limitations of the various
datasets, it is becoming more and more important to identify strategies that are capable of making
best use of existing information and optimally integrate various data sources for improved large
area biomass assessment (e.g., see Willcock et al. 2012).
In the present study, we compiled existing ground observations and high-resolution biomass maps
to obtain a reference dataset of high quality aboveground biomass data for the tropical region,
including both plot data and high-resoltion biomass maps (objective 1). This reference dataset was
used to assess the two pan-tropical biomass maps (objective 2) and to combine them in a fused map
that optimally integrates the two maps, based on the method presented by Ge et al. (2014) (objective
3). Lastly, the fused map was compared to known biomass patterns and stocks across the tropics
(objective 4).
Overall, the approach consisted of pre-processing, screening and harmonizing the Saatchi and
Baccini maps (called ‘input maps’), the high-resolution biomass maps (called ‘reference maps’) and
the field plots (called ‘reference plots’; ‘reference dataset’ refers to the maps and plots combined) to
a common spatial resolution and geospatial reference system (Figure 1). The input maps were
combined using bias removal and weighted linear averaging (‘fusion’). The fusion model was
applied independently in areas representing different error patterns of the input maps (called ‘error
strata’), which were estimated from the reference data and additional covariates (called ‘covariate
maps’). The reference dataset included only a subset of the reference maps (i.e., the cells with
highest confidence) and if a stratum was lacking reference data (‘reference data gaps’), additional
data were extracted from the reference maps (‘consolidation’). The fused map was validated using
independent data and its uncertainty quantified using model parameters. In this study, the term
biomass refers to aboveground live woody biomass and is reported in units of Mg dry mass ha-1.
5
Figure 1: methodology flowchart
Data & Methods
Input maps
The input maps used for this study were the two pan-tropical datasets published by Saatchi et al.
(2011) and Baccini et al. (2012), hereafter referred to as the Saatchi and Baccini maps individually,
or collectively as input maps. The Baccini map was provided in MODIS sinusoidal projection with
a spatial resolution of 463 m while the Saatchi map is in a geographic projection (WGS-84) at
0.00833 degrees (c. 1 km) pixel size. The two datasets were harmonized by first projecting the
Baccini map to the coordinate system of the Saatchi map using the Geospatial Data Abstraction
Library (www.gdal.org) and then aggregating to match its spatial resolution and grid. Spatial
aggregation was performed by computing the mean value of the pixels whose centre was located
6
within each 1 km cell of the Saatchi map. Resampling was then undertaken using the nearest
neighbor method.
Reference dataset
The reference dataset comprised individual tree-based field data and high-resolution biomass maps.
The field data included biomass estimates derived from field measurement of tree parameters and
allometric equations. The biomass maps included high-resolution (≤ 100 m) datasets derived from
satellite data using empirical models calibrated and validated using local ground observations and,
in some cases, airborne LiDAR measurements (Table S7 – S10). Given the variability of procedures
used to acquire and produce the various datasets, they were first screened according to a set of
quality criteria to select only the most reliable biomass estimates, and then pre-processed to be
harmonized with the pan-tropical biomass maps in terms of spatial resolution and variable observed.
Field and map datasets providing aboveground carbon density were converted to biomass units
using the same coefficients used for their original conversion from biomass to carbon. The sources
of the reference data are listed in table S7 and S9.
Data screening and pre-processing
Reference field data
The reference field data included ground observations in forest inventory plots, for which accurate
geolocation and biomass estimates were available. The pre-processing of the data consisted of a 2-
step screening and a harmonization procedure. A preliminary screening selected only the ground
data that estimated aboveground biomass of all living trees with diameter at breast height ≥ 5-10 cm,
and acquired on or after the year 2000. The taxonomic identities of trees strongly indicate wood
density and hence stand-level biomass (e.g., Baker et al., 2004; Mitchard et al. 2014). Plots were
therefore only selected if tree biomass was estimated using at least tree diameter and wood density
as input parameters, and plot coordinates were measured using a GPS. All datasets not conforming
to these requirements or not providing clear information on the biomass pool measured, the tree
parameters measured in the field, the allometric model applied, the year of measurement and the
plot geolocation and extent were excluded. Next, the plot data were projected to the geographic
reference system WGS-84 and harmonized with the input maps by averaging the biomass values
located within the same 1 km pixel. The field plots not fully located within one pixel were attributed
to the map cell where the majority of the plot area (i.e., the plot centroid) was located.
7
Lastly, the representativeness of the plot over the 1km pixels was considered, and the ground data
were further screened to discard plots not representative of the map cells in terms of biomass
density. More specifically, since the two input maps are not aligned and therefore their pixels do not
correspond to the same geographic area, the plot representativeness was assessed on the area of both
pixels (identified before the map resampling). The representativeness was evaluated on the basis of
the homogeneity of the tree cover and crown size within the pixel, and it was assessed using visual
interpretation of high-resolution images provided on the Google Earth platform. If the tree cover
and tree crowns were not homogeneous over at least 90% of the pixel area, the plots located within
the pixel were discarded. More details on the selection procedure are provided in the Supplementary
Information.
Reference biomass maps
The reference biomass maps consisted of high quality local or national maps published in the
scientific literature. Maps providing biomass estimates grouped in classes (e.g., Willcock et al.,
2012) were not used since the class values represent the mean biomass over large areas, usually
spanning multiple strata used in the present study (see ‘Stratification approach’). The reference
biomass maps were first pre-processed to match the input maps through re-projection, aggregation
and resampling using the same procedures described for the pre-processing of the Baccini map.
Then, only the cells with largest confidence (i.e., lowest uncertainty) were selected from the maps.
Since uncertainty maps were usually not available, and considering that the reference maps were
based on empirical models, the map cells with greatest confidence were assumed to be those in
correspondence of the training data (field plots and/or LiDAR data). When the locations of the
training data were not available, random pixels were extracted from the maps. In order to compile a
reference database that was representative of the area of interest and well balanced among the
various input datasets (as defined in ‘Consolidation of the reference dataset’), the amount of
reference data extracted from the biomass maps was proportional to their area and not greater than
the amount of samples provided by the field datasets representing a similar area. In the case where
maps with extensive training areas provided a disproportionate number of reference pixels, a further
screening selected only the areas underpinned by the largest amount of training data.
Selected reference data
8
The biomass reference dataset compiled for this study consists of 15,969 1-km reference pixels,
distributed as follows: 953 in Africa, 449 in South America, 9,167 in Central America, 400 in Asia
and 5,000 in Australia (Fig. 2, Table 1). The reference data were relatively uniformly distributed
among the strata (Table S5) but their amount varied considerably by continent. The average amount
of reference data per stratum ranged from 50 (Asia) to 1,144 (Central America) reference pixels and
their variability (computed as standard deviation relative to the mean) ranged from 25% (South
America) to 57% (Central America). The uneven distribution of reference data across the continents
is mostly caused by the availability of ground observations: in order to have a balanced reference
dataset for each stratum, the reference data extracted from biomass maps were limited to the
(smaller) amount of direct field observations. When biomass maps were the only source of data this
constrain was not occurring and larger datasets could be derived from the maps (i.e., Central
America, Australia). Ad example, all pixels (4,263) in correspondence of the training data could be
selected from the biomass map of Mexico while only 13% of the 1,167 pixels with training data
could be selected from the biomass map of Uganda to maintain comparability with smaller
reference datasets available in Africa.
The reference data were selected from 18 ground datasets providing 1,591 research field
observations and 5,036 forest inventory plots, and from 9 high resolution biomass maps calibrated
by field observations and, in four cases, airborne LiDAR data. The field plots used for the
calibration of the maps are not included in this section because they were only used to select the
reference pixels from the maps. The visual screening of the field plots removed 35% of the input
data (from 6,627 to 4,283) and their aggregation to 1 km resolution further removed 70% of the
reference units derived from field plots (from 4,283 to 1,274), while 10,741 reference pixels were
extracted from the high-resolution biomass maps. The criteria used to select the reference pixels for
each map are reported in Table S3. The consolidation procedure added 3,954 reference data to the
final reference dataset that consisted of 15,969 units (Table S2). In particular, ground observations
were mostly discarded in areas characterized by fragmented or heterogeneous vegetation cover and
high biomass spatial variability. In such contexts, reference data were often acquired from the
biomass maps. The reference dataset was compared to the input maps, revealing substantial linear
correlation (ranging from 0.60 to 0.77) between the errors of the Saatchi and Baccini maps and
RMSE values in the same order of magnitude for all continents, ranging from 96 to 124 Mg ha-1,
with the exception of Australia (45 Mg ha-1) (Table S4).
9
Figure 2: Biomass reference dataset for the tropics and spatial coverage of the two input maps
Table 1: Number of reference data (plots and 1-km pixels) selected after the screening, upscaling and
consolidating procedures, per continent. The reference data selected for each individual dataset are reported in
Table S2. The field plots underpinning the reference biomass maps are not included.
Continent Available Selected Consolidated
Plots Plots Pixels Pixels
Africa 2,281 1,976 953 953
S. America 648 474 449 449
C. America - - 5,260 9,167
Asia 3,698 1,833 353 400
Australia - - 5,000 5,000
Total TROPICS 6,627 4,283 12,015 16,969
Modelling approach
The fusion model
The integration of the two input maps was performed with a fusion model based on the concept
presented by Ge et al. (2014) and further developed for this study. The fusion model consists of bias
removal and weighted linear averaging of the input maps to produce an output with greater
accuracy than each of the input maps. The reference biomass dataset described above was used to
calibrate the model and to assess the accuracy of the input and fused maps. A specific model was
developed for each stratum (see ‘Stratification approach’).
10
Following Ge et al. (2014), the p input maps for locations sD, where D is the geographical domain
of interest common to the input maps, were combined using a weighted linear average:
(1) 1
( ) ( ) ( ( ) ( ))
p
i i iif s w s z s v s
where f is the fused map, the wi(s) are weights, zi the estimate of the i-th input map and vi(s) is the
bias estimate. The bias term was computed as the average difference between the input map and the
reference data. The weights were obtained from a statistical model that assumes the map estimates
zi to be the sum of the true biomass bi with a bias term vi and a random noise term i with zero mean
for each location sD. We further assumed that the i of the input maps are jointly normally
distributed with variance-covariance matrix C(s). Differently from Ge et al. (2014), C(s) was
estimated using a robust covariance estimator as implemented by the ‘robust’ package in R (Wang
et al., 2014), which uses the Stahel-Donoho estimator for strata with fewer than 5,000 observations
and the Fast Minimum Covariance Determinant estimator for larger strata. Under these assumptions,
the variance of the estimation error of the fused map f(s) is minimized by calculating the weights
w(s) as Searle (1971. p. 89):
(2) 1
1 1( ) ( ) ( )
1 C 1 1 CT T Tw s s s
where 1=[1, ..., 1]T is the p-dimensional unit vector and where T means transpose. Larger weights
were assigned to the map with lower error variance. The fusion model assured that the variance of
the error in the fused map was smaller than that of the input maps (Bates and Granger, 1969),
especially if the errors associated with these maps were not strongly positively correlated and their
error variances were close to the smallest error variance. The fusion model can be applied to any
number of input maps. Where there is only one input map, the model estimates and removes its bias
and the weights are set equal to 1.
The model parameters
The fusion model computed a set of bias and weight parameters for each stratum and continent on
the basis of the respective reference data, and used these for the linear weighted combination of the
input maps (Table S5). Since the stratification approach grouped together data with similar error
patterns (see ‘Stratification approach’), the biases varied considerably among the strata and could
reach values up to ±200 Mg ha-1. However, considering the area of the strata, the biases of both
11
input maps were smaller than ± 45 Mg ha-1 for at least 50% of the area of all continents and smaller
than ±100 Mg ha-1 for 81% - 98% of the area of all continents.
Stratification approach
Error modelling
Preliminary comparison of the reference data with the input maps showed that the error variances
and biases of the input maps were not spatially homogeneous but varied considerably in different
regions. Since the fusion model is based on bias removal and weighted combination of the input
maps, the more homogeneous the error characteristics in the input maps are, the better they can be
reduced by the model. For this reason, the stratification approach aimed at identifying areas with
homogeneous error structure (hereafter named ‘error strata’) in both input maps. A first
stratification was done by geographic location (namely Central America, South America, Africa,
Asia and Australia) to reflect the regional allometric relationships between biomass and tree
diameter and height (Feldpausch et al., 2011, 2012). Then, the error strata were identified for each
continent, using a two-step process. Firstly, the error maps of the Saatchi and Baccini maps were
predicted separately on the basis of their biomass estimates and land cover, tree cover and tree
height parameters by using a Random Forest model (Breiman, 2001), calibrated on the basis of the
reference dataset. Secondly, the error maps of the Saatchi and Baccini datasets were clustered using
the K-Means approach. Eight clusters (hence, eight error strata) was considered as a sensible trade-
off between homogeneity of the errors of the input maps and number of reference observations
available per stratum (Fig. S1). The performance of this approach was assessed on the basis of the
root mean square error (RMSE) of the models that predicted the errors of the input maps. Since the
ensemble modelling approach used in this study (Random Forest) includes a certain level of
randomness, the model performance was computed as an average of 100 model runs (Fig. S2, Fig.
S3). The RMSE computed on the Out-Of-Bag data (i.e., data not used for training) of the Random
Forest models for the Baccini extent ranged between 22.8 ± 0.3 Mg ha-1 for Central America to 83.7
± 2.5 Mg ha-1 in Africa, with the two models (one for each input map) achieving similar accuracies
in each continent (Fig. S2, Fig. S3). In most cases the main predictors of the errors of the input
maps were the biomass values of the maps themselves, followed by tree cover and tree height, while
land cover was always the least important predictor (Table S1). Further details on error modelling
and processing of the input data are provided in the Supplementary Information.
12
The stratification map identifies eight strata for each continent with homogeneous error patterns in
the input maps (Fig. S4). The use of a stratification based on the errors of the input maps was
compared with a stratification based on an alternative variable, such as land cover (used by Ge et al.,
2014), tree cover or tree height. Each of these variables was aggregated into eight classes to
maintain comparability with the number of clusters used in the error strata, and each stratification
map was used to develop a specific fused map. The performance of alternative stratification
approaches was assessed by validating the respective fused maps. The results (Fig. S5)
demonstrated that the stratification based on error modelling and clustering (i.e., the error strata)
produced a fused map with higher accuracy than that of the maps based on other stratification
approaches, and therefore was used in this study.
Consolidation of the reference dataset
Considering that the parameters of the fusion model (i.e. weight and bias) are sensitive to the
characteristics of the reference data, each stratum requires that calibration data are relatively well-
balanced between the various reference datasets. Specifically, if a stratum contains few calibration
data, the model becomes more sensitive to outliers, while if a reference dataset is much larger than
the others, the model is more strongly determined by the dominant dataset. For these reasons, where
the reference dataset was under-represented or un-balanced, it was consolidated by additional
reference data taken from the reference biomass maps, if available. The reference data were
considered insufficient if a stratum had less than half of the average reference data per stratum, and
were considered un-balanced if a single dataset provided more than 75% of the reference data of the
whole stratum and it was not representative of more than 75% of its area. In such cases, additional
reference data were randomly extracted from the reference biomass maps that did not provide more
than 75% of the reference data. The amount of data to be extracted from each map was computed in
a way to obtain a reference dataset with an average number of reference data per stratum and not
dominated by a single dataset. If necessary, additional training data representing areas with no
biomass (e.g., bare soil) were included, using visual analysis of Google Earth images to identify
locations without vegetation.
Post-processing
Predictions outside the coverage of the Baccini map
The Baccini map covers the tropical belt between 23.4 degree north latitude and 23.4 degree south
latitude while the Saatchi map presents a larger latitudinal coverage (Fig. 2). The fusion model was
13
firstly applied to the area common to both input maps (Baccini extent) and then extended to the area
where only the Saatchi map is available. In the latter area, the model focused only on removing the
bias of the Saatchi map using the values estimated for the Baccini extent. The model predictions for
the Saatchi extent were mosaicked to those for the Baccini extent using a smoothing function
(inverse distance weight) on an overlapping area of 1 degree within the Baccini extent between the
two maps. The resulting fused map was projected to an equal area reference system (MODIS
Sinusoidal) before computing the total biomass stocks for each continent, which were obtained by
summing the products of the biomass density of each pixel with their area. Water bodies were
masked over the whole study area using the ESA CCI Water Bodies map (ESA, 2014).
Assessing biomass in intact and non-intact forest
The biomass estimates of the fused and input maps in forest areas were further investigated
regarding their distribution in ecozones and between intact and non-intact landscapes. Forest areas
were defined as areas dominated by tree cover according to the GLC2000 map (classes 1-10 in the
global legend of GLC2000). Ecozones were defined according to the Global Ecological Zone (GEZ)
map for the year 2000 (FAO, 2000). The intact landscapes were defined according to the Intact
Forest Landscape (IFL) map for the year 2000 (Potapov et al., 2008). On the basis of these datasets
the mean forest biomass density of the fused and input maps were computed for intact and non-
intact landscapes for each continent and major ecozone. To reduce the impact of spatial
inaccuracies in the maps only ecozones with intact forest areas larger than 1,000 km2 were
considered. The mean biomass density of intact and non-intact forests per continent was computed
as the area-weighted mean of the contributing ecozones.
Validation and uncertainty
Validation was performed by randomly splitting the reference data into a calibration set (70% of the
data) and a validation set (remaining 30%). The ‘final’ fused map presented in Fig. 3 used 100% of
the reference data and for validation purposes a ‘test’ fused map was produced using only the
calibration data and its estimates, as well as those of the input maps, were compared with the
validation data. To maintain full independence, validation data were not used for any step related to
the development of the fused map, including production of the stratification map. To account for
any potential impacts of the random selection of validation data, the procedure was repeated 100
times, computing each time a new random selection of the calibration and validation datasets. This
procedure allowed computing the mean RMSE and assessing its standard deviation for each map.
14
The uncertainty of the fused map was computed with respect to model uncertainty, not including the
error sources in the input data (see ‘Discussion’). The model uncertainty consisted of the expected
variance of the error of the fused map (which is assumed bias-free) and was derived for each
stratum from C(s). The error variance was converted to an uncertainty map by reclassifying the
stratification map.
Results
Biomass map
The fusion model produced a biomass map at 1 km resolution for the tropical region, with an extent
equal to that of the Saatchi map (Fig. 3). In terms of aboveground stocks, the fused map gave
biomass estimates lower than both input maps at continental level. The total stock was 451 Pg dry
mass, 17% lower than the estimate of the Saatchi map (545 Pg) for the same extent. Considering the
same common area (Baccini extent), the fused map estimate was 360 Pg, 13% and 21% lower than
the Saatchi (413 Pg) and Baccini (457 Pg) estimates, respectively (Table S6).
Moreover, the fused map presented spatial patterns substantially different from both input maps
(Fig. 4): the biomass estimates were higher than both input maps in the dense forest areas in the
Congo basin, in West Africa, in the north-eastern part of the Amazon basin (Guyana shield) and in
South-East Asia, and lower in Central America and in most dry vegetation areas of Africa. In the
central part of the Amazon basin the fused map showed lower estimates than the Baccini map and
higher estimates than the Saatchi map, while in the southern part of the Amazon basin these
differences were inversed. Similar trends emerged when comparing the maps separately for intact
and non-intact forest ecozones (Supporting Information). In addition, the average difference
between intact and non-intact forests was larger than that derived from the input maps in Africa,
Asia and Australia, similar or slightly larger in South America, and smaller in Central America (Fig.
S7).
The fused map records the highest biomass density (> 400 Mg ha-1) in the Guyana shield, in the
Central and Western part of the Congo basin and in the intact forest areas of Borneo and Papua New
Guinea. The analysis of the distribution of forest biomass in intact and non-intact ecozones showed
that, according to the fused map, the mean biomass density was greatest in intact African (360 Mg
ha-1) and Asian (328 Mg ha-1) forests, followed by intact forests in South America (262 Mg ha-1),
15
Asia (202 Mg ha-1), Australia (162 Mg ha-1) and Central America (135 Mg ha-1) (Fig. S7). Biomass
in non-intact forests was much lower in all regions (Africa, 76 Mg ha-1; South America, 136 Mg ha-
1; Asia, 196 Mg ha-1; Australia, 90 Mg ha-1; and Central America, 46 Mg ha-1).
Figure 3: Fused map, representing the distribution of live woody aboveground biomass (AGB) for all land cover
types at 1 km resolution for the tropical region.
16
Figure 4: Difference maps obtained by subtracting the fused map from the Saatchi map (top) and the Baccini
map (bottom).
Validation
The validation exercise showed that the fused map achieved a lower RMSE (a decrease of 8 – 74%)
and bias (a decrease of 90 – 153%) than the input maps for all continents (Fig. 5). While the RMSE
of the fused map was consistently lower than that of the input maps but still substantial (87 – 98 Mg
ha-1) in the largest continents (Africa, South America and Asia), the mean error (bias) of the fused
map was almost null in most cases. Moreover, in the three main continents the bias of the input
maps tended to vary with biomass, with overestimation at low values and underestimation at high
values, while the errors of the fused map were more consistently distributed (Fig. 6). When
computing the error statistics as average of the regional validation results weighted by the
respective area coverage, the mean bias (in absolute terms) for the fused, Saatchi and Baccini maps
was 5, 21 and 28 Mg ha-1 and the mean RMSE was 89, 104 and 112 Mg ha-1, respectively (Fig. 5).
The accuracy of the input maps was computed using the validation dataset (30% of the reference
dataset) to be consistent with the accuracy of the fused map. The accuracy of the input maps was
17
also computed using all reference data and the results (Table S4) were similar to those based on the
validation dataset.
Figure 5: RMSE (left) and bias (right) of the fused and input maps (in Mg ha-1) per continent obtained using
independent reference data not used for model development. The error bars indicate one standard deviation of
the 100 simulations. Numbers reported in brackets indicate the number of reference observations used for each
continent.
Figure 6: scatterplots of the validation reference data (x-axis) and predictions (y-axis) of the input maps (left
plots) and fused map (right plots) by continent.
18
Uncertainty map
The uncertainty of the model predictions at 1 km resolution indicated that the standard deviation of
the error of the fused map for each stratum was in the range 11 - 78 Mg ha-1, with largest
uncertainties in areas with largest biomass estimates (Congo basin and Eastern Amazon basin).
When computed in relative terms (as percentage of the biomass estimate) the model uncertainties
presented opposite patterns, with uncertainties larger than the estimates (> 100%) in low biomass
areas (< 20 Mg ha-1 on average) of Africa, South America and Central America, while high biomass
forests (> 210 Mg ha-1 on average) had uncertainties lower than 25% (Fig. 7). The uncertainty
measure derived from C(s) is computed only when two or more input maps are available. Hence it
could not be calculated for Australia because the model for this continent was based on only one
input map (Saatchi map).
Figure 7: Uncertainty of the fused map, in absolute values (top) and relative to the biomass estimates (bottom),
representing one standard deviation of the error of the fused map.
19
Discussion
Biomass patterns and stocks emerging from the reference data
The biomass map produced with the fusion approach is largely driven by the reference dataset and
essentially the method is aimed at spatializing the biomass patterns indicated by the reference data
using the support of the input maps. For this reason, great care was taken in the pre-processing of
the reference data, which included a two-step quality screening based on metadata analysis and
visual interpretation, and their consolidation after stratification. As a result, the reference dataset
provides an unprecedented compilation of biomass estimates at 1 km resolution for the tropical
region, covering a wide range of vegetation types, biomass ranges and ecological regions across the
tropics. It includes the most comprehensive and accurate tropical field plot networks and high
quality maps calibrated with airborne LiDAR, which provide more accurate estimates compared to
those obtained from other sensors (Zolkos et al., 2013). The main trends present in the fused map
emerged from the combination of different and independent reference datasets and are in agreement
with the estimates derived from long-term research plot networks (Malhi et al., 2006; Phillips et al.
2009; Lewis et al. 2009; Slik et al., 2010, 2013; Lewis et al., 2013) and high-resolution maps (Asner
et al., 2012a, 2012b, 2013, 2014a). Specifically, the biomass patterns in South America represent
spatial trends described by research plot networks in the dense intact and non-intact forests in the
Amazon basin, forest inventory plots collected in the dense forests of Guyana and samples extracted
from biomass maps for Colombia and Peru representing a wide range of vegetation types, from arid
grasslands to humid forests. Similarly, biomass patterns depicted in Africa were derived from a
combination of various research plots in dense undisturbed forest (Gabon, Cameroon, Democratic
Republic of Congo, Ghana, Liberia), inventory plots in forest concessions (Democratic Republic of
Congo), biomass maps in woodland and savannah ecosystems (Uganda, Mozambique) and research
plots and maps in montane forests (Ethiopia, Madagascar). Most vegetation types in Central
America, Asia and Australia were also well-represented by the extensive forest inventory plots
(Indonesia, Vietnam and Laos) and high-resolution maps (Mexico, Panama, Australia).
In spite of the extensive coverage, the current database is far from being representative of the
biomass variability across the tropics. As a consequence, the model estimates are expected to be
less accurate in contexts not adequately represented. In the case of the fusion approach, this
corresponds to the areas where the input maps present error patterns different than those identified
in areas with reference data: in such areas the model parameters used to correct the input maps (bias
and weight) may not adequately reflect the errors of the input maps and hence cannot optimally
20
correct them. In particular, deciduous vegetation and heavily disturbed forest of Africa and South
America, and large parts of Asia were lacking quality reference data. Moreover, even though plot
data were spatially distributed over the central Amazon and the Congo basin, large extents of these
two main blocks of tropical forest have never been measured (cf. maps in Mitchard et al. 2014;
Lewis et al. 2013). Considering the evidence of significant local differences in forest structure and
biomass density within the same forest ecosystems (Kearsley et al., 2013), additional data are
needed to strengthen the confidence of the fused map as well as that of any other biomass map
covering the tropical region. Moreover, a dedicated gap analysis to assess the main regions lacking
biomass reference data and identify priority areas for new field sampling and LiDAR campaigns
would be very valuable for future improved biomass mapping.
Regarding the biomass stocks, a previous study showed that despite their often very strong local
differences the two input maps tended to provide similar estimates of total stocks at national and
biome scales and presented an overall net difference of 10% for the pan-tropics (Mitchard et al.,
2013). However, such convergence is mostly due to compensation of contrasting estimates when
averaging over large areas. The larger differences with the estimates of the present study (13% and
21%) suggest an overestimation of the total stocks by the input maps. This is in agreement with the
results of two previous studies that, on the basis of reference maps obtained by field-calibrated
airborne LiDAR data, identified an overestimation of 23% - 42% of total stocks in the Saatchi and
Baccini maps in the Colombian Amazon (Mitchard et al., 2013) and a mean overestimation of about
100 Mg ha-1 for the Baccini map in the Colombian and Peruvian Amazon (Baccini and Asner,
2013).
In general, the biomass density values of the fused map were calibrated and therefore in agreement
with the existing estimates obtained from plot networks and high resolution maps. The comparison
of mean biomass values in intact and non-intact forests stratified by ecozone provided further
information on the differences among the maps. The mean biomass values of the fused map in non-
intact forests were mostly lower than those of the input maps, suggesting that in disturbed forests
the biomass estimates derived from stand height parameters retrieved by spaceborne LiDAR (as in
the input maps) tend to be higher compared to those based on tree parameters or very high
resolution airborne LiDAR measurements (as in the fused map and reference data). This difference
occurred especially in Africa, Asia and Central America while it was less evident in South America
and Australia. By contrast, the differences among the maps for intact forests varied by continent,
with the fused map having, on average, higher mean biomass values in Africa, Asia and Australia,
21
lower values in Central America, and variable trends within South America, reflecting the different
allometric relationships used by the various datasets in different continents.
As mentioned above, a larger amount of reference data, ideally acquired based on a clear statistical
sampling design, instead of an opportunistic one, will be required to confirm such conclusions.
While dense sampling of tropical forests using field observations is often impractical, new
approaches combining sufficient ground observations of individual trees at calibration plots with
airborne LiDAR measurements for larger sampling transects would allow a major increase in the
quantity of calibration data. In combination with wall-to-wall medium resolution satellite data (e.g.,
Landsat) these may be capable of achieving high accuracy over large areas (10% - 20% uncertainty
at 1-ha scale) while being cost-effective (e.g., Asner et al. 2014b; Asner et al., 2013). In addition,
new technologies, such as terrestrial LiDAR scanning, allows for better estimates at ground level
(Calders et al., 2015), reducing considerably the uncertainties of field estimates based on
generalized allometric equations without employing destructive sampling. Nevertheless, such
techniques benefit from extensive and precise measurements of tree identity in order to determine
wood density patterns, since floristic composition influences biomass at multiple scales (e.g., the
strong pan-Amazon gradient in wood density shown by ter Steege et al., 2006), and cannot account
for variations in hollow stems and rottenness (Nogueira et al., 2006).
Additional error sources
Apart from the uncertainty of the fusion model described above (see ‘Uncertainty’), three other
sources of error were identified and assessed in the present approach: i) errors in the reference
dataset; ii) errors due to temporal mismatch between the reference data and the input maps; iii)
errors in the stratification map.
Errors in the reference dataset
The reference dataset is not error-free but it inherits the errors present in the field data and local
maps and introduces additional uncertainties during the pre-processing of the data by resampling
the maps and by upscaling the plot data to 1 km resolution. In particular, while the geolocation error
of the original datasets was considered relatively small (< 50 m) since plot coordinates were
collected using GPS measurements and the biomass maps were based on satellite data with accurate
geolocation (i.e., Landsat, ALOS, MODIS), larger errors (up to 500 m, half a pixel) could have
been introduced with the resampling of the 1 km input maps. All these error sources were
22
minimized by selecting only the datasets that fulfilled certain quality criteria and by further
screening them by visual analysis of high-resolution images available on the Google Earth platform,
discarding the data not representative of the respective map pixels. In case of reference data that
clearly did not match with the high-resolution images and/or with the input maps (e.g., reporting no
biomass in dense forest areas or high biomass on bare land), the data were considered as an error in
the reference dataset, a geolocation error in the plots or maps, or it was assumed that a land change
process occurred between the plot measurement and the image acquisition time (see next paragraph).
Errors due to temporal mismatch
The temporal difference of input and reference data introduced some uncertainty in the fusion
model. The input maps refer to the years 2000 - 2001 (Saatchi) and 2007 - 2008 (Baccini) while the
reference data mostly spanned the period 2000 – 2013. Therefore, the differences between the input
maps and the reference data may also be due to a temporal mismatch of the datasets. However,
changes due to deforestation were most likely excluded during the visual selection of the reference
data, when high-resolution images showed clear land changes (e.g., bare land or agriculture) in
areas where the input maps provided biomass estimates relative to forest areas (or vice-versa,
depending on the timing of acquisition of the datasets). However, changes due to forest regrowth
and forest degradation events that did not affect the forest canopy could not be considered with the
visual analysis and may have affected the mismatch observed between the reference data and the
input maps (< |56 - 78| Mg ha-1 for 50% of the cases of the Saatchi and Baccini maps, respectively).
The mismatch was in the range of biomass changes due to regrowth (1 - 13 Mg ha-1 year-1) (IPCC,
2003) or low-intensity degradation (14 - 100 Mg ha-1, or 3-15% of total stock) (Pearson et al., 2014;
Asner et al., 2010). On the other hand, considering the area affected by degradation (about 20% in
the humid tropics) (Asner et al., 2009), the temporal mismatch was considered responsible only for
a limited part of the large differences observed between the reference data and the input maps.
Small additional offsets may also be caused by the documented secular changes in biomass density
within intact tropical forests, which has been increasing by 0.2 – 0.5% per year (Phillips et al. 1998,
Chave et al. 2008, Phillips and Lewis 2014). It should also be noted that the reference data were
used to optimally integrate the input maps, and in the case of a temporal difference the fused map
was also ‘actualized’ to the state of the vegetation when the reference data were acquired. Therefore
the fused map cannot be attributed to a specific year and it represents the first decade of the 2000’s.
Errors in the stratification map
23
The errors in the stratification map (i.e., related to the prediction of the errors of the input maps)
were still substantial in some areas and affected the fused map in two ways. First, the reference data
that were erroneously attributed to a certain stratum introduced ‘noise’ in the estimation of the
model parameters (bias and weight), but the impact of these ‘outliers’ was largely reduced by the
use of a robust covariance estimator. Second, erroneous predictions of the strata caused the use of
incorrect model parameters in the combination of the input maps. The latter is considered to be the
main source of error of the fused map and indicates that the method can achieve improved results if
the errors of the input maps can be predicted more accurately. However, additional analysis showed
that, on average, fused maps based on alternative stratification approaches achieved lower accuracy
than the map based on an error stratification approach (Fig. S5). Therefore, this approach was
preferred over a stratification based on an individual biophysical variable (e.g., tree cover, tree
height, land cover or ecozone).
Application of the method at national scale
The fusion method presented in this study allows for the optimal integration of any number of input
maps to match the patterns indicated by the reference data. However, the accuracy of the fused map
depends on the availability of reference data representative of the error patterns of the input maps.
While the current reference database does not represent adequately all error strata for the tropical
region, and the model estimates are expected to have lower confidence in under-represented areas,
the proposed method may be applied locally and provide improved biomass estimates where
additional reference data are available. For example, the fusion method may be applied at national
level using existing forest inventory data, research plots and local maps that cover only part of the
country to calibrate global or regional maps, which provide national coverage but may not be
tailored to the country context. Such country-calibrated biomass maps may be used to support
natural resource management and national reporting under the REDD+ mechanism, especially for
countries that have limited capacities to map biomass from remote sensing data (Romijn et al. 2012).
Considering the increasing number of global or regional biomass datasets based on different data
and methodologies expected in the coming years, and that likely there will not be a single ‘best
map’ but rather the accuracy of each will vary spatially, the fusion approach may allow to optimally
combine and adjust available datasets to local biomass patterns identified by reference data.
24
Acknowledgments
This study was supported by the EU GEOCARON project. Data were also acquired by the
Sustainable Landscapes Brazil project supported by the Brazilian Agricultural Research
Corporation (EMBRAPA), the US Forest Service, and USAID, and the US Department of State. OP,
SLL and LQ acknowledge the support of the European Research Council (T-FORCES), SLL and
TS were supported by CIFOR/USAID; SLL was also supported by a Philip Leverhulme Prize.
References
Asner GP, Clark JK, Mascaro J et al. (2012) High-resolution mapping of forest carbon stocks in the
Colombian Amazon. Biogeosciences, 9, 2683–2696.
Asner GP, Clark JK, Mascaro J et al. (2012) Human and environmental controls over aboveground
carbon storage in Madagascar. Carbon Balance and Management, 7, 2.
Asner GP, Mascaro J, Anderson C et al. (2013) High-fidelity national carbon mapping for resource
management and REDD+. Carbon balance and management, 8, 7.
Asner GP, Mascaro J (2014) Mapping tropical forest carbon: Calibrating plot estimates to a simple
LiDAR metric. Remote Sensing of Environment, 140, 614–624.
Asner GP, Knapp DE, Martin RE et al. (2014) Targeted carbon conservation at national scales with
high-resolution monitoring. Proceedings of the National Academy of Sciences, 111, E5016–
E5022.
Avitabile V, Herold M, Henry M, Schmullius C (2011) Mapping biomass with remote sensing: a
comparison of methods for the case study of Uganda. Carbon Balance and Management, 6, 7.
Avitabile V, Baccini A, Friedl M a., Schmullius C (2012) Capabilities and limitations of Landsat
and land cover data for aboveground woody biomass estimation of Uganda. Remote Sensing of
Environment, 117, 366–380.
Baccini a., Goetz SJ, Walker WS et al. (2012) Estimated carbon dioxide emissions from tropical
deforestation improved by carbon-density maps. Nature Climate Change, 2, 182–185.
Baccini A, Asner GP (2013) Improving pantropical forest carbon maps with airborne LiDAR
sampling. Carbon Management, 4, 591–600.
Bates JM, Granger CWJ (1969) The Combination of Forecasts. Journal of the Operational
Research Society, 20, 451–468.
Birdsey R, Angeles-Perez G, Kurz W a et al. (2013) Approaches to monitoring changes in carbon
stocks for REDD+. Carbon Management, 4, 519–537.
25
Breiman L (2001) Random forests. Machine Learning, 45, 5−23.
Calders K, Newnham G, Burt A et al. (2015) Nondestructive estimates of above-ground biomass
using terrestrial laser scanning. Methods in Ecology and Evolution, 6, 198–208.
Cartus O, Kellndorfer J, Walker W, Franco C, Bishop J, Santos L, Michel-Fuentes JM (2014) A
National, Detailed Map of Forest Aboveground Carbon Stocks in Mexico. Remote Sensing, 6,
5559–5588.
Chave J, Olivier J, Bongers F et al. (2008) Above-ground biomass and productivity in a rain forest
of eastern South America. Journal of Tropical Ecology, 24, 355–366.
ESA, 2014. Global Water Bodies. http://www.esa-landcover-cci.org/
FAO, 2000. Global ecological zoning for the global forest resources assessment 2000. FAO FRA
Working Paper Rome, Italy; 2001
Feldpausch TR, Banin L, Phillips OL et al. (2011) Height-diameter allometry of tropical forest
trees. Biogeosciences, 8, 1081–1106.
Feldpausch TR, Lloyd J, Lewis SL et al. (2012) Tree height integrated into pantropical forest
biomass estimates. Biogeosciences, 9, 3381–3403.
Ge Y, Avitabile V, Heuvelink GBM, Wang J, Herold M (2014) Fusion of pan-tropical biomass
maps using weighted averaging and regional calibration data. International Journal of Applied
Earth Observation and Geoinformation, 31, 13–24.
Goetz S, Dubayah R (2011) Advances in remote sensing technology and implications for measuring
and monitoring forest carbon stocks and change. Carbon Management, 2, 231–244.
Grace J, Mitchard E, Gloor E (2014) Perturbations in the carbon budget of the tropics. Global
Change Biology.
Harris NL, Brown S, Hagen SC et al. (2012) Baseline Map of Carbon Emissions from Deforestation
in Tropical Regions. Science, 336, 1573–1576.
Houghton RA, House JI, Pongratz J et al. (2012) Carbon emissions from land use and land-cover
change. Biogeosciences, 9, 5125–5142.
Jiahui W, Zamar R, Marazzi A, et al. (2014). robust: Robust Library. R package version 0.4-16.
http://CRAN.R-project.org/package=robust.
Kearsley E, de Haulleville T, Hufkens K et al. (2013) Conventional tree height-diameter
relationships significantly overestimate aboveground carbon stocks in the Central Congo
Basin. Nature communications, 4, 2269.
IPCC (2003). Good practice guidance for land use, land-use change and forestry. IPCC National
Greenhouse Gas Inventories Programme, Technical Support Unit. Hayama, Japan: Institute for
Global Environmental Strategies.
26
Langner A, Achard F, Grassi G (2014) Can recent pan-tropical biomass maps be used to derive
alternative Tier 1 values for reporting REDD+ activities under UNFCCC? Environmental
Research Letters, 9, 124008.
Lewis SL, Sonké B, Sunderland T et al. (2013) Above-ground biomass and structure of 260 African
tropical forests. Philosophical transactions of the Royal Society of London. Series B,
Biological sciences, 368, 20120295.
Lewis SL, Lopez-Gonzalez G, Sonké B et al. (2009) Increasing carbon storage in intact African
tropical forests. Nature, 457, 1003–1006.
Malhi Y, Wood D, Baker TR et al. (2006) The regional variation of aboveground live biomass in
old-growth Amazonian forests. Global Change Biology, 12, 1107–1138.
Mitchard ET, Saatchi SS, Baccini A, Asner GP, Goetz SJ, Harris NL, Brown S (2013) Uncertainty
in the spatial distribution of tropical forest biomass: a comparison of pan-tropical maps.
Carbon balance and management, 8, 10.
Mitchard ET a, Feldpausch TR, Brienen RJW et al. (2014) Markedly divergent estimates of
Amazon forest carbon density from ground plots and satellites. Global Ecology and
Biogeography, 23, 935–946.
Nogueira MA, Diaz G, Andrioli W, Falconi F a., Stangarlin JR (2006) Secondary metabolites from
Diplodia maydis and Sclerotium rolfsii with antibiotic activity. Brazilian Journal of
Microbiology, 37, 14–16.
Pearson TRH, Brown S, Casarim FM (2014) Carbon emissions from tropical forest degradation
caused by logging. Environmental Research Letters, 034017, 11.
Phillips, O. L., Malhi, Y., Higuchi, N., Laurance, W., Nuñez, P., Vásquez, R., Laurance, S.,
Ferreira, L., Stern, M., Brown, S., Grace J (1998) Changes in the carbon balance of Tropical
Forests: Evidence from long-term plots. Science, 282, 439–442.
Phillips OL, Aragão LEOC, Lewis SL et al. (2009) Drought sensitivity of the Amazon Rainforest.
Science, 323, 1344–1347.
Phillips OL, Lewis SL (2014) Evaluating the tropical forest carbon sink. Global Change Biology,
20, 2039–2041.
Potapov P, Yaroshenko A, Turubanova S et al. (2008) Mapping the world’s intact forest landscapes
by remote sensing. Ecology and Society, 13.
Romijn E, Herold M, Kooistra L, Murdiyarso D, Verchot L (2012) Assessing capacities of non-
Annex I countries for national forest monitoring in the context of REDD+. Environmental
Science and Policy, 19-20, 33–48.
Saatchi SS, Harris NL, Brown S et al. (2011) Benchmark map of forest carbon stocks in tropical
regions across three continents. Proceedings of the National Academy of Sciences, 108, 9899–
9904.
27
Saatchi SS, Mascaro J, Xu L et al. (2014) Seeing the forest beyond the trees. Global Ecology &
Biogeography, 23, 935 – 946.
Searle SR (1971) Linear Models, Vol. XXI. WILEY-VCH Verlag, York-London-Sydney-Toronto,
532 pp.
Slik JWF, Paoli G, Mcguire K et al. (2013) Large trees drive forest aboveground biomass variation
in moist lowland forests across the tropics. Global Ecology and Biogeography, 22, 1261–1271.
Ter Steege H, Pitman NC a, Phillips OL et al. (2006) Continental-scale patterns of canopy tree
composition and function across Amazonia. Nature, 443, 444–447.
Willcock S, Phillips OL, Platts PJ et al. (2012) Towards Regional, Error-Bounded Landscape
Carbon Storage Estimates for Data-Deficient Areas of the World. PLoS ONE, 7, 1–10.
Wright JS (2013) The carbon sink in intact tropical forests. Global Change Biology, 19, 337–339.
Ziegler AD, Phelps J, Yuen JQ et al. (2012) Carbon outcomes of major land-cover transitions in SE
Asia: Great uncertainties and REDD+ policy implications. Global Change Biology, 18, 3087–
3099.
Zolkos SG, Goetz SJ, Dubayah R (2013) A meta-analysis of terrestrial aboveground biomass
estimation using lidar remote sensing. Remote Sensing of Environment, 128, 289–298.
Supporting Information
Appendix S1. Supplementary methods and results
28
Appendix S1
Supplementary methods and results
Stratification approach
Error modelling
The stratification approach implemented in this study aimed to identify areas with homogeneous
error structure in both input maps. The error maps of the Saatchi and Baccini maps were first
predicted separately and then combined in eight error strata per continent using a clustering
approach. Since the biomass estimates of the input maps were mostly based on optical and LiDAR
data that are sensitive to tree cover and tree height, it was assumed that their uncertainties were
related to the spatial variability of these parameters. In addition, the errors of the input maps
resulted to be linearly correlated with the respective biomass estimates. For these reasons, the
biomass maps themselves as well as global datasets of land cover, tree cover and tree height were
used to predict the map errors using a Random Forest model (Breiman, 2001), calibrated on the
basis of the reference dataset. Then, the error maps of the Saatchi and Baccini datasets were
clustered using the K-Means approach. The number of clusters was determined as a trade-off
between homogeneity of the errors of the input maps (quantified as variance (sum of squares)
within groups of the error maps) and number of reference observations available per stratum. Eight
clusters was considered as a sensible compromise between these two parameters, with a larger
number of clusters providing only a marginal increase in homogeneity but leading to a small
number of reference data in some strata (Fig. S1). The resulting stratification map presented missing
values where the predictors presented no data, i.e. areas without coverage of the Baccini map, or for
classes of the categorical predictor (i.e., land cover) without reference data. The missing values
were estimated using an additional Random Forest model that predicted the strata (instead of the
errors of the input maps) based on 10,000 randomly selected training data. These ‘secondary’
training data were extracted from the stratification map and included only the predictors without
missing values (i.e., Saatchi map, tree cover and tree height). The performance of this approach was
assessed on the basis of the RMSE of the models that predicted the errors of the input maps, and on
the error rate of the models that predicted the strata for the areas with missing values (Fig. S2, Fig.
S3). The performance statistics were computed as an average of 100 model repetitions to account
for a certain level of randomness inherent in the ensemble modelling approach used in this study
(Random Forest). The importance of the predictor variables was assessed on the basis of the total
increase in node purities (measured by residual sum of squares) from splitting on the variable,
29
averaged over all trees (Liaw and Wiener, 2002) (Table S1). According to this variable, the main
predictors of the errors of the input maps were in most cases the biomass values of the maps
themselves, followed by tree cover and tree height, while land cover was always the least important
predictor.
Figure S1: Relation between the number of clusters (x axis), the average number of reference data per cluster
(red line) and their homogeneity (within groups sum of square) (black line) when the error maps of the Saatchi
and Baccini datasets are clustered using the K-Means approach.
30
Figure S2: Performance of the models that predicted the error of the input maps (RMSE) and of the models that
predicted the strata for the areas not covered by the Baccini map (Error rate). Error statistics are computed as
mean of 100 random model repetitions, with the error bars indicating 1 standard deviation.
Figure S3: Comparison of predicted errors on the Out-Of-Bag data (i.e., data not used for training the model)
with observed errors of the Random Forest models that estimated the errors of the input maps.
Table S1: Importance of the variables used to predict the errors of the input maps by the Random Forest models.
The importance is represented by the increase in node purity (x 1000). The variables are the biomass of Saatchi
31
map (“AGB Saatchi”), biomass of the Baccini map (“AGB Baccini”), forest height (“Height”), tree cover (“VCF”)
and land cover according to the ESA CCI map (“CCI”).
Africa S. America C. America Asia Australia
Saatchi Baccini Saatchi Baccini Saatchi Baccini Saatchi Baccini Saatchi
AGB Saatchi 2,607 2,963 1,321 839 14,684 963 1,293 618 1,536
AGB Baccini 2,365 2,737 839 923 1,832 6,219 890 766 -
VCF 2,346 3,223 1,170 1,237 2,469 3,071 614 534 4,971
Height 1,629 1,710 784 748 3,320 1,990 586 537 949
CCI 471 937 22 12 433 239 164 104 1,702
Pre-processing of covariate maps
The maps used to predict the error strata were obtained as follows. The land cover information was
derived from the ESA CCI 2005 Land Cover map (ESA, 2014) by first reclassifying it into eight
classes representing the main vegetation types (Evergreen forest, Deciduous forest, Woodland,
Mosaic Vegetation, Shrubland, Grassland, Cropland, Other land) and then resampling it from its
original resolution (330 m) to the resolution of the Saatchi map (1 km) on the basis of the majority
criterion. The tree cover value was obtained from the annual MODIS VCF tree cover composites at
250m for the years 2000 – 2010 (DiMiceli et al., 2011). The eleven annual datasets were averaged
to a mean decadal composite, spatially aggregated to 1 km resolution and warped to the geographic
projection and grid of the Saatchi map (WGS-84). Tree height was directly derived from the Global
3D Vegetation Height map at 1km resolution (Simard et al., 2011). Additional potential land cover
and ecosystem predictors, namely the GLC2000 map (Mayaux et al., 2004), the Synmap map (Jung
et al., 2006) and the Global Ecological Zone (GEZ) map (FAO, 2000), were tested but discarded
after preliminary analysis because they did not provide additional explanatory power to the error
models. All maps were resampled to match the grid of the Saatchi map using the nearest neighbor
method.
Stratification map and alternative approaches
The stratification map obtained with the approach described above identified eight strata depicting
homogeneous error patterns of the input maps for each continent (Central America, South America,
Africa, Asia and Australia). The map, reported in Figure S4, was used to identify the parameters
(bias and weights) of the fusion model.
32
Alternative stratification approaches using individual datasets to predict the errors of the input maps,
namely land cover, tree cover or tree height, were tested and their performance assessed by
validating the respective fused maps. The fused maps based on each stratification map were trained
using a calibration dataset (random selection of 70% of the reference data) and compared with the
validation dataset (remaining 30% of the reference data) to compute the respective RMSE. To
account for any potential impacts of the random selection of validation data, the procedure was
repeated 100 times, computing each time a new random selection of the calibration and validation
datasets. This procedure allowed to compute the mean RMSE and assess its standard deviation for
each map.
Figure S4: Stratification map, depicting eight strata per continent with homogeneous error patterns of the input
maps.
33
Figure S5: Performance of all stratification approaches, computed as RMSE of the respective fused maps
assessed using independent reference data. The “Model” stratification refers to the approach used in this study
(“Error strata”). The alternative stratification approaches consist of individual datasets used to predict the
errors of the input maps, i.e., forest height (“HEI”), tree cover (“VCF”), land cover according to the GLC2000
map (“GLC”) and land cover according to the ESA CCI map (“CCI”).
Reference dataset
Visual selection of the reference field data
After the preliminary selection of field plots based on metadata information (tree parameters
measured in the field, allometric model applied, year of measurement, plot geolocation and extent),
the ground data were further screened to discard the plots not representative of the biomass density
within the 1 km cells of the input maps. Considering that the two input maps are not aligned and
therefore their pixels do not correspond to the same geographic area, the plots needed to be
representative of the area of both pixels identified before their resampling. For this reason, the plot
representativeness was assessed on the area obtained by merging the pixel extent of the Saatchi map
with the corresponding pixel extent of the Baccini map after its aggregation to 1 km but before its
resampling. The representativeness was assessed by means of visual interpretation of high
resolution images provided on the Google Earth platform, discarding the plots located in pixels with
34
heterogeneous tree cover and tree crowns. Pixels were considered heterogeneous if more than 10%
of the area presented forest structure different than that of the majority of the area (i.e., pixels
needed to be homogeneous for at least 90% of their area). Moreover, if the sum of the area of all
plots located within a pixel was smaller than 0.1 ha, these plots were removed because the area
measured on the ground was considered insufficient to represent 1km pixels even in homogeneous
contexts. Even though biomass may still vary substantially within an area having similar forest
structure, using the texture and context information assessed through visual interpretation allowed
for higher confidence than assessing pixel homogeneity using fixed thresholds of tree cover and
texture variability automatically derived from coarser resolution images (i.e., MODIS or Landsat).
Examples of plots selected or discarded through the visual analysis of Google Earth images are
provided in Fig. S6.
Figure S6: visual selection of the field plots. Plots are discarded if not representative of the biomass density in the
1 km2 area of the input maps. The yellow and red polygons represent the area of the corresponding pixels of the
Saatchi and Baccini maps and the blue circles represent the extent of the field plots, superimposed on the Google
Earth images. The right plots were selected (tree cover and texture are homogeneous) and the left plot was
discarded (heterogeneous pixels). FIGURE TO BE ENHANCED: plot area is not visible
The pan-tropical reference dataset
The reference data selected after the screening, upscaling and consolidating procedure, for each
individual dataset are reported in Table S2. The field plots underpinning the reference biomass
maps are not included among the “Available plots” to avoid double-counting, since the respective
reference data are already considered as “Selected pixels”. Table S3 indicates the main type of
training data used by the high resolution biomass maps and the criteria used to select the 1-km
35
reference pixels for this study. A complete description of the metadata and the literature reference
of the ground reference data and reference biomass maps are provided in Table S7 – S10.
Table S2: Number of reference data (plots and 1-km pixels) selected after the screening, upscaling and
consolidating procedure, for each individual dataset.
ID Continent Type Country/Region Available Selected Consolidated
Plots Plots Pixels Pixels
AFR1 Africa Plots DRC 1,157 1,067 227 227
AFR2 Africa Plots Sierra Leone 609 576 162 162
AFR3 Africa Plots Tropical Africa 260 175(a) 161 161
AFR4 Africa Plots Ethiopia 119 60 42 42
AFR5 Africa Plots Ghana 74 65 12 12
AFR6 Africa Plots Tanzania 42 23 9 9
AFR7 Africa Plots DRC 20 10 9 9
AFR8 Africa Map Uganda 223 223
AFR9 Africa Map Madagascar 60 60
AFR10 Africa Map Mozambique 38 38
AFR11 Africa Map Cameroon 10 10
TOTAL AFRICA 2,281 1,976 953 953
SAM1 S. America Plots Amazon basin 413 287(b) 221 221
SAM2 S. America Plots Brazil 124 101 48 48
SAM3 S. America Plots Guyana 111 86 30 30
SAM4 S. America Map Peru 100 100
SAM5 S. America Map Colombia 50 50
TOTAL S. AMERICA 648 474 449 449
CAM1 C. America Map Mexico 4,263 6,072
CAM2 C. America Map Panama 997 3,095
TOTAL C. AMERICA 0 0 5,260 9,167
ASI1 Asia Plots Vietnam 3,197 1,547 101 101
ASI2 Asia Plots Laos 122 68 65 65
ASI3 Asia Plots Sabah 104 74 53 53
ASI4 Asia Plots Indonesia 82 39 36 36
ASI5 Asia Plots SE Asia 132 77 77 77
ASI6 Asia Plots Indonesia 25 17 11 11
36
ASI7 Asia Plots SE Asia 25 4 4 4
ASI8 Asia Plots Indonesia 11 7 6 6
ASI9 Asia Map Asia 0 47(c)
TOTAL ASIA 3,698 1,833 353 400
AUS1 Australia Map Australia 5,000 5,000
TOTAL AUSTRALIA 0 0 5,000 5,000
TOTAL TROPICS 6,627 4,283 12,015 15,969
(a) 22 plots (DOU, OVG, EKO, LOP-01, CVL, GBO) were removed because they were used by Saatchi et al. (2011) to
calibrate the LiDAR/biomass relationships, and therefore highly correlated with the map values.
(b) 264 plots measured for the last time after the year 2000 were selected. In addition, 23 plots measured from 1990 in
undisturbed forests were included because the visual analysis of high-resolution images did not indicate any sign of
disturbance.
(c) Additional training data representing areas with no biomass (e.g., bare soil) were included only for Asia, where 47
pixels were selected using visual analysis of Google Earth images.
Table S3: Type of calibration data used by the high resolution biomass maps (identified by the “ID”) and criteria
used to select the 1-km reference pixels for this study (“Selected pixels”).
ID Calibration data Selected pixels
AFR8 Field plots Pixels with plots representative of >40% of pixel area
AFR9 Airborne LiDAR Random pixels in an amount proportional to other datasets
AFR10 Field plots All pixels with plots
AFR11 Field plots All pixels with plots
SAM4 Airborne LiDAR Random pixels in an amount proportional to other datasets
SAM5 Airborne LiDAR Random pixels in an amount proportional to other datasets
CAM2 Field plots Pixels with available plots
CAM3 Airborne LiDAR Random pixels in an amount proportional to the map area
AUS1 Field plots Random pixels in an amount proportional to other continents
The reference dataset described above was used to assess the errors of the input maps (RMSE) and
the linear correlation (r) of the errors for each continent (Table S4).
Table S4: Linear correlation of the errors (r) and RMSE of the input maps computed using the complete
reference dataset, per continent.
Africa S. America C. America Asia Australia
37
Saatchi Baccini Saatchi Baccini Saatchi Baccini Saatchi Baccini Saatchi
r 0.77 0.72 0.61 0.65 0.60 0.75 0.64 0.72 0.69
RMSE 105 116 104 96 97 101 124 103 45
The fusion model
The biases and weights computed by the fusion model per stratum and continent are reported in
Table S5, along with the number of reference data and relative area of each stratum (in percentage).
Table S5: Bias values and weights applied to the input maps, number of reference data and percentage relative
area per stratum and continent
Bias Weight Ref. data (N) Area
Strata Saatchi Baccini Saatchi Baccini
AFRICA
1 -11 -55 0.29 0.71 111 13%
2 -26 -17 0.57 0.43 123 21%
3 113 203 0.54 0.46 163 1%
4 200 146 0.00 1.00 77 2%
5 77 -51 0.93 0.07 66 3%
6 -91 -49 0.07 0.93 146 7%
7 -56 -87 0.45 0.55 89 8%
8 -7 16 0.60 0.40 178 45%
ASIA
1 -42 -22 0.75 0.25 83 69%
2 82 52 0.00 1.00 19 2%
3 43 116 0.45 0.55 26 2%
4 174 190 0.00 1.00 38 2%
5 -186 -74 0.84 0.16 74 2%
6 -125 -117 0.43 0.57 61 2%
7 11 -31 1.00 0.00 55 14%
8 -77 20 1.00 0.00 44 6%
CENTRAL AMERICA
1 -16 -67 0.43 0.57 1574 20%
2 -12 -112 0.34 0.66 1155 7%
3 -119 -80 0.62 0.38 1511 6%
4 -76 -163 0.53 0.47 748 2%
5 -204 -101 0.45 0.55 617 3%
6 -63 -74 0.65 0.35 2402 9%
7 -14 -27 0.38 0.62 637 51%
8 -178 -178 0.33 0.67 506 1%
38
SOUTH AMERICA
1 -37 -112 0.38 0.62 72 6%
2 27 5 0.84 0.16 43 10%
3 -18 -44 0.96 0.04 37 56%
4 38 -64 0.39 0.61 51 7%
5 161 139 0.17 0.83 79 3%
6 -102 -79 0.75 0.25 59 4%
7 -23 3 0.33 0.67 48 10%
8 124 40 0.97 0.03 60 5%
AUSTRALIA
1 -21 NA 1.00 0.00 847 16%
2 -37 NA 1.00 0.00 449 6%
3 79 NA 1.00 0.00 411 2%
4 2 NA 1.00 0.00 859 29%
5 141 NA 1.00 0.00 220 2%
6 41 NA 1.00 0.00 564 4%
7 -11 NA 1.00 0.00 1145 35%
8 16 NA 1.00 0.00 505 6%
Biomass map
Biomass stocks of the fused and input maps
The total biomass stocks (Pg) of all land cover types per continent and biomass map are reported in
Table S6. For comparability among the three maps (Saatchi, Baccini and fused map), the stocks are
computed for the extent of the Baccini map while the values in brackets represent the stocks relative
to the larger extent of the Saatchi map. The values reported for the Saatchi map are higher than
those published by Saatchi et al. (2011) because the latter refer only to areas with tree canopy
cover >10%.
Table S6: Total biomass stocks (Pg) of all land cover types per continent and biomass map. The values refer to
the extent of the Baccini map while the values in brackets represent the stocks relative to the larger extent of the
Saatchi map.
Africa S. America C. America Asia Australia Total
Baccini 129 216 18 93 - 457
Saatchi 113 (117) 178 (196) 15 (20) 107 (192) (21) 413 (545)
Fused 96 (97) 179 (191) 7 (9) 78 (134) (19) 360 (451)
39
Comparison of forest biomass in intact and non-intact landscapes
The biomass values per continent for intact and non-intact forests were computed as the area-
weighted mean of the contributing ecozones (see related section in “Data and Methods”). In Africa
the fused map presented higher mean biomass values than the Saatchi and Baccini maps in intact
forests (+86 and +65 Mg ha-1, respectively) and lower values in non-intact forests (-15 and -36 Mg
ha-1, respectively), in Asia and in Central America the fused map was consistently lower than the
input maps in both intact (between -74 and -125 Mg ha-1) and non-intact forests (between -51 and -
65 Mg ha-1), in South America the fused map was similar to the Saatchi map in both intact and non-
intact forests (+20 and +3 Mg ha-1, respectively) and lower than the Baccini map (-18 and -28 Mg
ha-1, respectively), and in Australia the fused map was higher than the Saatchi map in both intact
and non-intact forests (+72 and +30 Mg ha-1, respectively). As expected, intact forests had
consistently higher mean biomass than non-intact forest in all continents and major ecozones. When
considering the average difference between intact and non-intact forests, this was larger in the fused
map than in the Saatchi and Baccini maps in Africa (236, 143 and 154 Mg ha-1 respectively),
similar or slightly larger in South America (78, 63 and 79 Mg ha-1 respectively), and smaller in
Central America (58, 74 and 86 Mg ha-1 respectively) and Asia (39, 59 and 55 Mg ha-1 respectively).
40
Figure S7: Mean biomass density of the fused and input maps in intact (I) and non-intact (NI) forests stratified
by continent and ecozone. Only ecozones with intact forest larger than 1,000 km2 are reported
41
Table S7: Metadata of the ground reference datasets (part 1)
ID Continent Country / Region Location Extent Vegetation type(s) Year(s) N.
plots
Area -
Range (ha)
Area - Mean
(ha)
Min. DBH
(cm)
Reference
AFR1 Africa DRC Lukenie Local Forest (concession) NA 1157 0.5 0.5 10 Hirsch et al., 2013
AFR2 Africa Sierra Leone Gola Forest Local Forest 2005-2007 609 0.125 0.125 10 Lindsell and Klop, 2013
AFR3 Africa Tropical Africa Tropical Africa Regional Forest (Intact) 1970s - 2013 260 0.2 - 10 1.2 10 Lewis et al., 2013
AFR4 Africa Ethiopia Kafa Local Forest - Woodland 2011-2013 119 0.126 0.126 5 De Vries et al., 2012
AFR5 Africa Ghana Ankasa Local Forest 2012 34 0.05 0.05 10 Laurin et al., 2013
AFR5 Africa Ghana Bia Boin, Dadieso Local Forest 2012-2013 40 0.16 0.16 5 Pirotti et al., 2014
AFR6 Africa Tanzania Eastern Arc Mountain Local Forest 2007-10 42 0.08 - 1 0.66 10 Lopez-Gonzalez et al., 2009, 2011
AFR7 Africa DRC Yangambi Local Forest (Intact) NA 20 1 1 10 Kearsley et al., 2013
SAM1 S. America Amazon Amazon Regional Forest 1956 - 2013 413 0.25 - 9 1 10 Mitchard et al., 2013; Lopez-Gonzalez et al., 2014
SAM2 S. America Brazil Brazil National Forest 2009 - 2013 124 0.16 - 1 0.42 5 - 10 Embrapa, 2014
SAM3 S. America Guyana Guyana Local Forest 2010 - 2011 111 5 NA
CAM1 C. America Mexico Mexico National Forest 2004-2008 4136 7.5 de Jong, 2013
ASI1a Asia Vietnam Quang Nam Province Forest 2007-2009 3035 0.05 0.05 6 Avitabile et al., 2014
ASI1b Asia Vietnam Quang Nam Province Forest 2011-2012 162 0.01 - 0.126 0.08 5 Avitabile et al., 2014
ASI2 Asia Laos Xe Pian Local Forest 2011-2012 122 0.1 - 0.126 0.11 5 WWF and OBf, 2013
ASI3 Asia Indonesia Sabah Local Forest (concession) 2005 - 2008 104 0.5 - 1.5 1 10 Morel et al., 2011
ASI4 Asia Indonesia Riau province Local Forest 2009-2010 82 0.015 0.015 5 Wijaya et al., 2015
ASI5 Asia Asia India, China, Indonesia Local Forest (Intact) circa-2010 132 0.25 - 20 1.5 10 Slik et al., 2013, 2014
ASI6 Asia Malaysia, Indonesia Sarawak, C. Kalimantan Regional Forest (Intact) 2013 - 2014 25 0.25 - 1 0.57 10 Phillips et al., in prep.
ASI7 Asia Indo-Pacific Indo-Pacific Regional Mangrove 2008 - 2009 25 0.015 0.015 5 Donato et al., 2011
ASI8 Asia Indonesia Kalimantan Local Mangrove 2008-2009 11 0.015 0.015 5 Murdiyarso et al., 2010 (a)
(a) The metadata for this dataset are provided in Amira (2008) and Kauffman and Donato (2012)
42
Table S8: Metadata of the ground reference datasets (part 2)
ID Parameter
measured
Tree Height Allometric equation Parameters of
allom. eq.
Plot type Permanent plot
AFR1 Dbh, Sp, Hei not used Chave (2005) Moist Dbh, wd, hei For. Inv. No
AFR2 Dbh, Sp, Hei local eq. Chave (2005) Moist Dbh,wd, hei For. Inv. No
AFR3 Dbh, Sp Feldpausch (2012) Chave (2005) Moist Dbh,wd, hei Res. plots Yes
AFR4 Dbh, Sp not used Chave (2005) Wet dbh, wd Res. plots No
AFR5 Dbh, Sp, Hei measured for all trees Chave (2005) Moist Dbh,wd, hei Res. plots No
AFR5 Dbh, Sp, Hei measured for all trees Chave (2005) Moist Dbh,wd, hei Res. plots No
AFR6 Dbh, Sp Feldpausch (2012) Chave (2005) Moist Dbh,wd, hei Res. plots Yes
AFR7 Dbh, Sp, Hei stand-specific eq. Chave (2005) Moist Dbh,wd, hei Res. plots Yes
SAM1 Dbh, Sp Feldpausch (2012) Chave (2005) Moist Dbh,wd, hei Res. plots Yes
SAM2 Dbh, Sp, Hei measured for all trees Chave (2005) Moist Dbh,wd, hei For. Inv. No
SAM3 Dbh, Sp not used Chave (2005) Moist dbh, wd For. Inv. No
CAM1 Dbh, Sp, Hei measured for all trees Urquiza-Haas et al. (2007) Dbh,wd, hei For. Inv. Yes
ASI1a Dbh, Sp, Hei local eq. Chave (2005) Moist Dbh,wd, hei For. Inv. No
ASI1b Dbh, Sp not used Chave (2005) Moist dbh, wd Res. plots No
ASI2 Dbh, Sp not used Chave (2005) Dry/Moist dbh, wd Res. plots No
ASI3 Dbh, Sp, Hei stand-specific eq. Chave (2005) Moist Dbh,wd, hei Res. plots No
ASI4 Dbh, Sp not used Komiyama et al. (2008), Chave (2005) Moist dbh, wd Res. plots No
ASI5 Dbh, sp Feldpausch (2012) Chave (2005) Dry/Moist/Wet Dbh,wd, hei Res. plots No
ASI6 Dbh, sp Feldpausch (2012) Chave (2005) Moist Dbh,wd, hei Res. plots Yes
ASI7 Dbh, sp not used Komiyama et al. (2008) dbh, wd Res. plots No
ASI8 Dbh, Sp not used Komiyama et al. (2008) dbh, wd Res. plots No
43
Table S9: Metadata of the reference biomass maps (part 1)
ID Continent Country/Region Location Extent Vegetation types (a)
Year (map) Resolution (m)
Accuracy dataset
RMSE (Mg/ha)
R2 RS data Reference
AFR8 Africa Uganda Uganda National For – Wood – Sav.
1999-2003 30 Validation 13 0.81 Landsat, LC
Avitabile et al., 2012
AFR9 Africa Madagascar North Madagascar
Local Forest 2010 100 Calibration 42 0.88 Landsat, LiDAR
Asner et al., 2012
AFR10 Africa Mozambique Gorongosa Local Wood – Sav 2007 50 Validation 20 0.49 ALOS Ryan et al., 2012
AFR11 Africa Cameroon Mbam Djerem Local For - Sav 2007 100 Calibration 29 NA ALOS Mitchard et al., 2011
SAM4 S. America Peru Peru National For – Wood – Grass
NA 100 Calibration 55 0.82 Landsat, LiDAR
Asner et al., 2014
SAM5 S. America Colombia Colombian amazon
Regional Forest 2010 100 Validation 58 NA Landsat, LiDAR
Asner et al., 2012
CAM2 C. America Mexico Mexico National Forest 2007 30 Validation 28 0.52 Landsat, ALOS
Cartus et al., 2014
CAM3 C. America Panama Panama National For – Wood – Grass
2008 - 2012
100 45 0.62 Landsat, LiDAR
Asner et al., 2013
AUS1 Australia Australia West Australia Local Wood – Sav 50 NA NA NA ALOS Lucas et al., 2010
(a) Forest (For), woodland (Wood), Savannah (Sav), Grassland (Grass)
44
Table S10: Metadata of the reference biomass maps (part 2)
ID N. plots Years (Plots) Plot size -
Range (ha)
Plot size -
Mean (ha)
Min. DBH
(cm)
Parameter
measured
Tree Height Allometric equation Parameters of
allometric eq.
Plot type
AFR8 2527 1995-2005 0.25 0.25 3 Dbh, Sp, Crown measured Drichi (2003) Dbh, wd,
crown
For. Inv.
AFR9 19 NA 0.28 0.28 0 Dbh, Sp, Hei local eq. Chave (2005) Wet Dbh, wd, hei Res. plots
AFR10 96 2006-2009 0.1 - 2.2 0.63 5 Dbh not used Ryan et al. (2011) dbh Res. plots
AFR11 25 2007 0.2 - 1 0.6 10 Dbh, Sp, Hei local eq. Chave (2005) Dry/Moist/Wet Dbh, wd, hei Res. plots
SAM4 272 NA 0.3 - 1 0.33 NA Dbh, Sp, Hei local eq. Chave et al., 2014 Dbh, wd, hei For. Inv.
SAM5 11 NA 0.28 0.28 10 Dbh, Sp local eq. Chave (2005) Moist Dbh, wd, hei Res. plots
CAM2 16906 2004 - 2007 1 1 7.5 Dbh, Sp, Hei measured National species-specific eq. Dbh, wd, hei For. Inv.
CAM3 228 NA 0.1 - 0.36 NA 10 Dbh, Sp, Hei local eq. Chave (2005) Dry/Moist/Wet Dbh, wd, hei Res. plots
AUS1 2781
45
Additional References
Amira S (2008). An estimation of Rhizophora apiculata Bl. biomass in mangrove forest in Batu
Ampar Kubu Raya Regency, West Kalimantan. Undergraduate Thesis, Bogor Agricultural
University, Indonesia
Avitabile V, 2014. Carbon stocks of vegetation in the Vu Gia Thu Bon river basin, central Vietnam.
Technical Report. Land Use and Climate Change Interactions in Central Vietnam (LUCCi)
project. http://leutra.geogr.uni-jena.de/vgtbRBIS/metadata/start.php
DeVries B, Avitabile V, Kooistra L, Herold M, 2012. Monitoring the impact of REDD+
implementation in the Unesco Kafa biosphere reserve, Ethiopia. Proceedings of the “Sensing a
Changing World” Workshop (2012).http://www.wageningenur.nl/upload_mm/9/d/c/f80b6db7-
9c3c-4957-8717-6fdc6e46e60f_deVries.pdf
Donato DC, Kauffman JB, Murdiyarso D, Kurnianto S, Stidham M, Kanninen M (2011) Mangroves
among the most carbon-rich forests in the tropics. Nature Geoscience, 4, 293–297.
EMBRAPA, 2014. Sustainable Landscape Brazil. http://geoinfo.cnpm.embrapa.br/geonetwork/srv/
eng/main.home
ESA, 2014. Global land cover map for the epoch 2005. http://www.esa-landcover-cci.org/
DiMiceli, C.M., Carroll, M.L., Sohlberg, R.A., Huang, C., Hansen, M.C. and Townshend J.R.G.
(2011). Annual Global Automated MODIS Vegetation Continuous Fields (MOD44B) at 250 m
Spatial Resolution for Data Years Beginning Day 65, 2000 - 2010, Collection 5 Percent Tree
Cover, University of Maryland, College Park, MD, USA
Hirsh, F.; Jourget, J. G.; Feintrenie, L.; Bayol, N.; Atyi, R. E., 2013. REDD+ pilot project in
Lukenie. Center for International Forestry Research (CIFOR), Jakarta, Indonesia, CIFOR
Working Paper, 2013, 111, pp vi + 48 .
Jung M, Henkel K, Herold M, Churkina G (2006) Exploiting synergies of global land cover
products for carbon cycle modeling. Remote Sensing of Environment, 101, 534–553.
Kauffman JB and Donato DC (2012). Protocols for the measurement, monitoring and reporting of
structure, biomass and carbon stocks in mangrove forests. Working Paper 86. CIFOR, Bogor,
Indonesia.
Liaw, A., Wiener, M. (2002). Classification and regression by random forest. R News, 2
(3), 18–22.
Lindsell J a., Klop E (2013) Spatial and temporal variation of carbon stocks in a lowland tropical
forest in West Africa. Forest Ecology and Management, 289, 10–17.
Lopez-Gonzalez, G., Lewis, S.L., Burkitt, M., Baker T.R. and Phillips, O.L. 2009. ForestPlots.net
Database. www.forestplots.net. Date of extraction [09,09,13]
46
Lopez-Gonzalez G, Lewis SL, Burkitt M and Phillips OL (2011). ForestPlots.net: a web application
and research tool to manage and analyse tropical forest plot data. Journal of Vegetation
Science 22: 610–613. doi: 10.1111/j.1654-1103.2011.01312.x
Lopez-Gonzalez G, Mitchard ETA, Feldpausch TR et al. (2014) Amazon forest biomass measured
in inventory plots. Plot Data from "Markedly divergent estimates of Amazon forest carbon
density from ground plots and satellites." doi: 10.5521/FORESTPLOTS.NET/2014_1
Lucas R, Armston J, Fairfax R et al. (2010). An evaluation of the ALOS PALSAR L-band
backscatter – above ground biomass relationship over Queensland, Australia. IEEE Journal of
Selected Topics in Earth Observations and Remote Sensing, 3(4): 576-593.
Mayaux P, Bartholome E, Fritz S, Belward A (2004) A New Land Cover Map of Africa for the
Year 2000. Journal of Biogeography, 31, 861–877.
Mitchard ET, Saatchi SS, Lewis SL et al. (2011) Measuring biomass changes due to woody
encroachment and deforestation/degradation in a forest-savanna boundary region of central
Africa using multi-temporal L-band radar backscatter. Remote Sensing of Environment, 115,
2861–2873.
Morel AC, Saatchi SS, Malhi Y et al. (2011) Estimating aboveground biomass in forest and oil
palm plantation in Sabah, Malaysian Borneo using ALOS PALSAR data. Forest Ecology and
Management, 262, 1786–1798.
Murdiyarso D, Donato D, Kauffman JB, Kurnianto S, Stidham M, Kanninen M (2010). Carbon
storage in mangrove and peatland ecosystems: a preliminary account from plots in Indonesia.
CIFOR Working Paper no. 48. 35pp. Bogor – Indonesia.
Phillips O.L., Lewis S.L., Qie L., Banin L., et al. (in prep.) Tropical Forests in the Changing Earth
System Borneo forest dataset'.
Pirotti F, Laurin G, Vettore A, Masiero A, Valentini R (2014) Small Footprint Full-Waveform
Metrics Contribution to the Prediction of Biomass in Tropical Forests. Remote Sensing, 9576–
9599.
Ryan CM, Hill T, Woollen E et al. (2012) Quantifying small-scale deforestation and forest
degradation in African woodlands using radar imagery. Global Change Biology, 18, 243–257.
Simard M, Pinto N, Fisher JB, Baccini A (2011) Mapping forest canopy height globally with
spaceborne lidar. Journal of Geophysical Research: Biogeosciences, 116, 1–12.
Vaglio Laurin G, Chen Q, Lindsell J, Coomes D, Cazzolla-Gatti R, Grieco E, Valentini R (2013)
Above ground biomass estimation from lidar and hyperspectral airbone data in West African
moist forests. EGU General Assembly Conference Abstracts, 15, 6227.
Wijaya A, Liesenberg V, Susanti A, Karyanto O, Verchot LV (2015). Estimation of Biomass
Carbon Stocks over Peat Swamp Forests using Multi-Temporal and Multi-Polarizations SAR
Data. Proceeding of the 36th International Symposium on Remote Sensing of Environment, 11-
15 May 2015, Berlin, Germany.
47
WWF and ÖBf. 2013. Xe Pian REDD+ project document. Gland, Switzerland.
http://www.leafasia.org/sites/default/files/public/resources/WWF-REDD-pres-July-2013-
v3.pdf