Yield & Climate Variability: Learning from Time Series & GCM
Spatiotemporal Analysis of Rice Yield Variability in … 1 Spatiotemporal Analysis of Rice Yield...
Transcript of Spatiotemporal Analysis of Rice Yield Variability in … 1 Spatiotemporal Analysis of Rice Yield...
1
Spatiotemporal Analysis of Rice Yield Variability in two California Fields
Alvaro Roel and Richard E. Plant*
Alvaro Roel1, Graduate Group in Ecology, University of California, Davis, CA 95616,
USA; Richard E. Plant, Departments of Agronomy and Range Science and Biological
and Agricultural Engineering, University of California, Davis, CA 95616, USA
1Present address: Instituto Nacional de Investigaciones Agropecuarias, Treinta y Tres,
Uruguay
Received -- -- -- *Corresponding author: [email protected]
Acknowledgements
We are very grateful to the cooperating farmers, Charley Mathews and Charley
Mathews, Jr., for allowing us to carry out research on their farm. We are grateful to
David Clay and to an anonymous reviewer for many helpful comments. This research
was supported by the California Rice Research Board, by the Instituto Nacional de
Investigaciones Agropecurarias of Uruguay, and by a William F. Golden Fellowship to
A. Roel.
2
Spatiotemporal Analysis of Rice Yield Variability in two California Fields 1
2
Abstract 3
Currently, little is known about the spatial and temporal variability of rice (Oryza 4
Sativa L.) yield patterns. This information is needed before implementing any site-5
specific management strategy. The objective of this research was to characterize the 6
spatial and temporal yield variability of rice grown in commercial fields in California. 7
Rice cultivars M-202 and Koshihikari were grown and managed by a cooperating farmer, 8
who collected yield monitor data over a 4-year period. Alternative methods of data 9
quality analysis were applied to the data. To evaluate temporal variability yields from 10
different years must be placed on a common grid. The appropriate size of for these grids 11
was tested. Large-scale spatial structure was determined using median polish, while 12
small-scale spatial structure was evaluated by computing variograms of the yield 13
residuals after subtracting the trends. Temporal variability was determined using 2 14
approaches: 1) computing the variance among years of the original, trend and residual 15
yield values at fixed points in the field; and 2) cluster analysis of the standardized trend 16
yield values. Results from the study showed that the grid density necessary to capture the 17
spatial variability was site and year dependent. Trend surface spatial behaviors were year-18
dependent, indicating a lack of temporal stability. Variograms showed strong spatial 19
structure of yield residuals. Cluster analysis reduced the considerable complexity in a 20
sequence of yield maps of these fields to a few general patterns of among year variations. 21
22
3
Abbreviations: DGPS: differential global positioning system; GR: grid resolution; MC: 1
Moran coefficient, RIC: relative information criterion; UTM: universal transverse 2
mercator. 3
Introduction 4
Yield monitoring and mapping technology that can measure, georeference, and 5
record grain yields makes it possible to document the location and magnitude of yield 6
variability with a spatial precision of meters. If the causes of this variability can be 7
identified then corrective action may be implemented to reduce costs, increase yield, 8
and/or reduce environmental impacts by adopting site-specific management practices 9
(Lowenberg- DeBoer and Erickson, 2000). Moreover, the availability of high precision 10
measurements may permit researchers to more efficiently test hypotheses by precisely 11
measuring crop response to environmental conditions as these conditions vary in the field 12
(Bhatti et al., 1991; Grondona and Cressie, 1991; Long, 1998). Both of these uses of 13
precision measurement technology require statistical methods that until now have more 14
commonly been employed in ecological, epidemiological, and econometric research 15
(Long, 1998; Griffith and Layne, 1999; Bongiovani and Lowenberg-DeBoer, 2001). 16
Yield map data quality is an important initial issue for farmers and even more so 17
for scientists and engineers who wish to use these data in the course of a scientific study. 18
Blackmore and Marshall (1996) identify six primary sources of error in yield map data: 19
unknown crop width at the header, time lag in the harvester, GPS error, grain surge, grain 20
losses, and sensor accuracy and calibration. Birrell et al. (1996), Blackmore and Marshall 21
(1996), Doerge (1999), Colvin and Arslan (2001), and Haneklaus et al. (2001) provide 22
discussions of the errors associated with yield maps. 23
4
It is a well-known property of spatial data that their statistical properties depend 1
on the scale at which they are represented. This is often called the “modifiable areal unit 2
problem” (Openshaw and Taylor, 1981; Wong, 1995).Yield monitor data are generally 3
recorded periodically (e.g., once per second). To compare data from different sources, 4
(e.g., yield monitor data from different years, bulk soil electrical conductivity data, and 5
aerial image data) these data must be converted to a common grid. In choosing the grid 6
size there is a tradeoff between maintaining spatial precision by selecting a fine grid and 7
reducing noise and making the data more manageable by selecting a coarser grid (Wong, 8
1995; Long, 1998). Since variability may be studied at any spatial scale, the choice of 9
grid size is dependent of the aims of the investigation. In making this choice the 10
investigator is aided by knowledge of how much information is lost in moving from one 11
scale to a larger one. One way of determining the effect of increasing grid size is to 12
examine the experimental variograms of the measured quantities (Isaaks and Srivastava, 13
1989). Long (1998) examined the change in correlation coefficients between yield and a 14
second variable as the grid size increased. Chen et al. (1995) developed the Relative 15
Information Criterion (RIC) to quantify efficiency of data representation. They defined 16
the RIC as the correlation coefficient between block-kriged estimates based on the 17
highest grid density and estimates obtained at the same points based on reduced grid 18
densities. 19
One important application of yield map data is the study of the spatiotemporal 20
properties of yield distribution. These properties should be understood before 21
implementing any site-specific management strategy (Bakhsh et al., 2000). From a 22
scientific perspective understanding spatiotemporal variation in yield may help to 23
5
determine the factors underlying yield variability. Several studies of spatiotemporal 1
patterns in terrestrial crops have been carried out. In corn (Zea mays L.) and soybean 2
(Glycine max L.) Jaynes and Colvin (1997) reported that yields displayed substantial 3
spatial and temporal variability. This variability may be due to interactions between 4
climatic growing conditions, soil properties, and the crop (Porter et al., 1998). 5
There have been few studies on the spatiotemporal variability of rice yields 6
(Doberman et al., 1995). Rice is one of the world’s most important staple crops. The 7
development of effective site-specific management techniques for rice production could, 8
if it increases production efficiency, contribute to increased yields (Doberman et al., 9
2002). The study of spatiotemporal variability in rice production also provides 10
scientifically useful information as a comparison with terrestrial crops. Eghball and 11
Power (1995) found that rice yields displayed less year-to-year variability than terrestrial 12
crops. Since rice is grown as a monoculture in California, this production system 13
provides a particularly good environment to analyze the evolution of yields in space and 14
time. 15
Yield variability is often defined in terms of summary statistics such as temporal 16
variance and spatial variance (Whelan and McBratney, 2000). To fully characterize the 17
spatiotemporal behavior of a field, however, one must also understand the tendency of 18
yield patterns to persist season after season, that is, the temporal stability of the spatial 19
pattern. Correlation coefficients between years are often relatively small (Jaynes and 20
Colvin, 1997). Moreover, determining the significance of a correlation coefficient is 21
complicated due to the effect of spatial autocorrelation of the data (Cliff and Ord, 1981). 22
One way to evaluate temporal variability is to compute the temporal variances of yield 23
6
values at fixed points in the field (Whelan and McBratney, 2000). By comparing the 1
among seasons (temporal) variances with the within season (spatial) variances the 2
importance of both components can be estimated. Cluster analysis of annual yields can 3
also be used to assess temporal patterns (Lark and Stafford, 1997; Lark et al., 1999; 4
Perez-Quezada et al., 2003). Cluster analysis may provide an objective quantification of 5
the spatial structure of yield patterns as well as an indication of the consistency of these 6
patterns from year to year (Perez-Quezada et al., 2003). 7
Our study was carried out to perform a spatial and temporal analysis of four years 8
of rice yield monitor data collected in two California fields. The first objective of this 9
study was to examine different methods for evaluating the quality of the data set. The 10
second objective was to determine the grid resolution that captures enough information to 11
represent yield spatial variability at a scale appropriate to management. The third 12
objective was to assess the usefulness of summary statistics in quantification of the 13
spatiotemporal variability. Finally, the fourth objective was to examine the use of 14
clustering to delineate areas in the field with different spatiotemporal yield behaviors. 15
Materials and Methods 16
The study was carried out from 1998 through 2001 in two rice fields 17
approximately 2 km apart, one of 38 ha (denoted Field 1) and one of 52 ha (denoted Field 18
2), located near Marysville, CA (UTM Zone 10, coordinates: E: 627,102, N: 4,340,769; 19
and E: 624,970, N: 4,341,076 for Field 1 and 2, respectively). The soils of the study fields 20
consist of approximately 45% Kimball loam (fine, mixed, active, thermic Mollic 21
Palexerafls), 30% San Joaquin loam (fine, mixed, active, thermic, Abruptic Durixeralfs), 22
and 25% Bruella loam (fine- loamy, mixed, Ultic Palexeralfs). Medium grain rice (Oryza 23
7
Sativa L.) cultivar M-202 and short grain cultivar Koshihikari were grown and managed 1
by the cooperator in Fields 1 and 2, respectively, using standard practices for the area 2
(Hill et al., 1992). The fields were selected based on conversations with the cooperating 3
grower with the object that one field would tend to have spatially uniform properties and 4
the second would be very heterogeneous. Fig. 1 shows gray scale renditions of false color 5
infrared aerial images taken of the two fields during the first year of the experiment. 6
These images were taken using Kodak 2443 infrared film on August 18, 1998 and are 7
shown prior to georeferencing. The dark areas in Field 2 correspond to areas of very 8
sparse vegetation. The field had recently been laser leveled and brought into production, 9
and these dark areas in the image were areas where considerable topsoil had been cut off 10
in the leveling process. The fields were managed without regard to spatial variability with 11
one exception: the consistently poor yielding portion of Field 2 was observed by the 12
grower to mature earlier than the rest of the field, and so the grower harvested this area 13
first. In 2001 the grower harvested this portion of the field five days earlier than the rest 14
of the field. 15
Rice was harvested using a stripper harvester combine equipped with a John 16
Deere Green Star® yield mapping system with real- time differential global positioning 17
system (DGPS) receiver in all years. The harvester followed a back-and-forth north to 18
south harvest pattern in Field 1 and a series of concentric patterns in Field 2. Combine 19
speed ranged from 1.1 to 1.25 m sec-1, header width ranged from 5.5 to 6.7 m, grain 20
flows were recorded once per sec, and moisture content was recorded once per 15 sec. 21
Yield map data files (yield, grain, moisture, longitude and latitude) were collected and 22
imported into the ArcView® (ESRI, Redlands, CA) geographic information system for 23
8
analysis. The yield monitor was calibrated at the start of each harvest season by 1
comparing with a sample of known weight, and yield measurements were adjusted based 2
on this calibration. Yield monitor data points for a distance of approximately 50 m from 3
field edges were deleted from the data set to remove border effects and end-of- field yield 4
monitor errors. Aerial images were taken approximately mid July and mid August of each 5
year, using Kodak 2443 infrared film in the first year and a four band multispectral digital 6
camera in each other year, and were georeferenced to boundary files made using a 7
Trimble Ag 132 GPS. 8
Data quality analysis 9
Yield monitor data were imported into ArcView GIS as point shapefiles (each 10
point representing the grain flow and moisture measurements in an area of size 11
approximately 1m by 6m). GPS accuracy and consistent header width were verified by 12
visual inspection of the records. The John Deere Green Star yield monitor contains built-13
in proprietary software to correct for time lag in the harvester (D. Goebel, John Deere 14
Co., personal communication), so we did not attempt to further correct for this. We could 15
not measure errors due to grain surge or losses, so this left sensor calibration errors as the 16
remaining source of inaccuracy we could detect. 17
Grain flow rates and moisture content were converted to yield at constant 14% 18
moisture. Considering the sensor record as a time series, we tested for two ways in which 19
sensor calibration could change during the course of the harvest of a single field: a 20
gradual trend and a sudden change. A convenient method of trend analysis in time series 21
data is by studying the differences between successive records (Kendall and Ord, 1990). 22
The data sets consisted of between approximately 50,000 and 90,000 values. We first 23
9
removed outliers by visual inspection of the data record. To make the data more 1
manageable and remove the statistical problems associated with very large data sets 2
(Matloff, 1991) we next selected every tenth point from the data records for time series 3
analysis. 4
The result ing data sets each consisted of a sequence of time series in which 5
discontinuities in the GPS clock time indicated points in which the harvest had been 6
stopped and then restarted. Yield data were plotted against GPS clock time. The resulting 7
plot was then inspected first for sudden changes in calibration and second for evidence of 8
a trend (Haneklaus et al., 2001). We assumed that sudden changes in calibration would 9
occur at discontinuity in the GPS clock time (indicating that the harvest had been stopped 10
and restarted). Because in Field 2 the cooperating grower harvested the low yielding 11
areas first and then the better yielding areas, an abrupt change in yield at a gap in GPS 12
time in Figure 2 does not necessarily represent a calibration artifact. To determine which 13
if any of the records showed an evident change in yield monitor calibration the yield 14
records were visually compared with false color infrared aerial images of the field, with a 15
change in calibration being indicated if the change in yield did not correspond to a 16
change in vegetation intensity in the aerial image. 17
The comparison was carried out as follows. In each year and for each field the 18
raw data values were displayed as points in the GIS. The display was examined for 19
evidence of abrupt changes in yield trend that occurred at discontinuities in the GPS time. 20
A false color aerial image of the field was then examined to determine if there was a 21
corresponding change in infrared reflectance at this location. Because of the difficulties 22
in estimating yield from aerial images (Plant et al., 2001), this process was done visually 23
10
rather than statistically. If there was no change in reflectance properties corresponding to 1
a change in yield, the change was assumed to be caused by a sudden change in calibration 2
of the yield monitor. In this case the yield data after the jump in GPS time was adjusted 3
by multiplying by a constant value to bring the yield trends before and after the change 4
into visual alignment. 5
To test for gradual drift or trend in the data the yield records were differenced and 6
the differences were tested against the null hypothesis of zero mean using the sign test. 7
The third data quality test was carried out in Field 1 to take advantage of the back and 8
forth harvest pattern. The test was carried out at 10 randomly selected locations in the 9
field. It consisted of comparing the mean of a sequence of 10 yield values with the mean 10
of the sequence of 10 values at points immediately to the left of the first 10. Confidence 11
intervals were computed for the means based on the standard deviation of the 10 12
sequential yield values. These confidence intervals are not exact due to the 13
autocorrelation of the data. 14
Spatial resolution analysis 15
Yield point shapefile data were interpolated to a fixed 5×5 m grid using inverse 16
distance weighted interpolation with power 2 and number of neighbors 12. This grid 17
provided a set of locations of the yield data that was consistent over the four years and 18
approximated the spacing of the original yield data. These interpolated grids were used as 19
the starting point for the analyses. Our primary interest was in studying variability on a 20
scale of tens of meters to eliminate very short-range effects while still maintaining the 21
ability to distinguish important patterns of spatial variability. Therefore, three different 22
regularly spaced square point grids of size 90 m, 60 m, and 30 m were used to extract the 23
11
values from the interpolated yield maps in both fields. These three grids form resolutions 1
with numbers of grid points: n=36, n=76, n=292 and n=42, n=93, n= 402 for the 90, 60 2
and 30 m grid in Field 1 and 2 respectively. Each larger grid was made up of a subset of 3
the smaller grids. 4
Large-sale variability (trend) and small-scale variability were separated by 5
detrending the data, us ing the median polish technique (Cressie, 1991; Jaynes and Colvin 6
1997; Bakhsh et al, 2000). Median polish by rows and columns may not capture the 7
entire large-scale trend, as the trend orientation is not known a priori (Cressie 1991). 8
Therefore, an additional term was included in the median polish equation to detect any 9
further trend in the polished data as described by Cressie (1991) and Jaynes and Colvin 10
(1997). No significant further trends were detected in any case in which this procedure 11
was used in our data sets. 12
For the characterization of the small-scale variation experimental variograms 13
were calculated using the residual yields from the median polish. Variography analyses 14
assume the data have a Gaussian distribution. Therefore, the distributions of the yield 15
residuals for the four years were checked using stem and leaf plots and normal 16
probability plots. Outliers, indicated by stem and leaf plots, were removed and replaced 17
by kriged estimates following the analysis of Bakhsh et al. (2000). In Field 1, 15, 22, 11, 18
and 12 values for the 30 m grid and 9, 4, 0 and 0 values in the 60 m grid were removed in 19
the 1998, 1999, 2000 and 2001 yield data sets, respectively. In Field 2, 11, 13, 9 and 22 20
values for the 30 m grid were removed in the 1998, 1999, 2000 and 2001 yield data sets, 21
respectively. In the 60 m grid 4 values in 1998 and 13 in 2001 yield data sets were 22
removed. Experimental variograms were calculated from the yield values. All sample 23
12
variogram computations and model fitting were performed using GS+ v. 5.0 software 1
(Gamma Design Software, Plainwell, MI) assuming isotropic conditions. The isotropic 2
assumptions were verified by variography analysis in different orientations. Only lags 3
less than half the total grid length were computed. A theoretical variogram model was fit 4
to each experimental variogram. Different models were tested for fitting the data (Isaaks 5
and Srivastava, 1989), and the isotropic spherical model provided a good fit in each case. 6
One method for estimating the appropriate resolution for spatial analysis was to base it on 7
the ranges of the fitted variograms. 8
A second method for determining the appropriate cell size for analysis is the 9
relative information criterion (RIC) (Chen et al., 1995). In this paper we use the square of 10
the RIC as defined by Chen et al. (1995). The RIC2 is the square of the product moment 11
correlation coefficient of kriged residuals from the same locations in the field, viz.: 12
2,1),,()( 322 == ikkriRIC i 13
In this equation r2(kj,k i)denotes the square of the correlation coefficient of the block-14
kriged yield residual estimates using GR i and j. The RIC2 as we use it gives the fraction 15
of the sample variation using GR j that is “explained” if it is related to GR i using a 16
simple linear regression. 17
Block kriging was used to interpolate the residuals from the different grids. 18
Following Chen et al. (1995), block kriging with a block size equal to a 4 ×4-m plot was 19
used to produce block means of yield residuals, from which distribution maps of yield 20
residuals were developed. The block size was chosen to approximate as closely as 21
possible the original cell size of the interpolated grid, which could not be fit exactly due 22
to limitations of the software. The search radius in the kriging procedure was 350 m in 23
13
both fields. In this procedure we first estimated the variogram as described above and 1
block-kriged using all locations in both fields, corresponding to a grid resolution (GR) of 2
30 m (i = 3). This procedure was repeated at GR of 60 and 90 m (i = 2 and 1). 3
Computation of summary statistics 4
All temporal analysis was carried out on the 30 m grid. The four seasons worth of 5
data do not provide the opportunity to study large versus small scale temporal variability; 6
any long term variability would only be evident with a longer temporal record (Eghball 7
and Power, 1995). We therefore focused on the stability of the spatial structure over the 8
four years. We computed the mean temporal variance and the mean spatial variance of 9
each field. These statistics are defined as follows. Let N be the total number of grid cells 10
and T total number of years. Let Y(n,t) be the yield in grid cell n in year t, and let )(nYT 11
be the mean yield of cell n over all years. The temporal yield variance )(2 nTσ of cell n is 12
the variance in yield over the T years, 13
∑=
−−
=T
nTT nYtnY
Tn
1
22 ])(),([1
1)(σ ,
(1)
and the mean temporal variance is the average of these grid cell variances over all cells, 14
∑=
=N
iTT n
N 1
22 )(1
σσ . (2)
The spatial yield variance is the yield variance over all grid cells, 15
2
1
2 )](),([1
1)( tYtnY
Nt N
N
nS −
−= ∑
=
σ , (3)
14
where )(tYN is the mean yield in year t over the N cells, and the mean spatial variance is 1
the average of these spatial variances over all years, given by 2
∑=
=T
tSS t
T 1
22 )(1
σσ . (4)
Finally, it is of interest to determine the extent to which temporal variability is distributed 3
over a field. The summary statistic for this is the standard deviation of the temporal 4
variance, given by 5
2/122 ]})([1
1{ TTT n
Ns σσ −
−= ∑ .
(5)
Cluster Analysis 6
We used the k-means clustering algorithm (Jain and Dubes, 1984). This algorithm 7
forms a fixed number k of cluster groups by selecting those clusters that minimize within 8
group variances and maximize the between group variance. Cluster analysis was carried 9
out on the trend data obtained by median polish. Yields were standardized by subtracting 10
the mean and dividing by the standard deviation to avoid emphasis on data from higher 11
yielding years. Cluster analysis was carried out using Statistica (StatSoft, Tulsa, OK). 12
Results and Discussion 13
Data quality analysis 14
Figures 2 and 3 show the yield monitor records presented as time series, or, more 15
precisely, as connected sequences of time series, plotted in the order in which the data 16
were entered into the record. The ordinate of each axis is the difference between the GPS 17
time in seconds of the corresponding record and the lowest GPS time of all the records in 18
15
the data set. Arrows mark the points at which there is an abrupt change in the GPS time, 1
indicating that the harvest had been stopped and resumed. Visual inspection of the figures 2
indicate no obvious anomalies in any of the data sets except that of Field 2 in 2001. This 3
data set shows evidence of poor quality, including apparently truncated values in the 4
early part of the data set. Based on this evidence we elected to remove the Field 2 2001 5
data set from most of the analysis. 6
Based on the analysis for abrupt changes in calibration we concluded that the 7
abrupt changes in Field 2 all corresponded to actual abrupt changes in yield, but that the 8
abrupt change in yield in Field 1 in 1998 at to the second GPS time gap did represent a 9
calibration change. This record was adjusted accordingly. Specifically, all data values 10
occurring after the second GPS time gap were multiplied by 1.37, which brought the data 11
set into visual alignment. 12
The second test for data quality was the check for gradual drift of the calibration. 13
Tests of the null hypothesis m = 0 where m is the median of the yield differences yt+1 – yt 14
were not significant (p > 0.05) for Field 1 in any year. They were significant (p < 0.05) 15
for Field 2 in 1999 in 2001. The 2001 data set had been removed from analysis already. 16
Comparison of 1999 data with the corresponding aerial image led us to conclude that the 17
trend was due to actual conditions in the field rather than sensor drift. 18
Figure 4 shows the means and confidence intervals for the years with the most (5 19
in 2000) and fewest (2 in 1998) overlapping confidence intervals in 10 randomly selected 20
sequences of 10 points. In general the agreement between subsequent passes of the yield 21
monitor at the same location was poor. This same high level of local spatial variability in 22
the data may be observed in the data presented by Searcy et al. (1989) for grain yield and 23
16
was reported by Searcy et al. (2000) for cotton yield and mentioned by Perez-Munoz and 1
Colvin (1996) for grain yield. It was also observed in data recorded for a variety of crops 2
by Perez-Quezada et al. (2003), and indicates that yield monitor data may be an imprecise 3
predictor of yield at specific locations in the field. 4
Grid Resolution 5
Figures 5 and 6 show the sequences of yield maps of the four years for Field 1 6
and 2, respectively, with data corrected as described in the Methods section and 7
interpolated to a 30 m grid. Yield in Field 1 has no consistent visually apparent pattern 8
over the years, while in Field 2 there are persistent low and high yielding areas. The low 9
yielding areas correspond to cut areas of the laser leveling process, which are the dark 10
areas in Figure 1(b). In both fields the 30 m grid appeared to visually correspond to 11
patterns of spatial variability observed in infrared aerial images of the field. 12
Figure 7 shows the variograms of the median polish residuals for the 30 and 60 m 13
GR of the 1998 data sets in Field 1 The variograms for Field 2 are almost identical. These 14
are consistent with the variograms for the other years, which are not shown. To facilitate 15
comparison the sill and nugget are scaled to the sample variance for each variogram. The 16
variogram for the 90 m grid, which consisted of only 3 points, displayed no structure 17
(pure nugget). The 90m grid was therefore not considered adequate for representing the 18
spatial structure of yield variability. 19
Table 1 gives the parameters of the variogram fit to the median polish residuals 20
using the spherical model for each of the years for the 30 and 60 m grid resolutions for 21
Fields 1 and 2 respectively. Table 2 gives the RIC2 values for each field. RIC2 values in 22
both fields vary from year to year in a manner that corresponds to the variogram ranges 23
17
displayed in Table 1. The values in the Table 2 may be interpreted as the portion of the 1
variance of the data in the 30 m grid explained by a simple linear regression of 60 m on 2
the 30 m grid. Comparison of the data in Table 1 with the data of Table 2 indicates that 3
that efficiency of representation is of the grid is directly related to the spatial continuity 4
of the variable, i.e., that the 60 m grid captures more of the information of the finer grid 5
in cases where the range of the variogram is smaller. Our results agree with those of Chen 6
et al. (1995). Since RIC2 analysis indicated that a substantial amount of information may 7
be lost in some years by reducing the grid density all further analysis were done only with 8
the 30 m grid. 9
Summary statistics 10
Table 3 contains the spatial variances σs2(t) of the total yield, the median polish, 11
residuals, and the covariance between median polish and residual values. The variances 12
of the residuals, which are a combination of sampling error and short-range effects, are of 13
the same magnitude as those of the median polish. There is a negative relationship 14
between large and small-scale values in Field 2, reflecting a tendency of the lower 15
yielding areas to have smaller short-range variability. Table 4 gives the mean temporal 16
variance (equation 2), standard deviation of the temporal variance (equation 5) and the 17
ratio between the mean temporal and the mean spatial variance (equation 4) for the 18
original, trend and residual yield values. Computations for Field 2 were carried out using 19
the years 1998 - 2000 only. Mean temporal variance values over the seasons are larger 20
than the mean spatial variances for the original and trend (large-scale) yield values in 21
Field 1. Field 2 presented both higher spatial and temporal variance values. 22
18
Figures 7 a and b show the spatial distribution of the temporal variance (equation 1
1) computed with the trend yield values in Field 1 and 2, respectively. To visualize 2
different variance behaviors, the magnitude of the variances were classified in three 3
different categories: values less than one, between one and two, and greater than two (Mg 4
ha-1)2. These may be considered as representing low, intermediate and high variability 5
behaviors, respectively. To aid in visualization the figures are drawn using Thiessen 6
polygons surrounding each grid point. Visual inspection of the figures shows that the 7
number of locations with high temporal variability was very small in Field 1 and that 8
most of the points with high temporal variability were concentrated in the southwest 9
corner of the field. Most of the rest of the field (184 of the 292 polygons or 63%) had a 10
small temporal variance (< 1 Mg ha-1). Figure 7b indicates that Field 2 displays very 11
different behavior from Field 1. In this field most of the polygons (296 of the 402 or 12
73%) had a large temporal variance (> 2 Mg ha-1). 13
Cluster analysis 14
Figure 8 shows the results of the k-means cluster analysis, with k equal 3. The 15
figure gives the values of the standardized yields in each of the clusters. In Field 1 cluster 16
1 is composed of locations of the field with yield performance above field yield average 17
in 1998, around field yield average in 1999 and below field yield average in 2000 and 18
2001; cluster 2 is composed of locations in the field that presented low yield performance 19
in 1998 and 1999, average performance in 2000 and were above field yield average in 20
2001; and cluster 3 is composed of locations in the field with yield performances similar 21
to or above the annual field average in all years. The three clusters in Field 2 consist of a 22
19
consistently high yielding area, a consistently moderate yielding area, and a consistently 1
low yielding area (Figure 8b). 2
Figures 9a and b show the spatial location of the three different cluster groups in 3
Figure 8a and 8b, respectively. These figures allow the visualization of locations in the 4
field classified in each cluster and give an estimation of the importance of the spatial 5
configuration of these clusters. The clustering procedure does not use any information 6
regarding the spatial location of the data sets. Nevertheless, Figure 9 shows that the three 7
different cluster groups tend to be spatially grouped. The Moran Coefficient (MC) can be 8
used to test the null hypothesis of zero spatial autocorrelation (Long, 1998). The 9
distribution of cluster values was significantly different from random in both fields 10
(MC=0.673, Z value = 72.0, p = 0.0029 in Field 1 and MC 0.620, Z value = 69.0, p= 11
0.0031 in Field 2). 12
Comparing Figures 5, 8a, and 9a shows that although Field 1 appears to have 13
highly disorganized yield behavior, there are yield patterns that emerge from the cluster 14
analysis. Comparing Figures 6, 8b, and 9b indicates that the organization of clusters in 15
this field closely matches the consistent pattern observed in the original data. This 16
indicated that yield patterns in Field 1 were associated with transient phenomena, while 17
yield patterns in Field 2 were associated with more permanent factors. 18
Summary and Conclusions 19
The study of spatial yield variation of field crops in commercial fields may be 20
addressed at several scales. Yield data collected with commercial yield monitor clearly 21
cannot be used to study variation at a scale smaller than the resolution of the harvesting 22
equipment. The results of our study, in agreement with those reported by others, indicate 23
20
that commercial yield monitors may not provide useful information at the scale of one or 1
two multiples of the width of the harvester. Nevertheless, our results provide an 2
indication that at a somewhat larger scale (30 m in our case) the commercial yield 3
monitor can provide useful information about patterns of relative yield. In Field 2 the 4
pattern of consistently low yield in certain areas matched observations of aerial infrared 5
photographs and was explainable in terms of the laser leveling history of the field. The 6
behavior in Field 1 was more complex and dynamic but cluster analyses indicated a 7
strong spatial pattern. 8
Our analysis was based on the interpolation of yield monitor data into a regular 9
grid, and then examining the properties of this grid. This is a commonly used procedure 10
(Birrell et al., 1996; Swindell, 1997; Long 1998; Perez-Quezada et al., 2002). Although 11
this is technically a geostatistical procedure, and has been analyzed by Long (1998) as 12
such, the dense pattern of the data also justify considering it from a lattice perspective. In 13
this context each data point may be considered as an estimate of the mean of the yield 14
distribution within a lattice cell whose width is the width of the harvester and whose 15
length is the distance traveled. The interpolation process then becomes one of changing 16
the scale and extent of the lattice cells to a regular grid. Openshaw and Taylor (1981) and 17
Long (1998) have investigated the effects of such changes and have shown that they 18
profoundly affect the autocovariance structure. Thus conclusions about spatial 19
autocorrelation of such resampled data apply only to aggregation at the scale in which the 20
study was carried out. 21
Variogram and RIC2 analysis indicated that the grid density required to capture 22
the spatial variability was different for different years. The low RIC2 values in both fields 23
21
coincide with the smaller variogram ranges observed in those years. Visual inspection of 1
the yield maps of the two fields (Figures 5 and 6) supports the idea that Field 1 had a less 2
consistent spatial structure than Field 2. Trend surfaces, which followed the same patterns 3
as Figures 5 and 6, were very dissimilar among years for Field 1. For Field 2 trend 4
surfaces were quite similar from 1998 to 2000. The estimates of temporal variance over 5
the four seasons are larger than or similar to the mean spatial variances for the original 6
and trend (large-scale) yield values in both fields. This indicates that the magnitude of 7
both components of the spatiotemporal variance can be of equal importance in these 8
fields. Thus, the temporal component of the spatial variability must be taken into account 9
to make effective site-specific management decisions. Although both fields presented 10
similar temporal to spatial ratios, there were differences between them. Field 2 presented 11
both higher spatial and temporal variance values than Field 1, although it had an 12
obviously persistent spatial pattern. This indicates that simple summary statistics may not 13
capture the spatiotemporal structure of the field with sufficient descriptiveness to make 14
them useful by themselves. 15
The visually persistent spatiotemporal pattern displayed by Field 2 was captured 16
by a k-means cluster analysis. Moreover, the cluster analysis also captured a significant 17
spatial structure in Field 1, even though this field did not display a corresponding 18
persistent pattern. This indicates that cluster analysis may play a useful role in organizing 19
and simplifying the complex spatiotemporal behavior of yield trends over time. This may 20
aid in identifying the underlying causes of this behavior. 21
22
22
REFERENCES 1
Bakhsh, A., D. B. Jaynes, T. S. Colvin, and R. S. Kanwar. 2000. Spatio-temporal analysis 2
of yield variability for a corn-soybean field in Iowa. Trans. ASAE 43: 31-38. 3
Bhatti, A. U., D. J. Mulla, F. E. Koehler, and A. H. Gurmani. 1991. Identifying and 4
removing spatial correlation from yield experiments. Soil Sci. Soc. of Am. J. 55: 5
1523-1528. 6
Birrell, S. J., K. A. Sudduth, and S. C. Borgelt. 1996. Comparison of sensors and 7
techniques for crop yield mapping. Computers and Electronics in Agric. 14:215-8
233. 9
Blackmore, B. S., and C. J. Marshall. 1996. Yield mapping: errors and algorithms. p. 10
403-416 In: P. C. Robert, Rust et al. (ed.). Proc. Third Int. Conf. on Precision 11
Agriculture. ASA, Madison, WI. 12
Bongiovani, R., and J.Lowenberg-DeBoer. 2001. Nitrogen management in corn using 13
site-specific crop response estimates from a spatial regression model. unpaginated 14
CD In P. C. Robert et al. (ed.), Proc. Fifth Int. Conf. on Precision Agriculture. 15
ASA, Madison, WI. 16
Chen, J., J. W. Hopmans, and G.E. Fogg. 1995. Sampling design for soil moisture 17
measurements in large field trails. Soil Sci. 159: 155-161. 18
Cliff, A. D., and J. K. Ord. 1981. Spatial Processes: Models and Applications. Pion Ltd., 19
London, UK. 20
Colvin, T. S., and S. Arslan. 2001. A review of yield reconstruction and sources of errors 21
in yield maps. In P. C. Robert, Rust, R. H., and Larson, W. E., (ed.), unpaginated 22
23
CD In P. C. Robert et al. (ed.), Proc. Fifth Int. Conf. on Precision Agriculture. 1
ASA, Madison, WI. 2
Cressie, N. A. C. 1991. Statistics for Spatial Data. Wiley, New York. 3
Dobermann, A.; M. F. Pampolino, and H. U. Neue 1995. Spatial and temporal variability 4
of transplanted rice at the field scale. Agron. J. 87:712-720 5
Dobermann, A., C. Witt, D. Dawe, S. Abdulrachman, H. C. Gines, R. Nagarajan, S. 6
Satawathananont, T. T. Son, P. S. Tan, G. H. Wang, N. V. Chien, V. T. K. Thoa, 7
C. V. Phung, P. Stalin, P. Muthukrishnam, V. Ravi, M. Babu, S. Chatuporn, J. 8
Sookthongsa, Q. Sun, R. Fu, G. C. Simbahan, and M. A. A. Adviento. 2002. Site-9
specific nutrient management for intensive rice cropping systems in Asia. Field 10
Crops Res. 74: 37-66. 11
Doerge, T. A. Yield map interpretation. 1999. J. Prod. Agric. 12:54-61. 12
Eghball, B., and J. F. Power. 1995. Fractal description of temporal yield variability of 10 13
crops in the United States. Agron. J. 87: 152-156. 14
Griffith, D. A., and L. J. Layne. 1999. A Casebook for Spatial Statistical Data Analysis. 15
Oxford University Press, Oxford, UK. 16
Grondona, M. O., and N. Cressie. 1991. Using spatial considerations in the analysis of 17
experiments. Technometrics 33: 381-392. 18
Haneklaus, S., H. Lilienthal, E. Schnug, K. Panten, and E. Haveresch. 2001. Routines for 19
efficient yield mapping. unpaginated CD In P. C. Robert et al. (ed.), Proc. Fifth 20
Int. Conf. on Precision Agriculture. ASA, Madison, WI. 21
Hill, J. E., S. R. Roberts, D. M. Brandon, S. C. Scardaci, J. F. Williams, C. M. Wick, W. 22
M. Canevari, and B. L. Weir. 1992. Rice Production in California. University of 23
24
California Division of Agriculture and Natural Resources Publication 21498, 1
Oakland, CA. 2
Isaaks, E. H., and R. M. Srivastava. 1989. An Introduction to Applied Geostatistics. 3
Oxford University Press, Oxford, U.K. 4
Jain, A. K., and R. C. Dubes. 1984. Algorithms for Clustering Data. Prentice-Hall, Upper 5
Saddle River, NJ. 6
Jaynes, D. B., and T. S. Colvin. 1997. Spatiotemporal variability of corn and soybean 7
yield. Agron. J. 89: 30-37. 8
Kendall, M., and J. K. Ord. 1990. Time Series. Edward Arnold Publishing, Kent, Great 9
Britain. 10
Lark, R. M., H. C. Bolam, T. Mayr, R. I. Bradley, R. G. O. Burton, and P. M. R. 11
Dampney. 1999. Analysis of yield maps in support of field investigations of soil 12
variation. p. 151-161 In J. V. Stafford (ed.), Precision Agriculture '99. Sheffield 13
Academic Press, Sheffield, UK. 14
Lark, R. M., and J. V. Stafford. 1997. Classification as a first step in the interpretation of 15
temporal and spatial variability of crop yield. Annals of Applied Biol. 130: 111-16
121. 17
Long, D. S. 1998. Spatial autoregression modeling of site-specific wheat yield. Geoderma 18
85: 181-197. 19
Lowenberg-DeBoer, J., and K. Erickson. 2000. Precision Farming Profitability. Purdue 20
University, West Lafayette, IN. 21
Matloff, N. Statistical hypothesis testing: problems and alternatives. 1991. Environmental 22
Entomology. 20:1246-1250. 23
25
Openshaw, S., and P. Taylor. 1981. The modifiable areal unit problem. p. 60-69 In N. 1
Wrigley and R. J. Bennett (ed). Quantitative Geography: A British View. 2
Routledge and Kegan, London, UK. 3
Perez-Munoz, F., and T. S. Colvin. 1996. Continuous grain yield monitoring. Trans. 4
ASAE 39: 775-783. 5
Perez-Quezada, J. F., G. S. Pettygrove, and R. E. Plant. 2003. Spatial- temporal analysis 6
of yield and the influence of soil factors in two four-crop-rotation fields in the 7
Sacramento Valley, California. Agron. J. 95:676-687. 8
Plant, R. E., D.S. Munk, B.A. Roberts, R.N. Vargas, D.W. Rains, R.L. Travis, and R.B. 9
Hutmacher, R. B. 2001. Application of remote sensing to strategic cotton 10
management. J. Cotton Sci. 5:30-41. 11
Porter, P. M., J. G. Lauer, D. R. Huggins, E. S. Oplinger, and R. K. Crookston. 1998. 12
Assessing spatial and temporal variability of corn and soybean yields. J. Prod. 13
Agric. 11: 359-363. 14
Searcy, S. W., J. K. Schueller, Y. H. Bae, S. C. Borgetl, and B. A. Stout. 1989. Mapping 15
of spatially variable yield during grain combining. Trans. ASAE 32:826-829. 16
Searcy, S. W., A. D. Beck, and J. P. Roades. 2001. Cotton yield mapping: Texas 17
experiences in 2000. p. 342-344 In Proc. 2001 Beltwide Cotton Conf. National 18
Cotton Council, Memphis, TN. 19
Swindell, J. E. G. 1997. Mapping the spatial variability in the yield potential of arable 20
land through GIS analysis of sequential yield maps. p. 827-834 In J. V. Stafford, 21
(ed.), Precision Agriculture '97: Papers Presented at the First European 22
26
Conference on Precision Agriculture, BIOS Scientific Publishes Ltd., Oxford, 1
UK. 2
Whelan, B. M., and A. B. McBratney. 2000. The "null hypothesis" of precision 3
agriculture management. Prec. Agric. 2: 265-279. 4
Wong, D. W. S. 1995. Aggregation effects in geo-referenced data. p. 83-106 In S. L. 5
Arlinghaus et al. (ed.), Practical Handbook of Spatial Statistics. CRC Press, Boca 6
Raton, FL. 7
8
27
Table 1. Range, sill, and nugget of variograms fit using the spherical model to 30 m and 60 m grid resolutions of the yield median
polish residuals for both fields.
Field 1 Field 2
Year Grid Res. (m) Nugget Sill Range (m) R2 Nugget Sill Range (m) R2
1998 30 0.26 0.99 53.7 0.75 0.55 1.00 100.0 0.89
1998 60 0.22 1.05 88.0 0.29 0.15 1.10 100.0 0.92
1999 30 0.50 1.10 181.0 0.85 0.45 1.09 160.0 0.99
1999 60 0.57 1.16 211.0 0.85 0.45 1.00 160.0 0.95
2000 30 0.44 1.02 90.9 0.93 0.51 1.03 200.0 0.98
2000 60 0.24 0.95 89.0 0.78 0.45 1.01 244.6 0.97
2001 30 0.20 1.17 156.7 0.93 0.59 0.95 95.0 0.80
2001 60 0.40 1.11 143.4 0.91 0.50 0.98 120.0 0.95
28
Table 2. Squared relative information criteria (RIC2) between the 30 m and 60 m GR for
Fields 2 and 2 for each of the 4 years.
1998 1999 2000 2001
Field 1 0.15 0.63 0.52 0.70
Field 2 0.32 0.60 0.61 0.13
Table 3. Spatial variance in total yield, yield trend obtained through median polish, and
small scale yield variation obtained as median polish residuals, and covariance between
median polish and residuals. Units are (Mg ha-1)2.
Field 1 Field 1
Year Yield Med.
Polish
Residuals Covariance Yield Med.
Polish
Residuals Covariance
1998 2.65 1.02 1.86 -0.12 2.41 1.00 1.79 -0.19
1999 1.47 0.64 0.99 -0.08 6.28 2.39 4.71 -0.41
2000 0.38 0.18 0.22 -0.01 3.85 1.63 2.74 -0.26
2001 3.70 1.75 1.54 0.41
Avg. 2.05 0.90 1.15 0.0 4.18 1.67 3.08 -0.39
29
Table 4. Mean temporal variance (eq. 2), standard deviation (eq. 5) and the ratio between
the mean temporal and the mean spatial variance (eq. 4) for the years 1998-2001 for Field
1, and 1998-2000 for Field 2.
Mean Temporal
Variance (Mg ha-1)2
Std. Dev.†
(Mg ha-1)
Ratio:
(Temporal/Spatial)
Field 1 (1998-2001)
Grain Yield 2.46 2.04 2.46 / 2.05 = 1.20
Median Polished (Large-
scale)
1.38 0.85 1.38 / 0.90 = 1.53
Residuals (Small-scale) 1.13 1.56 1.13 / 1.15 = 0.72
Field 2 (1998-2000)
Grain Yield 4.07 5.00 4.07 / 4.18 = 0.97
Median Polish 1.87 2.73 1.87 / 1.67 = 1.12
Residuals 2.18 1.78 2.18 / 3.08 = 0.70
† Standard deviation
30
List of Figures
Figure 1. Gray scale renditions of false color infrared aerial images taken on August 8,
1998 of the two study fields. (a) Field 1. Note that in the image the field becomes darker
as one moves from west to east. This corresponds to higher near infrared values in the
original image. (b) Field 2. The dark areas in this image correspond to areas of very
sparse vegetation in the field. The field had recently been laser leveled and brought into
production, and these dark areas in the image were areas where considerable topsoil had
been cut off in the leveling process.
Figure 2. Yield monitor data (every tenth data point) of Field 1 plotted in the order in
which records were entered into the database. Abscissa is the difference between the GPS
clock time for that record and the lowest GPS time of the data set. Ordinate is yield (kg
ha-1) converted to common moisture content. Arrows placed at gaps in the GPS time
record, indicating that the harvest had been stopped and restarted.
Figure 3. Yield monitor data (every tenth data point) of Field 2 plotted in the order in
which records were entered into the database. Abscissa is the difference between the GPS
clock time for that record and the lowest GPS time of the data set. Ordinate is yield (kg
ha-1) converted to common moisture content. Arrows placed at gaps in the GPS time
record, indicating that the harvest had been stopped and restarted.
31
Figure 4. Relationship between the means of sequences of 10 yield monitor data points
recorded next to each other in Field 1. Error bars show 95% confidence intervals based
on standard deviation of the 10 data values.
Figure 5. Sequence of maps of Field 1 data as analyzed for each of the years 1998
through 2001. Yield data in kg ha-1. Outliers have been deleted and data have been
resampled to a common grid representing 30 meter square cells (Grid 1).
Figure 6. Sequence of maps of Field 2 data as analyzed for each of the years 1998
through 2001. Yield data in kg ha-1. Outliers have been deleted and data have been
resampled to a common grid representing 30 meter square cells (Grid 1).
Figure 7. Experimental and fitted variograms of yield data in Field 1. Values of the fitted
variogram sill, co+c1, range, Ao and model fitness, r2, are given in the figures. a) 30m, b)
60m sampling grid. Variograms for Field 2 are very similar.
Figure 8. Spatial distribution of the temporal variance (Mg ha-1)2.(a) Field 1, (b) Field 2.
Each figure consists of Thiessen polygons surrounding each sample point.
Figure 9. (a) Cluster behaviors defined by standardized yields in four years in Field 1. (b)
Cluster behaviors defined by standardized yields over the three years 1998 – 2000 in
Field 2.
32
Figure 10 (a) Thiessen polygons interpolated clusters in Field 1 (1998 – 2001). (b)
Thiessen polygons interpolated clusters in Field 2 (1998 – 2000).
33
(a) (b)
Figure 1
34
Field 1, 1998
0
5000
10000
15000
20000
25000
43 8580 17159 86349 94558 102877 175802 184507 193145 202090 210769
Time (sec)
Yie
ld (
kg/h
a)
Field 1, 1999
0
5000
10000
15000
20000
25000
93 6773 13255 19742 26486 91743 98062 109000
Time (sec)
Yie
ld (
kg/h
a)
Field 1, 2000
0
5000
10000
15000
20000
25000
18 6454 12852 19234 81332 252708 258991 265293
Time (sec)
Yie
ld (
kg/h
a)
Field 1, 2001
0
5000
10000
15000
20000
25000
22 6348 12709 18992 1516 7850 14135 20481
Time (sec)
Yie
ld (
kg/h
a)
Figure 2
35
Field 2, 1998
0
5000
10000
15000
20000
25000
79 13874 24220 14589 23537 97489 107343
Time (sec)
Yie
ld (
kg/h
a)
Field 2, 1999
0
5000
1000015000
20000
25000
30000
157 7841 2592 9335 169232 168382 510386 517052 524168
Time (sec)
Yie
ld (
kg/h
a)
Field 2, 2000
0
5000
10000
15000
20000
25000
184 7212 5537 13699 21174 6500 12796 19971 5402
Time (sec)
Yie
ld (
kg/h
a)
Field 2, 2001
0
5000
10000
15000
20000
25000
9776 23014 12514 12890 19840 14793 24104 30914 35104
Time (sec)
Yie
ld (
kg/h
a)
Figure 3
36
Field 1, 1998
0
5000
10000
15000
20000
1 4 7 10
Sample Number
Mea
n Y
ield
(kg
/ha)
# 1
# 2
Field 1, 2000
8000
10000
12000
14000
16000
18000
1 4 7 10
Sample Number
Mea
n Y
ield
(kg
/ha)
# 1
# 2
Figure 4
37
Figure 5.
38
Figure 6.
39
a)
b)
Figure 7.
40
a)
b)
Figure 8.
41
a)
-1.5
-1
-0.5
0
0.5
1
1.5
1998 1999 2000 2001
Year
Sta
ndar
dize
d Y
ield
Cluster 1
Cluster 2
Cluster 3
b)
-1.5
-1
-0.5
0
0.5
1
1.5
1998 1999 2000
Year
Sta
ndar
dize
d Y
ield
Cluster 1
Cluster 2
Cluster 3
Figure 9.
42
(a)
(b)
Figure 10