Temporal Aggregation and Persistence of the S&P/Case- Shiller … · 2017-06-02 · Temporal...
Transcript of Temporal Aggregation and Persistence of the S&P/Case- Shiller … · 2017-06-02 · Temporal...
Temporal Aggregation and Persistence of the S&P/Case-Shiller Indices: An Empirical Study
By
Antoine Giannetti*
July 2016
* College of Business, Florida Atlantic University, Boca Raton, FL 33431-0991, E-mail: [email protected]. The author would like to thank the discussant Zack McGurk as well as seminar participants at the American Real Estate Society meetings, ARES Denver 2016. All remaining errors are my own responsibility.
1
Temporal Aggregation and Persistence of the S&P/Case-Shiller Indices: An Empirical Study
Abstract
Extant research, e.g. Cotter and Roll (2014), suggests that the S&P/Case-Shiller indices’ monthly persistence may be partly due to their moving average computation rule (temporal aggregation). The paper specializes the insights of the literature on statistical measurement error to investigate this conjecture. It is found that temporal aggregation is a transient phenomenon that selectively distorts OLS predictive regressions’ slopes but does not impact the uniform, robust persistence of the S&P/Case-Shiller indices for all 14 surveyed cities. Economically, the results are consistent with the argument that a similar trading mechanism induces momentum across local housing markets.
Keywords: Housing Momentum; Repeat Sale Indices; Instrumental Variable
2
1. Introduction
Casual evidence suggests that, in recent years, residential real estate has become a
distinct financial asset class. As a consequence, investors who acquire properties, lenders who
finance those purchases and regulators who monitor the lenders increasingly require frequent
price monitoring. Unfortunately, home do not transact as often as stocks. As a consequence,
residential housing indices which computationally require a substantial number of paired-
transactions are constrained by data scarcity. A convenient remedy, implemented by Standard &
Poor’s through the S&P/Case-Shiller indices is by temporal aggregation1: transactions for a
given month are combined with those from previous two months in a moving-average window2.
Temporal aggregation effectively increases sample size but it also induces spurious correlation as
the same transaction is affecting the index calculation for at least three consecutive months. In
extant literature, the statistical impact of temporal aggregation is not clearly delineated. For
instance, Ghysels, Plazzi, Torous and Valkanov (2013) argue that persistence in repeat-sale
index returns may be due “partly to the construction of the index and partly to market
inefficiencies in real estate markets”. Cotter and Roll (2014) make a similar point and further
propose an intuitive filtering procedure to recover the “true” underlying index level. The main
contribution of this paper is to develop a rigorous econometric framework to dissect the impact
of temporal aggregation on the S&P/Case-Shiller indexes’ persistence.
1 Temporal aggregation is not the only solution to high-frequency data scarcity. The Federal Housing Finance Agency (FHFA) instead calculates monthly home price indices (HPI) by aggregating transactions in nine large US census divisions. Because the census divisions cover several MSAs (Metropolitan Statistical Areas), enough repeat sale pairs are available to monthly update HPI indices. A limitation is that US Census divisions are arbitrary statistical constructs that may cover distinct residential markets. Alternatively, Kortweg and Sorensen (2016) propose a Bayesian approach that allows reconstructing a continuous home price path hence bypassing the need for increased observable sample. 2 In technical documentation, Standard & Poor’s (February 2015) explains that the three-month moving average window allows late reported transactions and “keep sample sizes large enough to create meaningful price change averages”.
3
Ever since Case and Shiller (1989), empirical research on housing predictability has
proceeded by estimating the slope of a predictive model that regresses annual log-index changes
on a one year lag of itself. In recent work, Guren (2015) has comprehensively updated earlier
figures for a set of a 103 US cities and finds a median slope coefficient of 0.6: a 1% index
change this year is followed by 0.6 % change in the same direction next year3. A quick “back-of-
the-envelope” calculation shows that, for a first-order autoregressive process, a 0.6 annual slope
is roughly equivalent to a 0.96 monthly slope (0.96 0.61 ). In words, one would expect
robust economic persistence in monthly SCS index returns. For a number of S&P/Case-Shiller
cities, this is roughly what herein estimated OLS predictive regressions show. There are also a
number of cities for which the slope point-estimates stand substantially lower than predicated.
The paper conjectures that, at monthly horizon, cross-sectional differences in OLS estimated
persistence may not be of economic nature but rather, driven by temporal aggregation.
This work assumes that the observed S&P/Case-Shiller index return may be decomposed
into the sum of two unobserved, orthogonal components, the “true” index return and the
temporal aggregation process. In this setting, it is straightforward to show that the OLS
predictive regression of the observed index return on its own lag is biased and asymptotically
inconsistent (i.e. the slope may not converge to its “true” value even in large sample). The reason
is that, in the presence of temporal aggregation, the regressor and the regression residuals are, by
construction, correlated. The ensuing endogeneity of the regressor violates standard OLS
assumption for consistent estimation. Furthermore, as previously alluded to, temporal
aggregation induces a short-lived moving-average structure. The current work proposes to jointly
3 Guren also reports evidence of mean-reversion at the two to three year horizon but the latter is not statistically as strong for it necessarily relies on a smaller number of non-overlapping time periods.
4
address those estimation issues (i.e. endogeneity and serial correlation) by developing a
comprehensive econometric framework based on the classical literature on measurement error4.
A broad predicament of the literature on measurement error is that, in a linear regression,
a contaminated observed regressor induces dampened slope estimates. This effect is traditionally
referred to as the attenuation bias. The magnitude of this bias is functionally related to the ratio
of the variance of the unobservable regressor (the “true” index) to the variance of the
measurement error. This statistics is known as the signal to noise ratio. The empirical work will
demonstrate that statistical distortion is indeed stronger for cities where temporal aggregation
induces a low signal to noise ratio. Additionally, the paper marginally extends the classical
framework to accommodate the predictive regression as a dynamic model. This alteration results
in an additive bias that accounts for the serial correlation in temporal aggregation.
In the empirical work, the paper specializes the two-step FGLS (feasible GLS) estimator
introduced by Hatanaka (1974). The first step addresses the regressor endogeneity problem by
performing an instrumental variable (IV) estimation. It delivers predictive slope that are
“purged” from temporal aggregation disturbance and, as such, are consistent estimates of “true”
economic persistence. Additionally, the first step allows for consistent estimation of the
correlation structure imparted by temporal aggregation. The second step performs FGLS
estimation by using first step estimated residuals’ correlation. It recovers the predictive slopes
that best explain the observed S&P/Case-Shiller index returns. In short, accounting for temporal
aggregation, the two-step procedure delivers efficient estimates of the indices’ persistence.
Technical details are relegated to the body of the manuscript.
4 See Greene (1993) Section 9.5 for an introduction on measurement error. Estimation methods when data is corrupted by measurement error have long been studied in Econometrics. As a matter of fact, the method of instrumental variables (IV) is typically introduced as the tool of choice to deal with measurement error.
5
The main empirical finding is that, under FGLS estimation, residential real estate
persistence is uniformly robust for all 14 S&P/Case-Shiller surveyed markets. This is in sharp
contrast with monthly OLS predictive regressions which are plagued by downward biases related
to the indices’ moving-average computation rule. Indeed, with OLS estimation, several cities
appear less persistent than others but this is essentially a distortion caused a low signal-to-noise
ratio. From the economic standpoint, the results support the argument that a similar trading
mechanism (see Guren (2015) for a theoretical model) drives persistence across local residential
markets. At low frequency, the literature has emphasized the importance of local factors. For
instance, Novy-Marx (2009) argues that: “We would expect, for example, to see a more volatile
real estate market, both in terms of transaction prices and expected time-to-sale, in San
Francisco, where the supply response is relatively inelastic due to geography on the extensive
margin and regulations on the intensive margin, than in Phoenix, which is situated in a relatively
flat, featureless plain and where developers are comparatively unencumbered by regulatory
concerns.” In contrast, at high-frequency, this work uncovers scant heterogeneity in momentum
across local housing markets. Indeed, irrespective of geographical location, monthly housing
markets patterns display robust economic persistence.
The S&P/Case-Shiller indices are trusted residential real estate benchmarks routinely
used by regulators5 , academics and the general public alike. They endeavor to best capture the
current state of their respective local market. In order to achieve such a technical feat at high
frequency, they must expand their estimation window through temporal aggregation. The main
message of the paper is that those computational constraints have broad implications for the
5 Corelogic, the company that currently calculates S&P/Case-Shiller values reports that “During the financial crisis, the Federal Reserve used the S&P/Case-Shiller Composite-10 Index to assess residential mortgage-asset risk at the nation’s largest banks. These results—and related stress tests—helped determine actual lending viability at a time of deep financial uncertainty”. See https://www.corelogic.com/downloadable-docs/corelogic_case-shiller_indexes_datasheet.pdf
6
measurement of the local indices’ returns. The outline of the paper is as follows. Section 2
reviews relevant literature. Section 3 develops the conceptual framework. Section 4 reports
empirical results. Section 5 provides concluding remarks.
2. Literature Review
Home price persistence (or momentum) is probably the most distinctive empirical feature
of real estate data. In his Nobel Prize lecture, Shiller (2013) argues that: “… home prices are
generally extremely smooth through time, except for a small amount of seasonality. Home prices
do indeed go through years of price increases and then years of price decreases.” The economic
mechanisms underlying this momentum is still actively debated in the literature. The traditional
viewpoint, dating back to the seminal contribution of Case and Shiller (1989, 1990) and
reiterated by Shiller (2013) is that real estate markets are “wildly inefficient” all over the world.
According to Shiller, inefficiency of the private real estate market (relative to the stock market)
results from the higher cost of trading homes and, more specifically, to short-sell them. There are
however several theories to explain home price momentum. Most of them introduce a self-
reinforcing mechanism that amplifies shocks in economic fundamentals. Stein (1995) proposes a
model where home buyers are constrained by a minimum down-payment. He argues that the
later produces self-reinforcing effects by which macroeconomic shocks are magnified as
financially constrained repeat-buyers are unable to purchase homes. Novy-Marx (2009) adapts
the standard Diamond-Mortensen-Pissarides search model by assuming that entry/exit for both
buyers and sellers is not infinitely elastic. He constructs an equilibrium where, as the ratio of
buyers to sellers changes, a self-reinforcing mechanism triggers “hot and cold markets”. Guren
7
(2015) proposes a model with concave demand curves in relative price to explain the momentum
effect. His story is that home priced above current average are unlikely to transact while setting
the price below average will only slightly increase probability of a sale but, at the same time,
“leave money on the table”. He finds empirical support for his model using micro-data listings
from the San Francisco Bay, Los Angeles and San Diego metropolitan areas from 2008 to 2013.
In the urban economics literature, Glaeser, Gyourko, Morales and Nathanson (2014) estimates a
dynamic, urban real estate model and find that they are unable to find a calibration consistent
with the high-frequency momentum effect which they even dub “the most important puzzle for
housing economists to explain”.
On the purely empirical front, Case and Shiller (1989) were among the first to exploit the
predictive regression model to uncover momentum in repeat-sale indices. In their work, they
entertain the possibility that spurious correlation in repeat-sale indices may be induced by the
time-overlapping patterns of paired-transactions. They address the problem by randomly
partitioning their sample into two sub-samples and compute distinct repeat-sale indices for each
sub-sample. Case and Shiller further estimate year-to-year predictive regressions by using, as
dependent variable, returns from one sub-sample and, as explanatory variable, lagged-returns
from the other. They find that “a change in real city-wide housing prices in a given year, tends to
predict a change in the same direction, and one quarter to one half in magnitude, the following
year”.
The work perhaps most related to the current paper is by Cotter and Roll (2014). While
Case and Shiller (1989) and more recently Guren (2015) have mostly considered year-to-year
predictive regressions6, Cotter and Roll focus on monthly S&P/Case-Shiller returns. They point
6 Case and Shiller (1989) also argue that the quarterly first difference of their indexes are noisy while annual differences are less so. At fortiori, the implication, is that monthly repeat sale returns may be even noisier.
8
out that those monthly returns “are highly autocorrelated, probably because they are constructed
as three-month moving averages”. They also propose an ingenious procedure to filter the
S&P/Case-Shiller series. Indeed, assuming that the “true” but unobservable repeat-sale index
follows a random walk, they extract its level by simulating two index values and solving for the
third unknown price whose weight moving-average equates the observed S&P/Case-Shiller
level. They refer to their method as “reverse engineering” of filtered series whose moving-
average results in the observed S&P/Case-Shiller original data. They are two main limitations to
the Cotter and Roll procedure. The first concern is that it relies on simulated index levels from an
estimated historical distribution whose parameters are necessarily subject to sampling error. The
second and perhaps more problematic issue is the implicit assumption that the unobservable
index follows a random walk while, since Case and Shiller (1989), the bulk of the empirical real
estate literature suggests that it does not. Nevertheless, using their procedure, Cotter and Roll
adjust the S&P/Case-Shiller 10 cities Composite index (CS10) series and report that the first-
order serial correlation drops from positive 0.968 down to negative 0.206 (see Cotter and Roll
(2014) Table 3 pp. 219) for the adjusted series. Besides the fact that the Composite index is
likely even more serially correlated than local markets due to the capitalization weighting
scheme, it is difficult to reconcile the Cotter and Roll finding with extant research on home price
momentum at low frequency.
3. Conceptual Framework
3.1 A Model for Temporal Aggregation
Let denote the natural logarithm of a repeat sale index level at time . The observed
index log-level change ∆ is referred to as price growth, price appreciation or simply
Additionally, note that temporal aggregation of the type considered in this paper is not an issue with their data-set since the moving-average rule was presumably not used to compute the indices in their study.
9
return. Assume that the “true” but unobservable predictive model for estimation of the
persistence coefficient with | | 1 is given by:
∗ ∗
(1)
where ∗ is the unobservable “true” index return and is a white noise such that all of the
assumptions of the classic linear regression model hold. From the economic standpoint, the
above model says that current “true” return ∗ may be construed as the infinite sum of
geometrically declining shocks in past housing returns (infinite moving average MA ∞
representation). The observable return is represented as the sum two orthogonal components,
the unobservable return ∗ and a measurement error such that:
∗
(2)
where denotes the unobservable temporal aggregation represented by the following moving-
average process:
(3)
and is a white noise uncorrelated with . The above specification allows for current
measurement errors to be serially correlated with past measurement errors …, . As
such, (3) is a convenient way to capture the impact of the time-aggregation window7 on the
7 Lag length M is implied by the computation rule followed by S&P (see S&P/Case-Shiller Home Price Indices Methodology (2015), pp. 6): “The index point for each reporting month is based on sales pairs for that month and
10
computation of the S&P/Case-Shiller indices. Straightforward algebra shows that (1) may be
expressed in terms of the observable return as:
(4)
Observe that (4) cannot generally be estimated by OLS for, in the presence of measurement
error, the lagged regressor is endogenous. Indeed, the covariance between the regressor and the
residual in (4) is expressed8 as:
,
(5)
where is the first-order serial correlation of the measurement error and is the unconditional
variance of the measurement error. Unless (5) is zero (which is trivially satisfied in the absence
of measurement error), it is well known that the OLS estimator is asymptotically inconsistent.
Furthermore, one may express9 the slope of the predictive regression in (4), as:
1∗
1∗
(6)
the preceding two months”. It may appear that 2 is appropriate but this choice overlooks transactions from month t reported late on month t+1. Setting 3 addresses the issue.
8 , ∗ , and (5) follows.
9 Indeed , ∗ , ∗
∗
∗ ,∗ so that
∗
∗.Rearranging yields
(6).
11
where ∗is the unconditional variance of the “true” return. A few points are in order. For
starter, when ∗ ≪ , the first term on the right hand side of (6) causes the estimated slope
to shrink downward toward zero. As such, it produces the so-called attenuation bias from the
classical literature on measurement error (see Greene (1993) pp. 281). In particular, slope
distortion may be extreme when the signal to noise ratio ∗/ is close to zero. Additionally,
the second term on the right hand side of (6) imparts an additive bias that is large when ∗ ≪
. Intuitively, the extra-term is needed because, in a dynamic model, measurement errors are
necessarily serially correlated. This situation is not traditionally considered in the classical
literature for no-correlation is assumed between measurement errors in the dependent and
independent variable. Finally, perhaps the case against OLS should not be overstated as
expression (6) also says that when the signal to noise ratio is large enough, both the attenuation
bias and the additive bias may still be statistically negligible. In short, this is ultimately an
estimation issue.
2.2 The two-pass FGLS approach
Given the concerns with OLS estimation in the presence of temporal aggregation, the
paper proposes an alternative approach to estimate the predictive regression in (4). This
alternative approach allows to deal with both lagged regressor endogeneity and serial correlation
in the unobserved measurement error. The entertained estimation methodology adapts the two-
step efficient FGLS (feasible GLS) procedure first suggested by Hatanaka (1974). To proceed, it
is convenient to reparametrize expression (4) as:
12
(7)
where (3) yields the following parametric restrictions: , ,
for 1, 1 and = is a white noise. The above parametrization
implies that the observable return admits an ARMA (1, 1) representation. The is
consistent with aggregation rules from the time series literature (see e.g. Harvey (1981) pp. 42-
43) who points out that, generically, the combination of an AR (P) and a MA (Q) process yields
a mixed ARMA (P, P + Q) process. From the estimation standpoint, an important issue is the
determination of the moving-average lag length. The S&P/Case-Shiller Home Price Indices
Methodology (2015), pp. 6 indicates that: “The index point for each reporting month is based on
sales pairs for that month and the preceding two months” so that 2 would appear adequate.
But this choice may overlook spurious correlation induced by transactions from month t reported
late on month t+1. The paper settles for 3 which addresses the issue10. Finally, it is
important to note that the resulting ARMA (1, 4) is not unrestricted. Rather, the parametrization
in (7) imposes a testable non-linear constraint11 on the MA coefficients , 1,2,3,4 . In
spite of this non-linear constraint, the ARMA representation is convenient for its unrestricted
estimation may provide valuable robustness check of specifications. This route is further
explored in section 3.2.
Regardless, FGLS estimation may further proceed along the two steps described by
Greene (1993) Section 15.7.3 pp. 435. The first step estimates the predictive regression in (7) by
10 The empirical work also shows more economically plausible estimation results when using 3. 11 More specifically, algebra shows that ∑ 0
13
using the instrumental variable (IV) method. In this particular setting, an instrument is a variable
that is both correlated with the observable lagged index return (relevance) and uncorrelated
with the disturbance (viability). A good instrument should be both relevant and viable. For
each S&P/Case-Shiller city, the paper proposes to use contemporaneous returns on other
S&P/Case-Shiller cities as instruments (hence 13 instruments for each city).
The economic rationale for this choice is as follows. First, relevance is likely to be met as
long as S&P/Case-Shiller cities are contemporaneously cross-correlated (see Cotter and Roll
(2015) for related empirical evidence on “increased market integration”). Practically, relevance
is usually assessed by examining the statistical fit of contemporaneous regressions of each city
returns on its 13 counterparts. Viability, on the other hand, is always a delicate matter with the
IV method for the correlation between lagged instrument returns and city-specific
disturbance is, by construction, not observable. Conceptually however, one may still make an
intuitive argument as to why lagged instruments’ returns constitute viable instruments. Indeed,
temporal aggregation may be construed as an idiosyncratic phenomenon related to city-specific
patterns in paired-transactions. Consequently, measurement error embedded in observed
instruments’ returns from other cities should be minimally correlated with unobservable local
index’ disturbances. Section 3.2 buttresses this argument by reporting statistical sample evidence
of orthogonality under the form of standard over-identification tests.
Theoretically, in contrast to OLS, the first step IV method delivers slope estimates that
are asymptotically consistent albeit biased in small sample. Consequently, estimated
residuals from the IV first pass estimation may be used to construct estimates of the ’s. Those
estimates denoted as are obtained by regressing estimated residuals on four lagged-values.
14
With those estimates in hand, the second step regression consists in estimating by OLS the
following modified linear model:
(8)
where ∑ . Expression (8) is reminiscent of the traditional Cochrane-
Orcutt (1949) modified regression procedure that is typically estimated to account for serial
correlation in the residuals of a standard OLS regression. The FGLS procedure delivers efficient
estimates of the persistence slope as well as the temporal aggregation moving average
coefficients . Those efficient estimates are given by . Correct asymptotic
standard errors for all estimators, including efficient are obtained as usual with an OLS
regression. Hatanaka shows that the estimates are asymptotically equivalent to maximum
likelihood estimates. Estimation results are further reported.
3. Empirical Results
3.1 Estimation results
The data used in the empirical work consists of seasonally adjusted returns on 14
S&P/Case-Shiller city indices from inception on January 1987 to December 2015 (N=348). The
first group of 10 cities includes Boston, Chicago, Denver, Las Vegas, Los Angeles, Miami, New
York, San Diego, San Francisco, Washington DC all members of the capitalization weighted
S&P/Case-Shiller CS10 Composite index. The second group of 4 cities includes Charlotte,
Cleveland, Portland and Tampa.
15
The empirical work initiates by estimating the OLS predictive regression of monthly
S&P/Case-Shiller indices ‘returns on their first lagged-value. Results are displayed in Table 1.
All cities show statistically significant persistence with slopes ranging from 0.35 (R2 =0.12) for
Cleveland to 0.94 (R2 =0.88) for Los Angeles. The monthly median slope across markets stands
at 0.83. By comparison, Guren (2015) reports a median annual slope across cities of 0.60.
Despite the fact that Guren’s study covers more markets (103 cities) and uses a different repeat
sale index (namely the Corelogic index), the two figures are difficult to reconcile. Indeed, for a
first order autoregressive process, a monthly slope of 0.83 translates into an annual slope of
0.8312≈0.11. In this respect, the estimated monthly OLS slopes appear downward biased with
respect to their annual counterpart.
Table 2 displays estimation results for first step IV estimation of the predictive
regression. The striking result is that all slopes are sharply shifted closer to unity. Technically,
Instrumental Variable estimation follows the 2 stage least square procedure (2SLS). In the first
stage typically referred to as the reduced-form model, lagged returns for each city are projected
on the other 13 lagged returns’ counterparts. In the second stage, the projections (i.e. the fitted
values) from first stage are used as regressors instead of lagged returns with OLS. Those
projections may be construed as “purged” version of the endogenous variable (i.e. the lagged
returns). Figuratively, first step IV estimation filters out the temporal aggregation noise which is
transferred into the residuals. As a consequence, robust economic persistence, consistent with
low-frequency figures reported in the literature, is revealed. More specifically, all S&P/Case-
Shiller cities show highly significant estimated slopes ranging from 0.79 (R2 =0.29) for
Cleveland to 1.00 (R2 =0.68) for New York City where the latter is likely due to sampling
variability as the standard error of about 0.025 does not preclude an actual slope below unity.
16
However, notice that the statistical fit of the reduced-form model as measured by its R2 statistics
appears weaker for two cities namely Charlotte (R2 =0.32) and Cleveland (R2 =0.29). The latter
is confirmed by the reduced-form model’s F-test statistics which stands at 8.12 for Charlotte and
6.77 for Cleveland, definitely lower than the value of 10 informally suggested by Staiger and
Stock (1997) to indicate weak instruments. This is a concern as the technical literature on IV
estimation (e.g. Stock and Yogo (2005)) broadly establishes that weak (irrelevant) instruments
result in both severally biased IV slope point-estimates as well as asymptotic standard errors that
have poor finite sample accuracy.
Table 3 shows coefficients for the auxiliary regression of first step IV residuals on four lagged
values (i.e. estimates of the ’s). The estimates are typically negative and significant.
Additionally, the first order serial correlation of the residuals is calculated using the MA auto-
covariance function12 formula. As one may expect, is negative for all cities confirming the
economic intuition that temporal aggregation is mostly a short-lived phenomenon.
Table 4 displays results for second step FGLS estimation. Several observations are in
order. First, the efficient slope estimates for persistence are slightly lower than for first step IV in
Table 2. Intuitively, in the modified regression, the coefficients, 1,2,3,4 , pick up
some of the variability attributed to the biases in expression (6). It is nevertheless the case that
most S&P/Case-Shiller cities show slopes above 0.90 with a median standing at about 0.94
which more closely matches the annual median slope of 0.60 from Guren (2015) as 0.9412≈048.
12 Following Harvey (1981) pp. 36,
. The variance may also be computed as
1 where is the sample estimate of the residuals’ variance from first step IV auxiliary regression.
17
Second, for most cities, the coefficients are typically not significant which indicates
that second step efficient are statistically not much different from first step ’s. Table 4
also reports estimates of the signal-to-noise ratio13 for each S&P/Case-Shiller city. Observe that
the estimated ratio is close to zero for two cities namely Charlotte and Cleveland. Those results
are consistent with the low F-tests reported in Table 2. Taken at face value, they suggest that,
perhaps, the chosen instruments are weak for Charlotte and even more so Cleveland. This
conjecture is further explored in Section 3.2 where an unconstrained ARMA (1, 4) specification
is estimated.
Third, a Wald test of specifications is reported in Table 4. This test uses the estimates
to test the nonlinear constraint attached to specification restrictions on the MA coefficients14
from expression (7) Section 2.2. Overall, the displayed P-values suggest that the restrictions
imposed on the MA coefficients are not rejected by the data. Caution is order however for it is
well known that non-linear Wald tests perform poorly in small sample. Overall, the picture that
emerges from FGLS estimation is that, after filtering for statistical noise induced by temporal
aggregation, high-frequency persistence is remarkably uniformly robust across the 14 surveyed
S&P/Case-Shiller cities.
3.2 Robustness Checks
This section entertains two distinct robustness checks. First, the modelling premises
under which the analysis is carried on in this work are statistically tested. To this end, Table 5
presents both a test of endogeneity of the regressor in the predictive regression (the robust DWH,
Durbin-Watson-Hausman) as well as a test of over-identification (the robust score test). The
13 Using the orthogonality assumption from (2) one obtains:
∗1so that sample estimates for and may
be substituted. 14 For the modified model, the restriction condition is: ∑ 0
18
endogeneity test statistically validates the main premise of the paper: temporal aggregation
induces by the moving-average computation rule induces endogeneity. At the 5% confidence
level, the displayed P-values reject the null hypothesis of exogeneity for all surveyed S&P/Case-
Shiller cities. The overidentification assesses proposed instruments validity by testing their
orthogonality with post-estimation residuals. For all surveyed cities, the reported P-values show
that orthogonality constraints are not rejected at the 5% confidence level.
Next, based on the parametrization discussion of expression (7) Section 2.2, an
unconstrained ARMA (1, 4) model is estimated for each of the 14 surveyed cities. The
unconstrained ARMA specification is almost surely mispecified but, unlike the two-step
procedure, it does not require the explicit use of instruments. As such, it does avoid the weak
instrument problem that is suspected for some cities. Table 6 reports robust estimation results of
the ARMA (1, 4) specifications. It is noteworthy that the AR coefficient stands at 0.95 for
Charlotte and 0.98 for Cleveland further cementing the view that their (relatively) lower FGLS
slope estimates were due to weak instruments15. Also, the median AR coefficient stands at about
0.96, which, as previously argued, is almost perfectly in line with Guren (2015) low frequency
estimate. Finally, Table 7 displays usual information criteria for three alternative specifications:
OLS, two-step FGLS and ARMA (1, 4). As always with information criteria statistics lower is
better. The salient point here is that the parsimonious ARMA (1, 4) typically does best followed
by FGLS and lastly OLS. Hence, for a forecasting purpose, the latter specification should be
favored.
4. Concluding Remarks
15 Economically, the implication is that both cities are mildly correlated with other S&P/Case-Shiller cities.
19
The paper provides a comprehensive econometric framework to analyze the impact of
temporal aggregation induced by the moving-average computation rule on the S&P/Case-Shiller
indices’ monthly returns. The main contribution is to uncover that temporal aggregation is a
short-lived phenomenon that does not impact the robust economic persistence of the indices.
This is opposite to extent literature (e.g. Cotter and Roll (2014)), that argues that purging the
indices from temporal aggregation should result in subdued persistence. The findings also
reconcile the high frequency evidence of monthly predictive regressions with their low
frequency, annual counterpart. The paper argues that temporal aggregation lowers monthly OLS
predictive slopes by inducing both an attenuation bias and an additive bias. The magnitude of
those biases is city-specific. Once the noise has been filtered out however, the big picture is that,
at high frequency, all S&P/Case-Shiller indices are robustly persistent.
They are some empirical issues that could be addressed on the basis of the methodology
developed by this paper. For instance, Glaeser, Gyourko, Morales and Nathanson (2014) claim
that “most variation in housing price changes is local, not national”. Using the preconized FGLS
two-step approach or its companion ARMA parametrization to orthogonalize the S&P/Case-
Shiller indices’ returns into signal and shock, such a view may be further investigated. The focus
of the literature has legitimately been on rationalizing the tremendous persistence of housing
markets. However, from the empirical standpoint, shocks are nearly as important. Indeed,
because of the documented high-frequency momentum, a large shock to the housing market
today translates into a slowly decaying trend for many months to come. Consequently, a better
understanding of the economic nature of shocks in terms of whether they are driven by a
common factor structure and/or conditioned by economic fundamentals may be fruitful. In the
end, even if substantial inefficiencies plague them, it may still be the case that, marginally,
20
residential real estate markets react to the same broad economic stimuli. This is left for future
research.
21
Table 1: OLS Monthly Predictive Regressions
This table shows OLS coefficients of SCS monthly log-level changes regressed on their first lag. Robust t-statistics (in parenthesis) are calculated using the Newey West HAC-covariance matrix with 4 lags.
City Cons. Slope Adj.R2
Boston 0.00066 0.76 0.58
(2.80) (17.59)
Chicago 0.00076 0.69 0.48
(2.04) (11.20)
Denver 0.00092 0.75 0.57
(3.73) (18.73)
Las Vegas 0.00027 0.87 0.76
(0.83) (19.65)
Los Angeles 0.00025 0.94 0.88
(1.13) (44.52)
Miami 0.00031 0.90 0.81
(1.09) (23.23)
New York City 0.00032 0.86 0.75
(1.89) (29.78)
San Diego 0.00063 0.84 0.71
(1.62) (14.68)
San Francisco 0.00054 0.88 0.77
(1.50) (27.87)
Washington DC 0.00041 0.87 0.77
(1.72) (26.90)
Charlotte 0.00116 0.47 0.22
(3.29) (5.67)
Cleveland 0.00133 0.35 0.12
(3.37) (4.62)
Portland 0.00114 0.75 0.55
(2.94) (13.21)
Tampa 0.00043 0.83 0.68
(1.47) (19.45)
22
Table 2: Instrumental Variable Regression: First Step
This table shows 2SLS coefficients of SCS monthly log-level changes regressed on their first lag. The instruments are lagged log-level changes of all but the estimated city. Robust t-statistics (in parenthesis) are calculated using the Newey West HAC-covariance matrix with 4 lags. The F-stat and Adj.R2 statistics are from the reduced-form model estimation (the “first stage” of the 2SLS estimation).
City Cons. Beta F-Stat Adj.R2
Boston 0.00004 0.99 38.14 0.59
(0.19) (26.83)
Chicago 0.00001 0.99 30.03 0.54
(-0.01) (20.81)
Denver 0.00009 0.98 21.44 0.49
(0.34) (18.09)
Las Vegas -0.00001 0.99 18.33 0.66
(-0.05) (33.06)
Los Angeles 0.00016 0.96 74.48 0.87
(0.79) (45.85)
Miami 0.00004 0.98 61.76 0.82
(0.18) (34.79)
New York City -0.00005 1.00 30.87 0.68
(-0.31) (39.38)
San Diego 0.00007 0.98 126.82 0.82
(0.35) (39.00)
San Francisco 0.00023 0.95 35.83 0.75
(0.72) (32.25)
Washington DC 0.00012 0.96 51.83 0.79
(0.57) (36.23)
Charlotte 0.00016 0.93 8.12 0.32
(0.71) (12.29)
Cleveland 0.00043 0.79 6.77 0.29
(1.12) (6.37)
Portland 0.00012 0.98 23.53 0.56
(0.57) (31.35)
Tampa 0.0001 0.96 59.87 0.78
(0.44) (27.73)
23
Table 3: Instrumental Variable Lagged Residuals Regression
This table reports coefficients of first step Instrumental Variable residuals regressed on first 4 lags. is an estimate of the first-order serial correlation of the unobserved temporal aggregation process. Student t-
statistics are reported in parenthesis.
City
Boston -0.30 -0.14 -0.42 -0.09 -0.12
(-5.62) (-2.77) (-8.19) (-1.69)
Chicago -0.32 -0.06 -0.33 -0.13 -0.19
(-5.95) (-1.10) (-6.27) (-2.48)
Denver -0.37 -0.05 -0.37 -0.07 -0.24
(-6.8) (-1.00) (-6.94) (-1.30)
Las Vegas -0.33 -0.11 -0.21 -0.10 -0.21
(-6.22) (-2.03) (-3.89) (-1.82)
Los Angeles -0.10 0.01 -0.10 0.17 -0.12
(-1.91) (0.25) (-1.95) (3.14)
Miami -0.26 0.00 -0.40 -0.16 -0.16
(-4.86) (0.08) (-7.70) (-2.91)
New York City -0.34 -0.01 -0.22 -0.11 -0.26
(-6.20) (-0.20) (-4.02) (-1.95)
San Diego -0.47 -0.07 -0.22 -0.05 -0.32
(-8.62) (-1.22) (-3.72) (-0.85)
San Francisco -0.11 0.05 -0.22 0.01 -0.12
(-2.01) (0.85) (-4.11) (0.23)
Washington DC -0.29 0.02 -0.20 -0.07 -0.25
(-5.3) (0.40) (-3.56) (-1.23)
Charlotte -0.56 -0.18 -0.45 -0.21 -0.18
(-10.63) (-3.17) (-7.93) (-4.03)
Cleveland -0.43 0.01 -0.42 -0.24 -0.24
(-8.21) (0.16) (-7.90) (-4.53)
Portland -0.43 -0.14 -0.42 -0.21 -0.16
(-8.13) (-2.58) (-7.89) (-3.95)
Tampa -0.35 -0.10 -0.41 -0.10 -0.18
(-6.40) (-1.86) (-7.89) (-1.77)
24
Table 4: Instrumental Variable FGLS Estimation: Second Step
This table shows feasible GLS coefficients where transformed variables are used to account for 4 lags serial correlation in the first step residuals. Robust t-statistics (in parenthesis) are calculated using the Newey West HAC-covariance matrix with 4 lags. A non-linear Wald test (Chi-Square (1)) of specifications is displayed. P-values are reported in parenthesis.
City Slope Wald SNR Adj.R2
Boston 0.00036 0.93 0.03 0.03 0.03 0.03 0.41 1.17 0.88
(1.46) (44.39) (0.44) (0.40) (0.45) (0.56) (0.52)
Chicago 0.00049 0.89 0.05 0.05 0.05 0.04 1.34 0.83 0.81
(1.40) (25.39) (0.76) (0.81) (0.72) (0.75) (0.25)
Denver 0.00052 0.93 0.03 0.03 0.03 0.03 0.52 1.50 0.87
(1.93) (35.75) (0.53) (0.54) (0.46) (0.50) (0.47)
Las Vegas 0.00024 0.96 0.02 0.02 0.02 0.02 0.14 3.13 0.92
(0.70) (42.58) (0.26) (0.21) (0.28) (0.40) (0.70)
Los Angeles 0.00022 0.94 0.02 0.02 0.02 0.02 0.21 7.66 0.90
(0.97) (37.84) (0.28) (0.30) (0.28) (0.29) (0.65)
Miami 0.00018 0.97 0.01 0.01 0.01 0.01 0.01 4.10 0.95
(0.61) (52.19) (0.07) (0.07) (0.06) (0.08) (0.91)
New York City 0.00017 0.95 0.02 0.02 0.02 0.02 0.32 2.84 0.91
(0.84) (48.68) (0.32) (0.36) (0.34) (0.35) (0.57)
San Diego 0.00031 0.96 0.02 0.02 0.02 0.02 0.12 2.36 0.92
(0.99) (59.46) (0.17) (0.23) (0.22) (0.38) (0.73)
San Francisco 0.00046 0.92 0.03 0.02 0.02 0.03 0.33 3.34 0.85
(1.26) (27.37) (0.38) (0.28) (0.39) (0.44) (0.56)
Washington DC 0.00024 0.94 0.01 0.02 0.02 0.01 0.15 3.25 0.90
(0.95) (45.04) (0.22) (0.25) (0.25) (0.24) (0.7)
Charlotte 0.00052 0.90 0.02 0.02 0.02 0.02 0.17 0.04 0.81
(1.76) (27.75) (0.26) (0.35) (0.36) (0.32) (0.68)
Cleveland 0.00081 0.81 -0.01 -0.01 -0.01 -0.01 0.03 0.03 0.64
(1.80) (11.00) (-0.12) (-0.12) (-0.12) (-0.16) (0.86)
Portland 0.00056 0.95 0.02 0.02 0.02 0.02 0.13 0.92 0.90
(1.73) (46.54) (0.22) (0.23) (0.27) (0.27) (0.71)
Tampa 0.00022 0.96 0.00 0.00 0.00 0.00 0.01 2.06 0.92
(0.77) (45.81) (0.03) (0.03) (0.03) (0.03) (0.97)
25
Table 5: Endogeneity and Overidentification Specification Tests
This table reports specification tests of endogeneity and overidentifying restrictions. The endogeneity test is the robust Durbin-Wu-Hausman (DWH) test. The Null hypothesis is that the independent variable is exogenous. The over-identifying restriction test is the Wooldridge (1995) robust score test. The Null hypothesis is that the instruments are valid. Respective tests are followed by P-values calculated using the Newey West HAC-covariance matrix with 4 lags.
City Endo. P-val. Overid. P-val.
Boston 36.98 0.00000 17.64 0.12698
Chicago 58.77 0.00000 10.29 0.59048
Denver 21.78 0.00000 15.93 0.19441
Las Vegas 23.09 0.00000 17.97 0.11675
Los Angeles 6.04 0.01452 14.07 0.29658
Miami 28.21 0.00000 11.29 0.50448
New York City 54.15 0.00000 13.31 0.34694
San Diego 19.35 0.00001 11.41 0.49407
San Francisco 11.4 0.00082 20.51 0.05804
Washington DC 21.7 0.00000 15.22 0.2297
Charlotte 28.68 0.00000 15.16 0.23297
Cleveland 10.53 0.00129 13.97 0.30288
Portland 74.77 0.00000 13.92 0.30584
Tampa 74.32 0.00000 19.39 0.07959
26
Table 6: ARMA Estimation
This table report ARMA (1, 4) estimation results. Robust t-statistics are displayed in parenthesis.
City Slope Std. Err.
Boston 0.0031 0.97 -0.32 -0.12 -0.47 0.22 0.00379
(1.59) (68.92) (-5.17) (-2.16) (-9.33) (2.6) (6.60)
Chicago 0.00314 0.96 -0.38 -0.02 -0.36 0.04 0.00521
(1.26) (33.91) (-5.71) (-0.29) (-6.36) (0.75) (4.79)
Denver 0.00352 0.97 -0.37 0.00 -0.43 0.15 0.00311
(1.88) (48.34) (-6.06) (0.00) (-7.86) (3.03) (7.60)
Las Vegas 0.00285 0.94 -0.29 0.08 -0.19 0.11 0.00618
(0.78) (32.6) (-3.71) (0.80) (-2.01) (1.87) (3.07)
Los Angeles 0.00445 0.94 -0.07 0.04 -0.08 0.18 0.00375
(1.44) (31.27) (-0.95) (0.60) (-1.12) (3.33) (7.79)
Miami 0.00334 0.97 -0.23 0.16 -0.44 0.12 0.00418
(0.99) (47.47) (-2.36) (2.42) (-6.26) (1.47) (4.74)
New York City 0.00358 0.96 -0.30 0.12 -0.35 0.11 0.00302
(1.21) (32.41) (-4.27) (1.73) (-3.91) (1.73) (0.05)
San Diego 0.00437 0.94 -0.45 0.16 -0.26 0.19 0.0052
(1.44) (48.34) (-4.35) (1.81) (-2.42) (2.55) (7.67)
San Francisco 0.00468 0.9 -0.07 0.07 -0.20 0.09 0.00567
(1.73) (19.35) (-0.80) (0.84) (-3.85) (1.49) (3.09)
Washington DC 0.00392 0.96 -0.28 0.08 -0.24 0.05 0.00381
(1.55) (43.9) (-4.01) (1.23) (-4.04) (0.96) (5.49)
Charlotte 0.00249 0.95 -0.58 0.08 -0.46 0.25 0.00356
(2.59) (27.38) (-6.88) (0.92) (-9.65) (4.34) (7.23)
Cleveland 0.00237 0.98 -0.69 0.10 -0.49 0.18 0.00493
(2.03) (63.78) (-6.70) (0.58) (-5.35) (1.20) (1.12)
Portland 0.00464 0.95 -0.41 0.10 -0.43 0.20 0.00432
(2.49) (35.38) (-4.59) (1.60) (-6.71) (2.18) (9.49)
Tampa 0.00303 0.96 -0.33 0.05 -0.42 0.22 0.00462
(1.07) (41.78) (-5.00) (0.74) (-8.31) (4.70) (3.37)
27
Table 7: Information Criteria
This table reports the AIC (Akaike Information Criterion) and the BIC (Bayesian Information Criterion) for the OLS, FGLS and ARMA (1, 4) specifications respectively.
OLS FGLS ARMA(1,4)
City AIC BIC AIC BIC AIC BIC
Boston -2796.28 -2788.58 -2801.93 -2786.59 -2868.45 -2841.51
Chicago -2616.20 -2608.51 -2619.96 -2604.63 -2649.18 -2622.23
Denver -2929.27 -2921.58 -2967.33 -2951.99 -3006.16 -2979.21
Las Vegas -2503.14 -2495.45 -2501.47 -2486.13 -2529.04 -2502.09
Los Angeles -2858.97 -2851.28 -2842.12 -2826.78 -2875.4 -2848.45
Miami -2722.11 -2714.42 -2759.92 -2744.58 -2799.97 -2773.02
New York City -2978.15 -2970.45 -2971.97 -2956.63 -3026.62 -2999.68
San Diego -2566.44 -2558.75 -2609.22 -2593.88 -2649.31 -2622.37
San Francisco -2570.54 -2562.85 -2554.19 -2538.85 -2589.18 -2562.24
Washington DC -2828.38 -2820.69 -2828.50 -2813.16 -2864.96 -2838.02
Charlotte -2819.67 -2811.98 -2870.50 -2855.16 -2912.67 -2885.73
Cleveland -2602.47 -2594.78 -2655.42 -2640.08 -2687.09 -2660.14
Portland -2701.78 -2694.09 -2735.67 -2720.33 -2778.36 -2751.42
Tampa -2645.99 -2638.30 -2697.04 -2681.70 -2731.80 -2704.85
28
References
Case, K., and R., Shiller, 1987, “Price for Single-Family Homes since 1970: New Indexes for Four Cities,” New England Economic Review, September/October, 45-56
Case, K. and R., Shiller, 1989, “The Efficiency of the Market for Single Family Homes”, American Economic Review, 1, 125-137
Case, K. and R., Shiller, 1990, "Forecasting Prices and Excess Returns in the Housing Market" AREUEA Journal, 18, 253–273.
Cochrane, D. and G. H. Orcutt, 1949, "Application of Least Squares Regression to Relationships Containing Auto-Correlated Error Terms". Journal of the American Statistical Association, 44, 32–61
Cotter, J., S. Gabriel and R. Roll, 2015, “Can Housing Risk be diversified? A Cautionary Tale from the Housing Boom and Bust”, Review of Financial Studies, 28, 913-936
Cotter, J. and R., Roll, 2014, “A Comparative Anatomy of Residential REITS and Private Real Estate Markets: Returns, Risks and Distributional Characteristics”, Real Estate Economics, 1-32
Ghysels, E., A. Plazzi, W. Torous and R. Valkanov, 2013, “Forecasting Real Estate Prices”, Handbook of Economic Forecasting, 2 Amsterdam: North-Holland
Glaeser, E., J. Gyourko, E. Morales and C. Nathanson, 2014, “Housing dynamics: An urban approach”, Journal of Urban Economics, 81, 45-56
Glaeser, E. and C.Nathanson, 2015,”An Extrapolative Model of House Price Dynamics”, 2015, National Bureau of Economic Research
Greene, W. H., 1993,”Econometric Analysis”, 2nd Edition, McMillan Publishing Company Guren, A.M., 2015, “The Causes and Consequences of House Price Momentum”, Working Paper Harvey, A.C., 1981, “Time Series Models”, John Wiley & Sons, New York.
Harvey, A.C., 1990, “The Econometric Analysis of Time Series”, MIT Press, Cambridge, Massachusetts.
Hatanaka, M., 1974,”An Efficient Estimator for the Dynamic Adjustment Model with Autoregressive Disturbances”, Journal of Econometrics, 2, 199-220
Kortweg, A. and M. Sorensen, 2016, “Estimating Loan to Value Distributions”, Real Estate Economics, 44, 41-86
Novy-Marx, R., 2009, “Hot and Cold Markets”. Real Estate Economics, 37, 1-22
29
Shiller, R., 1991, “Arithmetic Repeat Sale Price Estimators”, Journal of Housing Economics, 1, 110-126 Shiller, R., 2013,”Speculative Asset Prices”, Nobel Prize Lecture Silverstein, J., 2014, “House Price Indexes: Methodology and Revisions”, Federal Reserve Bank of Philadelphia, Working Paper. Stein, J., 1995, “Prices and Trading Volume in the Housing Market: A Model with Down-Payment Effects”, The Quarterly Journal of Economics, 110, 379-406 S&P Dow Jones Indices, 2015, “S&P/Case-Shiller Home Price Indices Methodology”, Technical Documentation. Staiger, D., and J. H., Stock, 1997, “Instrumental Variable Regression with Weak Instruments”, Econometrica, 65, 557-586 Stock, J.H., and M. Yogo, 2005, “Testing for Weak Instruments in Linear IV regression”, Identification and inference for Econometrics Models: Essays in Honor of Thomas Rothenberg, ed. D. W. K. Andrews and J. H. Stock, 80-108, Cambridge University Press. Wooldridge, J. M, 1995, “Score diagnostics for linear models estimated by two stage least squares”, Advances in Econometrics and Quantitative Economics: Essays in Honor of Professor C. R. Rao, ed. G. S. Maddala, P. C. B.Phillips, and T. N. Srinivasan, 66–87, Blackwell, Oxford.