Hedonic analysis with locally weighted regression - Agricultural and

24
Hedonic analysis with locally weighted regression: An application to the shadow cost of housing regulation in Southern California David L. Sunding a , Aaron M. Swoboda b, a University of California, Berkeley, United States b Carleton College, United States abstract article info Article history: Received 16 September 2008 Received in revised form 7 July 2010 Accepted 12 July 2010 Available online 13 August 2010 JEL classication: R14 R21 R31 R38 Keywords: Hedonic analysis Housing regulation Land price Locally weighted regression Geographically weighted regression Geo-referenced data Semi-parametric econometrics This paper investigates the role of hedonic model misspecication through inappropriate geographic aggregation in the debate over the effects of housing regulation. We use locally weighted regression (LWR) techniques and geo-referenced data to allow the housing hedonic parameters to vary over space. This modeling strategy better represents micro-market realities and the importance of location as a prime determinant of housing prices. Our results, based on a unique dataset of almost 14,000 single-family home sales between 1993 and 2001 in Southern California, suggest regulation had strong direct impacts on the housing market as suggested by Glaeser and Gyourko (2003) and Cheung et al. (2009a) and not indirectly through increased land scarcity as suggested by Davis and Palumbo (2007). © 2010 Elsevier B.V. All rights reserved. 1. Hedonic regression and housing regulation Economists commonly apply the hedonic valuation method in housing markets to measure consumers' willingness to pay for non- market goods to aid in the policy-making process. A recent search for hedonicand hous*in EconLit 1 yielded over 600 peer-reviewed articles, over half of which have been published in the last ten years. Researchers perform hedonic analyses to value clean air (Harrison and Rubinfeld, 1978), open space (Irwin, 2002; Anderson and West, 2006), and access to quality schools (Brasingtion, 1999), to name but a few examples. Hedonic regression techniques also play an important role in the debate over the impact of land use regulation in the housing market. The housing regulation debate is currently split into two camps: those who believe that the scarcity of developable land is the driving force in housing markets and those who believe that regulation directly affects the price of housing through such policies as limits on the number of houses constructed. The land scarcityresearchers, such as Davis and Palumbo (2007), claim that house prices are simply the sum of construction costs and land prices, with land prices capitalizing the value of local (dis)amenities and scarcity (through geographic and regulatory constraints). Under this scenario regulation will affect housing prices through the land market and changes in housing prices are primarily a function of changing land values. Glaeser and Gyourko (2003) typify the other camp claiming that in many areas, notably the east and west coasts of the United States, regulation increases the price of housing by rationing the number of houses that can be built in an area. Such regulations drive a wedge between the price of housing and the sum of the land and structure value creating an implicit tax, or shadow price, in the housing market. Researchers and policy makers can determine whether a particular housing market is land- or housing-scarce by comparing the value of land at competing margins. A formal model is presented in the next section, but for now note consumers' value of land for additional lot size will be equal to the value of land for additional housing in a land-scarce market while consumers' value of land for additional lot size will be lower than the value of land for additional housing construction in a housing-scarce market. Glaeser and Gyourko (2003) estimate consumers' valuation of land for increased lot size by conducting hedonic regression analysis of housing prices as a function of lot size and other variables at the metropolitan level. Regional Science and Urban Economics 40 (2010) 550573 Corresponding author. 1 North College St, Northeld, MN 55057, United States. E-mail address: [email protected] (A.M. Swoboda). 1 http://www.econlit.org. 0166-0462/$ see front matter © 2010 Elsevier B.V. All rights reserved. doi:10.1016/j.regsciurbeco.2010.07.002 Contents lists available at ScienceDirect Regional Science and Urban Economics journal homepage: www.elsevier.com/locate/regec

Transcript of Hedonic analysis with locally weighted regression - Agricultural and

Page 1: Hedonic analysis with locally weighted regression - Agricultural and

Hedonic analysis with locally weighted regression: An application to the shadow costof housing regulation in Southern California

David L. Sunding a, Aaron M. Swoboda b,⁎a University of California, Berkeley, United Statesb Carleton College, United States

a b s t r a c ta r t i c l e i n f o

Article history:Received 16 September 2008Received in revised form 7 July 2010Accepted 12 July 2010Available online 13 August 2010

JEL classification:R14R21R31R38

Keywords:Hedonic analysisHousing regulationLand priceLocally weighted regressionGeographically weighted regressionGeo-referenced dataSemi-parametric econometrics

This paper investigates the role of hedonic model misspecification through inappropriate geographicaggregation in the debate over the effects of housing regulation. We use locally weighted regression (LWR)techniques and geo-referenced data to allow the housing hedonic parameters to vary over space. Thismodeling strategy better represents micro-market realities and the importance of location as a primedeterminant of housing prices. Our results, based on a unique dataset of almost 14,000 single-family homesales between 1993 and 2001 in Southern California, suggest regulation had strong direct impacts on thehousing market as suggested by Glaeser and Gyourko (2003) and Cheung et al. (2009a) and not indirectlythrough increased land scarcity as suggested by Davis and Palumbo (2007).

© 2010 Elsevier B.V. All rights reserved.

1. Hedonic regression and housing regulation

Economists commonly apply the hedonic valuation method inhousing markets to measure consumers' willingness to pay for non-market goods to aid in the policy-making process. A recent search for“hedonic” and “hous*” in EconLit1 yielded over 600 peer-reviewedarticles, over half of which have been published in the last ten years.Researchers perform hedonic analyses to value clean air (Harrison andRubinfeld, 1978), open space (Irwin, 2002; Anderson andWest, 2006),and access to quality schools (Brasingtion, 1999), to name but a fewexamples. Hedonic regression techniques also play an important rolein the debate over the impact of land use regulation in the housingmarket.

The housing regulation debate is currently split into two camps:those who believe that the scarcity of developable land is the drivingforce in housing markets and those who believe that regulationdirectly affects the price of housing through such policies as limits onthe number of houses constructed. The “land scarcity” researchers,

such as Davis and Palumbo (2007), claim that house prices are simplythe sum of construction costs and land prices, with land pricescapitalizing the value of local (dis)amenities and scarcity (throughgeographic and regulatory constraints). Under this scenario regulationwill affect housing prices through the land market and changes inhousing prices are primarily a function of changing land values.Glaeser and Gyourko (2003) typify the other camp claiming that inmany areas, notably the east and west coasts of the United States,regulation increases the price of housing by rationing the number ofhouses that can be built in an area. Such regulations drive a wedgebetween the price of housing and the sum of the land and structurevalue creating an implicit tax, or shadow price, in the housing market.

Researchers and policy makers can determine whether a particularhousingmarket is land- or housing-scarce by comparing the value of landat competingmargins. A formalmodel is presented in thenext section, butfor now note consumers' value of land for additional lot size will be equalto the value of land for additional housing in a land-scarce market whileconsumers' valueof land for additional lot sizewill be lower than thevalueof land for additional housing construction in a housing-scarce market.Glaeser and Gyourko (2003) estimate consumers' valuation of land forincreased lot size by conducting hedonic regression analysis of housingprices as a functionof lot size andother variables at themetropolitan level.

Regional Science and Urban Economics 40 (2010) 550–573

⁎ Corresponding author. 1 North College St, Northfield, MN 55057, United States.E-mail address: [email protected] (A.M. Swoboda).

1 http://www.econlit.org.

0166-0462/$ – see front matter © 2010 Elsevier B.V. All rights reserved.doi:10.1016/j.regsciurbeco.2010.07.002

Contents lists available at ScienceDirect

Regional Science and Urban Economics

j ourna l homepage: www.e lsev ie r.com/ locate / regec

Page 2: Hedonic analysis with locally weighted regression - Agricultural and

They conclude that regulation has created a significant shadow price inhousing markets for which their hedonic estimates are lower than theirestimates of the value of land used to build additional housing. Thus,appropriate hedonic analysis is crucial to proper estimation and inferenceregarding the qualitative and quantitative impacts of regulation in thehousing market.

While there is little theoretical guidance as to the appropriate level ofdata aggregation for the hedonic analysis, inappropriate data aggrega-tion can lead to biased estimates of the marginal effect of land onhousing prices. The two important conditions for this bias are:1) variation in the price of land within an area in which it is assumedto be constant, and 2) consumers respond to higher land prices bypurchasing houses with smaller lots. The neo-classical urban economicmodel2 predicts exactly these conditions: continuously varying landprices to reflect changing local amenities and smaller lot sizes in areaswith higher land prices, all other things equal. Section 5 of this papershows over-aggregation bias will lead to over-rejection of the nullhypothesis of no regulatory impacts, which means the claims ofregulatory impacts by Glaeser and Gyourko (2003) and Cheung et al.(2009b) may in fact be the result of misspecified hedonic models.

This paper advances the debate over the impact of regulation on thehousing market by relaxing the common but important assumption thatresearchers knowhow to spatially parameterize the hedonic analysis.Weuse locally weighted regression (LWR) to allow consumers' marginalwillingness to pay for land to explicitly vary over space. This spatiallydisaggregated econometric modeling techniquemore closelymimics realestate markets wherein land capitalizes local amenities rather thanassuming a constant value of land across municipalities or metropolitanareas. As far as we can tell this paper is the first to explore the use of LWRto better understand regulatory impacts in the housing market. Byallowing for hedonic parameter heterogeneity, we show that estimates ofthe shadow price of housing can be dramatically different than thosepredicted under other techniques, which suggests that future researchshould continue to investigate the other assumptions implicit in theempirical housing regulation literature.

2. A simple model of housing regulation

The implementation of LWR has the potential to substantiallyimprove research on the economic impacts of housing regulation. Webegin our application of LWR to this question by constructing a simplemodel of housing regulation and the land development problem.Imagine a profit maximizing housing developer choosing the numberof houses to build and the amount of land associated with each housein a particular neighborhood. Let H be the total number of housesproduced, L be the quantity of land used per house, and P representthe price of a housing bundle, including the house and lot. Thefunction P(d ) relates the price of the home to the amount of landassociated each house, a vector of neighborhood characteristics, α,and the total number of houses built in the neighborhood, althoughwe assume producers only directly affect the price through changes inlot sizes. Let the land market be defined by a fixed amount ofdevelopable land, N. The developer's profit maximization problem is

maxL;H;μ π = P L;αð ÞH−K Hð Þ + μ #N−LH! "

; ð1Þ

where μ is the price of land. K(H) represents the structuralconstruction costs as a function of the number of houses in thedevelopment. Optimal production occurs where P−KH−μL=0, andPLH−μH=0. The equilibrium house price is equal to the marginalconstruction and development costs plus the equilibrium price of theland it is built upon. Further, the price of land, denoted by μ, is equal to

consumers' marginal willingness to pay for lot space. Combining theseequations yields the condition for optimal subdivision, namely,

P−KH

L−PL = 0: ð2Þ

Inwords, this condition requires that the value of landused toproduceadditional housing is equal to consumers' marginal valuation of yardspace. In effect, consumers purchase yard space by bidding land awayfrom new housing construction until their marginal valuation of land isequal to the returns to land fromhousing construction. If this equality didnot hold, a profit maximizing housing developer could increase its profitsby devotingmore land to the higher valued use. Note that PL is a functionof the local amenities and to the degree that amenities vary over space(such as access to quality schools, transit, views of the surroundinglandscape, noise, pollution, etc.) the value of land will also vary.

The equilibrium results of Brueckner (1983) can be used to show thissubdivision condition is a direct result of extending the urban economicsliterature developed by Alonso (1964), Mills (1967, 1972), Muth (1969)and Beckmann (1969). This literature assumes that land is the scarceresource and accrues Ricardian rents. Somerville (1996) examines thecontribution of land and structure to builder profits using amodel of landscarcity and a micro-dataset of homebuilder land and structure ex-penditures. He finds a smaller than expected markup on land expendi-tures than predicted under a system of land scarcity. The coefficient onland expenditures is not significantly different from one for a majority oftheMetropolitanStatisticalAreas inhis study.Rosenthal (1999)concludesthe implicit market for residential buildings is efficient in Vancouver,British Columbia from 1979 to 1989 since deviations between the price ofnew buildings and their construction costs are dissipated more quicklythan the time required for new construction. He concludes that anyinefficiency in the housing market must lie in the market for land.

We now examine the possibility that regulation places direct limitson the amount of new housing produced, a condition that we refer toin this paper as “regulatory rationing.” Under this scenario bindingconstraints create economic rents that cannot be captured in the landmarket as is the standard assumption. Assume regulators constrainthe maximum number of houses produced in a neighborhood to be Hand let λ represent the resulting shadow price of housing, the“implicit tax” of regulation. Eq. (1) is now replaced by

maxL;H;μ;λ π = P L;αð ÞH−K Hð Þ + μ #N−LH! "

+ λ #H−H! "

: ð3Þ

The equilibrium relationships between the variables of interest areshown in Eq. (4). Consumers' willingness to pay for more land (whichsomepapers refer to as the intensivemargin value) and the valueof landfor new housing construction (the extensive margin value) are nolonger equal. Rather, because regulation limits the number of housesproduced, the returns to land used to construct additional housing aregreater than consumers' valuation of land for additional yard space,3 or

P−KH

L−PL =

λL

N 0: ð4Þ

Rearranging Eq. (4) yields,

Shadow Price of Housing = House Price−Construction Costs−PL × Lot Size:

ð5Þ

Any regulation that affects the housing market by reducing theamount of buildable landwhile still allowing the developer to choose theconstruction density will affect the market price for housing, but wecannot test for the presence of such policies using Eq. (4). Many commontypes of land use regulation, such as urban growth boundaries, building

2 See, for example, Alonso (1964), Mills (1967, 1972), Muth (1969) and Beckmann(1969).

3 Producer market power will cause P−KH

LN PL , however there seems to be little

reason to suspect significant producer market power in the absence of regulations thatlimit entry and production.

551D.L. Sunding, A.M. Swoboda / Regional Science and Urban Economics 40 (2010) 550–573

Page 3: Hedonic analysis with locally weighted regression - Agricultural and

location restrictions and land taxes, can have large impacts on theequilibrium price of housing, but will not create a wedge between theextensive and intensivemargin values of land. The test in Eq. (4) is bettercharacterized as a test for rationing of the housing stock. As such, we referto a gap between housing price and the sum of construction costs andintensive margin land values as the shadow price of housing rather thanthe implicit tax of regulation as doGlaeser andGyourko (2003), Cheung etal. (2009a,b) to acknowledge this difference.

Constructing abettermeasureof the shadowpriceof housinghas largeimplications for the housing regulation literature and the debate overwhether land scarcity (whether real or artificial via regulation) andconstruction costs are the prime determinants of housing prices, orwhether other types of regulation are having direct impacts on houseprices outside of the land market. In particular, knowing which modeldoes a better job of describing the housingmarket can help policymakersconstruct efficient future policies with higher benefits and lower overallcosts to society. For instance, Eq. (4) shows taxes on housing productionwill have no effect on output as long as the tax is less than λ. Landrestrictions or taxes will cause nothing more than an increase in housingdensity, if allowedby the regulatory regime.However,policies that furtherlimit housing production will have marginal costs equal to the existingshadow price of housing, much larger than that predicted under thestandard economic model. It should also be noted that land preservationfor biodiversity protection or open space provision may be accomplishedat lower cost through better instrument choice when the underlyingmarket structure is better understood.

3. Southern California housing data description

We construct a unique dataset of 13,857 single-family hometransactions between 1993 and 2001 in the Inland Empire area of

Southern California, including parts of Los Angeles, San Bernardinoand Riverside Counties (see Fig. 1). This region contained a substantialportion of Los Angeles' exurban growth over the time period.

Each observation represents an arm's-length market transaction of ahouse within two years of its construction. In contrast, Glaeser andGyourko (2003) use homeowner estimates of the house price and includehouses of all ages. Cheung et al. (2009b) use arm's-length market trans-actions fromFloridabut also includeolderhouses (in fact, they includenotonly age, but also age2 in their hedonic regression but the coefficients onthe age variables are not reported). We believe the inclusion of olderhouses introduces problems of depreciation and vintage effects.

Our data consist of the basic sales characteristics and the physicalattributes of the home, including its exact latitude and longitudeallowing for the use of Geographic Information Systems. A “medianhouse” for the entire regionwould have four bedrooms, two-and-a-halfbathrooms, 2032 ft2 of living space, a lot size of 7200 ft2, and sell forapproximately $267,000 in the year 2003. The data are commerciallyavailable through DataQuick.com, a company that aggregates anddistributes real estate data. Table 1 presents further descriptive statisticsfor the relevant variableswithin the study region. Table 2 presents a rawcorrelation matrix for the variables in question. The strongest

Fig. 1. Overview of the study area.

Table 1Basic summary statistics for the housing transaction data.

Mean St. Dev. Min 25% Median 75% Max

Price (2003 $) 283,936 88,481 70,766 223,211 266,724 324,704 669,327Lot Size (ft2) 7466 2807 2178 5663 7200 8400 21,600Living Space (ft2) 2112 561 773 1687 2032 2482 4112# Bedrooms 3.7 0.8 2.0 3.0 4.0 4.0 6.0# Bathrooms 2.5 0.5 1.0 2.0 2.5 3.0 4.2Year of Sale 1997 2 1993 1995 1997 1999 2001

Table 2Raw correlation matrix for variables of interest.

Price Lot Live Bath Bed

Lot size 0.30Living area 0.72 0.33Bathrooms 0.56 0.16 0.66Bedrooms 0.40 0.24 0.60 0.56Year 0.05 0.02 0.26 0.02 0.03

552 D.L. Sunding, A.M. Swoboda / Regional Science and Urban Economics 40 (2010) 550–573

Page 4: Hedonic analysis with locally weighted regression - Agricultural and

correlations exist between the price, living space, number of bathroomsand number of bedrooms; all other correlations are less than 0.35.

All analysis is in year 2003 US dollars; sales prices are convertedusing the ConventionalMortgageHome Price Index (CMHPI) by FreddieMac.4 This index uses repeat home sales to establish a price index forMetropolitanStatisticalAreas (definedby theOfficeofManagementandBudget) over time in order to compare home prices across time. Theindices for Los Angeles and San Bernardino/Riverside are used for thisanalysis. Themean (median) adjusted sale price is $284,000 ($267,000)with a standard deviation of roughly $88,500.

The mean lot size for the study region is roughly 7500 ft2, with amedian of 7200 ft2 and a standard deviation of approximately 2800 ft2.Themean (median) living space across the study region is just over 2100

(2000) ft2, with a standard deviation of approximately 560. The mean(median) number of bedrooms is 3.7 (4)with a standard deviation of 0.8.The mean (median) number of bathrooms is 2.5 (2.5) with a standarddeviation of 0.5. Figs. 2–4 show the spatial distribution of the adjustedsales price, lot size, and living area across the study area. Neo-classicalurban economic theory would suggest that houses closer to Los Angeleshave higher prices, smaller lots and less living space. The sales price dataroughly follow the predicted spatial pattern of higher prices closer to LosAngeles and lower prices to the east (Fig. 2). However, lot size and livingarea do not follow such an obvious pattern. Figures 5, 6 and 7 display thespatial distribution of the year built, number of bedrooms and bathroomsfor our data.5

Fig. 2. Univariate and spatial distribution of sales prices. Note the spatial pattern of higher prices to the West and lower prices to the East.

4 The CMHPI data are published quarterly and are available at http://www.freddiemac.com/finance/cmhpi/.

Fig. 3. Univariate and spatial distribution of lot size.

5 The maps in Figs. 2–7 represent smoothed surfaces as a result of Inverse DistanceWeighting techniques in ARC GIS. A cell size of 200 m2 and radius of 1 mile were used.

553D.L. Sunding, A.M. Swoboda / Regional Science and Urban Economics 40 (2010) 550–573

Page 5: Hedonic analysis with locally weighted regression - Agricultural and

The Residential Cost Handbook by Marshall and Swift (2002)provides detailed construction cost estimates for our study region.Cost estimates are a function of the location, construction materialsused, overall quality level, house size, and amenities such as pools,

fireplaces, and numbers of floors. We make the simplifying assumptionthat all houses are of average quality.6 We also include other housingproduction costs, including site preparation, design, regulatory fees, andmarketing costs. We estimate such “soft costs” as $35/ft2 of living spacefor average quality homes, which is consistent with other work done inthe area by Economic and Planning Systems (2005). Under theseassumptions the structural costs are roughly $100/ft2 of living space.7

4. Global regression model

Estimation results from an ordinary least squares hedonic model arepresented inTable 3 (R2=0.56 and σ = 58;709). Across the study areahousing prices increase an average of $2.15 for every square foot increase

Fig. 4. Univariate and spatial distribution of house living area.

6 Marshall and Swift (2002) describe “average quality” homes as the mostfrequently encountered residence type. They are often mass produced and exceedthe minimum construction requirements of lending institutions, mortgage insuringagencies and building codes. Workmanship is acceptable, but does not reflect customcraftsmanship. Cabinets, doors, hardware and plumbing are usually stock items.Architectural design includes ample fenestration and some ornamentation. Carpet,hardwood, vinyl composition tile or sheet vinyl floor cover is used. Interior walls aretaped and painted drywall with some inexpensive wallpaper or paneling. Kitchen andbath have enamel painted walls and ceilings. Countertops are laminated plastic orceramic tile. Adequate number of electrical outlets with some luminous fixtures inkitchen and bath areas. Eight average quality plumbing fixtures are included.

Fig. 5. Univariate and spatial distribution of sale year.

7 We relax the assumption of average quality housing construction in Section 7.

554 D.L. Sunding, A.M. Swoboda / Regional Science and Urban Economics 40 (2010) 550–573

Page 6: Hedonic analysis with locally weighted regression - Agricultural and

in lot size, $112 for every square foot increase in living area, and almost$27,000 for each additional bathroom. The negative coefficient on“number of bedrooms” seems unintuitive, but the concomitant positive“living area” coefficient implies conditional on the living area houseswithfewer bedrooms have higher prices: put another way, consumers prefer

larger bedrooms. The negative and significant “Time Trend” is worrisomebecause adjusting prices by the ConventionalMortgageHouse Price Indexshould have eliminated price differences across time. Regardless, thecoefficient on the time trend suggests the marginal effect of a one yearincrease in the sales date results in an approximately $5000 change inadjusted sales price.

Table 4 presents the OLS estimates for the four different com-binations of these two potentially troublesome variables. Thecoefficient of interest, the change in the price of housing due to aone square foot change in the size of the lot associated with a house,varies between $1.93 and $2.53, with a mean of $2.23.

4.1. The shadow price of housing

The OLS regression results for consumers' willingness to pay for lotsize, in combination with house price and construction data can be

Fig. 6. Univariate and spatial distribution of # of bedrooms.

Fig. 7. Univariate and spatial distribution of # of bathrooms.

Table 3Ordinary least squares regression results for house prices (three significant digits),R2=0.56, σ=58,709.

Estimate St. Err. t-stat p-value

Constant 30,700 3140 9.8 0.00Lot Size (ft2) 2.15 0.2 11.3 0.00Living Space (ft2) 112 1.4 79.5 0.00# Baths 26,800 1500 17.8 0.00# Bedrooms −12,500 804 −15.6 0.00Time Trend −4820 216 −22.3 0.00

555D.L. Sunding, A.M. Swoboda / Regional Science and Urban Economics 40 (2010) 550–573

Page 7: Hedonic analysis with locally weighted regression - Agricultural and

used to estimate the shadow price of housing according to Eq. (6). Themean estimated shadow price of housing is roughly $64,200 with astandard deviation of $63,000. Fig. 8 displays the distribution of theshadow price of housing estimates. The distribution shows a strongspatial pattern, with larger estimates of the shadow price of housingnear Los Angeles and smaller estimates farther away and generally tothe east.

Shadow Price of Housing = House Price−Construction Costs−PL × Lot Size:

ð6Þ

4.2. Alternative specifications

This section uses alternative model specifications to estimate theconsumers' marginal willingness to pay for land across the studyregion. We estimate a log–log hedonic model, a model that includes aquadratic term for lot size, as well as a model that disaggregates thedata to the census tract-level.

4.2.1. Log–log specificationWe estimate a log–log hedonic model specification, in which

logarithms are taken for the house price and the continuous andpositive independent variables as in Eq. (7). Table 5 displays theresults of this regression model. The estimated marginal effect of lotsize on the price of housing is no longer constant across the study area,but rather a function of the lot size of the house as shown in Eq. (8).

The marginal price effect of increasing the lot size is substantiallysmaller in the log–log specification than under the linear model. Thisis shown in Fig. 9, where a vast majority of the estimated marginaleffects are less than the OLS estimate of $2.15. The decrease in theestimated marginal effect will result in an increase in the estimatedshadow price of housing. The mean estimated marginal effect is $1.16and the standard deviation is $0.46. Fig. A.1 shows the spatialdistribution of the estimated shadow price of housing; the mean is$72,312 and the standard deviation is $61,266.

ln Priceð Þ = β0 + β1ln Lot Sizeð Þ + … + ! ð7Þ

PL = β1⁎Price

Lot Size: ð8Þ

4.2.2. Quadratic formulationIncluding a quadratic term in the hedonic model also incorporates

non-constant marginal effects of lot size on the price of housing.Table 6 displays the results of this regression.While many of the othercoefficients remain similar to Table 3, the coefficient on “Lot Size” hasbecome negative. This value, −3.85, combined with the positivecoefficient for the quadratic term, 3e−04, suggests that increasing lotsizes are correlated with increased housing prices above roughly sixthousand square feet, and decreased housing prices for smaller lots.Fig. 9 displays the estimated marginal effects for our lot size data; themean marginal effect is $0.62 and the standard deviation is $1.68. Theshadow prices of housing resulting from the quadratic hedonic modelmarginal effects are shown in Fig. A.2. The mean shadow price is$70,893 and the standard deviation is $63,676.

Table 4OLS hedonic regression results for various models (dependent variable=house price).Coefficients are reported to three significant digits.

(1) (2) (3) (4)

Lot size 2.15 2.53 1.93 2.32Living space 112 101 104 95Baths 26,800 32,700 20,600 27,000Bedrooms −12,500 −10,700Time trend −4820 −4480Constant 30,700 9560 15,500 −2,330

R2 0.56 0.54 0.55 0.54σ 58,709 59,751 59,220 60,124

Fig. 8. The spatial and univariate distribution of the shadow price of housing estimates (standard model). The mean shadow price of housing is roughly $64,000 and the standarddeviation is $63,000.

Table 5Log–log ordinary least squares regression results (three significant digits), R2=0.6,σ=0.185.

Estimate St. Err. t-stat p-value

Constant 5.94 0.064 93.3 0.000ln(lot size) 0.03 0.005 5.6 0.000ln(living space) 0.83 0.009 87.6 0.000# Baths −0.04 0.003 −16.5 0.000# Bedrooms 0.08 0.005 17.0 0.000Time trend −0.01 0.001 −21.9 0.000

556 D.L. Sunding, A.M. Swoboda / Regional Science and Urban Economics 40 (2010) 550–573

Page 8: Hedonic analysis with locally weighted regression - Agricultural and

4.3. Census tract analysis

Rather than conduct analysis at the global level, we can attempt toaccount for spatial heterogeneity by disaggregating the data. Thissection presents the results of one such specification: analysis at thecensus tract-level. Attempting to run separate OLS regressions for eachof the 156 census tract in our data creates sample size problems incertain census tracts. Table 7 displays the variation in the observationfrequency for census tracts: 17 census tracts have only one observation,while one census tract (G0600710008702) has 1345 observations. Forour analysis we eliminate all census tracts with fewer than 30observations, leaving a total of 61 census tracts and 13008 observations.The mean estimatedmarginal effect of an increase in the lot size by onesquare foot is $2.57 with a standard deviation of $3.53. These marginaleffects are manifested in the estimated shadow price of housing inFig. A.3, the mean shadow price of housing is $60,271 and the standarddeviation is $64,706.

Researchers might choose to aggregate census tracts with smallnumbers of observations, but the question remains, are census tractsthe appropriate level of aggregation? Is there any reason to believethat regression models across census tracts should be independentand that the information contained in data just across a census tract

boundary should be excluded from the regression model? For com-parison, Fig. 9 displays the distribution of the estimated coefficientsassociated with the house lot size for all of the models discussed inthis section.8 Similarly, Fig. 10 displays the distribution of shadowprices of housing for the four global models.

5. Locally weighted regression model

5.1. Intuition

Estimates of the “average” consumer willingness to pay for landthroughhedonic analysis performed at the incorrect level of aggregationhave the potential to be severely biased. Fig. 11 is a hypothetical but

Distributions of Lot Size Marginal Effect Estimates

Marginal Effect of Increased Lot Size (per ft2)

Rela

tive

Freq

uenc

y

−$5 $0 $5 $10 $15

0

Quadratic

Log−Log

Standard Model

Census Tractx = 0.62

x = 1.16

x = 2.15

x = 2.57σ = 1.66

σ = 0.45

σ = 0

σ = 3.44

Fig. 9. The distribution of themarginal effect of increased lot sizes on the price of housingresulting from alternative regression model specifications: the standard model,inclusion of a quadratic term, a log–log formulation, and census tract-level dataaggregation. (The census tract-level regressions are very similar whether the standardlinear, quadratic, or log–log model was used. For the sake of brevity we only present thelinear model results.)

Table 6Quadratic lot size ordinary least squares regression of housing prices. Note that thequadratic lot size coefficient is positive, reflecting an increasing marginal willingness topay for lot size, rather than a declining willingness to pay. R2=0.562, σ=58,500.

Estimate St. Err. t-stat p-value

Constant 54,600 4100 13.3 0.000Lot size −3.85 0.69 −5.6 0.000Lot size2 0.0003 0.00003 9.0 0.000Living space 113 1.4 80.2 0.000# Baths 25,100 1510 16.6 0.000# Bedrooms −11,600 807 −14.4 0.000Time Trend −4850 216 −22.5 0.000

Table 7Distribution of observations across census tracts.

# Observationsin census tract

0–10 11–30 31–50 51–100 101–300 301–600 600–1350

# Census tracts 63 32 13 13 22 7 6

Distributions of Shadow Price of Housing Estimates

−$100,000 $0 $100,000 $200,000

0

Census Tract

Standard Model

Log−Log

Quadraticx = 60,133

x = 64,101

x = 72,201

x = 70,821σ = 63,574

σ = 61,776

σ = 60,347

σ = 62,660

Fig. 10. The distribution of the estimated shadow price of housing resulting fromalternative regression model specifications: the standard model, inclusion of aquadratic term, a log–log formulation, and census tract-level data aggregation.

Stylized Relationship Between Land and House Prices

Quantity of Land

Hou

se P

rice

Fig. 11. The “global” estimated marginal effect of land on housing prices (the thin blackline) need not be the “average” marginal effect when the data are analyzed at theneighborhood level. Note that the slopes of the lines of best fit for each of the threesubsets (red “+”, purple “•”, and blue “▽”) are greater than the aggregated data line ofbest fit.

8 Note that we have assigned the coefficients to observations, and that Fig. 9displays the coefficients for all of the observations, not simply the 61 estimates foreach census tract. This better accounts for the variation in the number of observationsacross census tracts.

557D.L. Sunding, A.M. Swoboda / Regional Science and Urban Economics 40 (2010) 550–573

Page 9: Hedonic analysis with locally weighted regression - Agricultural and

illustrative example of heterogeneous land prices across a regioncombined with inappropriate aggregation leading to biased estimates.Sales price is plotted against lot size using house-level data; the threecolors represent three different neighborhoods in the region. Aresearcher estimating the marginal effect of increased lot size onhousing prices after aggregating all of the observations into a “global”regression will estimate the black line as the line of best fit. However,this line does not represent the true relationship between the twovariables of interest, nor even “the average” across the study area. Thered, blue, and purple lines show the “true” relationships between houseprice and lot size across the three neighborhoods. Controlling forneighborhood heterogeneity can yield higher marginal willingness topay estimates for every neighborhood. The “average” marginal effectestimated with the global model is anything but.

The problem of inappropriately aggregated data is easily cor-rected in the example in Fig. 11 by running separate regressions (orinteractions) by color. However, empirical data may not be so easilycategorized, for instance, under themonocentric citymodel of Alonso(1964), Muth (1969) and Mills (1967) there is a smoothly varyingrelationship between location and land prices.9 Locally weightedregression (LWR) represents a research strategy to incorporatespatial heterogeneity in hedonic parameters such as land prices. LWRallows for micro-market effects in the land and housing market suchas access to transit, pollution, schools, and other spatially correlatedobservable and unobservable characteristics.10 LWR, combined withcross validation techniques, frees up the researcher from imposing alevel of data aggregation and instead allows the data to “tell” theresearcher the appropriate level of aggregation.

Locally weighted regression (LWR) allows model parameters tovary over space in order to reflect spatial heterogeneity. LWRestimates model parameters like any other weighted least squares(WLS) model as can be seen in Eq. (9), but the weights matrix, W,varies based on location. That is, LWR parameter estimates are afunction of the “local” data. The nature of locally weighted re-gression is that the vector of model parameters, β, is allowed to varyover space, because the weights attached to different observationswithin the data are allowed to vary depending on the location inquestion.

β locationið Þ = XTW locationið ÞX# $−1

XTW locationið ÞY ð9Þ

where X is an n×mmatrix of independent variables, Y is a n×1 vectorof dependent variables, W(locationi) is an n x n matrix with zeros onthe off-diagonal, and the jjth element equal to 0≤wij≤1 depicting theweight the jth observation should have in the regression at locationi(the iith element is set to one). Severalmodeling choices of theweightsmatrix are available, but can essentially be broken into two categories:which observations in the dataset should receive positiveweights (thebandwidth), and what value those weights should be (the kernelfunction). Examples of LWR include Cleveland and Devlin (1988),McMillen (1996), Fotheringham et al. (1998), Pavlov (2000) amongothers, and represents a natural extension to the “spatial expansion”methods in which space is parameterized using polynomials, forexample as in Casetti (1972).

5.2. LWR bandwidth

Thebandwidthof the LWRsystemdetermineswhichobservations inthedatasetwill receive positiveweights in the regressions. Observations

outside of the bandwidth receive zero weight and are therefore ignoredin the regression. In this paper we adopt a “nearest neighbor” approachto setting the LWRbandwidth: the observations included in a regressionestimating the model parameters at a given location are the k nearestobservations. Alternatively, we could have set the bandwidth includingall observations within a certain distance. A distance-based bandwidthwill always be the same area, but will include varying numbers ofobservations over space and cannot be estimated in locations withsparse data densities. The nearest neighbor approach has the intuitiveappeal because the LWR model approaches ordinary least squaresregression when the number of included neighbors is equal to n−1.

Fig. 12 displays the distribution of regression residuals forbandwidths between 13,856 (all of the dataset) and 20 nearestneighbors. There is a strong inverse relationship between the standarddeviation of the regression residuals and the number of observationsincluded in the bandwidth. For comparison, the distribution ofregression residuals for the census tract-level regressions is alsoincluded. Whereas the standard deviation of the standard OLSregression (13,856 observations in the bandwidth) is almost $60,000,the smaller bandwidths achieve standard deviations of approximately$20,000 (and for comparison the census tract-level regression has astandard deviation of roughly $30,000). The fact that smaller band-widths result in more accurate predictions is not surprising. In fact, wecould perfectly predict every single house price if the bandwidthwere tobe decreased to five nearest neighbors (the number of parameters toestimate, six, would be equal to the number of observations, six).

5.3. Cross validation

Given that the regression residuals will naturally tend toward zerowith smaller LWR bandwidths, a diagnostic statistic should be used tochoose the appropriate LWRmodel.We compare LWRmodels via the useof a generalized cross validation (GCV) score, a regression diagnosticstatistic that is closely related to the familiar standard error of theregression, σ ,

σ =

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiSum of the Squared Residuals

n−m

r; ð10Þ

Fig. 12. Distribution of regression residuals for locally weighted regression models withvarying bandwidths. Shrinking the LWR bandwidth from the entire sample (equivalentto a global model) results in residuals closer to zero. The Census Tract regressionresiduals are similar to the LWRmodel with a bandwidth of 250 observations. Note thatthe “LWR” model with a bandwidth of 13,856 observations is exactly the same as thestandard global regression model.

9 Cheung et al. (2009b) allow the “amenity value” of land to vary with distance tothe CentralBusiness District, as predicted by the neo-classical urban economic model,but find that in most cases this locational value is small.10 For example, Redfearn (2009) uses LWR to value access to light rail in Los Angeles.

558 D.L. Sunding, A.M. Swoboda / Regional Science and Urban Economics 40 (2010) 550–573

Page 10: Hedonic analysis with locally weighted regression - Agricultural and

where n is the total sample size and m is the number of parametersestimated in the model. In contrast, Fotheringham et al. (2002) calculatethe GCV score as,

n ⁎ ∑n

i=1

yi− yið Þ2

n−υ1ð Þ2; ð11Þ

where yi is the dependent variable value, yi is the predicted dependentvariable value for observation i, and υ1 is the “effective number of modelparameters.”11 In a LWR model, the number of variables in themodel is no longer equal to the number of parameters to beestimated because different regressions are estimated over spaceyielding more parameter estimates. The GCV procedure is increas-ing in both the sum of squared residuals and the effective numberof parameters estimated. The challenge then, is choosing a modelthat achieves accurate prediction without sacrificing parsimony.Taking the square root of Eq. (11) and rearranging yields,

ffiffiffiffiffiffiffiffiffiffiGCV

p=

ffiffiffiffiffiffiffiffiffiffiffiffiffin

n−υ1

r ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiSum of Squared Residuals

n−υ1

s

; ð12Þ

which approaches σ as υ1 approaches m for large n. Henceforth,throughout the paper we report the square root of Eq. (11) becauseof its similarity to σ . This similarity between the GCV score and thestandard error of the regression (as approximated by the standarddeviation of the regression residuals) can be seen graphically inFig. 13. For large bandwidths the GCV score and the standarddeviation of the regression residuals are virtually indistinguishablebecause the effective number of parameters estimated is so smallrelative to the total size of the dataset. In fact, when the bandwidthis 13,856 neighbors, the number of parameters estimated is exactlyequal to six, the number of “true” parameters. However, smallerbandwidths result in larger and larger numbers of parameters to beestimated (because there is less and less overlap amongst regressions).Further decreases in the bandwidth beyond 40 nearest neighbors resultin increased GCV scores (the increased number of parameters to beestimated overwhelms the decreased sum of squared residuals).

5.3.1. LWR kernel shapeIn the previous example every observation in the bandwidth was

given equal weight. An alternative to this uniform weightingprocedure is to allow the weights to vary continuously between zeroand one, giving nearer observations more weight relative to distantobservations, rather than an “in-or-out”methodology. The underlyingfunction determining the weights given to observations in the LWR iscommonly referred to the kernel. Eqs. (13)–(15) display the functionalform for the three kernel shapes used in this paper, uniform, bi-square,and tri-cubic. Each of these functions produces weights based on thegeographic distance between observation i and j, dij. Letting dikrepresent the distance to the kth nearest neighbor to observation i,then for a nearest neighbor based LWR system,

uniform weights function is wij =1 if dij ≤ dik0 if dij N dik

&ð13Þ

bi# square weights function is wij =1−

dijdik

' (2" #2

if dij ≤ dik

0 if dij N dik

8>><

>>:ð14Þ

bi# square weights function is wij =1−

dijdik

' (3" #3

if dij ≤ dik

0 if dij N dik

:

8>><

>>:ð15Þ

The bi-square and tri-cubic functions are continuous and differ-entiable functions leading to smoothly declining weights, while theuniform weights function has a discontinuity. The measure of distanceneed not be the Euclidean distance between two observations, andmayincludeothermeasures of distance suchasdifferences in predeterminedboundaries. In fact, one can achieve the same results as a regression bycensus tract using a weights function that is 1 for observation j if it is inthe same census tract as obs i, and 0 if it is not. The regressions for anyobservation in a given census tract would include the same data andwould yield the same parameter estimates. Under this scenario themodel yields spatial variation in the regression parameters but exactlycorrelated with census tract boundaries.

20 40 100 200 500 1,000 2,000 5,000 Entire Sample

20,0

0026

,947

40,0

0060

,000

4000 2000 800 400 200 80 40 20 6

v1 = Effective # of Parameters Estimated in LWR Models

# of Neighboring Observations in Locally Weighted Regression Bandwidth

Comparison of GCV Score to St. Dev. of Residuals

GCV Score = # of Obs.

# of Obs. − v1

Sum of Squared Residuals# of Observations − v1

St. Dev. of Residuals = Sum of Squared Residuals

# of Observations −1

Bandwidth of 40 Nearest NeighborsMinimizes GCV Score = 26,947

Fig. 13. LWR outperforms OLS (GCV≈60,000) and census tract-level (GCV≈33,000)regressions.

Generalized Cross Validation Score Comparisons

Locally Weighted Regression Bandwidth (# of Neighbors)20 35 60 100 200 500 1,000 2,000 5,000 Entire Sample

26,1

32Ce

nsus

Tra

ct40

,000

48,0

00G

loba

l OLS

Uniform vs. Bi−Square vs.Tri−CubicKernel

Minimum GCV score = 26,132Bandwidth of 60 and Bi−Square Kernel

Fig. 14. The distribution of generalized cross validation scores for varying bandwidths(20 to 10,000 nearest observations) and three different kernel shapes (uniform, bi-square, and tri-cubic). The minimum GCV score is 26,132 when the bandwidth is 60 forthe bi-square kernel. For comparison, the GCV score for the global regression is 58,722and 30,808 for the census tract-level regression model. The bi-square kernel hasmarginally smaller GCV scores than the tri-cubic kernel for all bandwidths andsubstantially smaller scores than the uniform bandwidth for most bandwidths.

11 υ1=tr(S), where the matrix S is the “hat matrix” which maps y onto y,

y = Sy;

and each row of S, ri is given by:

ri = Xi XTW locationið ÞX# $−1

XTW locationið Þ:

559D.L. Sunding, A.M. Swoboda / Regional Science and Urban Economics 40 (2010) 550–573

Page 11: Hedonic analysis with locally weighted regression - Agricultural and

5.4. LWR cross validation results

This section describes the LWR cross validation results for our study.We calculate the GCV score at over 30 different bandwidths ranging from20 to 13,856 nearest neighbors and three different kernel shapes(uniform, bi-square, and tri-cubic).12 The minimum GCV score (26,132)is achieved with a bandwidth of 60 nearest neighbors and a bi-squarekernel function. This GCV score is substantially lower than the GCV scoreresulting from a global ordinary least squares model (58,722) and the

census tract-level regressions (30,808).Fig. 14plots theGCV scores for thedifferent bandwidths (half of which include less than 400 nearestneighbors) and the three kernel shapes. In general, the bi-square kernelshapeappears to consistently outperformthe tri-cubic anduniformkernelshapes (Fig. 14).13

Fig. 15. The spatial distribution of regression residuals for the LWR model with 60 nearest neighbors and a bi-square kernel function. Note the difference as compared to Fig. 16.

Fig. 16. The spatial distribution of global OLS regression residuals. Note the difference as compared to Fig. 15.

12 This means we estimated 30⁎3⁎13,856=1,247,040 separate regressions.

13 We also estimated LWR models under alternative specifications, similar toSection 4.2, in which we include a lot size quadratic term, drop the time trend, andboth at the same time. Our initial specification (including the time trend and noquadratic term for lot size) at a bandwidth of 60 nearest neighbors and a bi-squarekernel continues to outperform all other models. These results are available from theauthors upon request.

560 D.L. Sunding, A.M. Swoboda / Regional Science and Urban Economics 40 (2010) 550–573

Page 12: Hedonic analysis with locally weighted regression - Agricultural and

6. Results

This section presents the results of the LWR model estimation. Inparticular, we present the estimated regression residuals, marginaleffects of increased lot size, and resulting shadow price of housingfrom our preferred model of 60 nearest neighbors and a bi-squarekernel function. Special attention is paid to comparing the LWR resultsto the global regression results from Section 4.

6.1. Regression residuals

The LWR regression residuals using a bandwidth of 60 nearestneighbors andbi-square kernel are displayed inFig. 15. Themean residualis $163, with a standard deviation of $21,340. Note the stark difference

between Figs. 15 and 16, the residuals from the global OLS regression. TheLWR residuals are on average closer to zero and display much less of aspatial pattern. Whereas all of the large and positive global regressionresiduals are clustered in the northwest part of the region, the largest andsmallest LWR residuals are scattered throughout the study regionwith nodiscernable spatial pattern. Fig. 17 compares the distributions ofregression residuals for the Global, Census TRact, and LWRmodels.

6.2. The marginal effect of lot size on house prices

In contrast to the global regression model which imposes therestriction that the marginal effect of lot size be constant across thestudy region, or the census tract-level model which only allows themarginal effect to vary across census tracts, the LWR model allows the

Fig. 18. The spatial and univariate distribution of themarginal effect of increased lot size on house price for the LWRmodel with 60 nearest neighbors and a bi-square kernel function.This figure also displays the t-statistics from testing the null hypothesis that the LWRmarginal effects are equal to $2.15 (the global model estimate). Those areas with positive slopedblack hatching have amarginal effect that is more than two standard errors above the global estimate, while areas with negative sloped hatching are areas withmarginal effects morethan two standard errors below the global estimated marginal effect.

Fig. 19. Comparison of estimated marginal price effects of increased lot size for LWRmodel (bi-square kernel and bandwidth of nearest 60 neighbors), global regression andcensus tract-level regression.

Fig. 17. Comparison of regression residual distributions for LWR model (bi-squarekernel and bandwidth of nearest 60 neighbors), global regression and census tract-levelregression models.

561D.L. Sunding, A.M. Swoboda / Regional Science and Urban Economics 40 (2010) 550–573

Page 13: Hedonic analysis with locally weighted regression - Agricultural and

marginal effect to vary at each and every location within the dataset.Fig. 19 shows the distribution of the estimatedmarginal effects of lot sizefor varying bandwidths, while Fig. 18 shows the spatial distribution of theestimated marginal effects for the LWR model with a bandwidth of 60nearest neighbors and a bi-square kernel, while Fig. 19 compares theunivariate distribution of these LWR marginal effects estimates to theGlobal and Census Tract model estimates. The mean estimated marginal

effect is $1.7 and the standard deviation is $5.1. We should note that wefail to reject the null hypothesis that LWR marginal effects are differentfrom the global estimate of $2.15 for only 25% of the observations.

6.3. The shadow price of housing

The mean estimated shadow price of housing using the LWR modelwith a bandwidth of 60 nearest neighbors and a bi-square kernel is$66,901 and the standard deviation is $67,851, as shown in Fig. 20. Oneof the main purposes of this paper is to correct for the potential bias ofover-aggregation in the hedonic estimation used by other researchers.In Table 8we regress our LWR estimates of the shadow price of housingon thepreviousestimatesof the shadowprice of housing fromtheglobalhedonic model. The estimated regression parameters allow us to rejectthe null hypothesis that the true relationship linear relationship

Fig. 20. The spatial and univariate distribution of the shadow price of housing for the LWR model with 60 nearest neighbors and a bi-square kernel function.

Table 8Regression of LWR shadow price of housing estimates on global OLS model estimates.N=13,857, R2=0.74, σ=34,672.

Estimate Std. error t value Pr(N|t|)

Constant 7178 422 17.0 0.0000Global OLS shadow price 0.93 0.0047 198.0 0.0000

Fig. 21. The spatial and univariate distribution of the shadow price of housing for the LWR model with 1000 nearest neighbors and a bi-square kernel function.

562 D.L. Sunding, A.M. Swoboda / Regional Science and Urban Economics 40 (2010) 550–573

Page 14: Hedonic analysis with locally weighted regression - Agricultural and

between the two sets of estimates is one at the 1% significance level. Thestandard error of the regression is 35,000 which represents over 50% ofthe mean estimated shadow price of housing of just under $70,000.

7. Sensitivity analysis

7.1. Uncertainty over LWR bandwidth

Thus far we have reported locally weighted regression results with abandwidth of 60 nearest neighbors, which were chosen by minimizingthe generalized cross validation score as calculated in Eq. (11). Our goal isto allow for spatial heterogeneity in the marginal effect of lot size onhousing prices while the GCV metric is primarily concerned with thetradeoff between regression residuals and the number of modelparameters. McMillen and Redfearn, (2010) and Pagan and Ullah(1999) suggest when we are concerned with the derivative of thehedonic function (as we are with our interest in the marginal effect ofincreased lot sizes) the optimal LWR bandwidth may be bigger thansuggested by GCVminimization. Herewe show that, in our data, changesin the LWR bandwidth do not substantially change our results. Fig. 21shows the univariate and spatial distribution of the shadow price ofhousing estimates with a bandwidth of 1000 nearest neighbors. Themean, $59,195, and standard deviation, $60,042 are very similar to theestimateswithabandwidthof only60neighbors. Theestimatedmarginaleffects of lot size on house prices are shown in Fig. 22. As can be seen bycomparing these results to Fig. 18, we can actually reject the hypothesisthat the LWRmarginal effects for lot size are equal to the global ordinaryleast squaresmodel across a greater proportionof the study area (this factis due to the larger standard errors in the LWR model of 60 neighborsrather than estimates of the marginal effect being closer to $2.15). Themean marginal effect is $2.74, and standard deviation is $2.64.

7.2. Is the shadow price of housing correlated with demographics?

Cheung et al. (2009a) measure the incidence of their “implicit tax”estimates by regressing their measure against the price of housing andracial characteristics of the neighborhood. They find the tax to beregressive (it increases an average of roughly $.50 for every $1

increase in housing prices) and over $10,000 higher on average inneighborhoods with predominantly black populations compared toWhite and Latino neighborhoods.

In Table 9 we perform a similar regression to Cheung et al. (2009a).We find that the shadow price of housing increases an average of $0.48for every dollar increase in the selling price, a result strikingly similar toCheung et al. (2009a). The demographic characteristics for the area areobtained from the US Census at the place level. Approximately 10,000 ofour houses are within 35 places (the other roughly 4000 houses areoutside of census designatedplace boundaries and are not included in thisanalysis. We characterize each census designated place by the mostpopulous race in the municipality and obtain the following characteriza-tions: Asian=2, Black=0, Latino=17, and White (non-Latino)=16. InTable 9 race=“White” is used as the reference case and suggests that thetwo places with predominantly Asian demographics experience astatistically significantly higher estimated shadow prices of housing(by almost $100,000), however, with these two places comprising onlyroughly 100 of the nearly 10,000 observations, these results should betaken with a grain of salt. We attempt to corroborate these results byincluding the demographic characteristics as percentages of the totalpopulation rather than discrete variables in Table 10.

The regression results fromTable 9 appear to hold in Table10aswell,placeswith an additional one percent of the population characterized asAsian have estimated shadow prices of housing on average roughly$3000 higher, all other things equal (note that the percentage of thepopulation identifying themselves as White and non-Latino is notincluded in the regression). Increases in the percentage of the

Fig. 22. The spatial and univariate distribution of the marginal effect of increased lot size on house price for the LWRmodel with 1000 nearest neighbors and a bi-square kernel function. Thisfigure also displays the t-statistics from testing the null hypothesis that the LWRmarginal effects are equal to $2.15 (the globalmodel estimate). Those areaswith positive slopedhatching have amarginal effect that ismore than two standarddeviations above theglobal estimate,while areaswithnegative slopedhatchingare areaswithmarginal effectsmore than twostandarddeviationsbelow the global estimated marginal effect.

Table 9Regression of shadow price of housing on price and race data. N=9617, R2=0.52,σ=44,318. We use predominantly white municipalities as the reference case in thisregression (note that there are zero predominantly black municipalities among the 35in our data set.

Estimate Std. error t value Pr(N|t|)

(Intercept) −73,873 1770 −42 0.0000House Price 0.48 0.006 83 0.0000Race: Asian 92,792 4296 22 0.0000Race: Latino −448 965 −0.46 0.6422

563D.L. Sunding, A.M. Swoboda / Regional Science and Urban Economics 40 (2010) 550–573

Page 15: Hedonic analysis with locally weighted regression - Agricultural and

population identified as Black and Latino appear to be correlated withreduced shadow price estimates, although the economic effect is small.

Adding the average household income for the census place to theanalysis and the percentage of homes that are owner-occupied, as inTable 11does not seriously change the results. A one dollar increase in theaverage household income for the census place is correlated with a $0.45change in shadow price and a one unit increase in the percentage ofowner-occupied homes increases the shadowprice of housing by $600onaverage. The effect of a one percentage increase in the Asian population isfurther reduced to just under $3000, although it still appears to beeconomically significant as compared to the other race variables. Thepercentages of the population identified as Black and Latino are nowpositively correlated with the shadow price of housing, although themagnitudes are still small compared to the Asian parameter estimates.Adding population variables such as total population or populationdensity does not significantly change the results and as such are notreported here (they are available upon request).

7.3. Regulation and the shadow price of housing

In this sectionwe check for correlation between our shadow price ofhousing estimates and regulation indices constructed just prior to thestart of our study period. Rosenthal (2000) constructed two regulationindices for some census places in California using data from Glickfeldand Levine (1992). The “pro-growth” index measures housing industryperceptions of municipal attitudes toward residential and commercialgrowth (values range between 0 and 7), while the “exclusion” indexmeasures attitudes towards minorities, low-income residents andissues of affordable housing (values range from 0 to 19). Regulationsthat determine the measures include restrictions on the number ofbuilding permits issued, population growth restrictions, densityrestrictions, growth management plans, and urban growth boundaries.Higher values of the pro-growth index should be interpreted asmunicipalities being more friendly to growth and higher “exclusion”municipalitiesmore likely to exclude low-income andminority housingcandidates. Note that someof these regulationsmay be considered to be“anti-growth” or even “exclusionary,” but because they will affect thehousing market via land scarcity (such as growth boundaries), we donot necessarily expect these variables to be strongly correlatedwith ourmeasure of the shadow price of housing. Further description of theseregulatory variables can be found in Quigley et al. (2004).

Wedonothave regulationdata for everymunicipality inour studyandour sample size is further reduced to just under 9000 observations byincluding the regulation measures in the regression. Table 12 shows that

wecanexplain roughly50%of thevariation inour estimated shadowpriceof housingusing this reduceddata set. (Note that this value is smaller thanthe R2=0.58 in Table 11 with an additional 700 observations suggestingour model can better predict variation in the shadow price of housing inthose places for which we do not have regulation data.) The raceparameters are similar in both Tables 11 and 12. One unit increases in thepro-growth index,whichvaries inourdata from0to7, are associatedwitha $1839 increase in the shadowprice of housing. However, the coefficientfor the exclusionary index, $1087, actually suggests that more exclusion-ary communities have smaller shadowprices of housing. These results arerobust to many combinations of included and excluded variables.Interestingly, if we regress our estimates of consumer marginalwillingness to pay for land (our LWR lot size coefficients) on race andregulation variables, the coefficients are of the expected sign, although theexplanatory power is exceptionally low. Table 13 displays these results.

7.4. Construction cost uncertainty

For every dollar we underestimate housing construction costs for ourhousing datawewill overestimate the shadowprice of housing.Whileweobtained construction data in the most detailed manner possible, ourestimates of the shadow price of housing maybe the result of inaccurateconstruction data (or inappropriate assumptions regarding the construc-tion quality). In this sectionwe ask the question, “what construction costsare necessary to eliminate the shadow price of housing?” and “are thesecostsplausible?”Dividing the shadowpriceofhousingby the livingareaofthe house yields the change in per square foot construction costs neededto set the shadow price of housing equal to zero,

Δ construction costs per ft2 =ˆShadow Price of Housing

Living Area: ð16Þ

The first column in Table 14 displays the change in construction costsper square foot of living space calculated in Eq. (16). Themean increase inconstruction costs necessary to set the shadow price of housing for eachhouse to zero is 31, almost exactly equivalent to the change inconstruction costs from increasing the construction quality assumptionfrom “average” to “good” using the (Marshall and Swift (2002) data.Eq. (16) reminds us that the marginal effect of increased lot size is anestimate and as such contains uncertainty.

Eq. (17) re-estimates the changes in construction costs necessary toset the shadowprice of housing equal to zerowhile allowing themarginaleffect of lot size to be two standard errors higher than the LWR estimates.

Table 13Regression of marginal effect of lot size on price, race, and regulation indices. N=8920,R2=0.04, σ=5.

Estimate Std. Error t value Pr(N|t|)

(Intercept) 1.35 0.410 3.3 0.0010House Price 0.0000052 0.00000081 6.5 0.0000Percent Asian Pop. 0.037 0.0149 2.5 0.0139Percent Black Pop. −0.077 0.0130 −6.0 0.0000Percent Latino Pop. −0.013 0.0055 −2.4 0.0182Pro-Growth Index −0.184 0.0500 −3.7 0.0002Exclusionary Index 0.079 0.0120 6.6 0.0000

Table 11Regression of shadow price of housing on price, race, owner-occupancy, and incomedata. N=9617, R2=0.58, σ=41,626.

Estimate Std. error t value Pr(N|t|)

(Intercept) −136,194 5406 −25.2 0.0000House price 0.36 0.007 54.3 0.0000Percent Asian pop. 2787 79 35.1 0.0000Percent Black pop. 189 100 1.9 0.0591Percent Latino pop. 201 41 4.9 0.0000Average Household Inc. 0.45 0.053 8.6 0.0000Percent Owner Occ. Homes 633 68 9.4 0.0000

Table 12Regression of Shadow Price of Housing on Price, Race, and Regulation Indices. N=8920,R2=0.5, σ=42,474.

Estimate Std. Error t value Pr(N|t|)

(Intercept) −51,802 3351 −15.5 0.0000House Price 0.38 0.007 58.1 0.0000Percent Asian Pop. 3310 122 27.2 0.0000Percent Black Pop. 233 106 2.2 0.0275Percent Latino Pop. 90 45 2.0 0.0428Pro-Growth Index −1839 408 −4.5 0.0000Exclusionary Index −1087 98 −11.1 0.0000

Table 10Regression of shadow price of housing on price and race data. N=9617, R2=0.57,σ=42,162.

Estimate Std. error t value Pr(N|t|)

(Intercept) −63,616 2117 −30.1 0.0000House Price 0.38 0.006 61.9 0.0000Percent Asian Pop. 3012 78 38.8 0.0000Percent Black Pop. 208 102 2.1 0.0402Percent Latino Pop. −73 38 −1.9 0.0559

564 D.L. Sunding, A.M. Swoboda / Regional Science and Urban Economics 40 (2010) 550–573

Page 16: Hedonic analysis with locally weighted regression - Agricultural and

A value of zero implies that increasing the marginal effect of lot sizeeliminated the shadow price of housing. Values in the second columngreater than 35 imply that evenwith both an increase in the constructionquality from “average” to “good” and increasing consumers' marginalwillingness to pay for land by two standard errors there would still be anestimated shadow price of housing greater than zero. Table 14 suggeststhat this is the case for roughly 25% of the houses in our study area.

Δ construction costs per ft2 lowð Þ = ˆShadow Price of Housing−2 × σPL × Lot SizeLiving Area

:

ð17Þ

Another way to check and see if we are generating positive shadowprice of housing estimates by underestimating the construction costs is toregress the shadow price of housing on the LWR coefficients for houseliving area. In areas with higher quality housing construction we shouldexpect to see larger coefficients associated with house living area. Largershadow prices of housing in areas with higher quality housing thensuggests that at least part of the estimated shadowprice of housing is dueto underestimation of the construction costs. Table 15 shows that houseswith higher estimatedmarginal values of living space,whichmay serve asa proxy for construction quality, tend to have higher shadow prices.However, whereas each additional dollar increase in the living spacehedonic coefficient will reduce the shadow price of housing by the size ofthe house (roughly 2000 ft2), the regression results indicate that in ourdata those houses with one dollar higher living space valuations are onlyassociated with shadow prices a few hundred dollars higher. Thus weconclude that inaccuracies concerning housing construction quality arenot the driving force behind our positive shadow price estimates.

8. Future work

While our work has explored the effect of relaxing the hedonicanalysis assumption of globally invariant parameters, we have had tomake other assumptions that future researchers should test. Forinstance, this work has thus far ignored the temporal aspects of ourdata, instead choosing to use the Conventional Mortgage Housing Price

Index and cross-section econometric tools. Future work may want toexplicitly model the time dimension of housing data with LWR modelsto explore such questions as whether the effects of regulation arechanging over time and whether the optimal level of data aggregationhas changed over time. Future researchers should attempt to collectmicro-level construction cost data, as the previous section showed thatour results canbe changedwithdifferent construction cost assumptions.

Further work is also necessary regarding the relationship betweenindividual regulations and the shadow price of housing. The regulationindices used in this analysis are old and given the extensive growth thatoccurred in this area after the data were collected, the measures may beout of date. The regulatory indices were not designed to differentiatebetween policies expected to influence the housing market via the landmarket (inwhich case the regulationmay have large impacts on the priceof housing but no evidence of this will appear in our statistical test) andpolicies that will affect the housing market through the creation of apositive shadow price of housing (such as restrictions on the number ofbuildingpermits). Future research should attempt to correlate the shadowprice of housing with the presence/absence of such individual policies.

More work should explore the appropriateness of using LocallyWeighted Regression models in housing hedonic research. The LWRmodel appears to be intuitively appropriate in the presence of micro-market fluctuations in which the value of land and other attributes varysmoothly over small scales. However, if the market for housing attributesis not smoothly varying over space, but is instead characterized by“boundary effects,” i.e. in the case of a high-performance school districtlocated next to a low-performance district, the LWRmodel, while it maygive better results than a global model, still represents a modelmisspecification. Future work may want to perform simulations withgenerated data to compare LWR results under such conditions. Suchsimulations could also further explore the question of appropriatebandwidth selection for estimating the partial derivative of the hedonicfunction (as is the case for our interest in the marginal willingness to payfor land) vs. minimized prediction error. In short, we believe that locallyweighted regression is a powerful econometric tool that deservescontinued study in the context of hedonic analysis.

9. Conclusion

Recent work in urban economics has criticized the commonly usedtechniques of hedonic analysis, most notably the assumption ofconstant model parameters across the geographical study area.Researchers such McMillen and Redfearn (2010), Redfearn (2009),and Pavlov (2000) have begun to explore the impacts of relaxing thisunrealistic assumption through the use of locally weighted regression.This regression technique allows the regression parameters tocontinuously vary over space to reflect changing hedonic parameters.Such a formulation is consistent with the standard neo-classicalapproach to urban economics, which suggests the price of land variesacross locations and access to amenities.

Locally weighted regression techniques result in substantially smalleraverage model prediction errors than other models in our study in theInland Empire of Southern California. The generalized cross validationscore (a regression diagnostic statistically closely related to the standarderror of the regression) is roughly $26,000 per house for our preferredLWR model vs. almost $60,000 under the assumption of globalhomogeneity, or roughly $31,000 when the analysis is performed at thecensus tract-level. In a world in which Netflix awarded a $1 million prizefor reducing the root mean square error (similar to the generalized crossvalidation score) formovie recommendationsby10%,we think reductionsof over 50 and 15% for estimates related to a roughly $20 trillionresidential real estate market14 are quite remarkable and important.

Table 15Regression of shadow price of housing on locally weighted regression house living areacoefficients, house price, race, and regulation indices. N=8920, R2=0.53, σ=41,239.

Estimate Std. error t value Pr(N|t|)

Constant −67,187 3320 −20.2 0.0000House Price 0.36 0.007 55.4 0.0000LWR Living Area Coefficient 298 12.8 23.3 0.0000Percent Asian Pop. 2662 121 21.9 0.0000Percent Black Pop. 198 103 1.9 0.0541Percent Latino Pop. 208 43.6 4.8 0.0000Pro-Growth Index −2303 397 −5.8 0.0000Exclusionary Index −981 95.2 −10.3 0.0000

Table 14Change in per square foot housing construction costs necessary for the shadow price ofhousing to be zero. Column ∆ Cons. Costs uses the marginal effects of lot size estimatesfrom LWRmodel with 60 nearest neighbors and∆ Cons. Costs Low usesmarginal effectsestimates two standard errors larger than the LWR model with 60 nearest neighbors.

∆ Cons. costs ∆ Cons. costs low

5th%: −15 −4515th%: 4 −1725th%: 12 −435th%: 18 445th%: 24 1055th%: 30 1765th%: 37 2375th%: 47 3285th%: 64 4795th%: 93 76

14 According to Table B.100 in the Flow of Funds Accounts of the United States June 1,2009 release, available at: http://www.federalreserve.gov/releases/z1/Current/z1r-5.pdf.

565D.L. Sunding, A.M. Swoboda / Regional Science and Urban Economics 40 (2010) 550–573

Page 17: Hedonic analysis with locally weighted regression - Agricultural and

We take advantage of LWR techniques to test the hypothesis that theequilibrium returns to land for competing uses will be identical. Thisconcept is both simple andappealing. Consumerswill bidup landprices inareas with access to valuable amenities and producers will respond tothese increased prices by trying tofitmore people into these areas. If landis more valuable being devoted to the production of additional housing,then land should be subdivided to produce more housing. Regulatoryconstraints on the allowable number of housing units prevent suchadjustments and create a positive shadow price of housing: a wedgebetween the price of housing and the value of the structure and land.

Knowing whether a positive shadow price of housing exists canhelp policy makers construct efficient future policies with higherbenefits and lower overall costs to society. For instance, in a world ofrationed housing production marginal taxes on housing productionwill have no effect on output, while land restrictions and taxes causenothing more than an increase in housing density, if allowed by theregulatory regime. However, policies that further limit housingproduction will have marginal costs much larger than predictedunder the standard economic model. Land preservation for biodiver-sity protection or provision of open space may be accomplished atlower cost through better instrument choice when the underlyingmarket structure is better understood.

Our work finds that measures of the shadow price of housing arerobust to varying levels of data aggregation and locally weighted

regression techniques. We used a data set of almost 14,000 newlyconstructed and sold single-family houses in the Inland Empire ofSouthern California. Our preferred hedonic estimation strategyallows the value of land to vary smoothly over space rather thanimposing exogenous and unsupported homogeneity restrictions onthe data.

Our results suggest that municipalities with higher rates of owner-occupied housing had higher shadow prices of housing, while thoseconsidered to be more “pro-growth” by researchers in the 1990s hadsmaller estimated shadow prices of housing (but those samemunicipalities considered to be more “exclusionary” also had smallershadow price estimates). The estimates of consumer willingness topay for land was positively correlated with community measures ofexclusion and negatively correlated with measures of pro-growthattitudes. Our results underscore the central role that regulationshould play in models of the urban economy, and as noted in Quigleyet al. (2004), future work is necessary to collect and catalogueregulation data across markets and time.

Acknowledgment

A special thanks to Jonathan Blaufuss for outstanding researchassistance.

Appendix A

This Appendix contains additional figures exploring the results of our analysis. Figs. A.1–A.3 show the distribution of the shadow price ofhousing under alternative global models, while our work is mostly concerned with the model coefficient estimates for the lot size variable. In thissectionwe show also the display distribution of the other LWRmodel covariates: living area, number of bathrooms, number of bedrooms, time trendand the model intercept.

Fig. A.1. The spatial and univariate distribution of the shadow price of housing estimates (log–logmodel). Themean shadow price of housing is $72,312 and the standard deviation is $61,266.

566 D.L. Sunding, A.M. Swoboda / Regional Science and Urban Economics 40 (2010) 550–573

Page 18: Hedonic analysis with locally weighted regression - Agricultural and

Fig. A.2. The spatial and univariate distribution of the shadow price of housing estimates (quadratic model). The mean shadow price of housing is roughly $71,000 with a standarddeviation of $64,000.

Fig. A.3. The spatial and univariate distribution of the shadowprice of housing estimates (quadraticmodel). Themean shadowprice of housing is $60,271 and the standard deviation is$64,706.

567D.L. Sunding, A.M. Swoboda / Regional Science and Urban Economics 40 (2010) 550–573

Page 19: Hedonic analysis with locally weighted regression - Agricultural and

Fig. A.4. The spatial distributions of the number of bathrooms coefficient estimates for varying LWR bandwidths (clockwise from upper-left: 13,856, 6500, 1000, and 60 nearest neighbors) and a bi-square kernel.

568D.L.Sunding,A.M

.Swoboda

/RegionalScience

andUrban

Economics

40(2010)

550–573

Page 20: Hedonic analysis with locally weighted regression - Agricultural and

Fig. A.5. The spatial distributions of the house living area coefficient estimates for varying LWR bandwidths (clockwise from upper-left: 13,856, 6500, 1000, and 60 nearest neighbors) and a bi-square kernel.

569D.L.Sunding,A.M

.Swoboda

/RegionalScience

andUrban

Economics

40(2010)

550–573

Page 21: Hedonic analysis with locally weighted regression - Agricultural and

Fig. A.6. The spatial distributions of the number of bedrooms coefficient estimates for varying LWR bandwidths (clockwise from upper-left: 13,856, 6500, 1000, and 60 nearest neighbors) and a bi-square kernel.

570D.L.Sunding,A.M

.Swoboda

/RegionalScience

andUrban

Economics

40(2010)

550–573

Page 22: Hedonic analysis with locally weighted regression - Agricultural and

Fig. A.7. The spatial distributions of the time trend coefficient estimates for varying LWR bandwidths (clockwise from upper-left: 13,856, 6500, 1000, and 60 nearest neighbors) and a bi-square kernel.

571D.L.Sunding,A.M

.Swoboda

/RegionalScience

andUrban

Economics

40(2010)

550–573

Page 23: Hedonic analysis with locally weighted regression - Agricultural and

Fig. A.8. The spatial distributions of the model intercept estimates for varying LWR bandwidths (clockwise from upper-left: 13,856, 6500, 1000, and 60 nearest neighbors) and a bi-square kernel.

572D.L.Sunding,A.M

.Swoboda

/RegionalScience

andUrban

Economics

40(2010)

550–573

Page 24: Hedonic analysis with locally weighted regression - Agricultural and

References

Alonso, William, 1964. Location and Land Use. Harvard University Press.Anderson, Soren T., West, Sarah E., 2006. Open space, residential property values, and

spatial context. Regional Science and Urban Economics 360 (6), 773–789.Beckmann, Martin J., 1969. On the distribution of urban rent and residential density.

Journal of Economic Theory 1, 60–67.Brasingtion, DavidM., 1999.Whichmeasures of school quality does the housingmarket

value? Journal of Real Estate REsearch 180 (3), 395–413.Brueckner, Jan K., 1983. The economics of urban yard space: an “implicit market”model

for housing attributes. Journal of Urban Economics 13, 216–234.Casetti, E., 1972. Generating models by the spatial expansion method: applications to

geographical research. Geographical Analysis 4, 81–91.Cheung, Ron, Ihlanfeldt, Keith, Mayock, Thomas, 2009a. The incidence of the land use

regulatory tax. Real Estate Economics 37.Cheung, Ron, Ihlanfeldt, Keith, Mayock, Thomas, 2009b. The regulatory tax and house

price appreciation in florida. Journal of Housing Economics 18, 34–48.Cleveland, William S., Devlin, S.J., 1988. Locally weighted regression: an approach to

regression analysis by local fitting. Journal of the American Statistical Association83, 596–610.

Davis, Morris A., Palumbo, Michael G., 2007. The price of residential land in large uscities. Journal of Urban Economics.

Economic and Planning Systems, 2005. Economic Analysis of Critical HabitatDesignation for the Arroyo Toad. Berkeley, CA.

Fotheringham, A. Stewart, Brundson, Chris, Charlton, Martin, 1998. Geographicallyweighted regression: a natural extension of the expansion method for spatial dataanalysis. Environment and Planning A 300 (11), 1905–1927.

Fotheringham, A. Stewart, Brundson, Chris, Charlton, Martin, 2002. GeographicallyWeighted Regression: The Analysis of Spatially Varying Relationships. John Wileyand Sons, Chichester, England.

Glaeser, Edward, Gyourko, Joseph, 2003. The impacts of building restrictions on housingaffordability. Federal Reserve Board of New York Economic Policy Review 9, 21–39.

Glickfeld, Madelyn, Levine, Ned, 1992. Regional Growth, Local Reaction: The Enactmentand Effects of Local Growth Control and Management Measures in California.Lincoln Land Institute.

Harrison Jr., David, Rubinfeld, Daniel L., 1978. Hedonic housing prices and the demandfor clean air. Journal of Environmental Economics andManagement 50 (1), 81–102.

Irwin, Elena, 2002. The effects of open space on residential property values. LandEconomics 780 (4), 465–480.

Marshall and Swift, 2002. Residential Cost Handbook. Marshall and Swift.McMillen, Daniel P., 1996. One hundred fifty years of land values in chicago: a

nonparametric approach. Journal of Urban Economics 40, 100–124.McMillen, D.P., Redfearn, C.L., 2010. Estimation and hypothesis testing for non-parametric

hedonic house price functions. Journal of Regional Science 50 (3), 712–733.Mills, Edwin S., 1967. An aggregative model of resource allocation in a metropolitan

area. The American Economic Review 57, 197–210.Mills, EdwinS., 1972. Studies in theStructureof theUrbanEconomy.The JohnsHopkinsPress.Muth, Richard, 1969. Cities and Housing. University of Chicago Press.Pagan, Adrian,Ullah, Aman, 1999.Nonparametric Econometrics. CambridgeUniversity Press.Pavlov, Andrey D., 2000. Space-varying regression coefficients: a semi-parametric

approach applied to real estate markets. Real Estate Economics 28, 249–283.Quigley, John M., Raphael, Steven, Rosenthal, Larry, 2004. Local land-use controls and

demographic outcomes in a booming economy. Urban Studies 410 (2), 389–421.Redfearn, Christian L., 2009. How informative are average effects? Hedonic regression

and amenity capitalization in complex urban housing markets. Regional Scienceand Urban Economics 39, 297–306.

Rosenthal, Stuart, 1999. Residential buildings and the cost of construction: newevidence on the efficiency of the housing market. The Review of Economics andStatistics 810 (2), 288–302.

Larry Rosenthal. Long Division: California's Land Use Reform Policy and the Pursuit ofResidential Integration. PhD thesis, Goldman School of Public Policy, University ofCalifornia–Berkeley, 2000.

Somerville, C.T., 1996. The contribution of land and structure to builder profits andhouse prices. Journal of Housing Research 70 (1), 127–141.

573D.L. Sunding, A.M. Swoboda / Regional Science and Urban Economics 40 (2010) 550–573