Digital Printing Company NYC - Neon Signs NYC - Awnings NYC - Banners
Predicting Rental Prices in NYC
-
Upload
joel-carlson -
Category
Data & Analytics
-
view
102 -
download
1
Transcript of Predicting Rental Prices in NYC
predicting rental unit prices in NYC Joel Carlsonhttp://joelcarlson.me
two hypotheses
1. Changes in the number of issued liquor licenses precedes changes in rental price increases
2. Changes in the number of taxi pickups and drop-offs precedes changes in rental price increases
motivation
1. Identify regions ripe for investment
2. Identify areas which may undergo gentrification
• Give early chance to policy makers to implement rent controls
the dataRental Unit Prices
• Published by Zillow
Liquor Licenses
• NY Gov’t Liquor Authority Database*
Taxi pickups/drop-offs
• Published by NY City Gov’t
• ~30 Gb / year
* Databases were, unfortunately, harmed in the creation of this project
data pipeline
Raw Data : Roughly oscillatory trend
data pipeline
Raw Data : Roughly oscillatory trend
Raw Data : Month over month changes Too noisy
data pipeline
STL Data (Trend): Acquire oscillatory trend
STL Decomposition
STL Data (Seasonal): Extract seasonal component
STL Data (Remainder): Discard remainder
data pipeline
Raw Data : Roughly oscillatory trend
Raw Data Processed Data
data pipeline
Raw Data : Roughly oscillatory trend
Processed Data
Prediction Target!
Model Features (lagged 3 to 12 months):
• Monthly changes in number of liquor licenses issued
• Monthly changes in taxi pickups and drop-offs
• Historical changes in price
pipeline
Liquor Data
Taxi Data
Rental Price Data
Aggregate Synchronize
Trend
Train/optimize Models • Vector Autoregression (VAR) • Random Forest • Random Forest w/o L+T
Goal 1 : Test taxi and liquor license hypotheses
Goal 2 : Accurately forecast monthly changes
are the models accurate?
VAR found no
statistically significant
relationship between
taxis + liquor licenses
and rent increases
Model Forecasts for A Single Zip-code
Random Forests
VAR
Target
training forecast
how far can the models predict?Forecast Accuracy for All NY Zip-codes
• Hoped to observe Full RF outperform RF on long term predictions
• Failed to observe
• Adding taxi and liquor data does not improve predictions
• Confirms VAR finding
what can the models tell us about NY?NYU has identified a number of regions which have been undergoing gentrification
Three categories:
1. Gentrifying
• Low-income in 1990, experienced rent growth above the median between 1990 and 2014
2. Non-Gentrifying
• Started off as low-income in 1990 but experienced more modest growth than gentrifying areas
3. Higher Income
• Those that were already at high income levels in 1990
what can the models tell us about NY?Bimodal accuracy distribution by zipcode:
For some zip codes, models trained with liquor and taxi data well outperform models without
These regions are almost unanimously gentrifying
• Bed-Stuy and Crown Heights
• Bronx near Yankee Stadium
• Jackson Heights near Citi Field (Mets)
• Not included in NYU map
• Google results from 2016 indicate gentrification has just begun
Regions where Liquor and Taxi
Models Perform Better
Perhaps there is some signal after all…
Thank you!
http://joelcarlson.me
github.com/joelcarlson
github.com/joelcarlson/CityPredictions