Soofi Demo

Post on 23-Feb-2017

117 views 0 download

Transcript of Soofi Demo

CrowdSkipprWafa Soofi

Yosemite trip, May 2011

We had a great time…

Though it might have been better if the scenery had looked less

like this

alanak

and morelikethis.

Gianluca Vegetti

The problemI want to go hiking at a time/day that works for me, but that also

minimizes the size of the crowds.

The problemI want to go hiking at a time/day that works for me, but that also

minimizes the size of the crowds.

I would like to predict the crowd size for a specific location and a

range of future dates.

The problemI want to go hiking at a time/day that works for me, but that also

minimizes the size of the crowds.

I would like to predict the crowd size for a specific location and a

range of future dates.

Then I can use that prediction to make an intelligent choice about

when to take my trip.

How do we predict crowds right now?

Government dataOften aggregated

Not always immediately accessible

Check-insSparse coverage

Prior knowledge/IntuitionNot always validated

There’s another way.

There’s another way.

We can crowdsource this problem!

CrowdSkippr: Inner workings

From flickr.com, extract the total

number of photos taken at a given

time/place).

Extract data on temperatures

from NOAA.gov for

a given time/place.

Using this information, create a prediction of how heavy the crowds will be at a given

future time/place.

TM

Gradient Boosting RegressionPredictors

Day of week (Flickr)Holiday flag (Flickr)Day of year (Flickr)

Daily temperature (NOAA)

ResponseNumber of photos taken (Flickr)

(proxy for size of crowd)

Photos (or visitors) per

month normalized by

total

Wait:Is # photos a good proxy for # visitors?

PhotosVisitors

Photos (or visitors) per

month normalized by

total

Wait:Is # photos a good proxy for # visitors?

PhotosVisitors

02000

4000

6000

8000

100001200014000

0100,000200,000300,000400,000500,000600,000700,000

R2 =0.89

No. Photos

Num

. Visi

tors

Day of Year

Temperature

Day of Week

Yosemite National Park

Holiday

Relative Feature Importance0 0.4 0.6 0.8 10.2

Thanks for your time!I’m Wafa.

For all 28-day windows in a given year,the median difference between crowd size on predicted

and actual best days is 4.6%.

(On the days that are predicted to have the lowest crowds, the crowd size is 29% of the worst possible

crowds within that window.)

Validation:Rocky Mountain National Park

Predicted crowd size

Actual crowd size (test data)