A Stochastic Model for Seasonal Cycles in Crime - UCLAbertozzi/WORKFORCE/REU 2013/Crime Fighters...A...

A Stochastic Model for Seasonal Cycles in Crime

Yunbai Cao Kun Dong Beatrice Siercke Matt Wilber

August 9, 2013

(UCLA, Harvey Mudd) August 9, 2013 1 / 43

Introduction

Introduction


Introduction


Objective

We would like to create a mathematical model for theseasonal variation of burglary in Los Angeles, CA andHouston, TX, in order to forecast future crime trends.While seasonality of crime data is widely agreed upon,it is poorly understood, and we seek to determine if amodel can reproduce seasonality for a wide variety ofdata sets without relying on environmental factors.

Introduction

Criminology Background


Inconsistent Seasonal Folklore

Disagreement over seasonality of crimes

Variation with crime type

Burglary peak in summer or winter

Geographical variation of behavior

Inconsistent hypotheses of what drives seasonality

Introduction

Data Sets


Los Angeles Houston

Data sets contain property crime rates

LA data 2005-2013, Houston 2009-2013

Would like to divide into long-term trend, seasonal,and noise components Xt = Tt + St + Nt

Methods

Methods


Methods

Singular Spectrum Analysis


Decomposition of timeseries X = (x1, x2, . . . , xN)into elementaryreconstructed series

Window lengthL = N K + 1Low-rank approximation oforiginal time series

Allows us to extractlong-term trend,seasonality, and noise

M =

x1 x2 x3 xKx2 x3 x4 xK+1x3 x4 x5 xK+2...

......

. . ....

xL xL+1 xL+2 xN

Methods



Interested in the first six or so singular values

Methods



First modecorresponds tolong-term trend

Modes 2 and 5display yearlyperiods

Methods

Trend Extraction

Los Angeles Houston


Methods

Seasonality Extraction

Los Angeles Houston


Methods

Crime Type Comparisons


Aggravated Assault,Murder, and Theft have aclear downward trend

Auto Theft and Robberydip then rise

Burglary and Rape brieflyincrease after two years,but otherwise decrease

Methods



Aggravated Assault, AutoTheft, Burglary, Rape, andTheft all have clear yearlyseasonal components

Methods



Murder has no seasonal component, only noise

Robbery has a semi-annual seasonal component

Methods


In summary,

Extracted a clear trend for each crime type

Most types have a yearly seasonal component

Murder and Robbery do not have yearly components


Methods

Stochastic Differential Equation (SDE)


We will consider two-dimensional SDEs of the form

dXt = f (Xt ,Yt) dt + g(Xt ,Yt) dWt

dYt = f (Xt ,Yt) dt + g(Xt ,Yt) dVt

White-noise terms dWt and dVt represent Brownianmotion

Methods

Stochastic Differential Equation (SDE)


Ito Calculus uses a different chain rule. If we consideru(t, x , y), then

du =u

tdt + u, dXt+

1

2Hess(u)dG , dG

where

dXt =[dXtdYt

]and dG =

[g(Xt ,Yt) dWtg(Xt ,Yt) dVt

].

Methods

Lotka-Volterra

Follows the equations

x = x( y), x(0) = x0y = y( x), y(0) = y0


Predator-PreyModel

Growth/decay rateparameters ,

Interaction effectparameters ,

Used incriminologyliterature

Natural periodicsolutions

Methods

Lotka-Volterra

Used a stochastic version of Lotka-Volterra,

dXt = Xt( Yt) dt + 1Xt dWt , X (0) = X0dYt = Yt( Xt) dt + 2Yt dVt , Y (0) = Y0.

White noise, IID terms dWt and dVt

Noise is proportional to size of population at time t

Xt = Available targets per criminal

Yt = Crime rate


Methods

Energy of the Model


Well-known energy of the Lotka-Volterra Model:

E = log y + log x y x

Using Itos Lemma, we find

Et = E0 +

t0

1( Xt) dWt + 2( Yt) dVt

1

2(1

2 + 22)t

Methods

Energy of the Model


Expected Energy

Taking expectation,

E(Et) = E(E0)1

2(1

2 + 22)t.

Energy decreases over time

Orbit tends to leave equilibrium

Built-in period variation

Can be used to validate weak order of scheme

Methods

Numerics For Stochastic Lotka-Volterra


Used semi-implicit scheme, derived by setting

Xt+1 Xt + (Xt t Xt+1Yt t + 1Xt Wt)

Yt+1 Yt + (XtYt t Yt+1 t + 2Yt Vt)(1)

where t is the timestep and Wt ,Vt

t N(0, 1).

Methods

Numerics For Stochastic Lotka-Volterra

Solving for Xt+1, Yt+1 results in thescheme

Xt+1 =1 + t + 1Wt

1 + YttXt

Yt+1 =1 + Xtt + 2Vt

1 + tYt .

Weak first order, but quick to compute

Maintains positivity with very highprobability


Results

Results


Results

Simulations


Plot1 = 0, 2 = 0.02

Least squareparameterestimation

Minimum leastsquares differencefrom data

Missing period

Results

Simulations


Los Angeles Houston

Reproduced missing period

Results

Simulations


Los Angeles Aggravated Assault Data in red,simulation in blue

Better fit withoutmissed period orwidely varyingpeak heights

Shows that ourmodel worksacross a variety ofdata

Results

Simulations


Houston Aggravated Assault

Data in red,simulation in blue

Criminologicalimplications

Results

LA Burglary Video


laclip.mpgMedia File (video/mpeg)

Results

Houston Burglary Video


HTclip.mpgMedia File (video/mpeg)

Results

Missing Period Statistics

Want to identifypeaks in time series

Use peakfinder.m,by Nathanael Yoder 1

Yoder, Nathanael (2009). PeakFinder(http://www.mathworks.com/matlabcentral/fileexchange/25500-peakfinder),MATLAB Central File Exchange. Retrieved July 2013.


Results

Missing Period Statistics


Peakfinder.m

Used for finding local peaks or valleys ... in anoisy vector.

Input: A real vector

Output: [pks, locs], two vectors with peak heightsand peak locations, respectively.

Options

sel: The amount above surrounding data for apeak to be identifiedthresh: A threshold above which maxima must beto be considered peaks

Results

Method A


Peaks must be 10units above nearbydata

Peaks must be atleast 180 units high

Results

Method A


If any pair of peaksare at leastminpeakdistance

days apart, a periodhas been missed

Results

Method


Peaks must be 7 unitsabove nearby data

Peaks must be atleast 179.6 units high

Peaks must be atleast 44 days apart

Peaks within 103days apart each otherare grouped into thesame peak cluster

Results

Method


If any pair of peaks are atleast minpeakdistance daysapart, a period has beenmissed

If the first peak appears afterminpeakdistance days, aperiod has been missed

If number of peak clusters isless than 8, a period has beenmissed

Results

Noise Parameter Analysis: Method A

Percentage relatively sensitive to noise

Varies little with 1

Decreases with 2


Percent Missed Periods vs. Noise Parameters

1/2 0.010 0.015 0.020 0.025 0.030

0.000 56.9 36.3 21.3 11.0 5.50.01 56.8 37.6 20.2 11.2 5.00.02 59.8 35.7 19.4 10.6 5.00.03 55.0 34.6 18.7 8.2 4.0

Results

Noise Parameter Analysis: Method

Varies little with 1Decreases with 2


Percent Missed Periods vs. Noise Parameters

1/2 0.010 0.015 0.020 0.025 0.030

0.000 32.4 27.9 19.5 12.1 6.90.01 38.9 30.6 18.0 13.9 6.70.02 39.5 27.2 17.4 13.4 8.10.03 40.5 28.0 19.3 12.6 8.1

Results

Minpeakdistance Analysis: Method A

Both percentages decrease as minpeakdistance increases

Percentages relatively insensitive to minpeakdistance parameter

1 = 0, 2 = 0.02


Percent Missed Periods vs. Minpeakdistance

minpeakdistance 450 475 500 525 550

Percent Missed Peaks 22.5 21.7 19.2 16.0 17.0

Results

Minpeakdistance Analysis: Method

Both percentages decrease as minpeakdistance increases

Percentages relatively sensitive to minpeakdistance parameter

1 = 0, 2 = 0.02


Percent Missed Periods vs. Minpeakdistance

minpeakdistance 450 460 470 480 490 500

Percent Missed Peaks 27.5 22.6 20.5 18.5 14.1 12.8

Results

Conclusions

Decomposed crime data into long-term trend, seasonal, and noisecomponents

Produced an SDE model to simulate burglary rates over several yearspans

Affirmed models ability to reproduce missed periods as in LA data


Results

Acknowledgements

Jeffrey Brantingham, for the Los Angeles crime data

Houston Police Department, for the Houston crime data

Scott McCalla and James von Brecht for lots of advice and mentoring


A Stochastic Model for Seasonal Cycles in Crime - UCLAbertozzi/WORKFORCE/REU 2013/Crime Fighters...A...

Documents

Transcript of A Stochastic Model for Seasonal Cycles in Crime - UCLAbertozzi/WORKFORCE/REU 2013/Crime Fighters...A...