A Bayesian mixture model for detecting unusual time trends Modelling burglary counts in Cambridge

28
A Bayesian mixture model for detecting unusual time trends Modelling burglary counts in Cambridge Guangquan (Philip) Li 4 th ESRC Research Methods Festival July 5-8, 2010 Joint work with Nicky Best, Sylvia Richardson and Robert Haining

description

A Bayesian mixture model for detecting unusual time trends Modelling burglary counts in Cambridge. Guangquan (Philip) Li 4 th ESRC Research Methods Festival July 5-8, 2010 Joint work with Nicky Best, Sylvia Richardson and Robert Haining. Outline. Motivations - PowerPoint PPT Presentation

Transcript of A Bayesian mixture model for detecting unusual time trends Modelling burglary counts in Cambridge

Page 1: A Bayesian mixture model for detecting unusual time trends Modelling burglary counts in Cambridge

A Bayesian mixture model for detecting unusual time trends

Modelling burglary counts in Cambridge

Guangquan (Philip) Li4th ESRC Research Methods Festival

July 5-8, 2010

Joint work with Nicky Best, Sylvia Richardson and Robert Haining

Page 2: A Bayesian mixture model for detecting unusual time trends Modelling burglary counts in Cambridge

2

Outline

1. Motivations

2. A Bayesian mixture model for detecting unusual time trends

3. Preliminary results from analysis of burglary data in Cambridge

Page 3: A Bayesian mixture model for detecting unusual time trends Modelling burglary counts in Cambridge

3

Reasons for detecting unusual time trends

Emergence of local risk factors?

Change of population composition?

Impact of a new policing scheme?

Modelling

Highlighting areas deserving of further scrutiny Identifying

possible risk factors

Informing policy making

Assessing effect of policy

Page 4: A Bayesian mixture model for detecting unusual time trends Modelling burglary counts in Cambridge

4

• The report describes the work of the Domestic Burglary Task Force (DBTF) in Cambridge, which was established to examine the nature of residential burglary in Cambridge and to design and implement initiatives to prevent it.

• Analysis of the burglary counts (1993-1994), the DBFT identified the largest ‘hot spot’ in the north of the City, and the two wards which contained the ‘hot spot’, as the targeted area.

• After a series of seminars, a number of burglary prevention strategies were identified and implemented.

• Question: whether the strategies helped to reduce residential burglary rates?

Preventing residential burglary in CambridgePolice research Series paper 108

Page 5: A Bayesian mixture model for detecting unusual time trends Modelling burglary counts in Cambridge

5

Trend comparisons

Cambridge city as a whole

Two targeted wards

Need modelling

Page 6: A Bayesian mixture model for detecting unusual time trends Modelling burglary counts in Cambridge

6

A Bayesian detection model

• We have proposed a detection method that, for each area, provides estimates independently from the common trend component and the area-specific trend component and selects estimates between the two to describe the observed data.

• For each area, the posterior probability of selecting the common trend component is used to classify the area/trend as “unusual” or not.

Page 7: A Bayesian mixture model for detecting unusual time trends Modelling burglary counts in Cambridge

7

A schematic diagram of the detection model

Space-Time variations

Common time trend

Area-specific time trends

Common spatial pattern

Area-specific time trends

Common spatial pattern

Common time trend

Space-time separable Space-time inseparable

Page 8: A Bayesian mixture model for detecting unusual time trends Modelling burglary counts in Cambridge

8

Specific model components (1)

• A conditional autoregressive (CAR) model is used to impose the spatial correlation.

Spatial smoothing

Page 9: A Bayesian mixture model for detecting unusual time trends Modelling burglary counts in Cambridge

Specific model components (2)

• A random walk of order 1 is used to define the temporal structure.

Temporal smoothing

tt-1 t+1

Time

Non-informative priors are assigned to other parameters in the model

1. S varies2. S fixed (>1)

Page 10: A Bayesian mixture model for detecting unusual time trends Modelling burglary counts in Cambridge

Classification• For each area, the posterior mean of zi (denoted by

pi) presents evidence for area i to follow the common trend pattern

a small value of pi suggests that the area is unlikely to follow the common trend

• The area is unusual if the above probability is less than some threshold, i.e.,

Page 11: A Bayesian mixture model for detecting unusual time trends Modelling burglary counts in Cambridge

11

The idea of classification

Unusual Usual

pi

Prob (An area follows the common trend pattern)

• Choose cutoff to achieve pre-specified false detection rate (FDR)

• Cutoff values cannot be obtained using conventional approaches such as Storey 2002 since null hypothesis is specific to each areas.

• We have proposed a novel simulation approach to obtain area-specific cutoffs so that we can maximize the sensitivity while controlling for FDR.

Page 12: A Bayesian mixture model for detecting unusual time trends Modelling burglary counts in Cambridge

12

A simulation study

Page 13: A Bayesian mixture model for detecting unusual time trends Modelling burglary counts in Cambridge

13

Simulation results

Scenario 1

Scenario 2

Scenario 3Small departures Large departures

15 (out of 354) areas were selected according to the population sizes and spatial risks and assigned the unusual

trend.Comparing the gain/loss of sensitivity

amongst the following 4 models

1. S-vary

2. S=2 (the optimal setting, the reference)

3. S=5

4. SaTScan (space-time permutation test)

S-vary

S=5

SaTScan

Scenario 1

Page 14: A Bayesian mixture model for detecting unusual time trends Modelling burglary counts in Cambridge

14

Simulation results

s-vary

S-vary

S=5

SaTScan

Reference: S = 2

Page 15: A Bayesian mixture model for detecting unusual time trends Modelling burglary counts in Cambridge

15

Summary: Key features of the model

The comprehensive simulation study has shown some key features of our model:

1. Our model can detect various realistic departure patterns;

2. The performance is robust over different model settings;

3. Our model outperforms the popular SaTScan;

4. Our detection model works relatively well on sparse data.

Page 16: A Bayesian mixture model for detecting unusual time trends Modelling burglary counts in Cambridge

16

Burglary data in Cambridge

• Geo-referenced offence records in Cambridgeshire (2001-2008) are made available by the Cambridgeshire Constabulary;

• In this analysis, we focus on the burglary counts in Cambridge at the Lower Super Output Area (LSOA) level for each quarter from 2001 to 2002 (2584 reported burglary cases).

• Numbers of houses were taken from the 2001 Census then aggregated to LSOA level (≈600 houses).

Page 17: A Bayesian mixture model for detecting unusual time trends Modelling burglary counts in Cambridge

17

Overall spatial/temporal pattern

Page 18: A Bayesian mixture model for detecting unusual time trends Modelling burglary counts in Cambridge

18

Detected LSOA (FDR=0.01)

Page 19: A Bayesian mixture model for detecting unusual time trends Modelling burglary counts in Cambridge

19

High risks and unusual LSOA

Page 20: A Bayesian mixture model for detecting unusual time trends Modelling burglary counts in Cambridge

20

Future work

• We are currently working closely with the Cambridgeshire Police to assess effectiveness of possible policing schemes;

• The framework can be extended to a prospective surveillance system by applying the detection model sequentially to observed data;

• Incorporation of time-varying covariates (e.g., unemployment from surveys) can enrich the detection analysis.

Page 21: A Bayesian mixture model for detecting unusual time trends Modelling burglary counts in Cambridge

21

Summary

• We have proposed a Bayesian mixture model for detecting unusual time trends;

• The extensive simulation study has shown the superior performance of the model in detecting various “real” departures;

• Applying the model to the offence data can assist/inform policy making (by identifying abrupt changes) and help to assess policy.

Page 22: A Bayesian mixture model for detecting unusual time trends Modelling burglary counts in Cambridge

22

Acknowledgement

• Funded by ESRC

• The BIAS project (PI Nicky Best), based at Imperial College London, is a node of the Economic and Social Research Council’s National Centre for Research Methods (NCRM)

• The offences data are kindly provided by the Cambridgeshire Constabulary.

Page 23: A Bayesian mixture model for detecting unusual time trends Modelling burglary counts in Cambridge

23

ReferencesSaTScan

• Kulldorff M, Heffernan R, Hartman J, Assunção RM, Mostashari F. A space-time permutation scan statistic for the early detection of disease outbreaks. PLoS Medicine, 2:216-224, 2005.

False discovery rate (FDR)

• Storey J. A direct approach to false discovery rates. JRSS(B), 64: 479-498, 2002.

• Newton M, Noueiry A, Sarkar D, Ahlquist P. Detecting differential gene expression with a semiparametric hierarchical mixture method. Biostatistics, 5:155-176, 2004.

Crime

• Bennett T. and Durie L. Preventing residential burglary in Cambridge: From crime audits to targeted strategies. Police research series Paper 108, 1998.

Page 24: A Bayesian mixture model for detecting unusual time trends Modelling burglary counts in Cambridge

24

False discovery rate• The FDR measures the percentage of areas that

are identified as unusual but being truly usual.

• The FDR is a trade-off between the sensitivity and the specificity.

Page 25: A Bayesian mixture model for detecting unusual time trends Modelling burglary counts in Cambridge

25

A simulation approach to control for the FDR

• There are methods to approximate the FDR and hence the cutoffs based on posterior probabilities (e.g., Storey 2002 and Newton et al. 2004).

• However, these methods are not applicable in the current situation as here each area has its own alternative hypothesis.

• We have proposed a simulation based approach to estimate area-specific cutoffs that achieve the required level of confidence in the detection.

• If the distribution of the probability pi under the null hypothesis is known, then we can work out a cutoff value that corresponds to a pre-defined level of FDR, 5%, say.

Page 26: A Bayesian mixture model for detecting unusual time trends Modelling burglary counts in Cambridge

26

• H0 : the area follows the common trend. That is,

• We then fit the above model to real data to get the estimates for the intercept, α*, the space, ηi* , and the time, νt* ,components

Approximating the distribution of pi under the null hypothesis

Page 27: A Bayesian mixture model for detecting unusual time trends Modelling burglary counts in Cambridge

27

Approximating the distribution of pi under the null hypothesis

• Step 1: generate data, ysimi,t , from

0 1 pi

• Step 3: repeat Steps 1 and 2 many times (≈200)

The null: No areas are unusual!

• Step 2: fit the full detection model to the simulated data to get

Page 28: A Bayesian mixture model for detecting unusual time trends Modelling burglary counts in Cambridge

28

Classification with control for FP

We are with 95% confidence in our classification if the cutoff is obtained by controlling the false positive at the 5% level.

pcut=0.2

Unusualpi=0.1

Usualpi=0.35

Obtain the test statistic, pi by fitting the full detection model to real data