Crime Forecasting Using Boosted Ensemble Classifiers

19
Crime Forecasting Using Boosted Ensemble Classifiers Chung-Hsien Yu Crime Forecasting Using Boosted Ensemble Classifiers Department of Computer Science University of Massachusetts Boston 2012 GRADUATE STUDENTS SYMPOSIUM Present by: Chung-Hsien Yu Advisor: Prof. Wei Ding

description

Crime Forecasting Using Boosted Ensemble Classifiers. Present by: Chung- Hsien Yu. Advisor: Prof. Wei Ding. Department of Computer Science University of Massachusetts Boston . 2012 GRADUATE STUDENTS SYMPOSIUM. Abstract. - PowerPoint PPT Presentation

Transcript of Crime Forecasting Using Boosted Ensemble Classifiers

Page 1: Crime Forecasting Using Boosted Ensemble Classifiers

Crime Forecasting Using Boosted Ensemble Classifiers Chung-Hsien Yu

Crime Forecasting Using Boosted Ensemble Classifiers

Department of Computer Science University of Massachusetts Boston

2012 GRADUATE STUDENTS SYMPOSIUM

Present by: Chung-Hsien Yu

Advisor: Prof. Wei Ding

Page 2: Crime Forecasting Using Boosted Ensemble Classifiers

Crime Forecasting Using Boosted Ensemble Classifiers Chung-Hsien Yu

• Retaining spatiotemporal knowledge by applying multi-clustering to monthly aggregated crime data.

• Training baseline learners on these clusters obtained from clustering.

• Adapting a greedy algorithm to find a rule-based ensemble classifier during each boosting round.

• Pruning the ensemble classifier to prevent it from overfitting. • Constructing a strong hypothesis based on these ensemble

classifiers obtained from each round.

Abstract

2

Page 3: Crime Forecasting Using Boosted Ensemble Classifiers

Crime Forecasting Using Boosted Ensemble Classifiers Chung-Hsien Yu

Original Data

3

Residential Burglary

911 Calls

Arrest

Foreclosure

Street Robbery

Page 4: Crime Forecasting Using Boosted Ensemble Classifiers

Crime Forecasting Using Boosted Ensemble Classifiers Chung-Hsien Yu

Aggregated Data

4

3

1

1

1

Page 5: Crime Forecasting Using Boosted Ensemble Classifiers

Crime Forecasting Using Boosted Ensemble Classifiers Chung-Hsien Yu

Monthly Data3

1

1

0

5

0

0

2

6

0

3

3

1

0

0

0

0

0

1

0

4

3

3

2

8

9

4

0

6

4

5

1

2

2

2

5

4

3

0

2

3

1

2

3

0

0

0

0

3

1

1

0

5

0

0

2

6

0

3

3

1

0

0

0

0

0

1

0

4

3

3

2

8

9

4

0

6

4

5

1

2

2

2

5

4

3

0

2

3

1

2

3

0

0

0

0

3

1

1

0

5

0

0

2

6

0

3

3

1

0

0

0

0

0

1

0

4

3

3

2

8

9

4

0

6

4

5

1

2

2

2

5

4

3

0

2

3

1

2

3

0

0

0

0

3

1

1

0

5

0

0

2

6

0

3

3

1

0

0

0

0

0

1

0

4

3

3

2

8

9

4

0

6

4

5

1

2

2

2

5

4

3

0

2

3

1

2

3

0

0

0

0

3

1

1

0

5

0

0

2

6

0

3

3

1

0

0

0

0

0

1

0

4

3

3

2

8

9

4

0

6

4

5

1

2

2

2

5

4

3

0

2

3

1

2

3

0

0

0

0

3

1

1

0

5

0

0

2

6

0

3

3

1

0

0

0

0

0

1

0

4

3

3

2

8

9

4

0

6

4

5

1

2

2

2

5

4

3

0

2

3

1

2

3

0

0

0

0

3

1

1

0

5

0

0

2

6

0

3

3

1

0

0

0

0

0

1

0

4

3

3

2

8

9

4

0

6

4

5

1

2

2

2

5

4

3

0

2

3

1

2

3

0

0

0

0

3

1

1

0

5

0

0

2

6

0

3

3

1

0

0

0

0

0

1

0

4

3

3

2

8

9

4

0

6

4

5

1

2

2

2

5

4

3

0

2

3

1

2

3

0

0

0

0

3

1

1

0

5

0

0

2

6

0

3

3

1

0

0

0

0

0

1

0

4

3

3

2

8

9

4

0

6

4

5

1

2

2

2

5

4

3

0

2

3

1

2

3

0

0

0

0

2

6

1

0

5

6

6

2

7

5

3

3

1

3

4

4

3

1

4

0

4

3

3

2

8

9

4

0

6

4

5

1

2

3

2

3

0

3

0

2

0

1

2

5

0

0

0

0

5

Page 6: Crime Forecasting Using Boosted Ensemble Classifiers

Crime Forecasting Using Boosted Ensemble Classifiers Chung-Hsien Yu

Monthly Clusters (k=3)

6

Page 7: Crime Forecasting Using Boosted Ensemble Classifiers

Crime Forecasting Using Boosted Ensemble Classifiers Chung-Hsien Yu

Monthly Clusters (k=4)

7

Page 8: Crime Forecasting Using Boosted Ensemble Classifiers

Crime Forecasting Using Boosted Ensemble Classifiers Chung-Hsien Yu

Flow Chart

8

Page 9: Crime Forecasting Using Boosted Ensemble Classifiers

Crime Forecasting Using Boosted Ensemble Classifiers Chung-Hsien Yu

Algorithm (Part I)

9

Page 10: Crime Forecasting Using Boosted Ensemble Classifiers

Crime Forecasting Using Boosted Ensemble Classifiers Chung-Hsien Yu

Algorithm (Part II)

10

Page 11: Crime Forecasting Using Boosted Ensemble Classifiers

Crime Forecasting Using Boosted Ensemble Classifiers Chung-Hsien Yu

Confidence Value

11

From AdaBoosting (Schapire & Singer 1998) we have

Let and ignore the boosting round .

𝑍=∑𝑖𝑤 (𝑖 ) exp (−𝐶𝑅¿ 𝑦 𝑖)¿

is defined as the confidence value for the rule and if .

Page 12: Crime Forecasting Using Boosted Ensemble Classifiers

Crime Forecasting Using Boosted Ensemble Classifiers Chung-Hsien Yu

Objective Function

12

Therefore,

𝑊 0= ∑{ 𝑖|𝑥 𝑖∉𝑅 }

𝑤 (𝑖 )𝑊+¿= ∑{𝑖|𝑥𝑖∈𝑅 𝑎𝑛𝑑 𝑦=1 }

𝑤 ( 𝑖 ) ¿𝑊−= ∑{𝑖|𝑥 𝑖∈𝑅𝑎𝑛𝑑 𝑦=− 1}

𝑤 (𝑖 )

𝑊 0+𝑊+¿+𝑊 −=1¿

Page 13: Crime Forecasting Using Boosted Ensemble Classifiers

Crime Forecasting Using Boosted Ensemble Classifiers Chung-Hsien Yu

Minimum Z Value

13

𝑑𝑍𝑑𝐶𝑅

=−𝑊+¿exp (−𝐶 𝑅 )+𝑊 −exp (𝐶𝑅 )=0¿

→𝑊−exp (𝐶𝑅 )=𝑊+¿ exp (−𝐶𝑅 ) ¿

→ ln (𝑊 −exp (𝐶𝑅 ))=ln ¿¿→ ln (𝑊 −)+𝐶𝑅= ln ¿¿→2𝐶𝑅= ln¿ ¿

→𝐶𝑅=12 ln ¿¿

has the minimum value when

𝑑𝑍𝑑𝐶𝑅

2=𝑊+¿ exp (−𝐶𝑅 )+𝑊 −exp (𝐶𝑅 )>0¿

Page 14: Crime Forecasting Using Boosted Ensemble Classifiers

Crime Forecasting Using Boosted Ensemble Classifiers Chung-Hsien Yu

BuildChain Function

14

𝑊 0+𝑊+¿+𝑊 −=1¿

Repeatedly adding a classifier to R until it maximizes . This will minimize as well.

Page 15: Crime Forecasting Using Boosted Ensemble Classifiers

Crime Forecasting Using Boosted Ensemble Classifiers Chung-Hsien Yu

PruneChain Function

15

�́�=¿Loss Function:

Minimize by removing the last classifier from R.

is obtained from GrowSet. are obtained from applying R to PruneSet

Page 16: Crime Forecasting Using Boosted Ensemble Classifiers

Crime Forecasting Using Boosted Ensemble Classifiers Chung-Hsien Yu

Update Weights

16

Calculate with ensemble classifier R on the entire data set.

where

Page 17: Crime Forecasting Using Boosted Ensemble Classifiers

Crime Forecasting Using Boosted Ensemble Classifiers Chung-Hsien Yu

Strong Hypothesis

17

At the end of boosting, there are chains,

�̂�𝑅𝑡=0 𝑖𝑓 𝑥 ∉𝑅𝑡

Page 18: Crime Forecasting Using Boosted Ensemble Classifiers

Crime Forecasting Using Boosted Ensemble Classifiers Chung-Hsien Yu

1. The grid cells with the similar crime counts clustered together also are close to each other on the map geographically. Besides, the high-crime-rate area and low-crime-rate area are separated with cluster.

2. The original data set is randomly divided into two subsets each round. The greedy weak-learn algorithm adapts confidence-rate evaluation to “chain” the base-line classifiers using one data set. And then, “trim” the chain using the other data set.

3. The strong hypothesis is easy to calculate.

SUMMARY

18

Page 19: Crime Forecasting Using Boosted Ensemble Classifiers

Crime Forecasting Using Boosted Ensemble Classifiers Chung-Hsien Yu

Q & A

THANK YOU!!

19