Dr Sebastian Haan - University of Sydney

21
Data Science for Crime and Criminal Behaviour Dr Sebastian Haan (PI Dr Roman Marchant) Centre for Translational Data Science Sydney Institute of Criminology The University of Sydney

Transcript of Dr Sebastian Haan - University of Sydney

Page 1: Dr Sebastian Haan - University of Sydney

Data Science for

Crime and Criminal

Behaviour

Dr Sebastian Haan

(PI Dr Roman Marchant)

Centre for Translational Data Science

Sydney Institute of Criminology

The University of Sydney

Page 2: Dr Sebastian Haan - University of Sydney

Sebastian Haan | Centre for Translational Data Science

Outline

2

Overview of Data Science Methodology

Multivariate Model of Crime

Assessing Risk and Criminal Behaviour

Page 3: Dr Sebastian Haan - University of Sydney

Sebastian Haan | Centre for Translational Data Science

The Centre for Translational Data

Science

3

Page 4: Dr Sebastian Haan - University of Sydney

Sebastian Haan | Centre for Translational Data Science

Data Science Methodology

4

Data

Research Question

Inference

Evaluation

Quantification of real phenomena.

Surveys

Sensors

Existing Records

Specific Question,

Fundamental core of a

research project.

Exploratory

AnalysisStatistical

Modelling

Field Application

Impact

Page 5: Dr Sebastian Haan - University of Sydney

Sebastian Haan | Centre for Translational Data Science

Multivariate Model of Crime

5

Crime: Realisation of an event which is against the law.

Research Questions:

1. What are the key drivers of crime? Demographics? Transport? Environment

characteristics?

2. What is the distribution/density of crimes in space and time?

3. Pathways of individuals towards crime

Page 6: Dr Sebastian Haan - University of Sydney

Sebastian Haan | Centre for Translational Data Science

Literature Review

6

Spatial

Models

Grid Mapping

Temporal

Models

Heat Maps

Covering Ellipses

Kernel

Density

Seasonality

Time Series

Spatial-

Temporal

Regression

Self

Exciting

Point

Process

Heuristics

Risk Models

Demographic

Regression

Multi-Variate

Crime

Regression

Risk-Terrain

Modelling

Page 7: Dr Sebastian Haan - University of Sydney

Sebastian Haan | Centre for Translational Data Science

Probabilistic Crime Model

7

Historic Space-Time

Criminal Records

Demographic

Information

Information Sources

Probabilistic

Regression

Model

Crime Density

for each

Crime Type

Prediction

Importance

of Risk

Factors

Inference

Page 8: Dr Sebastian Haan - University of Sydney

Sebastian Haan | Centre for Translational Data Science

Probabilistic Crime Model

8

1. Discretise space into unit elements (SA1, SA2, Postcode, LGA)

Page 9: Dr Sebastian Haan - University of Sydney

Sebastian Haan | Centre for Translational Data Science

Probabilistic Crime Model

9

2. Accumulate Crime Occurrence per Geographic Unit

COPS/BOCSAR

Unit Record Incident Data

Number of Assaults

Number of Robberies

Number of MVT

Example Simulated Data

Page 10: Dr Sebastian Haan - University of Sydney

Sebastian Haan | Centre for Translational Data Science

Probabilistic Crime Model

10

3. Gather Explanatory Variables for each Geographic Unit

Population

Indigenous

Rent

Income

Household_Income

Persons_per_Bedroom

Mortgage

Tenure

Australian Bureau of Statistics

Demographics

Language

Occupation

Education

Religion

Marital_Status

Disadvantage_Scores

IRSAD

IER

IEO

Total_Houses

Voluntary_Work

Age

English

Cultural Background

Motor_Vehicles

Domain Experts

Page 11: Dr Sebastian Haan - University of Sydney

Sebastian Haan | Centre for Translational Data Science

Probabilistic Crime Model

11

4. Ensemble DatabaseGeographic Area

Spatial Coordinates Crime CountsD e m o g r a p h i c s

Page 12: Dr Sebastian Haan - University of Sydney

Sebastian Haan | Centre for Translational Data Science

Probabilistic Crime Model

12

5. Define Model

where

6. Parameter Learning Model

Unknown Parameters and

Hyper-Parameters

Conduct sampling algorithm:

Markov Chain Monte Carlo

Page 13: Dr Sebastian Haan - University of Sydney

Sebastian Haan | Centre for Translational Data Science

Probabilistic Crime Model

13

6. Evaluate Results

Model based on

demographic factors

BUT spatial correlations

NOT taken into account

Model based on

demographic factors

AND

spatial correlations

Page 14: Dr Sebastian Haan - University of Sydney

Sebastian Haan | Centre for Translational Data Science

Improvements and Future Work

14

Include other sources of Information.

Parks

Drugstores

Liqueur Stores

Stadium

Hotels

Bar

Night Club

Casino

Public Housing

Hospital

High Rise

Beach

Historic Space-Time

Criminal Records

Demographic

Information

Probabilistic

Regression

Model

Extra Crime

Information (Alcohol,

Drugs DUMA, DVSAT)

Environment

Transport

Policing Levels

Health

Bus Stops

Train Stations

Car Density

Bicycle Density

Pedestrian Activity

Road Types

Sidewalk size

Page 15: Dr Sebastian Haan - University of Sydney

Sebastian Haan | Centre for Translational Data Science

Pathways towards Criminal Behaviour

15

1. What are the important factors that affect the

criminality levels of people over time.

• What are drivers for young children for becoming

involved in with crime?

• How do life events diminish criminality over

criminals? (Desistance)

• What factors characterise a specific type of

criminal? Can we build clusters of criminals?

Page 16: Dr Sebastian Haan - University of Sydney

Sebastian Haan | Centre for Translational Data Science

Criminal Behaviour Regression

16

Age Crime

Curve

Criminal Records

Mental Health

Records

Education

Housing

Parental History

Social Records

Social Media

Cultural

Background

Possible Information Sources

Clusters of

Criminals

Probabilistic

Model

Regression

and Clustering

Individual

Pathways

Page 17: Dr Sebastian Haan - University of Sydney

Sebastian Haan | Centre for Translational Data Science

Criminal Behaviour/Risk Projects

17

1. Juvenile Justice Risk Assessment: Evaluate how risk assessment is

conducted and how effective it is.

• Prior and Current Offenses

• Education

• Substance Abuse

• Family

• Personality/Behaviour

• Peers

• Leisure/Recreation

• Attitudes/Orientation

Page 18: Dr Sebastian Haan - University of Sydney

Sebastian Haan | Centre for Translational Data Science

Criminal Behaviour/Risk Projects

18

2. Police Citizens Youth Clubs: Effectiveness of Police Youth Clubs

depending on engagement of individuals.

Status Level over TimeFinalised

Non Current

Active

Referenced

Declined

Rejected

Page 19: Dr Sebastian Haan - University of Sydney

Sebastian Haan | Centre for Translational Data Science

Acknowledgement

19

Blake Roles

Data-Science

Master Student

Sally Cripps

Professor CTDSGordon McDonald

Research Engineer

Roman Marchant

Data Scientist CTDS

Garner Clancey Tony Makkai

Page 20: Dr Sebastian Haan - University of Sydney

Data Science for

Crime and Criminal

Behaviour

Sebastian Haan

[email protected]

Centre for Translational Data Science

The University of Sydney

Thanks!

Questions?

Page 21: Dr Sebastian Haan - University of Sydney

Sebastian Haan | Centre for Translational Data Science

Correlation Matrix

21