Post on 05-Apr-2017
Data Science for
Crime and Criminal
Behaviour
Dr Sebastian Haan
(PI Dr Roman Marchant)
Centre for Translational Data Science
Sydney Institute of Criminology
The University of Sydney
Sebastian Haan | Centre for Translational Data Science
Outline
2
Overview of Data Science Methodology
Multivariate Model of Crime
Assessing Risk and Criminal Behaviour
Sebastian Haan | Centre for Translational Data Science
The Centre for Translational Data
Science
3
Sebastian Haan | Centre for Translational Data Science
Data Science Methodology
4
Data
Research Question
Inference
Evaluation
Quantification of real phenomena.
Surveys
Sensors
Existing Records
Specific Question,
Fundamental core of a
research project.
Exploratory
AnalysisStatistical
Modelling
Field Application
Impact
Sebastian Haan | Centre for Translational Data Science
Multivariate Model of Crime
5
Crime: Realisation of an event which is against the law.
Research Questions:
1. What are the key drivers of crime? Demographics? Transport? Environment
characteristics?
2. What is the distribution/density of crimes in space and time?
3. Pathways of individuals towards crime
Sebastian Haan | Centre for Translational Data Science
Literature Review
6
Spatial
Models
Grid Mapping
Temporal
Models
Heat Maps
Covering Ellipses
Kernel
Density
Seasonality
Time Series
Spatial-
Temporal
Regression
Self
Exciting
Point
Process
Heuristics
Risk Models
Demographic
Regression
Multi-Variate
Crime
Regression
Risk-Terrain
Modelling
Sebastian Haan | Centre for Translational Data Science
Probabilistic Crime Model
7
Historic Space-Time
Criminal Records
Demographic
Information
Information Sources
Probabilistic
Regression
Model
Crime Density
for each
Crime Type
Prediction
Importance
of Risk
Factors
Inference
Sebastian Haan | Centre for Translational Data Science
Probabilistic Crime Model
8
1. Discretise space into unit elements (SA1, SA2, Postcode, LGA)
Sebastian Haan | Centre for Translational Data Science
Probabilistic Crime Model
9
2. Accumulate Crime Occurrence per Geographic Unit
COPS/BOCSAR
Unit Record Incident Data
Number of Assaults
Number of Robberies
Number of MVT
…
Example Simulated Data
Sebastian Haan | Centre for Translational Data Science
Probabilistic Crime Model
10
3. Gather Explanatory Variables for each Geographic Unit
Population
Indigenous
Rent
Income
Household_Income
Persons_per_Bedroom
Mortgage
Tenure
Australian Bureau of Statistics
Demographics
Language
Occupation
Education
Religion
Marital_Status
Disadvantage_Scores
IRSAD
IER
IEO
Total_Houses
Voluntary_Work
Age
English
Cultural Background
Motor_Vehicles
Domain Experts
Sebastian Haan | Centre for Translational Data Science
Probabilistic Crime Model
11
4. Ensemble DatabaseGeographic Area
Spatial Coordinates Crime CountsD e m o g r a p h i c s
Sebastian Haan | Centre for Translational Data Science
Probabilistic Crime Model
12
5. Define Model
where
6. Parameter Learning Model
Unknown Parameters and
Hyper-Parameters
Conduct sampling algorithm:
Markov Chain Monte Carlo
Sebastian Haan | Centre for Translational Data Science
Probabilistic Crime Model
13
6. Evaluate Results
Model based on
demographic factors
BUT spatial correlations
NOT taken into account
Model based on
demographic factors
AND
spatial correlations
Sebastian Haan | Centre for Translational Data Science
Improvements and Future Work
14
Include other sources of Information.
Parks
Drugstores
Liqueur Stores
Stadium
Hotels
Bar
Night Club
Casino
Public Housing
Hospital
High Rise
Beach
Historic Space-Time
Criminal Records
Demographic
Information
Probabilistic
Regression
Model
Extra Crime
Information (Alcohol,
Drugs DUMA, DVSAT)
Environment
Transport
Policing Levels
Health
Bus Stops
Train Stations
Car Density
Bicycle Density
Pedestrian Activity
Road Types
Sidewalk size
Sebastian Haan | Centre for Translational Data Science
Pathways towards Criminal Behaviour
15
1. What are the important factors that affect the
criminality levels of people over time.
• What are drivers for young children for becoming
involved in with crime?
• How do life events diminish criminality over
criminals? (Desistance)
• What factors characterise a specific type of
criminal? Can we build clusters of criminals?
Sebastian Haan | Centre for Translational Data Science
Criminal Behaviour Regression
16
Age Crime
Curve
Criminal Records
Mental Health
Records
Education
Housing
Parental History
Social Records
Social Media
Cultural
Background
Possible Information Sources
Clusters of
Criminals
Probabilistic
Model
Regression
and Clustering
Individual
Pathways
Sebastian Haan | Centre for Translational Data Science
Criminal Behaviour/Risk Projects
17
1. Juvenile Justice Risk Assessment: Evaluate how risk assessment is
conducted and how effective it is.
• Prior and Current Offenses
• Education
• Substance Abuse
• Family
• Personality/Behaviour
• Peers
• Leisure/Recreation
• Attitudes/Orientation
Sebastian Haan | Centre for Translational Data Science
Criminal Behaviour/Risk Projects
18
2. Police Citizens Youth Clubs: Effectiveness of Police Youth Clubs
depending on engagement of individuals.
Status Level over TimeFinalised
Non Current
Active
Referenced
Declined
Rejected
Sebastian Haan | Centre for Translational Data Science
Acknowledgement
19
Blake Roles
Data-Science
Master Student
Sally Cripps
Professor CTDSGordon McDonald
Research Engineer
Roman Marchant
Data Scientist CTDS
Garner Clancey Tony Makkai
Data Science for
Crime and Criminal
Behaviour
Sebastian Haan
Sebastian.Haan@sydney.edu.au
Centre for Translational Data Science
The University of Sydney
Thanks!
Questions?
Sebastian Haan | Centre for Translational Data Science
Correlation Matrix
21