PREDICTIVE POLICING IS A MORAL...

PREDICTIVE

POLICING IS A

MORAL

TECHNOLOGY

THE CASE OF PREDPOL

THE CASE OF

PREDICTIVE POLICING

Palentir Law

Enforcement

Hitachi Data Systems

Crime View (Omega

Group)

Microsoft Public Safety Motorola

Bair Analytics (Nexis Lexis)

Hunchlab (Azavea) Predpol

Information Builders

SOCIAL MEDIA DATA AND

PREDICTIVE POLICING

• Predicting crime with Twitter is not operational, it's still in the research stage.

• Geotagged tweets can be used as « topics » (unsupervised classification with topic modeling), and topics work like « independant variables » in a binary logistic regression model (Gerber, 2014).

• Geotagged tweets can be used as a proxy of ambient population (Andresen, 2016).

• Twitter user can be seen as a sensor of offline phenomena : disorder-related posts on Twitter are associated with actual police crime rates (supervised classification of tweets) (Williams, Burnap and Sloan, 2016)

• Twitter is a good candidate for « event detection » (riots for instance)

STARTING POINT

Algorithmic drama (Ziewitz, 2016): inscrutability, opacity, critical

dispossession.

A specific method (Ingold, 2017): touching algorithms, and

knowing them from the inside.

Hypothesis: Experts who practice machine learning can have

different moral visions of their activity that can be understood by

analysing the values and material consequences that participate in

the assessment by which predictions are built. Prediction can be

seen as an inseparably cognitive, moral and material problem.

PREDPOL

Slogan: “More than hotspot”

As researchers in Predpol declare that their algorithm is

inspired by an algorithm used in seismology, and given that

for commercial reasons the company refuses to provide

access to its algorithm, a promising alternative is to directly

consult the seismologists who developed and use this

algorithm

SELF EXCITING POINT

PROCESS

Risk intensity

Background density, (exogeneous fluctuation or

hotspot)

Kernel

Contagion

The background rate and kernel parameters are estimated by

an expectation-maximization algorithm.

MARSAN’S COMMENTS

(SEISMOLOGIST)

“These results cast strong doubts on the capacity of the models proposed here tooutperform simple hotspot maps obtained by smoothing, for the dataset analyzed. Thetriggering contribution [branching ratio] to the occurrence of future events is small (itaccounts only for 1.7 % for the best model). Accounting for memory in the systemtherefore can only provide a very modest contribution to the effectiveness of theprediction scheme.

More importantly, it is assumed that the dynamics of the process stays the same overtime. Possible non-stationarity of the process is thus clearly an issue, as it will preventthe use of past information to predict the future. This is for example experienced in thisanalysis, as 2015 burglary events are clearly not distributed (in time and in space) asthey were in 2014. This non stationarity is likely due to uncontroled evolutions in theway these acts are performed, but, in situations were new prediction algorithms are setup and exploited by police patrols, could also be a response by burglars to such achange. Unlike natural processes like earthquakes, analyses like the one presentedhere could therefore have the ability to modify the observed process, making it moredifficult to, correctly predict future events.”

David Marsan, Published draft paper on Mediapart Journal , April 2015.

Predictive policing as a public issue

MOHLER’S REACTION

(PREDPOL)

Thanks for your email and sending along the analysis. I have found your work onnonparametric point processes quite interesting and influential! We have certainlyseen the branching ratio vary quite a lot from city to city and crime type to crimetype (from 0 to .5). As you point out, it is important to pick such parameters usingcross validation in which case it is certainly possible that a simpler model may befavored. It also may be the case that the nonparametric model you are using isover-parametrized (it looks like it has over 30 parameters), so it may be over-fittingthe training data. You might need more regularization, or you might want to use asemi-parametric model (you mention using an exponential smoothing kernel,which is essentially a parametric Hawkes process without the background rate).Another thing you bring up is the non-stationarity of the process. I thinkthis is important and something we tried to estimate in the JASA paper(where the background rate depends on time). Disentangling endogenouscontagion from exogenous fluctuations in the intensity is a somewhat openproblem, though I have done a little work in this area. The non-stationarity ofthe background rate is one big difference between crime and earthquakes,and you often try to factor in seasonality and other explicit exogenouspredictors.

λ x, y, t = 𝒗(𝒕) 𝜇 (𝑥, 𝑦) + σ𝑖,𝑡𝑖<𝑡 𝑔(𝑥 − 𝑥𝑖 , 𝑦 − 𝑦𝑖, 𝑡 − 𝑡𝑖,𝑀𝑖).

A SIMPLE

VISUALISATION OF

MEMORY

SEISMES CRIMES

Why Mohler is not distabilized by the Marsan’s criticism?

TAKING THE FOLD OF

ALGORITHM

Bilel, you have to understand. You're a statistician, you don't know much about the problem you're faced with and Police says: "We pay you, we give you the data, give us the best possible model". You get to work and you realize that your model is doing well for a year and a year later, it is not doing well. You're a stateux, you don't know much about the problem. What are you doing? What are you doing?

Marsan waits a minute while he looks me in the eye. As a good sociologist, I saynothing.

As a statistician, Mohler says that my model is not flexible enough, I'll make it a little more flexible and I'll add 𝑣(𝑡). Well, I'd rather go to the Chicago police officers to find out what happened, what changed. Why is 2015 different than 2014? Is this a counting problem? Have the police changed their habits? Anyway, you're trying to figure out what makes it change from year to year. Maybe Molher is trying to understand, but his attitude makes me think that's not too much of a problem. He is looking to improve the predictive efficiency of his algorithm. But since it doesn't work very well, he's trying to loosen up a bit so that things are better. His model isn't flexible enough, so he says my µ (x, y) I'll loosen it up a little bit to make it go better by adding a temporal variation to the background rate. (David Marsan)

A MORAL STATEMENT

Ça se trouve ce n’est pas la bonne approche. Ça se trouve c’est même lacontagion qui est différente d’une année sur l’autre. Il faudrait rechanger leskernels de contagion. Mais c’est le plus pénible à ajuster. C’est plus simpled’ajouter une variable temporelle. C’est très basique ce qu’il fait. En sismo,on fait des choses beaucoup plus complexe pour faire évoluer le taux defond en fonction du temps, pour tenir compte de la non stationnarité.L’étape essentielle après l’article de Predpol serait de comprendre la non-stationnarité. Hors, ils avancent à l’aveugle. Moi, je pense que tu ne peuxpas traiter tes données sans questionner la réalité qu’elle représente. Si tuveux nous, on n’est pas mue par le même moteur. Nous ce qui nousintéresse en sismo, ce n’est pas de faire de la prédiction, c’est decomprendre la forme du Kernel. La contagion nous intéresse car elle nousdonne des indices sur les mécanismes qui font qu’un tremblement de terreva en enclencher un autre. Elle nous intéresse parce qu’elle nous apprendquelque chose sur le processus sismogénique. On ne va pas s’imposer uneforme a priori car c’est la forme qui nous intéresse. Lui il ne s’intéresse pasà la forme de la contagion. Il n’a pas envie de comprendre comment lacontagion va avoir lieu. Il a envie de faire une prédiction. Ça n’a rien avoir.Dans notre domaine, on retrouve le même type de chercheur. On a des gensqui font de la prédiction, mais qui n’ont pas envie de comprendre leprocessus. On est beaucoup à penser que ça mène à une impasse.

TWO WAYS TO

VALUING PREDICTION

Marsan assesses the accuracy of the algorithm: it is the

ability of the algorithm to reveal a close link between the

mathematical model and a coherent conception of the

phenomenon that is evaluated.

According to Predpol, If the algorithm improves the precision

of prediction scores, then the algorithm is good enough.

We now need to track the network that is deployed in these

two different ways of valuing prediction.

Let us recall this basic principle of the sociology of science:

phenomena are defined by the response they give to the

tests that scientists make them undergo in their laboratory

AFTERSHOCKS: TRACES

TO UNDERSTAND

SEISMICITY

In his analyses of the aftershocks, the seismologist doesn't justcount them. First of all, it is in these turbulent periods that it ismost likely to catch a major earthquake in the grids of itsmeasurement networks. If the recordings are of sufficientquality and number, he will be able to scan the fault break. Evenwithout a large aftershock, he will learn many of the small ones,especially about the directions of the tectonic constraints,which he can deduce from their mechanisms.

[…] as the images of the aftershock became more accurate, theirinterpretation seemed impossible in detail, which depended onuncontrollable parameters related to the unknowable strengthand state of stress of the peripheral faults. "(Bernard p. 108,reading recommended by Marsan.)

THE MARSAN’S

NOMINALISM

Even though great progress has been made in the last

decade [about declustering algorithm], there are still many

open questions, i.e., starting with the physical triggering of

earthquakes (aftershocks), effects of uncertainties in the

catalog on the results of declustering, or the effect of

censored data (selection in time, space and magnitude

range) on the outcome. In summary, care should be taken

when interpreting results of declustering or results that

depend on a declustered catalog, because these results

cannot reflect the exact nature of foreshocks, mainshocks

and aftershocks; indeed the exact nature of these events may

not exist at all! (David Marsan)

TWO WAYS TO MOBILIZE

REPETITIONS

Marsan is interested in replicas because they have the power

to help him conceptualize the process of seismicity in a new

way.

Repetitions are of interest to Predpol researchers because

they have the ability to add an additional regularity alignment

process to hotspot mapping. Repetition of crimes are

mobilized for their ability to capture the largest possible

proportion of events.

Replicas (aftershock or repeat victimization) are what they do

based on what we try to make them do.

PRACTICAL CONSEQUENCE AND

PREDICTION

In Chambéry, classes of measured entities exist in an area where

prediction refers to demonstrable consequences, which is not the

case in the police area.

✓ The moral of this controversy is that the robustness of a

prediction is inversely proportional to its practical

consequences.

MORAL ANTHROPOLOGY

Different styles of moral reasoning are embedded in differentkinds of social circumstances, and that forms of moralreasoning only flourish in those social circumstances that arewell suited to them. Consequentialist moral reasoning, forexample, only works where people have a sense that the socialworld they inhabit is relatively predictable, such that theprobable consequences of an action appear relatively easy togauge with certainty. Where such conditions do not hold,deontological approaches make much more sense – even insituations in which one cannot control the consequences ofone’s actions, one can control whether or not they conform to arule or set of rules.

Joel Robbins, 2010, “On the Pleasures and Dangers of Culpability.”Critique of Anthropology.30(1); 122-128.

CONSEQUENTIALIST VS

DEONTOLOGICAL

APPROACH

➢ consequentialist in seismology, as the goal in this case is

to measure the practical consequences of an action in the

near future.

➢ deontological approach in the fight against crime, as the

respect for legal principle contained in the algorithm

allows the police to concentrate on the immediacy of

these acts.

THE DIVINARITY PART

OF PREDICTIVE

MACHINES

The attitudes of Pentecostals analyzed by Robbins and the

integration of predictive machines into the police

organization can be more closely related in this respect than

one might think. Some situate the future in the hands of

gods, others between those of a machine in which police

leaders hope to find salvation. When operating in this style of

ethical moral reasoning, the predictive machines of artificial

learning are not only made up of technique, science and

organization, but also contain a part of divination. For future

surveys, an analysis of the modalities of prediction in the

world of machine learning and the more occult world of

witchcraft or astrology could prove fruitful.

PREDICTIVE POLICING IS A MORAL...

Documents

Transcript of PREDICTIVE POLICING IS A MORAL...