PREDICTIVE POLICING IS A MORAL...
Transcript of PREDICTIVE POLICING IS A MORAL...
PREDICTIVE
POLICING IS A
MORAL
TECHNOLOGY
THE CASE OF PREDPOL
THE CASE OF
PREDICTIVE POLICING
Palentir Law
Enforcement
Hitachi Data Systems
Crime View (Omega
Group)
Microsoft Public Safety Motorola
Bair Analytics (Nexis Lexis)
Hunchlab (Azavea) Predpol
Information Builders
SOCIAL MEDIA DATA AND
PREDICTIVE POLICING
• Predicting crime with Twitter is not operational, it's still in the research stage.
• Geotagged tweets can be used as « topics » (unsupervised classification with topic modeling), and topics work like « independant variables » in a binary logistic regression model (Gerber, 2014).
• Geotagged tweets can be used as a proxy of ambient population (Andresen, 2016).
• Twitter user can be seen as a sensor of offline phenomena : disorder-related posts on Twitter are associated with actual police crime rates (supervised classification of tweets) (Williams, Burnap and Sloan, 2016)
• Twitter is a good candidate for « event detection » (riots for instance)
STARTING POINT
Algorithmic drama (Ziewitz, 2016): inscrutability, opacity, critical
dispossession.
A specific method (Ingold, 2017): touching algorithms, and
knowing them from the inside.
Hypothesis: Experts who practice machine learning can have
different moral visions of their activity that can be understood by
analysing the values and material consequences that participate in
the assessment by which predictions are built. Prediction can be
seen as an inseparably cognitive, moral and material problem.
PREDPOL
Slogan: “More than hotspot”
As researchers in Predpol declare that their algorithm is
inspired by an algorithm used in seismology, and given that
for commercial reasons the company refuses to provide
access to its algorithm, a promising alternative is to directly
consult the seismologists who developed and use this
algorithm
SELF EXCITING POINT
PROCESS
Risk intensity
Background density, (exogeneous fluctuation or
hotspot)
Kernel
Contagion
The background rate and kernel parameters are estimated by
an expectation-maximization algorithm.
MARSAN’S COMMENTS
(SEISMOLOGIST)
“These results cast strong doubts on the capacity of the models proposed here tooutperform simple hotspot maps obtained by smoothing, for the dataset analyzed. Thetriggering contribution [branching ratio] to the occurrence of future events is small (itaccounts only for 1.7 % for the best model). Accounting for memory in the systemtherefore can only provide a very modest contribution to the effectiveness of theprediction scheme.
More importantly, it is assumed that the dynamics of the process stays the same overtime. Possible non-stationarity of the process is thus clearly an issue, as it will preventthe use of past information to predict the future. This is for example experienced in thisanalysis, as 2015 burglary events are clearly not distributed (in time and in space) asthey were in 2014. This non stationarity is likely due to uncontroled evolutions in theway these acts are performed, but, in situations were new prediction algorithms are setup and exploited by police patrols, could also be a response by burglars to such achange. Unlike natural processes like earthquakes, analyses like the one presentedhere could therefore have the ability to modify the observed process, making it moredifficult to, correctly predict future events.”
David Marsan, Published draft paper on Mediapart Journal , April 2015.
Predictive policing as a public issue
MOHLER’S REACTION
(PREDPOL)
Thanks for your email and sending along the analysis. I have found your work onnonparametric point processes quite interesting and influential! We have certainlyseen the branching ratio vary quite a lot from city to city and crime type to crimetype (from 0 to .5). As you point out, it is important to pick such parameters usingcross validation in which case it is certainly possible that a simpler model may befavored. It also may be the case that the nonparametric model you are using isover-parametrized (it looks like it has over 30 parameters), so it may be over-fittingthe training data. You might need more regularization, or you might want to use asemi-parametric model (you mention using an exponential smoothing kernel,which is essentially a parametric Hawkes process without the background rate).Another thing you bring up is the non-stationarity of the process. I thinkthis is important and something we tried to estimate in the JASA paper(where the background rate depends on time). Disentangling endogenouscontagion from exogenous fluctuations in the intensity is a somewhat openproblem, though I have done a little work in this area. The non-stationarity ofthe background rate is one big difference between crime and earthquakes,and you often try to factor in seasonality and other explicit exogenouspredictors.
λ x, y, t = 𝒗(𝒕) 𝜇 (𝑥, 𝑦) + σ𝑖,𝑡𝑖<𝑡 𝑔(𝑥 − 𝑥𝑖 , 𝑦 − 𝑦𝑖, 𝑡 − 𝑡𝑖,𝑀𝑖).
A SIMPLE
VISUALISATION OF
MEMORY
SEISMES CRIMES
Why Mohler is not distabilized by the Marsan’s criticism?
TAKING THE FOLD OF
ALGORITHM
Bilel, you have to understand. You're a statistician, you don't know much about the problem you're faced with and Police says: "We pay you, we give you the data, give us the best possible model". You get to work and you realize that your model is doing well for a year and a year later, it is not doing well. You're a stateux, you don't know much about the problem. What are you doing? What are you doing?
Marsan waits a minute while he looks me in the eye. As a good sociologist, I saynothing.
As a statistician, Mohler says that my model is not flexible enough, I'll make it a little more flexible and I'll add 𝑣(𝑡). Well, I'd rather go to the Chicago police officers to find out what happened, what changed. Why is 2015 different than 2014? Is this a counting problem? Have the police changed their habits? Anyway, you're trying to figure out what makes it change from year to year. Maybe Molher is trying to understand, but his attitude makes me think that's not too much of a problem. He is looking to improve the predictive efficiency of his algorithm. But since it doesn't work very well, he's trying to loosen up a bit so that things are better. His model isn't flexible enough, so he says my µ (x, y) I'll loosen it up a little bit to make it go better by adding a temporal variation to the background rate. (David Marsan)
A MORAL STATEMENT
Ça se trouve ce n’est pas la bonne approche. Ça se trouve c’est même lacontagion qui est différente d’une année sur l’autre. Il faudrait rechanger leskernels de contagion. Mais c’est le plus pénible à ajuster. C’est plus simpled’ajouter une variable temporelle. C’est très basique ce qu’il fait. En sismo,on fait des choses beaucoup plus complexe pour faire évoluer le taux defond en fonction du temps, pour tenir compte de la non stationnarité.L’étape essentielle après l’article de Predpol serait de comprendre la non-stationnarité. Hors, ils avancent à l’aveugle. Moi, je pense que tu ne peuxpas traiter tes données sans questionner la réalité qu’elle représente. Si tuveux nous, on n’est pas mue par le même moteur. Nous ce qui nousintéresse en sismo, ce n’est pas de faire de la prédiction, c’est decomprendre la forme du Kernel. La contagion nous intéresse car elle nousdonne des indices sur les mécanismes qui font qu’un tremblement de terreva en enclencher un autre. Elle nous intéresse parce qu’elle nous apprendquelque chose sur le processus sismogénique. On ne va pas s’imposer uneforme a priori car c’est la forme qui nous intéresse. Lui il ne s’intéresse pasà la forme de la contagion. Il n’a pas envie de comprendre comment lacontagion va avoir lieu. Il a envie de faire une prédiction. Ça n’a rien avoir.Dans notre domaine, on retrouve le même type de chercheur. On a des gensqui font de la prédiction, mais qui n’ont pas envie de comprendre leprocessus. On est beaucoup à penser que ça mène à une impasse.
TWO WAYS TO
VALUING PREDICTION
Marsan assesses the accuracy of the algorithm: it is the
ability of the algorithm to reveal a close link between the
mathematical model and a coherent conception of the
phenomenon that is evaluated.
According to Predpol, If the algorithm improves the precision
of prediction scores, then the algorithm is good enough.
We now need to track the network that is deployed in these
two different ways of valuing prediction.
Let us recall this basic principle of the sociology of science:
phenomena are defined by the response they give to the
tests that scientists make them undergo in their laboratory
AFTERSHOCKS: TRACES
TO UNDERSTAND
SEISMICITY
In his analyses of the aftershocks, the seismologist doesn't justcount them. First of all, it is in these turbulent periods that it ismost likely to catch a major earthquake in the grids of itsmeasurement networks. If the recordings are of sufficientquality and number, he will be able to scan the fault break. Evenwithout a large aftershock, he will learn many of the small ones,especially about the directions of the tectonic constraints,which he can deduce from their mechanisms.
[…] as the images of the aftershock became more accurate, theirinterpretation seemed impossible in detail, which depended onuncontrollable parameters related to the unknowable strengthand state of stress of the peripheral faults. "(Bernard p. 108,reading recommended by Marsan.)
THE MARSAN’S
NOMINALISM
Even though great progress has been made in the last
decade [about declustering algorithm], there are still many
open questions, i.e., starting with the physical triggering of
earthquakes (aftershocks), effects of uncertainties in the
catalog on the results of declustering, or the effect of
censored data (selection in time, space and magnitude
range) on the outcome. In summary, care should be taken
when interpreting results of declustering or results that
depend on a declustered catalog, because these results
cannot reflect the exact nature of foreshocks, mainshocks
and aftershocks; indeed the exact nature of these events may
not exist at all! (David Marsan)
TWO WAYS TO MOBILIZE
REPETITIONS
Marsan is interested in replicas because they have the power
to help him conceptualize the process of seismicity in a new
way.
Repetitions are of interest to Predpol researchers because
they have the ability to add an additional regularity alignment
process to hotspot mapping. Repetition of crimes are
mobilized for their ability to capture the largest possible
proportion of events.
Replicas (aftershock or repeat victimization) are what they do
based on what we try to make them do.
PRACTICAL CONSEQUENCE AND
PREDICTION
In Chambéry, classes of measured entities exist in an area where
prediction refers to demonstrable consequences, which is not the
case in the police area.
✓ The moral of this controversy is that the robustness of a
prediction is inversely proportional to its practical
consequences.
MORAL ANTHROPOLOGY
Different styles of moral reasoning are embedded in differentkinds of social circumstances, and that forms of moralreasoning only flourish in those social circumstances that arewell suited to them. Consequentialist moral reasoning, forexample, only works where people have a sense that the socialworld they inhabit is relatively predictable, such that theprobable consequences of an action appear relatively easy togauge with certainty. Where such conditions do not hold,deontological approaches make much more sense – even insituations in which one cannot control the consequences ofone’s actions, one can control whether or not they conform to arule or set of rules.
Joel Robbins, 2010, “On the Pleasures and Dangers of Culpability.”Critique of Anthropology.30(1); 122-128.
CONSEQUENTIALIST VS
DEONTOLOGICAL
APPROACH
➢ consequentialist in seismology, as the goal in this case is
to measure the practical consequences of an action in the
near future.
➢ deontological approach in the fight against crime, as the
respect for legal principle contained in the algorithm
allows the police to concentrate on the immediacy of
these acts.
THE DIVINARITY PART
OF PREDICTIVE
MACHINES
The attitudes of Pentecostals analyzed by Robbins and the
integration of predictive machines into the police
organization can be more closely related in this respect than
one might think. Some situate the future in the hands of
gods, others between those of a machine in which police
leaders hope to find salvation. When operating in this style of
ethical moral reasoning, the predictive machines of artificial
learning are not only made up of technique, science and
organization, but also contain a part of divination. For future
surveys, an analysis of the modalities of prediction in the
world of machine learning and the more occult world of
witchcraft or astrology could prove fruitful.