A green-fingered approach can improve the clinical utility of violence risk assessment tools

6
Invited editorial A green-fingered approach can improve the clinical utility of violence risk assessment tools STUART THOMAS AND MORVEN LEESE, PO29, Health Services Research Department, Institute of Psychiatry, London, UK Statistical methods are now routinely used to determine the strength of associ- ation, between various explanatory (risk) factors and violence. Such methods can be used both for short-term prediction using ‘dynamic’ factors and long- term prediction based on more stable characteristics. The statistical issues are similar whatever the type of data and length of time over which predictions are made: the main difference is the relative availability of data and samples on which to fit and test the models rather than the model themselves It is often recommended that, in developing such models, univariate methods to estimate odds ratios for the risk factors should be considered first. Then perhaps stratified analyses may be employed to control for confounders such as age and sex. Common practice then dictates that the final stage is to combine any significant explanatory variables and other a priori choices (risk factors that are perhaps non-significant but considered clinically relevant) in a joint multivariate model, usually logistic regression. However, an alternative approach based on classification trees has recently been introduced as a potential competitor and has some attractive features. This editorial focuses on the relative strengths and weaknesses of these two contrasting methods in relation to improving the predictive accuracy and thus the clinical utility of violence risk assessment models. Binary logistic regression The strength of the relationship between a risk factor and the outcome can be expressed as an odds ratio: a value greater (less) than one indicating a risk (protective) factor. Logistic regression models usually include only these ‘main effects’ and are generally not aimed at identifying combinations of risk factors that interact with one another to produce a particularly high or low risk of violence over and above these simple multiplicative effects. Thus according to this approach, if two risk factors each independently double the odds of violence, for example, having both risk factors quadruples the odds. Criminal Behaviour and Mental Health, 13, 153–158 2003 © Whurr Publishers Ltd 153

Transcript of A green-fingered approach can improve the clinical utility of violence risk assessment tools

Invited editorialA green-fingered approach canimprove the clinical utility ofviolence risk assessment tools

STUART THOMAS AND MORVEN LEESE, PO29, Health Services ResearchDepartment, Institute of Psychiatry, London, UK

Statistical methods are now routinely used to determine the strength of associ-ation, between various explanatory (risk) factors and violence. Such methodscan be used both for short-term prediction using ‘dynamic’ factors and long-term prediction based on more stable characteristics. The statistical issues aresimilar whatever the type of data and length of time over which predictions aremade: the main difference is the relative availability of data and samples onwhich to fit and test the models rather than the model themselves

It is often recommended that, in developing such models, univariatemethods to estimate odds ratios for the risk factors should be considered first.Then perhaps stratified analyses may be employed to control for confounderssuch as age and sex. Common practice then dictates that the final stage is tocombine any significant explanatory variables and other a priori choices (riskfactors that are perhaps non-significant but considered clinically relevant) in ajoint multivariate model, usually logistic regression. However, an alternativeapproach based on classification trees has recently been introduced as apotential competitor and has some attractive features.

This editorial focuses on the relative strengths and weaknesses of these twocontrasting methods in relation to improving the predictive accuracy and thusthe clinical utility of violence risk assessment models.

Binary logistic regression

The strength of the relationship between a risk factor and the outcome can beexpressed as an odds ratio: a value greater (less) than one indicating a risk(protective) factor. Logistic regression models usually include only these ‘maineffects’ and are generally not aimed at identifying combinations of risk factorsthat interact with one another to produce a particularly high or low risk ofviolence over and above these simple multiplicative effects. Thus according tothis approach, if two risk factors each independently double the odds ofviolence, for example, having both risk factors quadruples the odds.

Criminal Behaviour and Mental Health, 13, 153–158 2003 © Whurr Publishers Ltd 153

CBMH 13.3_crc2 24/11/03 2:55 pm Page 153

The idea that factors contribute independently to the overall risk may beover-simplistic, and is not consistent with the view of Steadman et al. (2000)that predictors of violence vary for different people in different situations. Wemay need to consider the possibility of interactions. Within the logisticregression framework it is of course possible to fit specific interactions toaccount for the modifying effect certain risk factors may have on others.However, due to the cumbersome process of fitting separate interaction terms,and additional problems associated with multi-level interactions, this is rarelydone unless there is some a priori evidence that they should be fitted.

Classification trees: an alternative to logistic regression

An alternative procedure based on classification trees has recently been intro-duced into the field of violence prediction (e.g. Monahan et al., 2000). Thishas the potential to improve the clinical utility of risk assessment tools byautomatically considering interactions. This type of technique (available inpackages such as AnswerTree and SPLUS) splits the sample on the basis ofthose risk factors that best discriminate between violent or non-violent cases,where ‘best’ is defined, for example, as that producing the lowest misclassifi-cation rate. One starts with the most predictive risk factor and works throughthe others in turn, as they relate to the prior factors. For example, the samplemight be split by age (at the age which best divides the sample into violent andnon-violent). For older patents the next most predictive factor is their recordof previous violence, so this provides the next split for them, but for youngerpatients substance abuse is the next most predictive factor, so for them the splitis made on this basis. The process continues to produce a series of splits like atree, with the final end points or ‘terminal nodes’ representing groups of peoplewith similar combinations of characteristics and (ideally) similar levels of risk.

Important interactions that had not been envisaged in advance have a chanceto reveal themselves and thus the model can be particularly useful for generatinghypotheses about the association between various contingent risk factors andviolence or non-violence (Steadman, 2000). A further practical advantage is thatit provides a relatively simple way of including missing values by assigning themtheir own categories and allowing them to form a separate part of the tree.Furthermore, it could be claimed that the method is an attractive alternative forclinicians as it is a transparent model that reflects the way clinicians think.

Tree models must therefore be considered as real competitors to thestandard logistic models that are so widespread in violence prediction. Theydo, however, have a number of potential weaknesses.

Robustness and interpretability of trees

A particular problem with tree models is that they can produce complex treeswith many branches, each representing very specific risk profiles for groups

154 Editorial

CBMH 13.3_crc2 24/11/03 2:55 pm Page 154

containing a few individuals. This can entail fitting a very large number ofparameters and so can capitalize on chance in fitting to the particular sample(‘over-fitting’) especially with small-to-moderate samples. While on the onehand the automatic inclusion of multi-level interactions can uncoverpertinent associations between various risk factors, this apparent benefit maybe countered by a lack of generalizability to other populations. This could limitthe practical utility of the model. Furthermore the trees can be very complexand difficult to interpret unless they are simplified in some way.

A combination of methods can produce interpretable trees and can at leastpartly address the over-fitting issue. Trees can be ‘pruned’ to create more parsi-monious models, based on pre-determined size for the smallest defined riskgroups (e.g. Monahan et al., 2000; Silver et al., 2000), clinical importance, oron statistical methods according to, for example, minimum classification error(e.g. Thomas et al., submitted). Weights can also be ascribed to ‘cases’ toprovide an enhanced emphasis on predicting violent (or non-violent) cases ifthis is required.

The choices of the particular model, pruning method and system ofweighting are many and varied. This can be seen as a strength in the context ofexploratory analyses but can also be seen as a weakness because it introducesthe need for further subjective choices.

Outcomes with more than two categories

Before dealing with the issue of assessing prediction accuracy, it is worthmentioning that there are some useful techniques for dealing with outcomeswith more than two categories, both in logistic and tree models. These arepotentially useful when a binary split is unrealistic, for example in the situationwhere violence can be categorized in three ways: none at all, verbal violenceonly and physical violence. In a logistic framework, ordered and multiplelogistic regression can be used (Agresti, 1996) and the package Stata, amongothers, provides software routines. A significant increase in the fit of a multiplelogistic model compared with an ordered model would indicate, in the aboveexample, that different models describe verbal as opposed to physical violence.The analogous case for tree models can also be dealt with using more sophisti-cated versions of the standard tree model.

Predictive accuracy

Whatever method is chosen, it is important to have a realistic view of howwell any statistical model will actually work when used to predict violence inreal, clinical practice. Accuracy is usually expressed in terms of sensitivity(proportion of true positives identified as such) and specificity (proportion oftrue negatives identified as such). These measures are independent of the baserate (prevalence) of violence in the population under study and are thus

Editorial 155

CBMH 13.3_crc2 24/11/03 2:55 pm Page 155

suitable as a general method for comparing different predictive tools in newsituations. The area under the ROC curve (Mossman, 1994) and the index ofeffectiveness (Hasselblad and Hedges, 1995), based on combining sensitivityand specificity, can also be used as global measures of accuracy. Ideally suchmeasures would be applied to completely new data sets collected prospectively.This is often not feasible, however, and if the same data are used to estimateparameters and assess fit, the predictive accuracy of the model will be over-optimistic. This is a problem when there are many parameters in relation tothe sample size, as in the case of tree models.

Efron and Tibshirani (1993, p. 239), recognizing that new data sets are notalways available, describe alternative statistical methods of addressing thisissue, including cross-validation, which they specifically recommend for classi-fication trees. Cross-validation works by splitting the data set into a number ofparts, removing one part, developing the model on the remainder, thenapplying that model to the part originally removed. This process is repeateduntil the whole data set has been used for developing the model and fortesting, the overall accuracy being estimated by combining the misclassifi-cation rates of each separate run. This method can be recommended insituations where prospective data are not available, as it will provide a conser-vative (if anything pessimistic) view of prediction accuracy.

Base rates and uncertain prediction

Base rates are important in two ways. First, they can inform the choice ofprobability cut-off in a logistic regression or the allocation of terminal nodes ofa tree. The default is usually 0.5 and this would be appropriate if classifying anew case in which no prior information was available, only the measurementsthemselves. However, it might be appropriate to classify a case as ‘positive’ if itwere to be classified as above or below some other cut-off, e.g. based on thebase rate. A further development of this idea is to classify cases as positive iftheir risk is more than, say, twice the base rate and negative if less than half therate, leaving a window of uncertainty in the middle of this range. Cases in thewindow may be left as unclassified or, if using tree models, ‘iterative’ trees maybe employed to reclassify these uncertain cases by fitting a new tree.

In addition to using the base rate to inform the choice of parameters in themodel, one can also estimate positive and negative predictive values, which maybe of more interest from a clinical point of view than sensitivity and specificity,which describe the accuracy of the tool in general. The predictive powers are theproportions predicted as violent (non-violent) who will actually be violent (non-violent) and they may be most relevant to the clinician in a specific situation.Both these quantities depend not only on sensitivity and specificity but also onthe base rate so that lower base rates result in lower positive predictive values andhigher negative ones.

156 Editorial

CBMH 13.3_crc2 24/11/03 2:55 pm Page 156

Making sense of the statistics and the way forward

Even more fundamental than the technical points discussed above is the needto decide on the aim of the analysis. For example, it can be argued that,because violence remains a relatively rare phenomenon, in some situations oneshould focus on predicting those who definitely will not be violent in thefuture rather than those who will. A helpful discussion of the general issues todo with choosing models is given by Silver et al. (2000).

Yet more sophisticated approaches are currently being developed. Silver etal. (2000) suggest that the relationships evident from the classification treescould be used to inform the choice of important interaction terms that canthen be fitted in a logistic regression model. These interactions could improvepredictive accuracy without over-fitting the model to the data set.Alternatively Silver and Chow-Martin (2002) are now proposing a ‘multiplemodels’ approach, based on a number of different trees which separately modeldifferent groups of factors (for example sociodemographic and offence data),the results being combined to make an overall prediction. The SMILESsoftware package (Ferri-Ramirez et al., 2002) adopts the slightly differentmethod of combining different types of statistical model (for example a logisticregression and a tree model) on the same set of risk factors. These variousapproaches offer the scope for further advance by capitalizing on the bestaspects of different data and/or techniques without having to rely on a singleanalysis.

To sum up, we can certainly view classification trees as an attractive alter-native to more traditional approaches to risk assessment modelling.Notwithstanding the complex underlying statistics, and the caution that isrequired with regard to robustness, such a ‘green fingered’ approach seems tooffer further insights into the way in which certain risk factors relate to eachother in their association with violence or non-violence. However, it may turnout that the combination of several approaches and types of data may be betterthan any one individual approach. The challenge will be to apply some ofthese more experimental approaches prospectively, the only real test of apredictive tool.

References

Agresti A (1996) An Introduction to Categorical Data Analysis. New York: Wiley.Efron B, Tibshirani RJ (1993) An Introduction to the Bootstrap, Monographs on Statistics and

Applied Probability 57. London: Chapman & Hall. Ferri-Ramirez C, Hernández-Orallo J, Ramírez-Quintana MJ (2002) SMILES, A multi-purpose

learning system. Technical Report. Dep. Sistemes Informàtics i Computació, Univ.Politècnica de València (Spain) [http://www.dsic.upv.es/~flip/smiles/].

Hasselblad V, Hedges L (1995) Meta-analysis of screening and diagnostic tests. PsychologicalBulletin 117: 167–178.

Editorial 157

CBMH 13.3_crc2 24/11/03 2:55 pm Page 157

Monahan J, Steadman HJ, Appelbaum PS, Robbins P, Mulvey EP, Silver E, Roth LH, Grisso T(2000) Developing a clinically useful actuarial tool for assessing violence risk. British Journalof Psychiatry 176: 312–319.

Mossman D (1994). Assessing predictions of violence: being accurate about accuracy. Journal ofConsulting and Clinical Psychology 62: 783–792.

Silver E, Chow-Martin L (2002) A multiple models approach to assessing recidivism risk: impli-cations for judicial decision making. Criminal Justice & Behavior

29(5): 538–568. Silver E, Smith WR, Banks S (2000) Constructing actuarial devices for predicting recidivism: a

comparison of methods. Criminal Justice and Behavior, 27 (6), 733–764.Steadman HJ (2000). From dangerousness to risk assessment of community violence: taking

stock at the turn of the century. Journal of the American Academy of Psychiatry & The Law28(3): 265–271.

Steadman HJ, Silver E, Monahan J, Appelbaum PS, Clark Robbins P, Mulvey EP, Grisso T,Roth LH, Banks S (2000) A classification tree approach to the development of actuarialviolence risk assessment tools. Law and Human Behavior 24(1): 83–100.

Thomas SD, Leese M, Walsh E, McCrone P, Moran P, Burns T, Creed F, Tyrer P, Fahy T.(submitted) Comparison of statistical models in predicting violence in psychotic illness.

Address correspondence to: Stuart Thomas, Research Fellow, PO29 Health Services Research Department, Institute of Psychiatry, London SE5 8AF, UK. Tel: 020 7848 0711; fax: 020 7277 1462; email: [email protected]

158 Editorial

CBMH 13.3_crc2 24/11/03 2:55 pm Page 158