Matthews et al-2015-Asia & the Pacific Policy Studies · Asia & the Pacific Policy Studies, vol....

15
Original Article Coping with Nasty Surprises: Improving Risk Management in the Public Sector Using Simplified Bayesian Methods Mark Matthews and Tom Kompas* Abstract Bayesian methods are particularly useful to informing decisions when information is sparse and ambiguous, but decisions involving risks must still be made in a timely manner. Given the utility of these approaches to public policy, this article considers the case for refreshing the general practice of risk management in gover- nance by using a simplified Bayesian approach based on using raw data expressed as ‘natural frequencies’. This simplified Bayesian approach, which benefits from the technical advances made in signal processing and machine learning, is suitable for use by non- specialists, and focuses attention on the inci- dence and potential implications of false positives and false negatives in the diagnostic tests used to manage risk. The article concludes by showing how graphical plots of the inci- dence of true positives relative to false positives in test results can be used to assess diagnostic capabilities in an organisation—and also inform strategies for capability improvement. Key words: risk, management, Bayesian inference, public sector 1. Introduction This article considers the case for rethinking how risk is managed in the public sector in general, and in organisations responsible for handling security-related risks in particular. It performs a pragmatic role in drawing the atten- tion of practitioners to practical concerns over the current effectiveness of risk management methods while also performing a ‘transla- tional’ role by raising awareness of the rel- evance of Bayesian inference to these practical challenges and by explaining how these methods can be used by non-specialists as part of a reformed risk management agenda. This question over the case for refreshing approaches to risk is posed because of the con- fluence of three factors that have, arguably, created a situation in which approaches to risk are increasingly pervasive—yet can be ineffec- tive. As Michael Power has observed: Risk talk and risk management practices, rather like auditing in the 1990s, embody the fundamen- tally contradictory nature of organisational and political life. On the one hand there is a func- tional and political need to maintain myths of control and manageability, because this is what various interested constituencies and stake- holders seem to demand. Risks must be made auditable and governable. On the other hand, * Matthews: Australian Centre for Biosecurity and Environmental Economics, Crawford School of Public Policy, The Australian National University, Canberra, Australian Capital Territory 0200, Aus- tralia; Kompas: Australian Centre for Biosecurity and Environmental Economics, Crawford School of Public Policy, The Australian National University, Canberra, Australian Capital Territory 0200, Aus- tralia, and Centre of Excellence for Biosecurity Risk Analysis, The University of Melbourne, Vic- toria 3010, Australia. Corresponding author: Matthews, email [email protected]. Asia & the Pacific Policy Studies, vol. 2, no. 3, pp. 452–466 doi: 10.1002/app5.100 © 2015 The Authors. Asia and the Pacific Policy Studies published by Crawford School of Public Policy at The Australian National University and Wiley Publishing Asia Pty Ltd. This is an open access article under the terms of the Creative Commons Attribution-NonCommercial-NoDerivs License, which permits use and distribution in any medium, provided the original work is properly cited, the use is non-commercial and no modifications or adaptations are made.

Transcript of Matthews et al-2015-Asia & the Pacific Policy Studies · Asia & the Pacific Policy Studies, vol....

Page 1: Matthews et al-2015-Asia & the Pacific Policy Studies · Asia & the Pacific Policy Studies, vol. 2, no. 3, pp. 452–466 doi: 10.1002/app5.100 ... Royal Commission into the implementation

Original Article

Coping with Nasty Surprises: Improving Risk Management inthe Public Sector Using Simplified Bayesian Methods

Mark Matthews and Tom Kompas*

Abstract

Bayesian methods are particularly useful toinforming decisions when information is sparseand ambiguous, but decisions involving risksmust still be made in a timely manner. Given theutility of these approaches to public policy, thisarticle considers the case for refreshing thegeneral practice of risk management in gover-nance by using a simplified Bayesian approachbased on using raw data expressed as ‘naturalfrequencies’. This simplified Bayesianapproach, which benefits from the technicaladvances made in signal processing andmachine learning, is suitable for use by non-specialists, and focuses attention on the inci-dence and potential implications of falsepositives and false negatives in the diagnostictests used to manage risk. The article concludesby showing how graphical plots of the inci-dence of true positives relative to false positivesin test results can be used to assess diagnostic

capabilities in an organisation—and alsoinform strategies for capability improvement.

Key words: risk, management, Bayesianinference, public sector

1. Introduction

This article considers the case for rethinkinghow risk is managed in the public sector ingeneral, and in organisations responsible forhandling security-related risks in particular. Itperforms a pragmatic role in drawing the atten-tion of practitioners to practical concerns overthe current effectiveness of risk managementmethods while also performing a ‘transla-tional’ role by raising awareness of the rel-evance of Bayesian inference to these practicalchallenges and by explaining how thesemethods can be used by non-specialists as partof a reformed risk management agenda.

This question over the case for refreshingapproaches to risk is posed because of the con-fluence of three factors that have, arguably,created a situation in which approaches to riskare increasingly pervasive—yet can be ineffec-tive. As Michael Power has observed:

Risk talk and risk management practices, ratherlike auditing in the 1990s, embody the fundamen-tally contradictory nature of organisational andpolitical life. On the one hand there is a func-tional and political need to maintain myths ofcontrol and manageability, because this is whatvarious interested constituencies and stake-holders seem to demand. Risks must be madeauditable and governable. On the other hand,

* Matthews: Australian Centre for Biosecurity andEnvironmental Economics, Crawford School ofPublic Policy, The Australian National University,Canberra, Australian Capital Territory 0200, Aus-tralia; Kompas: Australian Centre for Biosecurityand Environmental Economics, Crawford School ofPublic Policy, The Australian National University,Canberra, Australian Capital Territory 0200, Aus-tralia, and Centre of Excellence for BiosecurityRisk Analysis, The University of Melbourne, Vic-toria 3010, Australia. Corresponding author:Matthews, email [email protected]".

bs_bs_banner

Asia & the Pacific Policy Studies, vol. 2, no. 3, pp. 452–466doi: 10.1002/app5.100

© 2015 The Authors. Asia and the Pacific Policy Studiespublished by Crawford School of Public Policy at The Australian National University and Wiley Publishing Asia Pty Ltd.This is an open access article under the terms of the Creative Commons Attribution-NonCommercial-NoDerivs License,which permits use and distribution in any medium, provided the original work is properly cited, the use is non-commercial

and no modifications or adaptations are made.

Page 2: Matthews et al-2015-Asia & the Pacific Policy Studies · Asia & the Pacific Policy Studies, vol. 2, no. 3, pp. 452–466 doi: 10.1002/app5.100 ... Royal Commission into the implementation

there is a consistent stream of failures, scandalsand disasters which challenge and threatenorganisations, suggesting a world which is out ofcontrol and where failure may be endemic, and inwhich the organisational interdependencies areso intricate that no single locus of control has agrasp of them.[Power 2004, p. 10]

These problems were especially evident inthe actions addressed by Australia’s recentRoyal Commission into the implementation ofthe home insulation scheme (Hanger 2014).The Royal Commission found that the Austra-lian Government’s management of risk hadbeen seriously deficient in a number of inter-related respects. These can be characterised asa failure to engage with risk as an integralaspect of the program in preference to treatingit as an impediment to rapid policy deliveryand with no overall responsibility for manag-ing risk being defined and acted upon. Riskmanagement was treated, in effect, as a com-pliance ritual rather than as a serious approachto identifying and reducing risks.

Our focus and criticisms of current practiceare not directed at existing technical work inrisk analysis and assessment—approaches thatare very useful in shedding light on, andinforming decisions about, complex and chal-lenging risk-related problems. Rather, weaddress the more general issue highlighted byMichael Power, and framed in the broadcontext of how ‘managerialist’ approaches inthe public sector in combination with thedominant ethos of evidence-based policy-making and an increasingly pervasive empha-sis on risk management, framed as riskavoidance, create a situation in which riskmanagement involves voluminous standardsand guidelines applied to what governanceinvolves—but surprisingly sparse technicalassistance as regards actually identifying,assessing and dealing with risks as a coreaspect of governments’ role as uncertainty andrisk managers of ‘last resort’. Indeed, the pro-posed solution to this general challenge ofrefreshing risk management relies uponfinding practical ways of transforming theBayesian approaches already being used bytechnical specialists for use by non-specialists.

At present, while the importance of adaptivelearning based on keeping options open in anuncertain and changing world is a well-established managerial principle (see Kleinand Meckling 1958 for an influential contribu-tion), after nearly 60 years there are still pleasfor governance to transition to a more adaptivelearning-based and ‘experimentalist’ mode(Sabel and Zeitlin 2012). Although the framingmay be different (Klein and Meckling wereconcerned with weapons system developmentdecision-making, and Sabel and Zeitlin stress amultilevel governance context), the underlyingprinciples, as reflected in governance, havebeen constant. The challenge is to find practi-cal ways for the public sector to operate in anadaptive learning-based and generally experi-mentalist manner while still being able to dem-onstrate value for money.

This article seeks to respond to the challengeset out by this prior work that stresses short-comings in risk management by consideringpractical responses that government depart-ments and agencies can start to develop toimprove their risk management effectiveness.

2. Appraisal of the Current Situation

Contemporary notions of good governancereflect the confluence of a three policy narra-tives that describe how efficiency and effec-tiveness can be achieved in the public sector.

First, the ‘new public management’ ethoscharacterised by the privatisation of certainpublic services and a strong emphasis on theuse of targets and performance measures todrive performance and demonstrate transpar-ency and accountability (Hood 1991).

Second, the concept of evidence-basedpolicy-making that places a high priority onthe collection and analysis of data as a basisfor making policy decisions and makes explicitclaims about avoiding ‘ideological’ issues(Solesbury 2001).

Third, the strong and pervasive role offormal process compliance-based approachesto risk management reflected in various stan-dards and guidelines, and especially in ISO:31000: 2009.

Matthews and Kompas: Improving Risk Management in the Public Sector 453

© 2015 The Authors. Asia and the Pacific Policy Studiespublished by Crawford School of Public Policy at The Australian National University and Wiley Publishing Asia Pty Ltd

Page 3: Matthews et al-2015-Asia & the Pacific Policy Studies · Asia & the Pacific Policy Studies, vol. 2, no. 3, pp. 452–466 doi: 10.1002/app5.100 ... Royal Commission into the implementation

In combination, this mix encouragesdecision-makers in the public sector toapproach risk as:

• something that should be avoided and/or dis-placed onto others (Power 2004)—an aspectthat can make it difficult to integrate risk intoeffective strategic plans

• a compliance-based impediment to policydelivery rather than an integral componentof what delivering policy in a prudentmanner actually involves doing (Hanger2014)

• being treated as a failure to achieve clearlydefined objectives with risk being defined as‘uncertainty over objectives’ (ISO 31000:2009)

• a matter of maintaining ‘risk registers’based on the assumed likelihood of occur-rence and severity of potential impact—butwith no overall measure of resulting riskexposure

• a tendency to overlook the implications, fordecision-making, stemming from treatingsubstantive uncertainty (that is, incalculablerisk) as if it is calculable risk, and as a resultmissing opportunities to learn-by-doing incoping with the inherent uncertainties ingoverning (Matthews, 2015).

The result is a situation in which the man-agement of real, and often unavoidable, uncer-tainties and risks by officials is hampered by areluctance to speculate and conjecture (as thatgoes beyond the available ‘evidence’). Indeed,the concept of evidence used in the publicsector tends to err towards a preference for‘facts’, and especially simple ‘killer’ facts usedto justify a desired and media-friendly inter-vention (Stevens 2011), rather than a more sci-entific approach based on the empirical testingof hypotheses.

This aversion to speculation and conjecturesis problematic, as Andrew Stirling hasobserved:

Since contemplating the unknown necessarilyrequires imaginings beyond the available evi-dence, it is treated as unscientific in conventionalrisk regulation. What is truly unscientific,however, is this effective denial of the unknown.[Stirling 2009]

In the real world of risk management, adver-saries and natural phenomenon are unpredict-able. However, this does not mean thatresulting risks cannot be considered and pre-pared for—including situations in which anadversary may prepare to act to exploit, andamplify, the damage caused by an unpredict-able natural crisis or crises. In short, effectiverisk management is best approached as aneffort to reduce the potential for nasty sur-prises by combining creativity and conjecturesabout what could happen, with a recognitionthat aiming simply to comply with prevailingrisk management standards and guidelinescan, in some circumstances, amplify ratherthan reduce the potential for nasty surprises(by engendering complacency and passivityonce rules are complied with).

While the current emphasis within govern-ment is on basing decisions on robust evi-dence, this is not a basis for effective riskmanagement. It is wise, therefore, to be clearabout the differences between concepts of‘proof’ using evidence and the tasks for intel-ligence, which operate under far more ambigu-ous, time-constrained and fluid conditions. Asex-CIA officer Bruce Berkowitz comments onthe distinction between intelligence work anddetective work:

Detective work and intelligence collection mayresemble each other, but they are really com-pletely different. Detectives aim at meeting aspecific legal standard—‘probable cause’, forexample, or ‘beyond a reasonable doubt’ or ‘pre-ponderance of evidence’. It depends on whetheryou want to start an investigation, put a suspect injail or win a civil suit. Intelligence, on the otherhand, rarely tries to prove anything; its mainpurpose is to inform officials and military com-manders. The clock runs differently for detectivesand intelligence analysts, too. Intelligenceanalysts—one hopes—go to work before a crisis;detectives usually go to work after a crime. Lawenforcement agencies take their time and dog-gedly pursue as many leads as they can. Intelli-gence analysts usually operate against the clock.There is a critical point in time where officialshave to ‘go with what they’ve got’, ambiguous ornot. But the biggest difference—important in allthe current controversies—is that intelligenceagencies have to deal with opponents who take

Asia & the Pacific Policy Studies September 2015454

© 2015 The Authors. Asia and the Pacific Policy Studiespublished by Crawford School of Public Policy at The Australian National University and Wiley Publishing Asia Pty Ltd

Page 4: Matthews et al-2015-Asia & the Pacific Policy Studies · Asia & the Pacific Policy Studies, vol. 2, no. 3, pp. 452–466 doi: 10.1002/app5.100 ... Royal Commission into the implementation

countermeasures. Indeed, usually the longer onecollects information against a target, the betterthe target becomes at evasion. So do other poten-tial targets, who are free to watch.[Berkowitz 2003]

In short, if we base risk management exclu-sively on evidence, and if we compound thisby relying on fixed performance targets thatlimit the capacity to learn and adapt to risk,then the outcome is likely to be amplifiedrather than reduced levels of risk.

3. Current Approaches to Risk andTheir Consequences

There are presently two main ways in whichrisk is defined:

• as the likelihood that something may happenand the magnitude/severity of the conse-quences if it does happen;

• as the effect that uncertainty has on theachievement of objectives (ISO 31000:2009).

It is common for risk management in thepublic sector to be based on classificationsusing ‘risk matrices’ that relate likelihood toconsequences, and use scoring techniques toaggregate across measures of probability andconsequences.

The use of risk matrices, while usefulin some settings, can also increase the poten-tial for nasty surprises, and hence amplifyrisk.

There are five reasons for this concern. First,the matrices specifically downplay low prob-ability and high consequence outcomes. The‘red zones’, or the areas that score highest, arein the high probability and consequence cat-egories. While this is fine in itself, it generallyshits attention from the outcomes that areworthy of considerable attention.

Second, the matrices always result in ‘rangecompression’ (Cox 2008). In one category orbox, for example, the designated range inprobability measure may range from, say, 0 to20 per cent. The problem here is that for manyespecially high consequence outcomes, achange in probability from 10 to 15 per cent of

a given outcome can be crucial. But this dis-tinction is simply buried in the given range orbox being considered.

Third, the ranges themselves do not alwaysmap out in symmetric boxes (for example,20 per cent blocks), and this can cause confu-sion over range intervals and what is beingmeasured.

Fourth, a lack of a ‘common language’ oftencauses misunderstanding. Categories in thematrix designated as ‘catastrophic’ or ‘almostcertain’ can mean very different things to thosewho do risk assessments.

Finally, risk matrices totally obscure prob-lems with ‘false negatives’ and ‘false positives’in security and risk measures, as discussedlater, and can never account for ‘jumps’ inprobability assessments or states of nature thatare common with nasty surprises. This latterpoint is essential. Probability measures ofpotentially severe outcomes cannot onlychange or ‘drift’ from one box in the matrix toanother over time, but take discrete jumps.Accounting for these jumps and militatingagainst them is especially important for highconsequence events.

Risk matrices aside, it is worth stressing thatcurrent risk management standards and guide-lines place a strong emphasis on the risks facedin achieving stated objectives—with theseobjectives treated as a given (in effect as anindependent variable). While there is a usefulemphasis on continuous improvement in riskmanagement practices, there is not a strongemphasis on the ways in which the decisionsmade in setting objectives will influence thenature and extent of the risks faced.

This point can be summarised by invertingthe ISO 31000: 2009 definition of risk as: theeffect of objectives on uncertainty. Clearly,the more ambitious the objectives, the greaterthe uncertainty faced in achieving these objec-tives. Engineers are familiar with the ways inwhich system complexity increases the likeli-hood of system failure due to complex inter-actions and cascading failure modes. This iswhy engineers prefer to simplify designs inorder to reduce the likelihood of failures (andbuild in redundant/duplicated systems wherefailures are most likely).

Matthews and Kompas: Improving Risk Management in the Public Sector 455

© 2015 The Authors. Asia and the Pacific Policy Studiespublished by Crawford School of Public Policy at The Australian National University and Wiley Publishing Asia Pty Ltd

Page 5: Matthews et al-2015-Asia & the Pacific Policy Studies · Asia & the Pacific Policy Studies, vol. 2, no. 3, pp. 452–466 doi: 10.1002/app5.100 ... Royal Commission into the implementation

This relationship between objectives,uncertainty and risk forms the basis of well-developed design, development and demon-stration in engineering systems (most evidentin NASA and U.S. Department of Defensestructured program management methods forcomplex engineering systems). Those methodsexplicitly focus on attempting to balance theuncertainty (hence risk) associated withattempts to achieve technical objectives withthe benefits of actually achieving those techni-cal objectives, using a perspective laid out inKlein and Meckling (1958). This results in asystem design tradeoff between aiming for themost technically demanding and potentiallymost useful objectives (‘stretch targets’) andthe uncertainties and risks faced in attemptingto meet those objectives.

In many circumstances (especially whenmajor wars with technologically sophisticatedadversaries are not happening), this tradeoffresults in less ambitious objectives being set inorder to reduce the likelihood of failure toacceptable levels. Scrutiny of this tradeoffbetween objectives and uncertainty/risk alsoresults in efforts to develop more effectiveinnovation pathways that minimise uncertaintyand risk—notably in the readoption of NASAApollo program-style incremental modularapproaches to system evolution (known asrapid spiral development (RSD)).

RSD involves the deliberate prioritisation ofmodular systems designs that allow incremen-tal advances and testing of discrete systemcomponents while holding other moduledesigns constant. The best example is the wayin which the Apollo space program tested dis-crete system components and their use inseveral successive missions—an explicit riskmanagement technique. The phasing out of thespace shuttle in preference for a return to theRSD approach, in the form of the Orionprogram currently being developed by NASA,reflects recognition that a modular systemdesign allows functionality to evolve in eachmission, and for new technologies and opera-tional procedures to be tested and adopted farmore easily and cheaply than when using afixed system such as the non-modular spaceshuttle (which was forced to operate with very

outdated computer systems because it was socostly up update subsystems within that rigidsystem design).

Consequently, the RSD methods have thepotential to inform innovation and risk man-agement in the public sector precisely becausethey reduce the risks faced at a given level ofuncertainty over objectives.

It is also important to stress that whilethe national security community has keyoverarching objectives, such as reducingterrorist threats, the core capabilities(preparedness/readiness and rapid response tohard to predict acts) reflect objectives that areset by adversaries—not as part of ‘enterpriserisk management’ frameworks oriented towell-defined agreed and locked-in objectives.Arguably, this aspect of national security prac-tice limits the utility of standards such as ISO31000: 2009.

4. Using a Simple Diagnostic Ratio Basedon the Potential for Surprise

Writing shortly after the end of the SecondWorld War, Claude Shannon distinguishedbetween information and uncertainty (definedas entropy) and expressed the value of newinformation that might be received in terms ofthe assumed likelihood of an event happeningand being observed. In that framework, whichhas been incredibly useful in information tech-nology, the less likely an event is assumed tobe, the greater the information gain if it isobserved (Shannon 1948).1

Shannon’s use of the concept of entropy inthe Second World War had a very specificnational security objective: the need to be ableto calculate the minimum volume of informa-tion required to encrypt a message given thestatistical frequencies with which differentletters are expected to appear for linguisticreasons. Shannon entropy is therefore anexpression of statistical uncertainty based onlyon available information (or noise) rather thanbeing treated as a sample of additional but

1. See Pierce (1961) for a non-technical explanation ofShannon’s work.

Asia & the Pacific Policy Studies September 2015456

© 2015 The Authors. Asia and the Pacific Policy Studiespublished by Crawford School of Public Policy at The Australian National University and Wiley Publishing Asia Pty Ltd

Page 6: Matthews et al-2015-Asia & the Pacific Policy Studies · Asia & the Pacific Policy Studies, vol. 2, no. 3, pp. 452–466 doi: 10.1002/app5.100 ... Royal Commission into the implementation

unobserved data (as in frequentalist/samplingtheory-based statistics).

The analytical value of Shannon entropy (asthis approach is now known) lies in maintain-ing and using this distinction between infor-mation and uncertainty. In information theory,this distinction allows for tractable calcula-tions of highly complex things, especiallyerror identification and correction and signal tonoise ratios, and has provided an analyticalframework that has assisted a range of techno-logical advances to be made (including noiseand signal error correction in wi-fi).

In this context, Shannon’s principle that theless likely an event is assumed to be the greaterthe information gain if it is observed (asurprise-based notion) can be framed explic-itly in terms of the potential for surprise in apublic policy context using the following ratio:

Risk amplification

Achieved potential for surpriseUnavoida

=bble potential for surprise

This ratio provides a conceptually simplemeans of framing risk management moreclearly in relation to the potential for nastysurprises—while also having the advantage ofopening up an avenue for using measures ofinformation entropy as part of the risk manage-ment toolkit.

The risk amplification ratio reflects thereality that governments face a range ofcomplex and unavoidable factors that can sur-prise them, and that the real objective of riskmanagement is to minimise this potential fornasty surprises (rather than simply comply withrisk management standards and guidelines), inparticular by seeking to minimise the extent towhich there are missed opportunities to learnfrom practical experience in handling risks andby failing to spot ‘weak signals’ of potentiallarge and unexpected shifts in circumstances.

This framework based on the potential forsurprise (and implementable as entropy mea-sures) provides a basis for using Bayesian testsof competing hypotheses as a risk manage-ment method. This is because it emphasisesthe utility of these hypothesis tests as a means

of reducing the amplification of risk. Prudentrisk management involves working with arange of hypotheses that, in combination,reduce the ratio of the achieved potential forsurprise relative to the unavoidable potentialfor surprise. The result is, of course, exactlythe sort of ‘bird’s-eye view’ of risk exposurethat is so lacking in conventional risk registerframeworks used in the public sector.

Significantly, risk is defined in a manneraligned with the most recent incarnation ofthe international risk management standardISO31000: 2009 as ‘uncertainty overobjectives’—yet with the major advantage offostering a constructive focus in the extent riskmanagement practices and procedures actuallyreduce the potential for nasty surprises.

Consequently, this risk amplification ratioprovides a clear high-level measure of bothcurrent capability challenges in governanceand a basis for planning future improvementsin capability and assessing the progress made.The simplicity of a bird’s-eye view metric ofthis type is important because one can easilyget lost in the complexity and detail of realgovernance processes and procedures—approaches that seem to amplify rather thansimplify complexity.

Governments should aim to minimise theextent to which the assumptions they hold overthe range of likelihoods and consequences ofwanted and unwanted events (that is, risks andopportunities) are distorted by failures to useall available information on those likelihoodsand consequences.

Three examples of risk management failurescan be used to illustrate how risk can be ampli-fied by choices made over how to handleuncertainties and risks. First, as noted in theintroduction, Australia’s recent Royal Com-mission into the implementation of the homeinsulation scheme highlighted the way inwhich a top-level political priority on ‘deliv-ery’ resulted in risk management being side-lined and treated as a ‘speed bump’ to delivery,making it hard for officials to adopt a moreprudent approach to risks (Hanger 2014). Thiswas exacerbated by the prevalent tendency ingovernment to treat risk management as acompliance ritual to be got out of the way as

Matthews and Kompas: Improving Risk Management in the Public Sector 457

© 2015 The Authors. Asia and the Pacific Policy Studiespublished by Crawford School of Public Policy at The Australian National University and Wiley Publishing Asia Pty Ltd

Page 7: Matthews et al-2015-Asia & the Pacific Policy Studies · Asia & the Pacific Policy Studies, vol. 2, no. 3, pp. 452–466 doi: 10.1002/app5.100 ... Royal Commission into the implementation

easily as possible, and where possible evenoutsourced to consultants for convenience.This ‘sidelined’ stance meant that prior infor-mation on relevant risks from States and NewZealand was effectively ignored. The RoyalCommission’s investigation revealed both spe-cific departmental capability shortcomings butalso pointed to more general systemic limita-tions in risk management in the loosely federalAustralian system of government: (i) risk as aninconvenience in delivering policy and (ii) a‘tick box’ (do and forget) mentality rather thanthe basis for learning and adaptation, and (iii)poor flows of relevant information acrossadministrative boundaries. In short, this ten-dency to treat risk management as a compli-ance ritual is itself an amplifier of risk.

Second, NASA’s Challenger space shuttledisaster in 1986 demonstrates how technicalchoices made over risk management methodolo-gies can amplify risks. In the lead-up to thedisaster, there were two alternative risk assess-ments for the space shuttle system (McGrane2011).2 Odds of a 1 in 100,000 risk of cata-strophic system failure calculated by NASA dif-fered markedly from the findings from a1983 U.S. Air Force funded review by TeledyneEnergy Systems that used Bayesian methods toput the risk of catastrophic system failure muchhigher (and dangerously so) at odds of 1 in 35.Teledyne had examined failure rates in similarsolid fuel rocket systems (as used in Poseidonsubmarine missiles, and Minuteman interconti-nental ballistic missiles) and used these esti-mates as prior risk factors in the Bayesiananalysis (with a prior of 32 confirmed failuresout of 1,902 rocket launches) augmented bysubjective probabilities and based on lessonsfrom real operating experience. Perhaps due tothe Bayesian impact on cryptography in theSecond World War, the Pentagon has alwaysbeen more receptive to Bayesian concepts thanNASA. Worryingly, NASA instructed thecompany it had hired to study shuttle risks toignore such ‘prior’ data—even though therockets were essentially identical. This was

partly because NASA’s risk management meth-odology worked on system-specific technicalengineering safety margins that sought to elimi-nate risk via system redundancy rather than useprobabilistic risk assessments that have strongerimplications for operational decision-makingable to respond to unusual events (such as coldweather)—a narrow ‘hard’data-driven approachthat, in fact, amplified risk.

Third, there is the case of the Hoover Dam.This was designed on the basis of what laterturned out to be statistical data on rainfall andconsequent river levels from an unusually dryperiod. This leaves it at risk of collapse unlesswater is released to pre-empt a rapid rise in thedam level caused by the spring snow melt, inturn the combined consequence of rainfall andsnow depositions earlier in the year and the rateat which the snow mass melts due to risingtemperatures.As a result, the dam came close tofailure in 1983 partly because this pre-emptionwas not carried out—a decision influenced byassumptions over expected deluge likelihoods(the likelihoods of actual water inflows threat-ening the dam are greater than the assumedlikelihoods). These likelihoods are bestupdated and used to drive operational decision-making rather than sticking to the rule book(based on non-updated likelihoods). TheHoover Dam illustrates the way in whichassumed statistical distributions used in designspecifications, coupled with risk managementsolely as compliance (especially when used todefine regulatory frameworks), can alsoamplify risk. The global financial crisis isanother example of how regulatory stancesbased on assumed or unrepresentative statisti-cal distributions can amplify risks in this way.

One key lesson from these examples of riskamplification is that the incidence of falsepositives and false negatives in diagnostic testsused in risk management can be and an impor-tant source of the amplification of risk. In eachof the examples summarised above, there wasa false negative conclusion to the test forsystem failure risk. In the case of the spaceshuttle, the false negative was caused by meth-odological restrictions imposed by NASA onitself that constrained risk assessment asregards prior data on failures in similar

2. This summary also draws on a range of internalNASA documents bearing upon on risk managementshortcomings.

Asia & the Pacific Policy Studies September 2015458

© 2015 The Authors. Asia and the Pacific Policy Studiespublished by Crawford School of Public Policy at The Australian National University and Wiley Publishing Asia Pty Ltd

Page 8: Matthews et al-2015-Asia & the Pacific Policy Studies · Asia & the Pacific Policy Studies, vol. 2, no. 3, pp. 452–466 doi: 10.1002/app5.100 ... Royal Commission into the implementation

systems and avoided a probabilistic approachto risk management in preference for engi-neered safety margins (a methodologicalchoice that impacted upon decision-makingover risks). In the case of the Hoover Dam,design tolerances and operational guidelinesthat impact on risk were based on biased sta-tistical data on rainfall patterns. In the case ofthe Australian home insulation scheme, falsenegatives arose because risk management wastreated as a compliance ritual and an impedi-ment to rapid program delivery (rather thanbeing placed at the centre of the design anddelivery of the intervention).

These false negatives amplified the potentialfor nasty surprises above unavoidable levels.In other words, risk was amplified becauseorganisations made choices over risk manage-ment that increased the likelihood of falsenegatives for tests of potential system failure.In addition, as the discussion of the limitationsto some forms of risk matrix has illustrated,this approach can also amplify risks byincreasing the potential for nasty surprises.

Given the importance of this ‘amplification’aspect of risk management, an analyticalmeans of dealing with the challenge of reduc-ing false negatives (and false positives) in testsrelating to risk is provided in the followingsection. The key enabler of this approach is toframe risk management as binary (true orfalse) hypothesis tests and to break downcomplex sets of interrelationships into thesebinary links.

5. Using Bayesian Signal Processing andMachine Learning Concepts toArticulate the Potential for Surprise

Bayesian probability and the associated statis-tical methods differ in marked ways from thealternative (and more established) samplingtheory approach (sometimes referred to asfrequentalist, classical or orthodox probabil-ity).3 In the sampling theory definition, prob-ability is treated as the long-run relativefrequency of the observed occurrence of an

event. The sample set can be either a sequenceof events through time or a set of identicallyprepared systems (Loredo 1990).

In contrast, Bayesians treat probability (ineffect) as the relative plausibility of proposi-tions when incomplete knowledge does notallow us to establish truth or falsehood in anabsolute sense. This methodological distinc-tion is useful because a Bayesian approachaligns better with learning processes (it isbased on updating estimates of the odds thathypotheses are true whenever new data areobtained). Bayesian methods are also wellsuited to handling the potential for surprise forthe same reason. Indeed, information theoryitself is derived from Bayesian principles,hence Shannon’s emphasis on valuing infor-mation in inverse proportion to the estimatedlikelihood of receiving that information (thedefinition of surprise that defines the powerfulconcept of Shannon entropy).

However, a major problem is that, as Fersoncomments, Bayesian approaches are ratherlike snowflakes in the sense that each one isunique (Ferson 2005). This heterogeneitymakes it hard for non-specialists to adopt thesemethods in a public policy context. The result-ing dilemma is that while Bayesian inferenceis, in principle, of great relevance to risk man-agement in the public sector (and more general‘risk-aware’ applications), the current relianceon complex and technically sophisticatedbespoke applications greatly limits the abilityof the public sector to use these methods in aday-to-day manner.

Our suggested response to this problem is todevelop a simplified and standardised Bayes-ian expression of the familiar policy learningcycle specifically designed for use by non-specialists. This is a long-term project to becarried out, wherever possible, in partnershipwith public sector departments and agencies.

The following diagram explains the basic(and very simple) principle behind Bayesiananalysis, namely that we are able to update theassumed odds of something happening whennew information is obtained. The new infor-mation may either confirm that the initiallyassumed odds should be retained or may leadus to revise these odds.

3. Two useful introductory sources on Bayesian inferencecan be found in Jaynes (1984) and Fenton and Neil (2013).

Matthews and Kompas: Improving Risk Management in the Public Sector 459

© 2015 The Authors. Asia and the Pacific Policy Studiespublished by Crawford School of Public Policy at The Australian National University and Wiley Publishing Asia Pty Ltd

Page 9: Matthews et al-2015-Asia & the Pacific Policy Studies · Asia & the Pacific Policy Studies, vol. 2, no. 3, pp. 452–466 doi: 10.1002/app5.100 ... Royal Commission into the implementation

For the purposes of relating Bayesian infer-ence to the policy cycle, the simple equationexpressing new odds and a product of the oldodds plus the analysis of new information isreframed as a circular learning process. This isillustrated in Figure 1. Estimated odds, viaexperience, generate new information thatwhen analysed allows estimated odds to beupdated. This learning loop combines a real-world implementation phase (experiments ineffect) with an analytical phase. Everythingthat government does is, in effect, in the imple-mentation phase and consequently an experi-ment (either explicitly or implicitly). How-ever, implementation/experimentation activi-ties may not necessarily involve the new infor-mation being identified, collated and analysed.If the latter does not happen, then the oddscannot be updated, and in effect there has beena missed opportunity to learn (and manage riskin particular).

From this perspective, governments’ moni-toring and evaluation (M&E) activities willhave the greatest utility when the informationobtained as a result of experience in imple-menting a policy intervention is related back toan initial (uncertainty and risk based) assump-tion of the odds of success assumed for theintervention. If M&E measures are not basedon an explicit recognition of uncertainty andrisk (that is, the odds of success are not madeexplicit), then it is unlikely that useful learning

will be captured and used even if useful learn-ing takes place. This is because uncertaintyand risk are marginalised rather than central-ised in the analysis.

Figure 2 contains the basic analytical tax-onomy used in signal processing and machinelearning; this is sometimes (usefully) referredto as a ‘confusion matrix’ by engineersbecause it draws out the ways in which binarytest results can be wrong, and in combinationcontradictory, and hence cause confusion. In amachine learning context based on the useof algorithms, this confusion paralyses learn-ing and adaptation. In a policy context, theimpact on human judgement can be equallyparalysing, or can lead to decisions beingmade that arbitrarily ignore this confusion.This can lock interventions into problematicdevelopmental pathways if not corrected atlater stages.

As the confusion matrix highlights, weshould prefer regulatory stances (if expressedas competing hypotheses with binary answers)that maximise the true positive rate and thetrue negative rate, but that also minimise thefalse positive and the false negative rates.Whenever there are false positive and falsenegative test results, the response of the regu-latory framework is itself a risk to effectivepolicy delivery (actions may be taken that areunnecessary, or actions that should be takenare not taken).

Figure 1 Linear and Circular Expressions of Bayesian Inference

Source: Matthews (forthcoming).

Asia & the Pacific Policy Studies September 2015460

© 2015 The Authors. Asia and the Pacific Policy Studiespublished by Crawford School of Public Policy at The Australian National University and Wiley Publishing Asia Pty Ltd

Page 10: Matthews et al-2015-Asia & the Pacific Policy Studies · Asia & the Pacific Policy Studies, vol. 2, no. 3, pp. 452–466 doi: 10.1002/app5.100 ... Royal Commission into the implementation

Figure 3 contains an illustration of the sig-nificance of test result errors in a clinicalcontext (using data from Gigerenzer 2002).For many people, this ‘natural frequency’-based expression of the situation, which clearly

communicates relative scale, is far easier tograsp than the standard Bayesian equation.

In presenting the data in this manner, it isclear that a positive test result (in this case forcolorectal cancer) means that there is only a

Figure 2 The ‘Confusion Matrix’ Used in Signal Processing and Machine Learning

Test result Condition assessment

Yes No

evitisoP

(a)

True positive rate (TP)

(b)

False positive rate (FP)

Negative

(c)

False negative rate (FN)

(d)

True negative rate (TN)

Figure 3 Does a Positive Test Result in Medicine Actually Mean a Condition Is Actually Present?

Source: The authors using data from Gigerenzer (2002).

Matthews and Kompas: Improving Risk Management in the Public Sector 461

© 2015 The Authors. Asia and the Pacific Policy Studiespublished by Crawford School of Public Policy at The Australian National University and Wiley Publishing Asia Pty Ltd

Page 11: Matthews et al-2015-Asia & the Pacific Policy Studies · Asia & the Pacific Policy Studies, vol. 2, no. 3, pp. 452–466 doi: 10.1002/app5.100 ... Royal Commission into the implementation

4.8 per cent likelihood that a particular patientactually has the condition. This is simplybecause the 3 per cent false positive rateapplied to the 9,970 in every 10,000 peoplewho in statistical terms do not have the diseaseresults in 300 cases of false positives relative to15 true positives. Hence, for an individualpatient, one must consider the implications ofthis ratio of 300 false positives against 15 truepositives (the odds from which favour a par-ticular test result being a false positive). Thishighlights the way in which the overall preva-lence of a disease in the population, combinedwith the rates of true and false positives (andtrue and false negatives) in test results, gener-ates this gap between a naïve interpretation ofa particular test result and a more thoughtfuland evidence-based interpretation.

Figure 4 contains the more conventionalBayesian expression of this same situation.Unless one is highly familiar with conditionalprobability (which many people working ingovernment and in stakeholder organisations

are not), then this way of calculating test sen-sitivity is hard to grasp. This unnecessary com-plexity is created by avoiding the use of naturalfrequencies in preference for reliance on themathematics of probability. As Figure 5 dem-onstrates, the use of natural frequencies makesBayes rule far easier to understand—it issimply the sensitivity of the test to the rateof false positives given the prevalence of acondition.

Figure 6 contains an illustration of how thisnatural frequency approach can be used in asecurity risk context. In this case, in detectingpotential terrorist threats. This illustrates theway in which a very small (0.49 per cent) falsepositive rate in threat detections can result in alarge number of cases treated as threats that arenot threats. This diverts scarce resources todealing with what are believed to be threatsthat are not in fact threats. These scarceresources would be more usefully directed atthe dangerous incidence of false negativethreat detections (threats that are not detected).

Figure 4 Conventional Conditional Probability Version of Bayes Rule

Figure 5 Natural Frequency Version of Bayes Rule

Asia & the Pacific Policy Studies September 2015462

© 2015 The Authors. Asia and the Pacific Policy Studiespublished by Crawford School of Public Policy at The Australian National University and Wiley Publishing Asia Pty Ltd

Page 12: Matthews et al-2015-Asia & the Pacific Policy Studies · Asia & the Pacific Policy Studies, vol. 2, no. 3, pp. 452–466 doi: 10.1002/app5.100 ... Royal Commission into the implementation

It is easy to see how this use of Bayesiansignal processing and machine learning con-cepts provides a robust and intuitively straight-forward basis for assessing aspects of theefficiency and the effectiveness of policy inter-ventions. The approach makes clear whereproblems caused by test inaccuracies lie, high-lights the implications for response decisionsand provides a basis for measuring historicalchanges in diagnostic capability. Crucially,this approach places risk-related concerns cen-trally in the policy process.

This issue of diagnostic capability is for-mally expressed in signal processing andmachine learning in the following manner (seeFigure 7).

For historical reasons, this is referred to asthe receiver operating characteristic curve (anROC curve in short). An ROC curve plots thefalse positive rate (on the X axis) against thetrue positive rate (on the Y axis) and was origi-

nally developed to assess the abilities of radaroperators in the Second World War. Some ver-sions, as presented here, also add the truenegative and the false negative rates in order toprovide a complete diagnostic profile. As such,ROC curves reflect the principles behind theuse of randomised control trials (RCTs) inpublic policy—but in a more generally appli-cable framework (indeed ROC curves are usedin medicine to assess the adequacy of RCTresults). For a useful overview of the use ofROC curves in a range of contexts, see Swetset al. (2000).

The best possible performing hypothesistest lies in the top left-hand corner (a test thatis 100 per cent sensitive and has a zero falsepositive rate). Random test results lie on thediagonal (for example, someone guessing thetoss result of a coin would expect to eventuallyend up at the 0.5, 0.5 point in the middle of thediagonal). Test results that are worse than

Figure 6 Application to Terrorist Epidemiology: Epidemiology = ‘Incidence, Distribution and Possibility forControl’

Source: The authors.

Matthews and Kompas: Improving Risk Management in the Public Sector 463

© 2015 The Authors. Asia and the Pacific Policy Studiespublished by Crawford School of Public Policy at The Australian National University and Wiley Publishing Asia Pty Ltd

Page 13: Matthews et al-2015-Asia & the Pacific Policy Studies · Asia & the Pacific Policy Studies, vol. 2, no. 3, pp. 452–466 doi: 10.1002/app5.100 ... Royal Commission into the implementation

random lie below that diagonal (thus providinga particularly useful diagnostic).

In a public policy context, the potential towaste public funds increases the further thatcapabilities lie from the ideal diagnostic pointin the top left-hand corner of the ROC space.Particular test capabilities can be representedas curves in this space: the further above thediagonal and the greater this curvature, themore reliable the hypothesis test is. Shifts incapability over time can be reflected as shiftsin these curves.

Finally, Figure 8 reinforces the potentialutility of this diagnostic framework by indicat-ing possible positions of organisational capa-bility (strong, weak and harmful). In the lattercase, test accuracy is worse than random in thesense that test results are negatively correlatedwith reality. This is not as rare an occurrence inrisk management (and indeed public policy ingeneral) as many would assume. This can becaused by cherry-picking ‘evidence’ to supportpolitical aims, weak analytical capacity andother shortcomings that can distort decision-making.

6. Conclusions

This article has drawn attention to the potentialthat exists to use proven analytical methodswidely used in signal processing and machinelearning (that are derived from Bayesian infer-ence) as a risk management tool for use in thepublic sector. The recommended approach,which encourages a ‘binary’ hypothesistesting-based approach to risk management,focuses attention on the incidence of diagnos-tic errors (false positives and false negatives)and their implications for risk management.The approach also highlights the ways inwhich the methodological choices made by anorganisation can have the negative unintendedconsequence of amplifying risks and drawsattention to a useful framework for assessingan organisation’s diagnostic capability regard-ing risk management.

July 2015.The authors would like to acknowledgesupport for research upon which this articledraws provided by the Australian Centre for

Figure 7 Measuring Diagnostic Capability Using Signal Processing Methods

Asia & the Pacific Policy Studies September 2015464

© 2015 The Authors. Asia and the Pacific Policy Studiespublished by Crawford School of Public Policy at The Australian National University and Wiley Publishing Asia Pty Ltd

Page 14: Matthews et al-2015-Asia & the Pacific Policy Studies · Asia & the Pacific Policy Studies, vol. 2, no. 3, pp. 452–466 doi: 10.1002/app5.100 ... Royal Commission into the implementation

Biosecurity and Environmental Economics(AC BEE).

References

Berkowitz B (2003) The Big Differencebetween Intelligence and Evidence.Commentary in the Washington Post.Re-published by the RAND Corporation.

Cox LL (2008) What’s Wrong with RiskMatrices. Risk Analysis 28(2), 497–512.

Fenton N, Neil M (2013) Risk Assessment andDecision Analysis with Bayesian Networks.CRC Press, Boca Raton.

Ferson S (2005) Bayesian Methods in RiskAssessment. Paper prepared for Bureaude Recherches Géologiques et Minières(BRGM), France.

Gigerenzer G (2002) Reckoning with Risk:Learning to Live with Uncertainty. Penguin,London.

Hanger I (2014) Royal Commission into theHome Insulation Scheme. Attorney-General’s Department, Canberra.

Hood C (1991) A Public Management for AllSeasons? Public Administration 69, 3–19.

Jaynes ET (1984) Bayesian Methods: GeneralBackground: An Introductory Tutorial.Paper presented at the Fourth Annual Work-shop on Bayesian/Maximum EntropyMethods, Calgary, August 1984.

Klein B, Meckling W (1958) Application ofOperations Research to DevelopmentDecisions. Operations Research 6(3), 352–63.

Loredo TJ (1990) From Laplace to SupernovaSN 1987A: Bayesian Inference in Astro-physics. In: Fougere PF (ed) MaximumEntropy and Bayesian Methods, pp. 81–142. Kluwer Academic Publishers, TheNetherlands.

Matthews M (2015) How Better Methodsfor Coping with Uncertainty and AmbiguityCan Strengthen Government—CivilSociety Collaboration, Designing andImplementing Public Policy: Cross-SectoralDebates. In: Carey G, Landvogt K, BarraketJ (eds) Studies in Governance and PublicPolicy Series, pp. 75–89. Routledge,London.

Matthews M (forthcoming, 2016) Transforma-tional Public Policy. Routledge, London.

Figure 8 Characterising Diagnostic Capability Using Signal Processing Methods

Matthews and Kompas: Improving Risk Management in the Public Sector 465

© 2015 The Authors. Asia and the Pacific Policy Studiespublished by Crawford School of Public Policy at The Australian National University and Wiley Publishing Asia Pty Ltd

Page 15: Matthews et al-2015-Asia & the Pacific Policy Studies · Asia & the Pacific Policy Studies, vol. 2, no. 3, pp. 452–466 doi: 10.1002/app5.100 ... Royal Commission into the implementation

McGrane SB (2011) The Theory that WouldNot Die. Yale University Press, New Haven.

Pierce JR (1961) Symbols, Signals and Noise:The Nature and Process of Communication.Harper & Brothers, New York.

Power M (2004) The Risk Management ofEverything; Re-Thinking the Politics ofUncertainty. Demos, London.

Sabel C, Zeitlin J (2012) Experimentalist Gov-ernance. In: Levi-Faur D (ed) The OxfordHandbook of Governance, pp. 169–183.Oxford University Press, Oxford.

Shannon CE (1948) A Mathematical Theory ofCommunication. Bell System TechnicalJournal 27(July), 379–423.

Solesbury W (2001) Evidence Based Policy:Whence It Came and Where It’s Going.

ESRC UK Centre for Evidence BasedPolicy and Practice: Working Paper 1,Queen Mary College, University of London,London.

Stevens A (2011) Telling Policy Stories: AnEthnographic Study of the Use of Evidencein Policy-Making in the UK. Journal ofSocial Policy 40(2), 237–55.

Stirling A (2009) Risk, Uncertainty andPower. Seminar Magazine 597, 33–9.

Swets JA, Dawes RM, Monahan J (2000) Psy-chological Science Can Improve DiagnosticDecisions. Psychological Science in thePublic Interest: A Journal of the AmericanPsychological Society 1, 1–26002E

Asia & the Pacific Policy Studies September 2015466

© 2015 The Authors. Asia and the Pacific Policy Studiespublished by Crawford School of Public Policy at The Australian National University and Wiley Publishing Asia Pty Ltd