Resilience in Critical Infrastructures: The Case of the Queensland...

246
Resilience in Critical Infrastructures: The Case of the Queensland Electricity Industry By Natalie Sinclair Bachelor of Business (International Business - Distinction) / Bachelor of Arts (International & Global Studies / Geography - Distinction) School of Management - Faculty of Business Queensland University of Technology Brisbane Queensland Australia Thesis Submitted for Master of Business (Research) 2009

Transcript of Resilience in Critical Infrastructures: The Case of the Queensland...

Resilience in Critical Infrastructures:

The Case of the Queensland Electricity Industry

By

Natalie Sinclair

Bachelor of Business (International Business - Distinction) /

Bachelor of Arts (International & Global Studies / Geography - Distinction)

School of Management - Faculty of Business

Queensland University of Technology

Brisbane – Queensland – Australia

Thesis Submitted for Master of Business (Research)

2009

ii

Abstract

The reliability of Critical Infrastructure is considered to be a fundamental expectation

of modern societies. These large-scale socio-technical systems have always, due to

their complex nature, been faced with threats challenging their ongoing functioning.

However, increasing uncertainty in addition to the trend of infrastructure

fragmentation has made reliable service provision not only a key organisational goal,

but a major continuity challenge: especially given the highly interdependent network

conditions that exist both regionally and globally. The notion of resilience as an

adaptive capacity supporting infrastructure reliability under conditions of uncertainty

and change has emerged as a critical capacity for systems of infrastructure and the

organisations responsible for their reliable management.

This study explores infrastructure reliability through the lens of resilience from an

organisation and system perspective using two recognised resilience-enhancing

management practices, High Reliability Theory (HRT) and Business Continuity

Management (BCM) to better understand how this phenomenon manifests within a

partially fragmented (corporatised) critical infrastructure industry – The Queensland

Electricity Industry. The methodological approach involved a single case study

design (industry) with embedded sub-units of analysis (organisations), utilising in-

depth interviews and document analysis to illicit findings.

Derived from detailed assessment of BCM and Reliability-Enhancing characteristics,

findings suggest that the industry as a whole exhibits resilient functioning, however

this was found to manifest at different levels across the industry and in different

combinations. Whilst there were distinct differences in respect to resilient

capabilities at the organisational level, differences were less marked at a systems

(industry) level, with many common understandings carried over from the pre-

corporatised operating environment. These Heritage Factors were central to

understanding the systems level cohesion noted in the work. The findings of this

study are intended to contribute to a body of knowledge encompassing resilience and

high reliability in critical infrastructure industries. The research also has value from a

practical perspective, as it suggests a range of opportunities to enhance resilient

functioning under increasingly interdependent, networked conditions.

Key Words:

Critical infrastructure, Resilience, Reliability, Electricity, Queensland Electricity Industry,

High Reliability Theory, Business Continuity Management, Risk and Crisis Management,

Institutional Fragmentation, Corporatisation, Government-Owned Corporations, Wildavsky,

Holling, Qualitative Research, Case Study Research, Interviews, Document Analysis,

Resilience-Enhancing Practices, Government Ownership, Inheritance, Sectoral Conditions,

Collective Commitment and Culture.

iii

Table of Contents

Chapter 1: Introduction ..................................................................................... 1

Contextual Background ............................................................................... 1

The Research Setting - The Queensland Electricity Industry ..................... 4

Research Purpose and Significance............................................................. 5

Chapter 2: Literature Review ............................................................................ 7

Introduction ................................................................................................ 7

Historical Development of Risk and Crisis Research ................................. 8

The Concept of Risk ............................................................................ 8

Large-Scale Technical Systems: Growth & Prominence .................. 10

The Development of Sociological Risk & Crisis Management

Research ..................................................................................... 12

Theories of Reliability ....................................................................... 13

The Pessimists View: Normal Accident Theory ...................... 13

The Optimists View: High Reliability Theory ......................... 17

NAT versus HRT – The Great Debate ..................................... 22

Emergence of Varying Approaches to Crisis Preparedness .............. 23

Disaster Recovery Planning (DRP) – A Useful but Limited

Approach ......................................................................... 24

Crisis Management – A Holistic Approach............................. 25

Resilience – A New Paradigm for Preparedness in Uncertain

Times ............................................................................... 28

Business Continuity Management (BCM) – A Framework for

Resilience......................................................................... 32

Critical Infrastructure Protection (CIP) ..................................................... 43

Large-Scale Technical Systems, CIP and Criticality ......................... 43

CIP: Managing Vulnerability, Ensuring Reliability ............... 47

Limitations of Academic Research into CIP ........................... 48

A Revised Approach to CIP – From Protection to Resilience 49

Challenges to Sustaining Reliability .................................................. 54

Interconnectivity of Critical Infrastructures ........................... 55

Institutional Fragmentation .................................................... 56

The Importance of Enhancing „Networked‟ Reliability .......... 59

The Need for a Resilience-Based Approach to CIP ................ 61

Bringing It All Together ............................................................................ 63

Chapter 3: Theoretical Framework ................................................................ 64

Introduction ............................................................................................... 64

iv

Theoretical Basis of This Study ................................................................ 65

The Notion of Resilience ................................................................... 65

Summary of Limitations in Existing CIP Research .................................. 67

An Innately Technical Approach ....................................................... 68

Defining a Resilience-based Approach to CIP .................................. 70

The Application of Existing Resilience-Enabling Management

Strategies .................................................................................... 72

A Shift to Systems Research ............................................................. 73

Summary of Literature Gaps ............................................................. 74

Implications for Research .................................................................. 76

Research Context ...................................................................................... 76

Critical Infrastructure – The Queensland Electricity Industry ........... 76

Resilience as a Construct ................................................................... 78

Research Problem ..................................................................................... 81

Research Questions............................................................................ 82

Research Outcomes Sought ............................................................... 83

Chapter 4: Methodology ................................................................................. 84

Introduction ............................................................................................... 84

Justification of the Scientific Paradigm .................................................... 84

Appropriateness of Realism Paradigm ..................................... 86

Methodological Choice ............................................................................. 88

The Case Study Approach ........................................................ 89

The Case Study Design............................................................ 95

Data Collection Strategies ...................................................... 101

In-Depth Interviews ............................................................... 102

Document Analysis ................................................................ 104

Analysis of Case Study Data .................................................. 106

Trustworthiness of Qualitative Case Study Research ............. 108

Credibility ............................................................................. 109

Transferability ....................................................................... 110

Dependability ........................................................................ 110

Confirmability ....................................................................... 111

Limitations .............................................................................. 111

Ethics ...................................................................................... 112

Conclusion ............................................................................................... 113

Chapter 5: Results ........................................................................................... 114

Introduction ............................................................................................. 114

v

Organisation (“Wildavsky-ian”) Resilience ............................................ 115

Within Case Analysis ............................................................. 116

Organisation A ...................................................................... 116

Organisation B ...................................................................... 121

Organisation C ...................................................................... 127

Organisation D ..................................................................... 133

Organisation E ...................................................................... 139

Organisation F ...................................................................... 145

Embedded Cross-Case Findings ............................................. 152

Business Continuity Management ......................................... 155

High Reliability Theory ......................................................... 165

Summary ................................................................................ 170

Whole-of-System Resilience (“Holling-ian”) ......................................... 171

Industry Structure and Governance Characteristics ............... 171

The Role of the Government as Shareholder ........................ 173

Attitude and Ethos Characteristics.......................................... 177

Industry Commitment and Culture of Reliability .................. 177

Conclusion ............................................................................................... 184

Chapter 6: Discussion and Conclusions ........................................................ 186

“Wildavsky-ian”: Organisational Level (Research Question 1) .............. 187

BCM and High-Reliability Practices ...................................... 188

“Holling-ian”: Industry Level (Research Question 2) ............................ 190

Collective Commitment and Culture ...................................... 191

The Critical Nature of Operations ........................................ 191

Government Ownership ........................................................ 192

Sectoral Differences................................................................ 194

Nature of Operating Environment: Market-Based vs. Non-

Competitive .................................................................... 194

Nature of Infrastructure ........................................................ 195

Summary of Findings – Implications for the Research Problem ............ 196

Resilience-Enhancing Characteristics .................................... 198

Potential Threats and Opportunities for Improvement to

Resilient Functioning .................................................... 200

Implications for Theory and Future Research Directions ....... 203

Research Limitations .............................................................. 208

Conclusion ............................................................................................... 209

Appendices ........................................................................................................ 210

Reference List ................................................................................................... 223

vi

List of Tables

Table 2.1: Systematic Classification of Risk Perspectives

Table 2.2: Characteristics of Complex vs. Linear Systems

Table 2.3: Characteristics of Tightly Coupled vs. Loosely Coupled Systems

Table 2.4: Reliability-Enhancing Characteristics of HROs

Table 2.5: Crisis Management and BCM Shared Assumptions

Table 2.6: Factors Contributing to the Development of a BCM Capability

Table 3.1: Varying Conceptualisations of Resilience

Table 3.2: Overview of Literature Gaps

Table 3.3: Resilience-Enhancing Characteristics

Table 4.1: Categories of Scientific Paradigms and their Philosophical Assumptions

Table 4.2: Firms in Queensland‟s Electricity Industry

Table 4.3: Sampled Firms and Criteria

Table 4.4: Interview Schedule (Number of Sessions)

Table 4.5: Relevant Corporate Documentation Utilised

Table 5.1: BCM Capability - Organisation (A)

Table 5.2: HRT Capability - Organisation (A)

Table 5.3: BCM Capability - Organisation (B)

Table 5.4: HRT Capability - Organisation (B)

Table 5.5: BCM Capability - Organisation (C)

Table 5.6: HRT Capability - Organisation (C)

Table 5.7: BCM Capability - Organisation (D)

Table 5.8: HRT Capability - Organisation (D)

Table 5.9: Reliability Performance - Organisation (D)

Table 5.10: BCM Capability - Organisation (E)

Table 5.11: HRT Capability - Organisation (E)

Table 5.12: Reliability Performance - Organisation (E)

Table 5.13: BCM Capability - Organisation (F)

Table 5.14: HRT Capability - Organisation (F)

Table 5.15: Summary of BCM Capability Ratings for Organisations

Table 5.16: Organisation Rank for BCM Capability Ratings

Table 5.17: Summary of HRT Capability Ratings for Organisations

Table 5.18: Organisation Rank for HRT Capability Ratings

Table 5.19: A Selection of Collaboration Measures

vii

List of Figures

Figure 2.1: Perrow‟s System Characteristics – Complexity and Coupling

Figure 2.2: The Crisis Management Cycle

Figure 2.3: Evolution of BCM Approach

Figure 3.1: Examples of Electric Power Infrastructure Dependencies

Figure 3.2: Overview of the Structure of the Queensland Electricity Industry

Figure 3.3: Conceptual Framework for Characterising Different Approaches to Resilience

Figure 3.4: Factors/Processes Contributing to Resilient Outcomes

Figure 4.1: Appropriate Methodologies by Paradigm

Figure 4.2 Position of Case Study Research

Figure 5.1: Composite Benchmark of Performance – Weighted Average Organisation (F)

Figure 5.2: Combined HRT & BCM Capability Rating

Figure 5.3: Overall Resilient Capabilities

Figure 5.4: Industry Collaborative Relationships

Figure 5.5: System Attitude and Ethos Characteristics

Figure 6.1: Summary of Key Findings

Figure 6.2: Key Relationships Between Emergent Themes

Figure 6.3: Strength of Presence & Emergent Relationships Between Resilience-Enhancing

Characteristics

Figure 6.4: A Balanced Approach to Reliability

viii

List of Common Acronyms

AEMO – Australian Electricity Market Operator

AER – Australian Energy Regulator

BIA – Business Impact Assessment

BCI – Business Continuity Institute

BCM – Business Continuity Management

BCP – Business Continuity Plan

CIP – Critical Infrastructure Protection

CIIP – Critical Information Infrastructure Protection

CEO – Chief Executive Officer

CPRS – Carbon Pollution Reduction Scheme

CRN – Comprehensive Risk Analysis and Management Network

DRP – Disaster Recovery Planning

ERP – Emergency Response Planning

GFC – Global Financial Crisis

GOC – Government-Owned Corporation

HRO – High Reliability Organisation

HRT – High Reliability Theory

IS – Information System

IT – Information Technology

KPIs – Key Performance Indicators

MAO – Maximum Acceptable Outage

MSS – Minimum Service Standards

NAT – Normal Accident Theory

NEMMCO – National Electricity Market Management Company

NFPA – National Fire Protection Association

RTO – Recovery Time Objectives

SAIDI – System Average Interruption Duration Index

SAIFI – System Average Interruption Frequency Index

UK – United Kingdom

US – United States

ix

Statement of Original Authorship

I hereby declare that the work contained in this thesis has not previously been

submitted to meet the requirements for an award at this or any other higher education

institution. To the best of my knowledge and belief, the thesis contains no material

previously published or written by another person expect where due reference is

made.

Natalie Sinclair

20th

November 2009

x

Acknowledgements

This certainly has been a journey and there are many people who have been central

to its completion and ensuring that I made it to the finish line. First and foremost the

biggest thank you goes to my supervisor Dr Paul Barnes - without your continued

support, guidance, patience and wisdom this dissertation would not have been

possible. I would also like to sincerely thank the organisations and wonderful

informants who devoted their time and insights to this study. Your valuable

contributions made this dissertation possible.

Another big thank you goes to my dear partner Michael – your support has been

fundamental in allowing me to undertake this journey, particularly for putting up

with all of the delays, my lack of income, and also for the piles of paper that have

cluttered our study. From the bottom of my heart – thank you. To my family – I wish

you were closer and could have shared this journey with me but I am thankful for

your blessing, your support, and most of all your love. To my best friend Kristina –

thank you for lending me your ear, for your words of encouragement, and for

sticking by me even when I was a „hermit‟ friend MIA. I look forward to celebrating

with you – and a bottle of Verve. To Gavin – you have truly been a massive help

and a dear friend. I have valued our coffee chats and sincerely appreciate your

assistance and time proof-reading my thesis. And finally, to my little Cooper – your

funny antics and unconditional love have kept me smiling.

I would also like to thank Dr Fran Finn and Mr Col McCowan for the very important

role you have played in getting me to this point in my life. Your support and words

of wisdom will never be forgotten. Also, a very special thank you to Professor Lisa

Bradley, who willingly came on board during the revision period to lend her

expertise in this document‟s review. Your time and advice is greatly appreciated.

Last but not least, a big thank you to all of the amazing people in Z701 (past and

present) that have been there to share this journey with me, and the Research Support

Staff who have make it possible for all of us.

Chapter 1: Introduction

Critical infrastructures are the contemporary large-scale technical systems which

provide the vital services that support our complex modern societies. Their continued

reliability is paramount; however, the escalating complexity and vulnerability of

these systems has been demonstrated in recent years, evidenced by significant

structural changes, a rise in terrorist attacks, and in some instances, notable failures.

The challenge of infrastructure management and their protection has emerged as a

critical issue for consideration, with research necessary to better understand and

prepare these complex systems and the organisations operating within them to

respond flexibly to uncertainty and business not-as-usual. Resilience, as a flexible,

adaptive capacity, is now considered a key capability for organisations responsible

for the delivery of reliable service provision so as to ensure their continued

availability in the face of adversity. This Chapter introduces the research undertaken

in this investigation, providing a discussion on the contextual background of the

research issue and setting. It will also identify the research problem and supporting

research questions, and conclude with a discussion regarding the significance of the

research investigation.

Contextual Background

Critical infrastructures are the lifeblood of contemporary society; central to

continued economic growth, prosperity and indeed social stability. Their importance

has been further augmented as the world continues to globalise and with the

emergence of a global information economy that is increasingly dependent on

technology. Accustomed to their 24/7 „always-on‟ availability, the reliability of these

contemporary industrial systems is thus paramount. Recent failures and attacks

however have demonstrated their escalating vulnerability and hence society‟s

vulnerability to an ever growing spectrum of threats. Critical infrastructure protection

is now considered a matter of national security (Holmgren 2007). This combined

with an institutional shift that has seen many infrastructures that were previously the

exclusive purview of governments and delivered as centralised essential services,

now increasingly being delivered under unbundled, networked conditions. Thus the

task of managing infrastructures for reliability has become even more difficult, with

2

the industries and organisations operating within them needing to be more flexible to

respond to uncertain and complex operating conditions.

Under conditions of complexity and uncertainty, resilience as a flexible, adaptive

capacity has emerged as a critical consideration for the organisations responsible for

the reliable management of critical infrastructures. This is fundamentally different to

traditional considerations of the concept of resilience, as it has been applied

historically from an engineering perspective in the context of critical infrastructures

to maintain efficiency and a constant operating state. Such a conceptualisation of

resilience is indeed important from a reliability perspective, but it does not bode well

under conditions of uncertainty and increased complexity. Under such conditions,

other conceptualisations of the term which consider being resilient as an adaptive

capacity to deal with uncertainty and change, are increasingly being considered to

hold cross-disciplinary weight from the Social Science and Ecological disciplines,

and indeed relevance to ensuring the protection of these lifeline systems as a

complementary capacity to traditional engineering approaches.

The fundamental research problem this research seeks to examine is how do

networked critical infrastructure systems operating in an increasingly institutionally

fragmented environment foster resilient capabilities to ensure the reliable provision

of essential services? In particular the work seeks to explore and better understand

how resilience, from a business management perspective, manifests within the

organisations responsible for the reliable provision of essential services and more

broadly, what industry-wide conditions exist to ensure resilient functioning across

the network of organisations. To achieve this end, the research will draw upon two

analytical lenses of resilience that will provide the conceptual framework for this

investigation. Firstly, this research examines resilience from an organisational

perspective, drawing on the conceptualisation of resilience by organisational scientist

Wildavsky (1988), which for the purpose of this investigation, will be referred to as

“Wildavskiy-ian” resilience. The second conceptualisation based on the work of

ecological scientist Holling (1973), examines resilience from a systems perspective

and will be referred to as “Holling-ian” resilience.

3

Although there is little consensus as to how organisations may achieve greater

resilience in the face of increasing threats (McManus 2008), this investigation looks

towards to the sociological risk literature which has evolved largely from the study of

industrial crises and failures in large-scale technical systems. This large body of

work holds considerable value in the study of critical infrastructures to enhance their

protection and indeed reliable functioning from a business management perspective.

Two approaches that have emerged as viable resilience-enhancing practices from the

broader body of literature aimed at dealing with risk and business not-as-usual

include High Reliability Theory (HRT) and the associated Reliability-Enhancing

characteristics. Similarly, Business Continuity Management (BCM) as an evolution

of earlier approaches to Crisis Management has emerged as a resilience-enhancing

management practice aimed at preparing organisations for disturbances. BCM

provides organisations with the capacity to cope with unanticipated changes and

continue functioning under stress. When used together effectively, BCM and HRT

can be posited to enhance resilient functioning, and thus offer significant value to

assist with the challenge of ensuring the reliable management of organisations

responsible for the delivery of essential services. In addition to these organisational

level practices, it is also useful to explore whether similar approaches or any other

characteristics may be contributing to resilient functioning at the industry-wide level.

To explore the research problem and illicit deeper understanding of this complex

phenomenon (resilience) as it occurs in fragmented critical infrastructure systems, a

qualitative case study design has been utilised. This involved the utilisation of a

single case with embedded units of analysis to explore both conceptualisations of

resilience; within individual organisations (“Wildavsky-ian”), and also the networked

conditions that exist within the broader infrastructure system (“Holling-ian”). To do

this, a variety of data collection methods were employed including one-on-one and

small group interviews, as well as an analysis of relevant industry and organisational

documentation. The following section will detail the selected case: the Queensland

Electricity Industry.

4

The Research Setting - The Queensland Electricity Industry

An examination of critical infrastructure is central to this research and the

Queensland Electricity Industry provides the context for the investigation. In this

Australian State, which is experiencing considerable population and economic

growth, and similarly has a large business and industrial base that is highly

dependent on electricity supply, the reliable functioning of this industry is of great

strategic importance now and into the future. This industry has also experienced

significant institutional change in recent years, transitioning from a relatively

centralised essential service provided by government, to a corporatised environment

with six independent Government Owned Corporations (GOCs) now responsible for

electricity service provision across three interconnected sectors along the distributed

electricity supply chain.

Furthermore, whilst the Transmission and Distribution functions remain

Government-owned monopolies, the Generation sector is now operating in a

regulated competitive marketplace with the introduction of privately owned

Generators. Accordingly, this changed institutional environment provides an

interesting backdrop to the research investigation, exploring the reliability challenges

under corporatised rather than fully privatised environments that have been the focus

of previous research (e.g. de Bruijne 2006).

Although it has not experienced the same degree of privatisation or scale of

disruptions to service as noted in places such as California, the restructured

Queensland Electricity Industry is characterised by significant uncertainty and has

experienced its own share of problems under the recently changed operating

conditions. In the wake of a number of electricity service disruptions, an independent

panel was commissioned to investigate concerns about the performance of the State‟s

Distribution assets. The resultant Electricity Distribution and Service Delivery

(EDSD) Report or more commonly known as the Somerville Report released in

2003, highlighted problems affecting reliability and made recommendations to

improve reliability of service provision (State of Queensland 2004).

5

The report served as a catalyst for change in the industry, as the Queensland

Government and organisations have since been actively engaged in implementing the

recommendations of this report. In light of these recommendations and the

subsequent improvements, the industry provides an interesting case study to explore

how resilience, from a business management perspective, manifests in the context of

a corporatised critical infrastructure system.

Research Purpose and Significance

The reliability of critical infrastructures has emerged as a fundamental concern for

modern societies, whom in an age of technology are dependent on their „always-on‟

availability. Similarly, in the wake of events such as the September 11 terrorist

attacks the protection of critical infrastructure has become a matter of national

security in many nations. This work addresses key components of the national

research priorities set out by the Australian Federal Government in respect to the

protection of national critical infrastructure to ensure their continued resilient

functioning. In particular, the contribution of Electricity Industries to the economy

and social well-being of many Western countries, including Australia, makes the

prevention of crises within them, or the minimisation of their impacts if they occur,

an issue of considerable strategic importance.

Accordingly, there is an identified need to enhance the management practices of

those organisations responsible for the reliable provision of essential services to

ensure their resilient functioning. Given the changed institutional environment and

greater uncertainty, research is needed to better understand how these systems and

the networks of organisations that operate within them can continue to be managed

reliably through the development of resilience-enhancing capabilities. The aim of

this research is to provide a description of gaps and opportunities for enhancing

resilient functioning in critical infrastructure from an organisational and systemic

perspective.

6

To do this, the dissertation will firstly provide a review of relevant literature bodies.

This will be followed by a presentation of the literature gaps this research seeks to

address and the conceptual frame underpinning the investigation in Chapter 3.

Chapter 4 will detail the methodology employed to address the research problem and

associated research questions. This will provide the foundation for the presentation

of results in Chapter 5. Chapter 6 will conclude with a discussion of key themes

emerging from the research investigation, provide conclusions regarding the two

research questions, and detail a number of directions for future research.

7

Chapter 2: Literature Review

Introduction

Systems of infrastructure were fundamental in the development of industrial societies

and their importance has been further augmented with the advent of the technological

age. These large-scale technical systems today underpin the economic and social

stability of modern societies, where 24/7 reliable service provision is paramount (de

Bruijne 2006; Egan 2007). Although once the exclusive purview of governments

with whom responsibility for their reliable operation traditionally resided,

increasingly these vital services such as electricity, water and telecommunications

are being delivered in unbundled, fragmented supply chains characterised by

networks of independent organisations as a result of the global trend towards their

privatisation and deregulation. Under such conditions the achievement of reliability

is said to be compromised, with instances of unreliability highlighted most explicitly

in the failures of the fully privatised California Electricity Industry (de Bruijne

2006). This combined with a more complex and uncertain operating environment in

the wake of events such as September 11 place greater pressure on the reliable

provision of service in these lifeline systems. As such, the importance of fostering

resilience to ensure their continued reliability cannot be emphasised enough.

This research explores, in practice, how electricity supply chains can be more

resilient. It seeks to look at an electricity industry, not fully privatised in the instance

of California nor one that has experienced such widespread loss of functionality, but

instead a corporatised industry that is both complex and services a large industrial

base that is a significant user of electricity. The basis of this study is to explore how

this new corporatised structure can be operated and function in a more resilient

manner. The key conceptualisation of resilience as applied in this work is the

capacity of a system or organisation to continue functioning when put under pressure

or impacted by some significant disturbance.

Resilience, as a phenomenon of highly effective systems such as infrastructures

providing essential services, has evolved from the study of industrial failures and

more recently commercial failures. This form of resilience will be approached by

8

applying two recognised practices said to contribute to resilience, in the form of

Normal Accident Theory (NAT) and High Reliability Theory (HRT), in addition to

Business Continuity Management (BCM) in the broader context of how Risk and

Crisis Management literature has evolved.

Accordingly, this Chapter will examine two dominant literature streams within the

sociological Risk and Crisis Management literature. The first will detail the evolution

of theories of reliability in NAT and HRT. Although often pitted against one another

in the literature, these theories are not incompatible but rather take a different view

on how complex organisations can be made more reliable with HRT said to

contribute to resilience in organisations. The second will explore the evolution of the

various approaches to crisis preparedness. This will focus on how they have evolved

from traditional reactive, technical approaches, towards contemporary, socio-

technical approaches such as BCM, which take a more flexible approach preparing

organisations to deal with uncertainty and contribute to building resilient capabilities.

This Chapter will conclude by contextualising this in the milieu of Critical

Infrastructure, whereby these lifeline systems are operating in an increasingly

uncertain environment in addition to the added challenge of delivering high

reliability under networked conditions. Thus, BCM and HRT as resilience-enhancing

practices hold weight to examine how the resilience of critical infrastructures may be

augmented under conditions of complexity and uncertainty to ensure the reliability of

service provision.

Historical Development of Risk and Crisis Research

The Concept of Risk The concepts of danger and hazard can be dated back to the ancient Sumerians,

associated with the notion of involuntariness (i.e. unforseen events) as life was

viewed as being in the hands of the Gods. In comparison, „risk‟ is a more recent

concept thought to date back to around 2000 B.C, with the antonyms of safety and

risk emerging in various literatures from across the world world including Chinese,

Indian and Greek, as well as the continuing lexicology of Egypt and the ancient

civilisations of the Middle East including Babylon. These concepts developed with

overtones of voluntariness (i.e. foreseeable events) which could be accepted or

avoided according to a human choice, in contrast to the parallel ideas of hazard and

9

danger which continued to be associated with „acts of god‟ (Ingles 1991). Despite

this rich history in ancient texts, the concept of risk and indeed safety are a more

recent addition to the English language, although their origins are largely unclear.

Ingles (1991) suggests that risk most likely has its origins around the 14th

Century

from the Greek word riskos, and later emerging in Spanish, Arabic and German with

all meanings gradually emphasising a sense of voluntary human control as a

dominant feature, with connotations to money and wealth.

Although its origins are somewhat unclear, the concept of risk has since emerged as a

fundamental issue in modern society, as it lies at all levels of human and business

activity. The inherent uncertainty and possibility of negative effects associated with

the term, combined with the human desire to reduce these conditions has seen it rise

to top of the public policy agenda and become a topic of immense interest across

many academic disciplines including finance and engineering, and later to

psychology and management. Consequently, the academic literature provides an

array of classifications of this broad concept, and depending on one‟s academic

discipline, it can have a vastly different definition and meaning (Renn 1992;

Shrivastava 1995). Recognising the diversity amongst classifications in the literature

and a need to distinguish between them, Renn (1992) developed a transdisciplinary

taxonomy of risk perspectives in what is commonly referred to as a „systematic

classification of risk perspectives‟. In his taxonomy, Renn (1992) identified seven

approaches to the conception and assessment of risk that are largely grounded in the

various academic disciplines.

Table 2.1: Systematic Classification of Risk Perspectives

Systematic Classification of Risk Perspectives

Risk as a Physical Attribute

Technical

Risk Analyses

Actuarial

Toxicological &

Epidemiological

Engineering

Social Science

Risk Analyses

Economic

Psychological

Sociological

Risk as a Social Construct Cultural

Adapted from Renn (1992)

10

Whilst perhaps not as well researched as the technical risk analyses, the sociological

perspective has become an important aspect of organisational research. This has

shifted the focus of higher level issues in organisations towards the study of risk and

uncertainty, in recognition of an increasingly uncertain operating environment, and

the growing prevalence of large-scale technical systems (Renn 1992; Shrivastava

1995).

Large-Scale Technical Systems: Growth & Prominence

„Human societies have always been faced with risks and hazards‟ (Quarantelli,

Lagadec and Boin 2007, 17), but as they have evolved, new threats have emerged.

Since the Industrial Revolution, the development of modern technologies have

allowed for the evolution of large and complex technical systems – the spatially

extended and functionally integrated socio-technical networks (Manion and Evan

2002). These systems have created vast efficiencies that have allowed for significant

labour and lifestyle changes, and are today the cornerstone of industrialised

countries. While such systems have created many benefits for the socio-economic

systems they support, they have also produced new challenges by way increasingly

the likelihood of failure and crisis across organisational contexts (Shrivastava 1995;

Egan 2007).

In line with the rapid economic development experienced in the latter half of the

twentieth century, these high-risk systems1 have expanded in number and now

dominate the post-industrial social landscape of developed and developing

economies. As technology continues to evolve at a remarkable rate, the complexity

of modern society compounds and the potential risks for individuals and

organisations multiply rendering them more susceptible to accidents that result from

unforeseen consequences and misunderstood interventions. Such complex

organisational conditions are not only increasing the risks and uncertainty for system

operators, but also for society at large (Shrivastava 1987; Weick 1987; Roberts 1990;

Shrivastava 1994a; Mannarelli, Roberts and Bea 1996; Perrow 1999a; Weick and

1 High-risk organisations or systems can be defined as those operating technologies sufficiently

complex to be subject to catastrophic accidents (Roberts and Rousseau 1989). This description is

supported by Perrow (1999a) who suggests that these complex socio-technical systems including

nuclear power plants, space missions, and aircraft and air traffic control have catastrophic potential,

and therefore require extraordinary attention to avoid major errors.

11

Sutcliffe 2001; Weick and Sutcliffe 2007). Social theorists now consider risk and

uncertainty as a central plank of analyses of the modern world (Shrivastava 1995).

As suggested by Beck (1992) we are living in a „risk society‟: a consequence of post-

industrial modernisation where the spectrum of threats faced continues to broaden.

For the first time in history, human-induced crises have the potential to rival natural

disasters in both scope and magnitude (Pearson and Mitroff 1993; Shrivastava 1995;

Stead and Smallman 1999; Amin 2002, International Risk Governance Council

2006). The growth of these complex, large-scale and geographically dispersed

technical systems has corresponded with a marked increase in the frequency and

magnitude of organisational failures since the 1970s. This alarming trend is

highlighted by major incidents that have occurred in complex socio-technical

systems including the Bhopal crisis, Chernobyl and Three Mile Island nuclear

accidents, NASA Challenger Space Shuttle tragedy, Exxon Valdez oil spill, the

Hillsborough and Kegworth air tragedies, and Johnson and Johnson‟s Tylenol

poisonings. These tragic events sparked a wave of organisational research in the field

of Risk and Crisis Management (Tenner 1997).

Despite such events forcing us to question our confidence in the reliability of large-

scale technical systems, our dependence has not waivered and their complexity

continues to increase as technology is pushed beyond its limits, increasing the

potential for catastrophic crises and fostering ongoing support for sociological risk

and crisis research examining the safety and reliability of these complex systems

(Mitroff 1988; Smith 1990; Richardson 1994; Shrivastava 1994a; Shrivastava 1994b;

Kovoor-Misra, Zammuto and Mitroff 2000; Weick and Sutcliffe 2001; Manion and

Evans 2002; Perrow 2007).

12

The Development of Sociological Risk & Crisis Management Research

According to Shrivastava (1994a, 12), „the realisation of the high-crisis proneness of

post-industrial risk societies created a need for new research on crises‟. The concept

of crisis was re-introduced and formalised in the field of management in the 1960s,

but it was not until the 1970s and 1980s amidst the increasing number of major

natural and socio-technical disasters, that sociological research in disaster studies

became a major research theme. Major contributions were made to the literature

during this period which saw research in this growing social science discipline move

beyond the traditional work in the natural disaster research paradigm, which was

popularised and well researched by the likes of Quarantelli (1970), Dynes (1970a;

1970b) and Drabek (1970), to take an organisational-centric approach. This was

popularised by the seminal work of Turner (1976), who was the first to argue that

organisations themselves could incubate the potential for crisis.

Once it was established that the safety and reliability of complex systems was not

based solely on physical characteristics or external events, but was also contingent on

the people and processes operating them, sociological studies of risk examining this

dark side of management have evolved to encompass many sub-fields and attract

many disciplines. Early theory in this realm developed through the study of industrial

disasters and socio-technical failures, and a perceptibly coherent theory of the nature

and structure of socio-technical crises has evolved (Pauchant and Douville 1993,

Sagan 1993; Shrivastava 1994a; de Bruijne 2006).

The significant socio-technical crises which occurred during this period including

Seveso (1976), Bhopal (1984), Three Mile Island (1979), and Chernobyl (1986),

provided fertile ground for this new wave of research, raising many questions about

the safety and reliability of increasingly complex, high-risk organisations asking why

have such tragedies occurred? Are such accidents inevitable in complex

technological systems? What can be done to prevent these accidents from occurring

(Smith 1990; Pauchant and Douville 1993; Shrivastava 1994a)? Such questions

ultimately led to the development of two competing schools of thought examining

the issue of reliability and safety in organisations – the inherently more optimistic

High Reliability Theory (HRT), and the more pessimistic view of Normal Accident

Theory (NAT), both of which will now be discussed.

13

Theories of Reliability

HRT and NAT emerged as competing sociological schools of thought in the 1980s

and have since developed into a significant body of scholarly literature. With

intellectual roots in different traditions within the organisational theory literature

(Sagan 1993), they have presented very different but equally valuable views on the

safety and reliability of complex, high-risk organisations. Both have contributed

significantly to organisational Risk and Crisis literature, in regards to developing a

better understanding about the reliable management and operation of complex, high-

risk systems. The first, NAT, takes an inherently more pessimistic view, arguing that

accidents are inevitable in organisations operating high-risk, hazardous technologies.

On the other hand, HRT takes a more optimistic approach, with proponents arguing

that high levels of safety and reliability are possible despite operating high-risk

technologies through appropriate organisational design and management techniques.

The following section will explore the two theories.

The Pessimists View: Normal Accident Theory

Since the mid-1980s many researchers have raised concern about the industrialised

world‟s growing reliance on complex technological systems to manage increasingly

hazardous operations reliably (Egan 2007). Contributions to this commentary include

Tenner (1997) whose work demonstrated that an increasing reliance on large-scale

technical systems has increased society‟s vulnerability, while Perrow (1984; 1999a)

and Sagan (1993) both published detailed works about the increased potential for

failure in large and complex organisations. Both Perrow and Sagan suggest that these

systems will eventually fail because they are growing increasingly complex and

difficult for their human operators to manage.

Spurred by concern that our „complex systems threaten to bring us down‟

unexpectedly, unpredictably, unintentionally but perfectly normally, Harvard-based

organisational sociologist Perrow (1984, vii) in his seminal work, „Normal Accidents

– Living with High Risk Technologies‟, describes numerous failures of what he

refers to as tightly coupled, complex systems – organisational causes of

technological disasters. The basic thesis of NAT holds that accidents are inevitable in

14

interactively complex2, tightly coupled

3 technological systems, such as nuclear

power plants (Rijpma 1997; Jerimer 2004; Marias, Dulac and Leveson 2004; Barnes

2005).

It is argued that complex systems are vulnerable to failures in reliability because of

their complexity and tight coupling, with these system characteristics interacting to

pose risks and influence overall system reliability (Roe, Schulman, van Eeten and de

Bruijne 2005; de Bruijne 2006; Wolf and Sampson 2007). Where systems are both

interactively complex and tightly coupled, they can be considered vulnerable and

accidents „normal‟ as there is an increased potential for system accidents that cannot

be foreseen or prevented (Barnes 2005). The following tables (2.2 and 2.3) highlight

key differences Perrow (1984) identified between complex and linear systems, and

those that are tightly coupled and loosely coupled.

Table 2.2: Characteristics of Complex vs. Linear Systems

Complex Systems Linear Systems

Components closely packed

Non-varying sequences

Interconnected sub-systems

Many feedback loops

Multiple/interacting controls

Indirect Information

Components spatially segregated

Sequence order can be changed

Segregated sub-systems

Few feedback loops

Segregated controls

Direct Information

Table 2.3: Characteristics of Tightly Coupled vs. Loosely Coupled Systems

Tight Coupling Loose Coupling

Processing delays not possible

Non-varying sequences

Single methods used

Little slack possible in supplies,

personnel or equipment

Buffers & redundancies are

deliberate & designed in

Processing delays are possible

Sequence order can be changed

Multiple methods are available

Slack in resources possible

Buffers & redundancies available &

are applied as needed

Source: (Perrow 1984; Barnes 2005)

2 Interactive Complexity refers to the ability of system parts to interact in unanticipated, non-linear

ways creating uncertainty. 3 Tight Coupling refers to systems lacking spatial, temporal, or other patterns of buffering among

components.

15

As evident in Figure 2.1, both characteristics have been used as dimensions to plot

large-scale technical systems in order to represent their vulnerability to system

accidents (de Bruijne 2006).

Figure 2.1: Perrow’s System Characteristics – Complexity and Coupling

Complexity

Linear Interactions Complex Interactions

Dam

I

Nuclear Power Plant

II

III

Post Office

IV

University

Source: (de Bruijne 2006)

The premise of NAT holds that some systems are sufficiently complex to allow

unexpected interactions of individual failures in such a way that safety systems are

defeated. The theory further suggests that some systems are tightly coupled enough

to allow for cascading failures – a chain of events causing failures that increase in

scale and quickly ripple through large-scale, complex systems that are significant

enough to bring the whole system down and cause widespread disruptions to service

(Perrow 1984; Marias et al. 2004). Using Three Mile Island as his case in point,

Perrow (1984) illustrates how multiple failures, each small and insignificant on its

own, can cause major accidents when they occur in unanticipated sequences over

time. He suggests that large-scale system accidents are the result of simultaneous and

interactive failure among various system components including equipment,

procedures, operators, supplies and materials, environment, and design (Perrow

1984).

Co

up

lin

g

Lo

ose

Cou

pli

ng

Tig

ht

Cou

pli

ng

Increasing Vulnerability to

System Accidents

16

Increased System Capacity, Increased Risk – Vulnerability & Complexity

A further claim by NAT is that the size and scope of a system also influences the

propensity for failure. This makes large systems more vulnerable to unavoidable

accidents, with any system increases resulting in more incomprehensible or

unexpected interactions. It is argued that the bigger and more complex the physical

system and the organisations that run them, the greater the propensity for

simultaneous failures resulting in disaster. This is because the complexity of the

system, with its tight coupling and massive scale simply extends beyond operators‟

capabilities to anticipate and understand sequences of events and how to effectively

react once an incident occurs (Sagan 1993; de Bruijne 2006).

This is an important consideration for large-scale technical systems, such as critical

infrastructure, as technological advances have allowed these organisations to

significantly expand their operational capacity to cater for increased societal

demands. Critical infrastructure systems are becoming increasingly complex to

enhance the speed of delivery and efficiency of operations, but in doing so they are

inadvertently increasing the fragility and vulnerability of the systems and the services

they provide (Zimmerman 2001; Boin, Lagadec, Michel-Kerjan and Overdijk 2003).

Perrow (1999b) has argued that in the search for speed, volume, efficiency and the

ability to operate in hostile environments, system designs that can provide reliability

and security are being neglected, reducing operational reliability. While providing

many of the aforementioned benefits, an unfortunate characteristic of these modern,

increasingly complex and tightly coupled systems is that they will predictably fail,

but in unpredictable ways (Perrow 1999a; Little 2002). Supporting this view, Sagan

(1993) later extended NAT, pointing out that complex organisations produce

accidents because official safety goals are rendered obsolete by production pressures

and parochial interests. That is, effectiveness and system safety are often being

compromised for efficiency, an alarming trend in contemporary systems that was

highlighted by the Bhopal tragedy (Shrivastava 1987; Shrivastava 1994b; Rijpma

1997; Jarman 2001). This is a startling trend, as safety scientist Rasmussen (1997)

contends that the goal of reliability conflicts with efficiency.

17

While rather pessimistic in his approach, Perrow‟s work has been significant to the

overall context of subsequent thematic research in the field of risk and crisis

research. This is evidenced by the fact that his work has been extended by and

continues to influence many other prominent authors from diverse backgrounds

including Scott Sagan, Karl Weick, Paul Shrivastava, Thierry Pauchant, Ian Mitroff,

Denis Smith and Patrick Lagadec, and importantly, the authors of the high reliability

school of thought, including Karlene Roberts and Todd La Porte (Shrivastava

1994a). Moving away from research examining why failures such as Bhopal

(Shrivastava 1987) and the Three Mile Island incident (Perrow 1984) had occurred,

authors from the high reliability school saw that there was a clear need to extend

management theory. They sought to complement this with an understanding of how

high-risk organisations can instead employ strategies (e.g. purposeful design and

management practices) to enhance the reliability of their operations to avoid, or at

least reduce, the impact of normal accidents (Roberts 1990; Rochlin 1993).

The Optimists View: High Reliability Theory

In response to discussions surrounding the potential for failure in increasingly

complex, high-risk systems, a multidisciplinary group of researchers further

advanced the literature with the study of what they termed High Reliability

Organisations4 (HROs) – organisations being successfully operated despite their

high-risk nature. Thus, the proponents of HRT take an inherently more optimistic

view of reliability5 in high-risk organisations than their NAT counterparts, arguing

that such organisations can either accept the inevitable and wait for these normal

accidents to occur, or they can take proactive measures to help avoid disastrous

accidents by building a capacity for resilience (Roberts and Rousseau 1989; Roberts

and Bea 2001).

4 HROs can be identified by asking the question: „How many times could the organization have failed

resulting in catastrophic consequences and did not?‟ If the answer given is many thousands of times,

the organisation can be considered highly reliable (Roberts 1990). 5 Reliability in the context here, is based the work of Roberts (1990) and others that consider it to be

an attribute of organisations that have organisational capacities that limit errors and failure, and in turn

consistently deliver on stated objectives.

18

According to Frederickson and La Porte (2002), this research was initiated to explain

how high-risk organisations achieve nearly error-free operations over long periods of

time, despite operating systems that display NAT‟s characteristics of tight coupling

and interactive complexity; a phenomenon which could not be explained by existing

literature on organisational reliability. Since the late 1980s, this body of literature has

explored the conditions present in a range of high-risk organisations already

operating reliably and performing at an extraordinary level of safety and productive

capacity. Over the last two decades this growing body of work has tracked

organisational responses developed to overcome the limits of complexity and tight

coupling associated with high-risk operations. It is now widely acknowledged that

such organisations can avoid cascading failures and catastrophic errors, and instead

maintain a high level of service reliability through intelligent design and appropriate

management practices (La Porte 1996; Weick and Sutcliffe 2007).

The development of a theory of high reliability is based on many years of direct

observation of error-intolerant systems by a multidisciplinary group of scholars from

the University of California, Berkeley (Frederickson and La Porte 2002). This group

of researchers including La Porte, Rochlin, Roberts, Weick, and Consolini have been

involved in making observations of, and theorising about this sub-set of

organisations which operate large, high-risk technical systems that are potentially

hazardous to society, but are able to maintain safe and reliable operations (La Porte

and Consolini 1998; Marias et al. 2004).

Initial empirical research into HROs was conducted in three high-risk organisations

in the United States (US). These included a large scale electric power generation and

distribution system (Pacific Gas and Electric Company), an air traffic control system

(the Federal Aviation Administration‟s Air Traffic Control), and two nuclear power

aircraft carriers (the US Navy) (Roberts 1990; Sagan 1993). Subsequent studies have

been extended to a variety of other high risk organisations including emergency

medical treatment teams, NASA, hostage negotiation teams, and wild land fire

fighting teams, all reinforcing this theory of high reliability (Weick and Sutcliffe

2001). All of the organisations studied share characteristics that differentiate them

from normal organisations due to the high-risk nature of their operations. For

example, all are expected to perform at high tempo for sustained periods of time and

19

must have the ability to do so repeatedly. Furthermore, while efficiency remains a

concern in all cases, reliability is the primary objective (Roberts, Rousseau and La

Porte 1994), a point that is supported by Weick and Sutcliffe (2007) who contend

that the diverse organisations studied share a single demand – they have no choice

but to function reliably, because if reliability is compromised, severe damage results.

With major operational errors in these organisations likely to produce catastrophic

consequences, HROs take on the dual goals of sustaining delivery at maximum

capacity and operating nearly error free. HRO‟s central day-to-day preoccupation is

therefore to operate complex and demanding technologies without failures, whilst

maintaining the capacity for meeting intermittent periods of very high peak

production; for example, peak demand loads for electricity (La Porte 1996; La Porte

and Consolini 1998). If an organisation performs hazardous operations repeatedly

without incident it can be considered to be highly reliable (La Porte and Consolini

1998). Rochlin (1993) contends that what distinguishes HROs is not their absolute

error or accident rate, but rather their effective management of inherently risky

technologies so that errors do not disable it. But understanding that no system is

perfect, HROs in their quest to deal with the unexpected, display a commitment to

resilience, accomplished through organisational and managerial mechanisms (Weick

and Sutcliffe 2001; Kendra and Wachtendorf 2003; Schulman, Roe, van Eeten and

de Bruijne 2004; Boin and Smith 2006); in what Sagan (1993, 14) describes as

„intelligent organizational design and management‟.

After extensive research, it was found that such organisations rarely fail even though

they frequently encounter unexpected problems which they face due to their complex

technologies, varied constituencies, and because employees have an incomplete

understanding of the complex systems they are operating (Weick and Sutcliffe 2007).

The organisations studied have actively managed to avoid failures in an environment

plagued with the potential for error (Rochlin 1993). The research found that such

organisations are so effective at managing their complex operations that the

probability for serious error is very low, and by any measure the safety and reliability

these systems can be considered remarkable (La Porte and Consolini 1998;

Frederickson and La Porte 2002). It was through this initial research and subsequent

studies that researchers have been able to identify common characteristics and

20

techniques to improve the reliability of high-risk organisations (Roberts and Bea

2001).

Reliability-Enhancing Characteristics

The extensive empirical case study research has led high-reliability researchers to

conclude that the design of organisations and the way they are managed influence

their ability to mitigate negative effects of complexity and tight coupling, what are

referred to as „reliability-enhancing characteristics‟ (Perrow 1999a; de Bruijne 2006).

According to Lagadec (1993), an organisation‟s ability to deal with crisis is largely

dependent on the structures that have been developed before a problem emerges.

Accordingly, HROs have distinguishable structural characteristics that enhance the

reliability and resilience of their operations. To achieve reliability under all

conditions, HROs not only have unique structural features, but they are also said to

think and act differently (Weick and Sutcliffe 2001; 2007).

According to de Bruijne (2006), the authors of HRT acknowledge that there is no

checklist to follow to ensure fail-safe solutions to counter the risk of service

disruption caused by interactive complexity and tight coupling. They have however

established an impressive set of conditions that are said to be reliability-enhancing.

They see these conditions as not so much the source of highly reliable performance,

but the result of more subtle and unexplained dynamics that enable organisations to

achieve and maintain continuously high levels of reliability (de Bruijne 2006). There

is however little agreement amongst HRT researchers on the actual number of

reliability-enhancing characteristics.

Many authors have used a different combination of characteristics in their research.

For example, Weick and Sutcliffe (2001) present five variables, whilst Sagan (1993)

suggests that four critical causal factors have been identified6. de Bruijne (2006),

extensively reviewed existing HRT literature and available lists of reliability-

enhancing characteristics and developed a summarised list of what he considers to be

6 The four critical causal factors identified by Sagan (1993) from the HRT literature include: the

prioritisation of safety and reliability as a goal by political elites and the organisation‟s leadership;

high levels of redundancy in personnel and technical safety measures; the development of a high

reliability culture in decentralised and continually practiced operations; and sophisticated forms of

trial and error organisational learning.

21

the main characteristics and conditions of HROs. For the purpose of this research

these characteristics can be further broken into two areas: Process and Design

Characteristics, and Goal and Commitment Characteristics, and are evident in the

following table.

Table 2.4: Reliability-Enhancing Characteristics of HROs

Pro

cess

an

d D

esig

n C

hara

cter

isti

cs

1. Technical

Performance

Sustained high technical performance by highly trained

staff and exceptional equipment

2. Structural

Flexibility &

Redundancy

System designs increasing organisational slack to

enhance reliability and resilience to the unexpected

3. Responsibility and

Accountability

High degrees of accountability and autonomy amongst

staff with a no-blame approach to encourage timely

error discovery and reporting

4. Decision Making

& Patterns of

Hierarchy

Flexible decision making processes & collegial patterns

of hierarchy to encourage quick and flexible decision

making at the level required for immediate action

supported by a team based environment

5. Training and

Learning

Continual search for system improvement with

organisational learning through performance and

incident reviews, and regular staff training

Go

al

an

d C

om

mit

men

t C

hara

cter

isti

cs

6. Importance of

Reliability

Reliability not marginalisable, not fungible (cannot be

traded off for another commodity such as money). Every

possible effort is made to maintain high levels of

reliability

7. Organisational

Culture of

Reliability

Instituting mindfulness: everyone understands and is

working towards the same goal or reliability – integrates

familiar norms of mission accomplishment and

production efficiency with those of a strong safety

culture

8. Commitment to

Reliability

Commitment to reliable operations in missions and

goals and ingrained in organisational culture

9. External

Oversight

Strong presence of external groups with access to

credible and timely information maintains focus on the

goal of reliable service provision

(Adapted from de Bruijne 2006)

22

NAT versus HRT – The Great Debate

Based on the preceding discussion it is evident that HRT takes an inherently more

optimistic approach to the reliable management of large-scale complex systems than

NAT, with characteristics such as those outlined in Table 2.4 contributing to safe and

reliable operations. Since the emergence of the theories in the 1980s, a great debate

has evolved between the two contrasting views on the ability of organisations to

manage tightly coupled and complex technologies reliably. Given that both theories

address the issue of reliable management of technology in high-risk organisations yet

come to seemingly different conclusions, NAT and HRT have frequently been set

against each other in the literature (Sagan 1993; La Porte 1994; Rijpma 1997; Rijpma

2003). This however has proven to be a rather unproductive debate that has failed to

advance the literature and has ultimately resulted in a stalemate (Jarman 2001;

Rijpma 2003; de Bruijne 2006).

Proponents of NAT tend to view the theories as being quintessentially opposite,

proposing many arguments against HRT. One such argument posits reliability as

having to compete with more benign values such as efficiency and profitability. In

this regard, NAT distinguishes powerful organisational pressures, including those

related to production, profit, growth, prestige and departmental power struggles, as

forces that will affect concerns for reliability in organisations (Sagan 1993; Perrow

1994). Alternatively, HRT assumes that systems might display certain conditions that

could lead to a higher level of reliability than might be theoretically expected based

on NAT. It does however acknowledge doubts about whether organisations are

capable of continuously meeting these conditions over longer periods of time and

thus shares common concerns with NAT (Rochlin 1993; La Porte 1994).

Thus, proponents of HRT contend that rather than viewing the theories as

contradictory and posited against one another, they should instead be seen as

complementary because they simply provide different explanations as to why

complex technologies fail or not, and how these complex technologies are managed

(La Porte 1994). It is increasingly being recognised in the literature that a fruitful

approach to research in the management of hazardous technologies would be one that

merges the two approaches, with some promising avenues for research already

attempting to integrate both theories in their work (Rijpma 1997; de Bruijne 2006).

23

Despite the ongoing debate between the two schools of thought about the viability of

strategies to reduce internally induced failures, both NAT and HRT have contributed

significantly to the literature on sociological approaches to risk. Although there is

recognition that despite the best intentions it is impossible to eliminate all errors or

suppress every threat, reliability is an outcome that is still worth striving for. It can

therefore be argued that HRT remains a promising strategy for securing the reliable

management and operation of high-risk organisations by reducing the possibility of

internally induced failures. However, with an awareness that organisations will

always be faced with threats with the potential to escalate into crisis situations as

evidenced by the significant socio-technical crises that occurred in the 1970s and

1980s, another significant area of research has evolved examining how organisations

manage risks and prepare for crises.

Emergence of Varying Approaches to Crisis Preparedness

Like reliability theories, research into organisational crises evolved in the wake of

socio-technical crises that occurred in the latter half of the Twentieth Century as

academics sought to answer questions about how organisations can manage risks and

prepare to effectively respond to events should they occur. Although the

development of organisational crisis research as a discipline evolved significantly

between the 1960s and 1980s, evidenced by the large number of articles published

during this period, its evolution has been highly speculative and much of it has

lacked empirical testing (Shrivastava 1994a). This is supported by authors such as

Boin (2004, 167) who describe the risk and crisis management literature as „ill-

defined, resembling a hodge-podge quilt of specialist academics that are scattered

over many disciplines‟, including Information Technology (IT), Psychology, and the

Organisational Sciences.

A number of researchers have recognised the need for further empirical exploration,

arguing that a great deal of the existing literature is fraught with limitations,

fragmentation, and lack of rigour (Shrivastava 1994a; Pearson and Clair 1998). Well

acknowledged issues hampering the development of a cohesive school of thought

have been the infancy of the research field, the lack of empirical rigour, the use of

diverse research methodologies, and the inadequate integration of research from

24

multidisciplinary fields which has also led to the contentious use of definitions

(Pearson and Clair 1998). This fragmentation has kept research on organisational

crises at the periphery of management theory, and led to different applications and

interpretations of key concepts (Smith 1990; Pauchant and Douville 1993;

Shrivastava 1994a; Pearson and Clair 1998). As a result of this fragmentation, a

number of distinct approaches for preparing and responding to organisational crises

emerged in the literature. Among these are Disaster Recovery Planning (DRP), Crisis

Management, and Business Continuity Management (BCM), which will now be

discussed in turn.

Disaster Recovery Planning – A Useful but Limited Approach

Research into DRP emerged in the 1980s and 1990s as a result of the increasing

dependency on complex IT and Information System (IS) applications within

organisations. While DRP has provided a valuable function, its limitations have

prevented it from becoming a comprehensive strategy for safeguarding organisations

as this approach is characteristically focused on natural and IT based hazards, giving

rise to a number of inherent problems for dealing with threats and potential business

interruptions (Elliott, Swartz and Herbane 2002; Herbane, Elliott and Swartz 2004).

Firstly, DRP is seen to be a reactive approach to organisational recovery due to its

focus on natural disasters, where the cause of the incident is perceived to be beyond

the organisation‟s control. Further, the focus on IT or IS disruptions yields a very

narrow, function-centric approach to the identification of and preparation for crises.

These characteristics limit the capacity of the organisation to consider non-technical

internal vulnerabilities, and also its ability to effectively respond to external threats

other than natural disasters. It also results in organisations focusing on recovery,

while potential improvements that could be made to prepare for and mitigate the

occurrence of incidents that can threaten operations, are instead neglected (Herbane

et al. 2004). Thus, the limitations of DRP are evident and have seen it fall somewhat

by the wayside in both research and practice to the broader, more holistic Crisis

Management approach (Elliott et al. 2002).

25

Crisis Management – A Holistic Approach

In contrast to DRP‟s reactive approach, Crisis Management recognises the potential

for crises to manifest in organisations, and is therefore focused on identifying ways

to prevent crises from occurring and similarly being able to effectively manage those

that do transpire (Pearson and Clair 1998; Herbane 2004). Accordingly, this

approach has developed as a more holistic, organisation-wide focus, recognising both

the social and technical characteristics, as well as the systemic nature of business

interruptions. Crisis Management is strategic in its approach in that it seeks to

support the development of organisational capabilities to manage interruptions before

they escalate into crisis situations, and there are emerging arguments that link the

need for Crisis Management with Strategic Management as crisis events ultimately

threaten the strategic goals of organisations (Preble 1997; Herbane 2004; Pollard and

Hotho 2006).

This approach is grounded in a process that has been well defined in the literature,

referred to as the „crisis cycle‟ (Shrivastava et al. 1988; Smith 1990; Pearson and

Mitroff 1993; Pearson and Clair 1998; Stead and Smallman 1999) that involves a

pre-crisis, incident (trigger), crisis containment, post-crisis consequences and

recovery, and organisational learning phase for effective understanding of the

structure of crisis events. This cycle is depicted in the following figure:

Figure 2.2: The Crisis Management Cycle

Source: (Stead and Smallman 1999)

Pre-Crisis Trigger Crisis

Consequences Learning

Post Crisis

26

The pre-crisis phase involves proactive prevention and preparation efforts. This

includes measures to avert or prevent crisis events by detecting and treating early

warning signals before they materialise into a crisis (i.e. risk assessment and risk

treatment). Recognising that all crises cannot be prevented, it also involves

preparation for the management of anticipated crises by way of crisis planning, also

referred to as contingency planning. Following a triggering event, the crisis phase

involves a response to minimise the consequences by isolating and containing the

crisis event. This will involve the initialisation and execution of plans developed

during the pre-crisis phase with the measures to assist recovery and return to

normalcy post-crisis. This should then be followed by a period of organisational

learning to identify where improvements need to be made (Mitroff, Shrivastava and

Udwadia 1987; Pearson, Kovoor-Misra, Clair and Mitroff 1997).

There is strong recognition within the Crisis Management literature that not only the

organisational strategies and design but also the organisational culture and

managerial perceptions are of critical importance in terms of an organisation‟s

perceived ability to effectively prevent or respond to a crisis event (Mitroff et al.

1989; Pauchant and Douville 1993). The recognition that internal, non-technical

characteristics of organisations also contribute to crisis events has enabled the Crisis

Management literature to establish common characteristics found in „crisis prone‟

organisations or „failing firms‟, as well as „crisis prepared‟ organisations. Such

characteristics are used to judge an organisation‟s ability to cope with a crisis event

or their „crisis profile‟ (Pauchant and Mitroff 1988; Mitroff et al. 1989; Pauchant and

Douville 1993; Pearson and Mitroff 1993; Smith and Sipika 1993).

Crisis preparedness or proneness is certainly amongst the most extensively theorised

aspects in the realm of organisational crises (Pauchant and Douville 1993; Pearson

and Clair 1998) and business failures (Shrivastava 1988). Such factors (i.e.

preparedness and proneness) have been derived and empirically validated in the

literature through extensive case studies examining organisational failures including

Bhopal (Shrivastava et al. 1988; Mitroff 1994), Tylenol (Pearson and Mitroff 1993),

the Challenger disaster (Shrivastava et al. 1988; Starbuck and Milliken 1988;

Pearson and Mitroff 1993) and the Barings Bank collapse (Sheaffer, Richardson and

Rosenblatt 1998). While Crisis Management has received much attention in the

27

literature, and provided many useful insights and tools for preparing and responding

to incidents, it has also been criticised for being too focused on rational planning

processes for prevention and recovery of specific breakdowns at the expense of

preparing for unexpected events (Smith 1990; Boin et al. 2003; Boin 2004; Lagadec

2007).

Increasingly, questions are being raised regarding the ongoing effectiveness of

conventional approaches to Crisis Management amidst an increasingly uncertain

operating environment. For example, Boin et al. (2003) contend that expectations

about capacities to deal with such „surprises‟ may be too optimistic for many of

today‟s complex organisations, with the general focus on anticipation strategies in

Crisis Management leaving organisations unable to adequately cope with uncertainty

and unexpected events. Similarly, Lagadec (2005, 1) suggests that one precondition

has been taken for granted in the Crisis Management literature – that triggering

events are „generally identifiable, and occurring in relatively stable and delineated

contexts‟. Such authors see Crisis Management as long being approached in terms of

finding or generating certainties for emerging uncertainty. It is argued however that

during breakdowns, the products of rational management become the modifiers of

cascading crises, with the text-book techniques and tools of Crisis Management often

not working anymore (Boin et al. 2003). This is further highlighted by Weick and

Sutcliffe (2001) and Seville (2009) who suggest that restricting attention to what is

expected in plans and procedures weakens the ability to respond to the unexpected.

In the midst of mounting uncertainty and complexity there is a recognised need to

move beyond the conventional approach, towards new trends in crisis research and

practice, where concepts such as inconceivability and „out of the box thinking‟ are

now key terms (Boin et al. 2003; Lagadec and Rosenthal 2003; Boin 2004; Lagadec

2007) because the ways to manage risks have changed (Starr, Newfrock and Delurey

2003). To remain useful, there is recognition in both literature and practice of the

need for Crisis Management to move past simplistic anticipatory planning as an end

in itself, and embrace new approaches to organisational preparedness that balance

anticipation strategies with those that emphasise resilience (Perrow 1999a; Boin and

Lagadec 2000; Boin et al. 2003; Auerswald et al. 2005).

28

Resilience – A New Paradigm for Preparedness in Uncertain Times

With greater recognition of the need to manage unexpected events, the concept of

resilience has taken on new meaning and importance in the organisational risk

literature (Starr et al. 2003), with management theorists increasingly identifying the

need for resilience (Hamel and Valikangas 2003). Although much of the

organisational theory literature has tended to focus on the negative consequences of

crises such as threat rigidity (Staw, Sandelands and Dutton 1981), there has been

increased attention on resilience in organisations – a capacity that enables

organisations to maintain functionality in the midst of crisis (Hoffer-Gittel et al.

2008). According to McManus (2008, i) however, the term resilience has been used

across a range of disciplines with a myriad of conceptual approaches emerging in the

literature and thus resulting in „little consensus regarding what resilience is, what it

means for organisations and, more importantly, how they may achieve greater

resilience in the face of increasing threats.‟

Conceptual approaches to resilience, relevant to the area of this study, were first

developed through the study of ecology in examinations of social-ecological systems

by Holling (1973). Holling‟s approach to resilience was founded on the tendency of

systems to absorb the impact of disturbance and maintain persistence of function by

moving through multiple stability domains for survival, in what he termed „adaptive

capacity‟ (Holling 1973). Various conceptualisations of resilience have since been

developed and applied across a range of disciplines from ecology and engineering to

organisational science. Whilst most definitions involve some idea of adapting to and

bouncing back from disruption (Kendra and Wachtendorf 2003), the traditional

engineering view considers resilience as maintaining the efficiency of function near a

stable equilibrium (Holling 1996; Folke 2006).

Holling‟s conceptualisation of resilience moved away from the stable equilibrium

view, „shifting emphasis from the equilibrium states to the conditions for persistence‟

(Holling 1973, 2), suggesting that ecological systems can be defined by two distinct

properties: resilience and stability. According to Holling (1973, 17), „resilience

determines the persistence of relationships within a system, and is a measure of the

ability of these systems to absorb changes of state variables... and still persist‟. On

the other hand, stability „is the ability of a system to return to an equilibrium state

29

after a temporary disturbance. The more rapidly it returns, and with the least

fluctuation, the more stable it is‟ (Holling 1973, 17). Thus, Holling argued that a

system can fluctuate greatly but still be resilient as evidenced by his studies of forest

communities where conditions of low stability were found invoke high resilience.

Holling (1973) posits that instead of aiming for a precise capacity to handle a

predicted scenario as suggested by traditional engineering views of resilience, system

stability can also be achieved through a qualitative capacity to absorb and persist

through unexpected changes.

Although originally applied to ecological systems, this idea has since been extended

to human systems (Dolan and Walker 2003), and has been instrumental in natural

hazard research (see for example Handmer and Dovers 1993), and increasingly

organisational research (see for example Dalziell and McManus 2004; Dekker and

Hollnagel 2006). Wildavsky (1988) was an early commentator on organisational

resilience who, building on the seminal work of Holling, contrasted anticipation

(stability) with resilience as a means for dealing with unexpected events in his book

„Searching for Safety‟. For Wildavksy (1988, 220), anticipation meant a careful

assessment of vulnerability, with prudent action taken to limit obvious danger by

„sinking resources into specific defences against particular anticipated risks‟ while

resilience meant a „flexible‟ response to actual danger, demonstrating an ability to

cope with unexpected threats after they have become manifest by providing the

capability to „bounce back‟.

Thus, Wildavsky (1988, 220) saw resilience as a dynamic capacity of organisational

adaptability that grows and develops over time resulting from the processes and

dynamics that retain „resources in a form sufficiently flexible – storable, convertible,

malleable – to cope with whatever harms might emerge‟. Where conditions are

stable, risks are highly predictable and verifiable, and remedies are relatively safe,

Wildavsky (1988, 221) suggests that „anticipation makes sense‟. However, when

they are uncertain and speculative, and remedies may do harm, „resilience makes

more sense because we cannot know which possible risks (risk sources) will actually

become manifest‟ (Wildavsky 1988, 221).

30

Wildavsky (1988) proposed creating a balance between anticipation and resilience as

a strategy for reducing risk under conditions of uncertainty, relying on a combination

of systematic actions to reduce known risk and the capacity to act quickly in the

event of uncertain danger (Vogus and Sutcliffe 2007). According to Weick and

Sutcliffe (2001) and La Porte (2005), this ability to balance anticipation and

resilience to effectively deal with the unexpected is a capacity that has been achieved

by HROs (discussed earlier in this Chapter). HROs are said to enact a number of

processes to improve their capabilities to anticipate and become aware of the

unexpected earlier, so people can act before problems become severe. Aware of the

limitations of foresight and anticipation, they also enact processes that enable them to

contain and bounce back from problems mindfully through a commitment to

resilience (Weick and Sutcliffe 2001) with resilience achieved through organisational

and managerial mechanisms (Rochlin 1993; Boin and Smith 2006). Thus, it has been

argued that Wildavsky‟s work on safety in organisations has been central to the high

reliability school of thought (Sagan 1993) and can equally be viewed as an early

bridge between organisational research and ecological conceptualisations of

resilience.

According to Fiksel (2006, 16), organisational resilience can be defined as „the

capacity for an enterprise to survive, adapt and grow in the face of turbulent change‟.

Like Holling‟s ecological view of resilience, organisational resilience can be seen to

involve the capacity to maintain desirable functions and outcomes in the midst of

strain and the ability to bounce back from disturbance (Sutcliffe and Vogus 2003;

Hoffer-Gittell 2008). Thus, organisations and other socio-economic entities (i.e.

industrial systems) are increasingly being viewed as analogous to living organisms

with „adaptive capacity‟, sharing much in common with their ecological counterparts

(Dekker and Hollnagel 2006; Fiksel 2006). Fiksel (2006) further contends that there

is an urgent need to draw on the concept of adaptive capacity as discussed in the

ecological resilience literature to better understand the dynamic, adaptive behaviour

of complex industrial systems and their resilience when faced with disruptions.

31

The Need to Balance Anticipation with Resilience

With organisations facing increasing costs and potential losses due to operational

downtime, there is growing recognition of a need to adopt more strategic mitigation

strategies to safeguard key business operations from unexpected interruptions, by

better balancing anticipation with resilience (Auerswald 2005; La Porte 2005). This

is supported by Starr, Newfrock and Delurey (2003), who argue that enhancing

resilience in organisations has become increasingly necessary given the current

economic and security climate which is posing a new set of challenges to executives

and boards. They contend that because not all risks can be anticipated, there is a

growing need for organisations to have this adaptive imperative to respond flexibly

to uncertainty (Starr et al. 2003).

Conventional strategies and tools for managing risks and crises have in the past

tended to prepare for known and expected contingencies on the basis of a priori-

based risk assessments. Such an approach prepares organisations to cope with and

bounce back from anticipated events but it does not prepare them to respond flexibly

to the unexpected. This structured approach may have been useful in the past when

risks were predictable. For example, in response to the Y2K Millennium Bug threat

billions of dollars were spent worldwide on specific defences for continuity against

an anticipated threat that thankfully never manifested (Manion and Evan 2000). It is

argued however that sinking resources into specific defences is inefficient when

faced with unexpected threats (Wildavsky 1988; de Bruijne and van Eeten 2007).

Furthermore, traditional anticipatory planning strategies are also argued to inhibit a

flexible response for systems to cope and „bounce back‟ following unexpected crisis

events, as they can lead to the loss of capacity to adapt to changing conditions or

threats ultimately leaving organisations vulnerable (La Porte 2005).

Whilst the importance of managerial perceptions, organisational culture, and

structure cannot be underscored enough, with these prescriptions remaining critical

for facilitating effective Crisis Management, conventional anticipatory mitigation

strategies are now not enough to ensure system reliability amidst an increasingly

uncertain and complex operating environment when employed alone. This is

supported by Fiksel (2006) who argues that, faced with a dynamic and unpredictable

environment, management theorists are increasingly identifying the need for

32

resilience. That is, systems must now be prepared to deal with adversity and the

unexpected, by also employing strategies that emphasise resilience as highlighted by

the devastating events of September 11 (Hamel and Valikangas 2003; Fiksel 2006

and Lagadec 2007).

According to Lagadec (2007), what happened on September 11 was a „watershed‟ in

the experience of and approach to threats, with authors in their assessment of the

New York City response arguing that resilient systems fared the best. In the wake of

this event, there is growing recognition of the need for organisations to have the

ability to protect key business functions by ensuring their continuity or resilience,

during interruptions. It must be noted however that the abstract and multidimensional

nature of the concept of resilience makes it difficult to operationalise, and it remains

largely unclear what factors contribute to resilience in complex systems or which

variables should be measured when studying resilience (Cumming et al. 2005). What

is known however is that like HRT, BCM as an evolution of traditional DRP and

Crisis Management approaches, has emerged as a critical management tool aimed at

fostering resilient capabilities in organisations to ensure their continuity in the face of

disturbance (Elliott et al. 2002; Herbane et al. 2004; Charters 2007; Drennan and

McConnell 2007), through the central notion of adaptive capacity which is at the

core of contemporary business continuity strategies (Dalziell and McManus 2004).

Business Continuity Management (BCM) – A Framework for Resilience

The Business Continuity Institute (BCI) defines contemporary BCM as a „holistic

management process that identifies potential impacts that threaten an organisation

and provides a framework for building resilience and the capability for an effective

response‟ (Charters 2007, 2). According to Standards Australia‟s handbook for

Business Continuity Management (2004a, 2), it has the important role of providing

„the availability of processes and resources in order to ensure the continued

achievement of critical objectives‟. This ongoing process is seen to be an effective

means for managing the unexpected, by not only anticipating and becoming aware of

unexpected threats earlier, but also providing a capacity for resilience that enables

systems to flexibly contain, and bounce back from interruptions.

33

BCM as a Resilience-Enabling Management Tool

Rather than simply focusing on preparing for anticipated events, BCM is a

recognised resilience-enabling management tool. It is a process that assists

organisations in preparing for disturbances, providing them with the capacity to cope

with and respond flexibly to unanticipated changes in internal and external operating

environments, and ultimately enabling them to absorb the consequences of disruptive

events and continue functioning when under pressure. By identifying and

understanding exposure to both internal and external threats, the BCM process

focuses on implementing effective protection and recovery mechanisms for critical

business processes to ensure the continuity of their operation. This proactive, focused

approach sees steps taken to ensure that critical business processes can continue to be

delivered uninterrupted in the face of unexpected interruptions (Gibb and Buchanan

2006; Robb 2006; Charters 2008). A capacity for dealing with the unknown is now

considered the key to an adequate response to crises in increasingly complex and

uncertain times, and may also provide significant long-term value creation and

organisational advantage (Boin et al. 2003).

The Evolution of Business Continuity Strategies

Both Crisis Management and DRP can be seen in the broader context of BCM as

antecedents, contributing significantly to its development and influencing its

contemporary approach (Herbane et al. 2004). Earlier business continuity

publications had restricted attention to information systems management and

protection. However, in recent years, the BCM approach has moved from the DRP

realm, to be more aligned with the Crisis Management literature (Herbane, Elliott

and Swartz 1997; Elliot et al. 2002; Herbane et al. 2004; Drennan and McConnell

2007), as highlighted in Figure 2.3.

34

Figure 2.3: Evolution of BCM Approach

Source: Herbane et al. 2004

Where it was once just computer mainframes and IT that were the focus of

traditional continuity strategies (i.e. DRP), now every aspect of business activity is

considered with BCM focused on recovering the entire business (Elliott et al. 2002;

Standards Australia 2004a; Drennan and McConnell 2007). Moving beyond the

outdated DRP literature, contemporary approaches to business continuity are said to

provide a holistic, socio-technical approach to safeguarding organisations and

dealing with the unexpected by including an organisation-wide function for the

identification of causes, preparation for, and response to incidents, by taking a Crisis

Management approach (Elliott et al. 2002).

The development of Crisis Management as an anticipatory and strategic

organisational approach is considered to be at the roots of BCM, establishing its core

assumptions. In contrast, DRP can be considered to be the reactive, less socio-

technical predecessor with its focus on IT activities, whilst contemporary BCM

shares many parallels with the Crisis Management literature. Both Crisis

Management and BCM are more proactive and anticipatory in their approach, taking

a broader view of operations by focusing on the internal and external threat

environment, as well recognising the importance of using and protecting both hard

(i.e. physical) and soft (i.e. culture, staff) assets within organisations (Elliott et al.

2002; Herbane et al. 2004; Drennan and McConnell 2007). While the traditional

DRP management approach has been incorporated as an integral component of the

contemporary BCM framework, the broadening of business continuity reflects a

Disaster Recovery Planning (DRP) - Old

• Reactive

• Recovery Emphasis

• Focus on IT

• IT Staff

• Technical, no Social Considerations

• Importance of Hard Assets

• External Threats

• Protect Core Operations

• Sustain Current Position

Business Continuity Management (BCM) - New

• Proactive

• Prevention & Recovery

• Organisation-wide Focus

• Multi-disciplinary Teams

• Socio-technical

• Importance of Hard & Soft Assets

• Internal & External Threats

• Considers Stakeholder Impacts

• Strategic Function for Sustainable Advantage

35

number of underlying assumptions about business interruptions that have been well

established within the field of Crisis Management, as evident in the following table.

Table 2.5: Crisis Management and BCM Shared Assumptions

Organisations themselves can incubate the potential for business interruptions

Business interruptions are systemic in nature

There can be both social and technical characteristics of business interruptions

Business interruptions require organisation wide considerations

Managers play an important role in the resolution of business interruptions

Business interruption will not inevitably result in crises if managed properly

The impact of business interruptions will be felt on a wide range of stakeholders

Source: Elliott et al. 2002; Herbane et al. 2004; Drennan and McConnell 2007

While the DRP focus is purely on the post-crisis, recovery phase, both Crisis

Management and BCM place emphasis on the pre-trans-and-post crisis phases,

although both vary in the emphasis they place on each (Herbane et al. 2004; Robb

2006). Crisis Management and BCM both see accidents as being „normal‟ events

within organisations that can either be prevented from materialising into a crisis

situation, or if a crisis emerges, it can be swiftly recovered from, thus reinforcing the

need for implementing not only recovery, but also prevention measures. Sharing

many similarities, the terms Continuity Management and Crisis Management are said

to be becoming interchangeable (Herbane et al. 2004). Despite sharing many

parallels, Herbane et al. (2004) contend that Crisis Management tends to be more

socio-centric in its approach focusing on generating certainties, whereas BCM has

developed into a more business-orientated approach, attuned to dealing with

uncertainty. Neither DRP nor the anticipatory planning tools utilised in the Crisis

Management approach adequately prepare organisations for managing unexpected

events, emphasising the importance of BCM in the contemporary business

environment.

36

Standards and Guidelines

When implementing business continuity activities, many variations are often adopted

by organisations. Despite its recognised benefits, there is as yet no internationally

defined or accepted standard for BCM. Varying approaches have been put forward

by a variety of authors from private, public and academic fields giving rise to

discrepancies over appropriate terminology and structure. At present there are

divisions between the National Standards, which are supported by Government and

are often used for compliance purposes in certain industries, and the academic

literature presented in the business, IT and other social science disciplines (see for

example Botha and Von Solms 2004; Gibb and Buchanan 2006).

In Australia, the governing guideline for BCM, HB221:2004 (Standards Australia

2004a) is based on generally accepted practice principles from within Australasia and

internationally, and is supported by the Practitioners Guide to BCM – HB292:2006

(Standards Australia 2006a). It takes a top-down approach and presents a linear

framework for the effective adoption and implementation of BCM, characterised by

nine stages, with mandatory monitoring and reviewing at each stage of the process.

This process also needs to be supported by effective communication at each stage of

the BCM program. This is a logical, reiterative process that has a number of features

common to all acknowledged international approaches, including the British

Standard BS25999-1:2006 and the US National Fire Protection Association (NFPA)

1600 Standard (Standards Australian 2006a).

In the United Kingdom (UK), the British Standards Institute published the PAS 56:

2003 Guide to Business Continuity Management which was replaced by an official

standard in BS25999-1:2006 Code of Practice for BCM (British Standards Institute

2006). In the US, there is the NFPA 1600 Standard which is updated annually

(National Fire Protection Association 2008). In essence, each demonstrates a similar

approach to business continuity with only slight differences in regards to the

terminology employed and the flowchart components in which key elements are

aligned. A significant difference however between the three methodologies lies in

how they view the relationship between Risk Management and BCM. The Australian

and US Standards both acknowledge the integral interrelationship between Risk

37

Management and BCM, while the UK Standard treats the two as separate procedures

(Standards Australia 2006a).

Academic work has established many different approaches to BCM, most of which

have been largely prescriptive in nature and focused on the protection of IT assets.

One of the most recent frameworks put forth was by Botha and Von Solms (2004),

who developed a cyclical seven-stage methodology for Business Continuity Planning

by critically analysing existing studies. Similarly, Gibb and Buchanan (2006),

drawing on the different approaches and experiences in the field, distinguish their

research by integrating various development cycles of BCM proposed in the

literature, offering an all-encompassing nine-step framework. Both of the approaches

differ considerably, with Gibb and Buchanan‟s (2006) appearing to be more aligned

with the National Standards. Both however focus their work on the IT realm, limiting

the scope and transferability of their research.

Whilst there are consistencies across the varying frameworks, in terms of the

importance of ongoing testing, training, and reviewing and monitoring, they each

vary in the elements incorporated in their approach and in the terminology employed.

As a result, a number of gaps between academic and National Standards are evident.

Although significant contributions have been made by the likes of Gibb and

Buchanan (2006), the academic literature details largely prescriptive BCM initiatives

in the form of plans and directions, and is substantially silent with regard to

performance-based guidance. The current National Standards, particularly the

Australian and British, appear to be comprehensive and robust for better practice

guidelines, as both take a holistic management approach, positioning BCM as a

whole-of-management approach firmly integrated and embedded across the whole

organisation (across all functions).

38

Critical Success Factors of BCM

Given that there is a lack of clarity over what constitutes the best approach to BCM,

accordingly no concrete measures exist in the literature for assessing the maturity of

BCM in organisations. However, based on an extensive review of the various

International Standards and the existing, albeit limited academic literature on BCM, a

number of common factors or themes emerge that appear to collectively contribute to

the development of a mature BCM capability and thus resilient capacities within

organisations. The themes identified, which are by no means exhaustive, are evident

in the following table.

Table 2.6: Factors Contributing to the Development of a BCM Capability

Scope of Implementation Testing & Review Activities

Embeddedness Threat Assessment

Role of Senior Management Business Impact Assessment (BIA)

Corporate Governance Integration with Risk Management

Alignment with Standards

Implementation

Contemporary approaches to BCM mark an evolution from the narrow specialist

areas of DRP with its IT focus, and rational, anticipatory Crisis Management,

towards a holistic and coordinated approach to dealing with crises that embraces all

aspects of strategic and operational areas of organisations (Herbane et al. 2004).

While many organisations may have had crisis and emergency response capabilities

in place, contemporary BCM can be considered to be a relatively new process in

organisations and accordingly the level of implementation and sophistication varies

across industry sectors and organisations. Increasingly though more and more

organisations are recognising the value of implementing a contemporary, enterprise-

wide approach to BCM, as it may deliver value preservation in the form of increased

resistance to crises and higher reliability configurations. The ability to recover from

crises and resume operations faster than competitors can also be considered to be a

source of competitive advantage. However, if BCM is not embedded throughout an

organisation, it cannot contribute to the long term strategic goals of the company

(Herbane et al. 2004).

39

Degree of BCM Embeddedness

Embedding BCM in the business-as-usual operations of an organisation is one of the

most challenging aspects of achieving enterprise-wide Risk Management, but getting

it right delivers significant benefits to organisations. According to the BCI (Charters

2008, 5), BCM must be „owned and fully integrated into the organisation as an

embedded management process‟. For it to be successfully implemented within an

organisation and effectively utilised and embedded, BCM should be an accepted

management process, requiring a top-down approach with Senior Management

support and commitment, and its adoption and practice across the whole organisation

in every function, as well as liaison with external agencies. Furthermore, BCM

should be considered an ongoing practice that requires continual monitoring and

review, reinforced by regular testing (Standards Australia 2006a; Charters 2008).

Testing and Review Activities

According to Standards Australia (2006a), a mature BCM program requires regular

and effective maintenance through testing and review activities. This generates and

maintains awareness and understanding to ensure that those activities remain fit for

purpose, and to ensure that the organisation and its people are prepared and

supported by appropriate resources and processes to respond effectively to a range of

disturbances. Regular testing and review activities can therefore be considered

central to development of a robust BCM program, contributing to the ongoing

maturity of a BCM capability through the process of organisational learning and

continuous improvement. This is supported by Charters (2008), who suggests that

testing and review exercises are part of an overall life-cycle approach to ensure the

development and maintenance of a mature BCM capability.

Threat Assessment and the BIA

Contemporary BCM with its Crisis Management approach strikes a balance between

anticipation and resilience for dealing with both anticipated and unexpected threats.

BCM entails preparing the organisation for a range of threats, by systematically

identifying, evaluating and treating potential threats as is done in Crisis Management

by way of a risk and vulnerability assessment (or „threat assessment‟). This addresses

the likelihood and consequence of a variety of threats that could potentially cause an

interruption. However, it also identifies what to protect in terms of critical business

40

processes and resources depended on to maintain service continuity in the face of

future uncertainty as part of the integral Business Impact Assessment (BIA)

(Standards Australia 2006a; Charters 2008).

The BIA involves establishing Maximum Acceptable Outage Times (MAOs) and

Recovery Time Objectives (RTOs) to help ensure the swift restoration of system

functionality no matter the interruption encountered. The BIA adds an extra

dimension (time) to the usual Threat Impact * Likelihood equation, so that risk

treatment efforts can be reduced to a more manageable scope and be prioritised on

those activities that may quickly disrupt the business. It also identifies single points

of failure and unacceptable concentrations of risk in an organisation. These tasks

help to generate better understanding of the organisation and guide the focus of the

Business Continuity Plans (BCPs) (Standards Australia 2006a; Charters 2008).

Common to all frameworks is the important role of Business Continuity Planning

which can be defined as „planning which identifies an organization‟s exposure to

internal and external threats and synthesizes hard and soft assets to provide effective

prevention and recovery for the organization, whilst maintaining competitive

advantage and value system integrity‟ (Elliot et al. 2002, 2). BCPs are developed and

documented in a comprehensive and simple manner, allowing an organisation to

respond flexibly to a wide range of potential disruptions. They address business

interruptions from the initial response to the point at which normal business activities

are resumed by detailing an appropriate set of recovery resources and services,

including roles and responsibilities to be deployed to provide for the restoration of

acceptable functionality for business activities (Charters 2008).

Managing continuity involves recognising that disruptions may occur that have not

been considered through formal risk assessment, therefore BCPs must maintain a

high degree of flexibility, allowing organisations to rapidly respond to changing

circumstances. The development and implementation of BCPs, while integral to the

success of the overall BCM process, is just one recognised step conducted as a part

of a broader BCM program. Despite BCM being a more embracing concept than

Business Continuity Planning, the terms are often confused and used interchangeably

in the Crisis Management literature (Standards Australia 2006a; Charters 2008).

41

Enterprise Risk Management and BCM

There is also recognition of the need for BCM to be firmly integrated with an

organisation‟s broader Risk Management framework. Although there is some

contention that the functions must be kept separate as the focus and methods of BCM

differ significantly from that of Risk Management (Drennan and McConnell 2007;

Charters 2008), the two are more commonly viewed as complementary for „best

practice‟ (Standards Australia 2004a; Griffiths 2008). The complementary nature of

the processes is supported by the handbook HB221:2004 Business Continuity

Management (Standards Australia 2004a), which suggests that Risk Management

provides the grounding for an effective BCM process by establishing the scope,

needs and priorities. It also suggests that BCM is an integral risk treatment function,

by providing the capability for an organisation to adequately plan for and manage

business disruptions and protect critical business functions, as an important

mitigation outcome of the wider Risk Management process (Standards Australia

2004a).

By combining and integrating the two approaches, Standards Australia (2004a)

suggests that organisations may create efficiencies and flexibility that may not be

realised if kept separate. This complementary relationship is further supported by

Griffiths (2008), who contends that BCM forms an essential part of the Risk

Management process, assisting organisations in mitigating disruptions to business

continuity by identifying critical objectives and processes, potential disruptions, and

mitigation strategies. Seville (2009) further supports this view but adds an additional

element, arguing that Risk Management and BCM combined with strategic planning

are an integral part of an organisation‟s toolkit for achieving greater resilience.

Corporate Governance and the Role of Senior Management

BCM‟s growing recognition as a critical management tool has been reinforced by the

rising issue of corporate governance on organisational agendas. With increased

pressure in recent years on company directors and management to demonstrate that

they are actively managing the businesses risks and ensuring the ongoing

sustainability of the organisation, BCM is increasingly being considered as an

integral element of an effective corporate governance framework. BCM helps to

42

protect shareholder, investor, and other stakeholder interests by having processes in

place to effectively mitigate risks that threaten the ongoing sustainability of the

organisation (Standards Australia 2006a; Drennan and McConnell 2007; Charters

2008; Freestone and Lee 2008). As mentioned earlier, it is also widely acknowledged

in the literature that Senior Management and the Board play a fundamental role in

implementing and sustaining a successful BCM program. Without their full support

and commitment to ensure sufficient resourcing and attentiveness, developing a

successful and sustainable BCM program will be an extremely difficult feat, even

with a BCM champion (Standards Australia 2006a).

Despite little consensus in the literature on what constitutes the best approach to

BCM, Herbane et al. (2004, 454) believe that it has the potential value for any

organisation that would benefit from a „socio-technical organisational wide,

strategically-cognisant approach to Crisis Management‟. Indeed any organisation that

cannot afford to experience business interruptions and losses should have a BCM

capability in place.

The preceding discussion has detailed the evolution of two different streams of the

Risk and Crisis Management literature, highlighting diverse strategies that have

emerged from the study of large scale socio-technical systems and industrial failures.

Although all of the strategies discussed have been useful in their own right, the

discussion highlighted the increasing recognition of specific strategic views that

balance anticipation with resilience. Events such as September 11 have highlighted

the importance of being prepared for the unexpected, with BCM and HRT identified

as two resilience-enhancing practices. Whilst resilience is indeed an important

capacity for all organisations to develop, its role in organisations responsible for

managing critical infrastructures cannot be understated, as building resilient

capacities to ensure the continuity and reliability of their essential operations is

paramount. The following discussion will therefore set the context for this research,

introducing the literature surrounding the protection of critical infrastructures,

highlighting the need for the development of resilient capabilities in these vital large-

scale technical systems, as well as challenges that are impacting the achievement of

resilient outcomes given increased interconnectivity and the introduction of

networked conditions.

43

Critical Infrastructure Protection (CIP)

Many major natural and socio-technical events during the last decade, such as

Hurricane Katrina and the September 11 terrorist attacks have raised questions about

the reliability of critical infrastructure systems in industrial nations and their ability

to provide routine and emergency services under stress (de Bruijne and van Eeten

2007; Egan 2007). Recent failures and attacks have sufficiently demonstrated their

escalating vulnerability7 and hence society‟s vulnerability to an ever growing

spectrum of threats (International Risk Governance Council 2006). In response, CIP

has become a matter of national security for many nations (Farrell, Zerriffi and

Dowlatabadi 2004) and subject to growing academic interest, with the aim of

building more resilient infrastructure systems by providing higher reliability

configurations to maintain service continuity.

The following section will set the context for this research investigation, detailing the

importance of critical infrastructures to modern industrial societies, and also

outlining the need for reliable service provision and improved infrastructure

resilience amidst growing uncertainty. In addition to growing uncertainty in the

operating environment, critical infrastructures are faced with a number of additional

challenges with the potential to affect their future reliability, particularly the system

issues of interconnectivity and institutional fragmentation.

Large-Scale Technical Systems, CIP and Criticality

While dangerous factories and plants were key sources of risk in the past

(Shrivastava et al. 1988), concern in the literature and society at large has now

shifted to networks of critical infrastructure – the contemporary large-scale technical

systems which provide vital services that support our complex modern societies.

While considered vital, they are also considered to be the source of societies‟

potential destabilisation (Lagadec and Rosenthal 2003; de Bruijne and van Eeten

2007). In the wake of the Y2K problem, the September 11 terrorist attacks and

Hurricane Katrina, concern has risen about the vulnerability of critical infrastructure,

as routine economic and societal functions are dependent on their secure and reliable

7 The vulnerability of a system can be seen as a sensitivity (susceptibility) to threats and hazards that

possibly will reduce the ability of the system to carry out its tasks, and supply the intended services

(Adger 2006; Holmgren 2006).

44

operation (Perrow 1999a; Amin 2001; Rinaldi, Peerenboom and Kelly 2001; Boin et

al. 2003; de Bruijne 2006; de Bruijne et al. 2006; Boin and McConnell 2007; de

Bruijne and van Eeten 2007; Egan 2007; Lagadec 2007).

Traditional views of critical infrastructure have been revised in the last decade (Boin

and McConnell 2007), and while a range of definitions have been put forth,

discrepancies remain with regard to their classification (International Risk

Governance Council 2006). The term critical infrastructure has become widely used

in both the academic literature and Government publications, but it has largely been

defined by illustration and categorisation, impacting attempts to determine a generic

definition (Egan 2007). There is however some consensus, with critical infrastructure

deemed to be a vital part of a larger set of services and products that are considered

essential to the functioning of modern economies and societies, which include

electric power, banking and finance, transportation, telecommunications, water

supply, and oil and natural gas (de Bruijne 2006; de Bruijne and van Eeten 2007).

The Australian Government (2008) defines critical infrastructure as those „physical

facilities, supply chains, information technologies and communication networks that,

if destroyed, degraded or rendered unavailable for an extended period, would

significantly impact on the social or economic well-being of the nation or affect

Australia‟s ability to conduct national defence and ensure national security‟.

Recognising their importance to economic and social order, many countries

including Australia, the US, and the Netherlands have initiated Government inquiries

into the vulnerability of their critical infrastructures, with each country categorising

different infrastructures as vital depending on system characteristics and national

needs (Farrell et al. 2004).

From an academic perspective, de Bruijne and van Eeten (2007, 18) contend that at

the core of every list of critical infrastructure are the large-scale technical grids of

energy, water, communication and transportation, with this subset of networked

infrastructures making up the „arteries and veins of Western, urbanized societies‟.

According to La Porte (1996), there are specific characteristics of these particular

systems that make critical infrastructures:

45

- tightly coupled technically, with complex organisational and management

imperatives driven by system operating requirements, consistent with the

assumptions of NAT (detailed earlier in the Chapter – see pages 13-17);

- non-substitutable, with few competing networks delivering the same service;

- driven to achieve maximum coverage of infrastructure (the operational tendency or

logic of networked systems);

- the source of public anxiety about the possibility of interruptions to service and the

consequences of serious operating failures; and

- critical to the effective functioning of our societies, thus the source of demands for

assurances of reliable operations.

Each of these infrastructure systems are interconnected to different degrees and must

be regarded as a systems of systems, embedded in a broader socio-political-economic

network (International Risk Governance Council 2006). They form complex,

interrelated large-scale technical systems that reach into every aspect of modern

society and are so vital that their failure would have debilitating effects for all

involved. Such essential services underpin the economic prosperity, national security

and the quality of life of modern societies who have long depended on the

unimpeded availability of these networked lifelines. The critical infrastructure of

developed and developing countries are today the foundations upon which these

societies are built, and while they are central to the high quality of life enjoyed today,

their benefits have come with increased risk. However, this growing reliance on

large-scale, complex infrastructure systems for critical services is increasing our

vulnerability to unanticipated events (Amin 2001; Amin 2002; International Risk

Governance Council 2006; de Bruijne and van Eeten 2007; Egan 2007).

Due to the importance of maintaining uninterrupted service continuity, critical

infrastructure must attain the highest level of reliability in performance (Roe et al.

2005). Reliability has become a popular catchphrase, and in some cases it means the

constancy of service, in others the safety of key activities and processes (La Porte

1996). Increasingly, in an age of uncertainty, it means the resilience of operations

which is the ability to absorb or recover from a disruption or attack (Schulman and

Roe 2007). Due to the systemic, networked nature of critical infrastructures,

achieving high reliability should be considered to be a process of providing essential

46

services across organisations, rather than a trait of individual organisations (Roe et

al. 2005).

Current levels of service reliability provided through various critical infrastructures

in Western societies can thus be considered impressive as they are usually described

in the order of 99% availability or better. Such impressive levels of reliability lie far

above the average reliability of service provision achieved by ordinary organisations

(de Bruijne et al. 2006). However, Schulman and Roe (2007) contend that a reliable

infrastructure is one whose output variation is relatively low, especially from the

standpoint of consumers. Electricity should always be on, and water, roads and

telecommunications always available (de Bruijne 2006; Schulman and Roe 2007).

Despite maintaining high levels of reliability to date, questions remain about how

sustainable this is in high-risk, complex systems like critical infrastructures (de

Bruijne 2006).

Despite facing challenging conditions, critical infrastructures have to date been

highly reliable in their service provision, raising societal expectations. Accustomed

to the high levels of reliability experienced in the past, society has become

unaccustomed to coping with unreliability and begun to take reliable service

provision for granted. Society now demands that services provided by critical

infrastructure networks be available 24 hours a day, 7 days a week, year round in an

always-on global information economy, increasing the need for higher levels of

reliability (Boin et al. 2003; de Bruijne 2006; de Bruijne et al. 2006).

To meet these demands, critical infrastructures have grown in size and complexity,

but in doing so have inadvertently increased their vulnerability. Widespread attention

and concern about the dependence of modern Western societies on critical

infrastructure has resulted in infrastructure reliability becoming a priority issue in

recent years. In the wake of major terrorist attacks (e.g. September 11, Bali, Madrid

and London bombings, Anthrax letter attacks), Y2K and large-scale network failures

such as the California Electricity Crisis, concerns have been raised about the

vulnerability and possible unreliability of key infrastructure, with recognition of the

need to improve the resilience of these lifeline systems (Boin et al. 2003; de Bruijne

2006; de Bruijne et al. 2006).

47

CIP: Managing Vulnerability, Ensuring Reliability

With increased societal dependence on these vital systems, it is now of the utmost

importance to Government, business, and the public at large that the flow of services

provided by critical infrastructures continues unimpeded in the face of a diverse

spectrum of threats, ranging from natural hazards to acts of terrorism (Little 2002;

International Risk Governance Council 2006). CIP has become a general label for a

range of proactive measures to protect vital networks that are necessary for security,

economic stability, and public safety, undertaken to manage risk to ensure reliable

service provision (Rothery 2005). CIP can be defined as the actions and programs,

undertaken jointly by Government and the operators of key facilities, that identify

critical infrastructure and its components, assess their vulnerabilities, and take

mitigative and protective measures to reduce vulnerabilities (Auerswald et al. 2005).

CIP brings together a significant number of existing strategies, plans and procedures

that deal with the prevention, preparedness, response and recovery arrangements for

disasters and emergencies. These activities include counter terrorism, Business

Continuity Planning, Risk Management and Emergency Management (Australian

Government 2008), providing the capability needed to eliminate potential

vulnerabilities in critical infrastructures. It involves not only deploying a tougher

structure, but also developing reliability-enhancing characteristics that enable society

to be more resilient and robust in the face of new, dynamic, and uncertain threats

(Auerswald et al. 2005).

CIP Initiatives in Australia

Since the September 11 attacks, protecting critical infrastructure from terrorist attack

has become a high priority for the Australian Government, with these events bringing

the Australian Government‟s policy on CIP sharply into focus. In November 2001,

the Prime Minister announced the formation of the Business-Government Task Force

on Critical Infrastructure, with the role of examining what needed to be done to

ensure that Australia‟s critical infrastructure is adequately protected. Following the

release of the Task Force‟s recommendations, the Government launched a program

to build a partnership with businesses operating and managing critical infrastructure

48

to ensure that they are not just protected from acts of terrorism, but from all hazards

(Rothery 2005).

Global CIP Initiatives

Recognising the increasingly global nature of risks and vulnerabilities, many

transnational initiatives have also been launched to address CIP (Farrell et al. 2004).

This includes the Comprehensive Risk Analysis and Management Network (CRN)

launched in 2000 as a joint Swiss-Swedish initiative, which was established as an

initiative for international dialogue on national-level security risks and

vulnerabilities. In 2002, the CRN published the International Critical Information

Infrastructure Protection (CIIP) Handbook (Abele-Wigert and Dunn 2006), which

provided a comprehensive inventory of national protection policies in eight countries

allowing for easy comparisons. This has since been updated to include twenty

countries and also provides an in-depth analysis of key issues relating to CIIP

(Abele-Wigert and Dunn 2006).

Other initiatives include the Zurich Centre for Security Policy, and the Geneva

Centre for Security Policy. In 2003 the Geneva Centre held a forum bringing

together 186 participants representing 28 countries from around the world to discuss

the issue of critical infrastructure and continuity of services in an interdependent

world (Narich 2005). What is evident from these examples is that many nations are

recognising the importance of CIP, and the benefits of global partnerships and

cooperation to ensure the service continuity of critical infrastructures. This work

seeks to contribute to this ongoing dialogue on this important issue.

Limitations of Academic Research into CIP

Significant research has been undertaken aimed at improving the understanding and

management of critical infrastructure systems to reduce vulnerabilities and enhance

reliability. For example, from mathematical modelling and analysis of system risk

and vulnerabilities in specific sectors (Haimes and Horowitz 2004), to case studies

examining the impact of specific incidents (Wallace et al. 2003; Mendonca, Lee and

Wallace 2004), a variety of methodologies and theoretical approaches have been

applied. The bulk of academic literature pertaining to CIP however lies within the

49

engineering and IT fields, as traditionally the safety and security of these physical

systems has been considered to be an engineering issue. Interest has surged since

September 11 with sophisticated analytical, mathematical modelling and forecasting

tools employed to improve understanding of these complex, interconnected systems

and identify system risks and points of vulnerability.

Although this technical approach has been instrumental in advancing system designs

for high reliability configurations, aimed at protecting physical assets and structures

to ensure that critical systems maintain a stable operating state near equilibrium, it

fails to take into consideration the equally important non-technical components and

processes within these critical systems (Haimes and Longstaff 2002). Accordingly, a

more diverse academic interest in CIP has surged in recent years, particularly

amongst Crisis Management academics with organisational theory and disaster

research providing vital insight into the behaviour and complexity of these essential

systems and the organisations that manage them (Pommerening 2007).

While concern about terrorism has previously been a primary focus, there is a strong

recognition in the academic literature and in Government policy that the effect of

terrorism is not the only issue requiring a revised policy approach and outside-the-

box thinking. What happened on September 11 is certainly the most spectacular, but

not the only incident which projected the world into a new and profoundly unstable

orbit as far as crises are concerned, challenging the effectiveness of conventional

approaches to CIP (Little 2002; Little 2003).

A Revised Approach to CIP – From Protection to Resilience

Several academics have highlighted the limitations of conventional Risk and Crisis

Management approaches for effective CIP in recent literature (Boin and McConnell

2007; de Bruijne and van Eeten 2007). Although anticipatory Risk and Crisis

Management practices with a focus on security like the HB167:2006 (Standards

Australia 2006b) and AS/NZ4360:2004 Risk Management Standard (Standards

Australia 2004b) utilised in Australia are indeed important strategies for the

protection of critical infrastructure, questions regarding their effectiveness have

emerged in recent years.

50

Such practices are said to be more effective when applied in a broader context, as

there is growing recognition of the benefits of promoting resilience in critical

systems to ensure that they can bounce back after experiencing unanticipated

disturbances (de Bruijne 2006). Academics including Boin and McConnell (2007,

53) warn that such prevention and planning efforts provided by traditional Risk and

Crisis Management approaches may not provide the capacity to prepare critical

infrastructure for the vast spectrum of „extraordinary, complex and critical threats

that they are sure to encounter in times of crisis‟. They argue that to improve CIP,

approaches need to find a better balance between anticipation and resilience, given

that the conditions facilitating effective anticipation have deteriorated (La Porte

2005; de Bruijne and van Eeten 2007).

Balancing Anticipation with Resilience in Critical Infrastructure

In the face of mounting uncertainty, recent research into the protection of critical

infrastructures has stressed the importance of enhancing system resilience to

maintain service continuity after experiencing interruptions (see de Bruijne 2006; de

Bruijne and van Eeten 2007). While anticipation strategies are indeed important for

effective CIP, it is argued that critical infrastructures should develop more resilience

based strategies in order to ensure security and service reliability in a dramatically

changed operating environment (Little 2003; La Porte 2005; de Bruijne 2006;

International Risk Governance Council 2006; Boin and McConnell 2007; de Bruijne

and van Eeten 2007). In the context of increasing natural and man-made threats and

vulnerabilities of modern societies, the concept seems particularly useful to inform

policies that mitigate the consequences of such adverse and potentially catastrophic

events.

de Bruijne and van Eeten (2007, 22) believe that the continued focus of current CIP

approaches on anticipation strategies and sinking valuable resources into ineffective

defences at the expense of investing in generalised resources for resilience to deal

with unanticipated risks is „ironic‟ given that the major threats that have fuelled the

recent interest in CIP are those posed by terrorists who „make it a point to defeat

anticipation‟, a view that is also supported by La Porte (2005).

51

Similarly, the International Risk Governance Council (2006) contends that while it is

tempting to focus exclusively on the prevention of major disruptions, it is important

to remain mindful that infrastructure failures cannot be ruled out and that it is the

essential services they provide, not the systems themselves that are most valuable to

society. The implication of this, they suggest, is that in addition to doing what can

reasonably be performed to guarantee continued service, attention should be equally

directed at fault tolerance and increased resilience (International Risk Governance

Council 2006).

In spite of recommendations for revised approaches to CIP to promote resilience,

very little has been offered in the literature on how to effectively do this in the

context of critical infrastructure. Many authors have called for urgent research

examining how critical infrastructure can organise for resilience (see for example,

Dalziell and McManus 2004; La Porte 2005; de Bruijne 2006; de Bruijne and van

Eeten 2007). A recent example of where the concept of resilience was examined in

the context of critical infrastructure was by Boin and McConnell (2007), who

considered how to best prepare for critical infrastructure breakdowns. However,

instead of taking an organisational or system centric approach and examining how to

promote infrastructure resilience, they take a public policy perspective focusing on

how to effectively respond and manage critical infrastructure breakdowns, arguing

that societal resilience is imperative. Amongst the thirteen valuable strategies

outlined for promoting societal resilience, they acknowledge the importance of

Business Continuity Planning which is suggested to be important in promoting

societal resilience in that it assists in the rapid recovery of local businesses affected

by critical infrastructure breakdowns (Boin and McConnell 2007).

While a valid point in terms of societal resilience, the importance of BCM to the

protection of critical infrastructure systems, through its ability to build resilient

capacities which may prevent catastrophic system breakdowns, has not yet been

acknowledged in the academic literature. Further, Boin and McConnell (2007) also

narrowly focus on Business Continuity Planning and fail to recognise the importance

of the wider BCM approach in building resilient capacities. Business Continuity

Planning is often cited as an important element of good CIP practice however, this

forms just one element of a comprehensive approach to BCM. Although various

52

National Standards (e.g. HB221:2004; BS 25999-1:2006) have clearly acknowledged

the difference between the two separate functions (Standards Australia 2006a), the

scholarly field still appears largely unclear about the distinction between the two

terms and the merits of the overarching BCM approach for effective CIP.

More recently however, in a presentation to the 1st Australian Security and

Intelligence Conference, Griffiths (2008, 44) acknowledged the importance of Risk

and BCM to CIP highlighting that the „consequences of risks that threaten continuity

must be given serious consideration‟. This is because BCM guarantees „the

availability of processes and resources in order to ensure the continued achievement

of critical objectives‟ (Standards Australia 2004a, 4), which Griffiths (2008) argues

is a necessary requirement for all forms of critical infrastructure. Due to the societal

consequences that could be incurred from disruptions to critical infrastructure service

provision, BCM should thus be viewed as a valuable management tool for

organisations operating and managing critical infrastructure as a means to build

resilient capacities and ensure service reliability amidst an increasingly uncertain

operating environment. Despite this acknowledgement, BCM remains under-

recognised in the academic literature surrounding CIP and is still often under-utilised

in organisations.

The previous discussion highlighted limitations of conventional Risk and Crisis

Management approaches to CIP acknowledged in the academic literature and

provided alternative directions for future research and practice by way of BCM.

Another important contribution to the academic literature surrounding CIP has been

the work by theorists in organisational reliability, advancing our understanding of

these complex systems and improving their management. Interest in organisational

reliability – the ability of organisations to manage hazardous technical systems safely

and without serious error – has grown dramatically in recent years, particularly in the

context of critical infrastructures (Schulman et al. 2004; Roe et al. 2005). This has

started to have a major influence on the direction of academic research into

improving the protection of these lifeline systems.

53

Like BCM, the HRT literature holds significant potential for improving the

protection of critical infrastructure systems by building resilient capacities. The

authors of the high reliability school contend that the characteristics outlined earlier

in this Chapter (see Table 2.4 – page 21) combine to make this special group of high-

risk organisations highly reliable, with such properties considered particularly

attractive for critical infrastructure operations (Boin et al. 2003; La Porte 2005). In

fact, a number of critical infrastructure organisations have been found to

independently possess these characteristics which have been examined and

empirically validated in a number of critical infrastructure sectors. The sectors

studied include electric power generation and distribution (Schulman et al. 2004; Roe

et al. 2005; de Bruijne 2005), large scale water systems (Roe et al. 2005), and

telecommunication providers (de Bruijne 2006; de Bruijne et al. 2006; de Bruijne

and van Eeten 2007), with the cases assessed found to display a commitment to

resilience and higher reliability configurations.

This approach to management can thus be considered to be a promising avenue for

enhancing the protection and resilience of critical infrastructures by embedding

identified Reliability-Enhancing strategies and conditions across these lifeline

systems (Boin et al. 2003; La Porte 2005). HRT as a structural strategy for improving

long-term CIP has however, according to La Porte (2005) been neglected in

infrastructure and homeland security policies as a viable avenue for their reliable

management. He suggests that to be effective, an overall CIP policy should include

attention to structural and organisational strategies, and that more research is needed

to learn how HROs develop and what can be done to stimulate their development to

help improve CIP (La Porte 2005).

The previous sections have discussed the influence of organisational issues on the

reliable management of large-scale technical systems, notably critical infrastructure

and the reliability of service provision. It is evident that both BCM and HRT can be

seen to have an integral application to the management and protection of critical

infrastructure systems as a means for developing resilient capacities, and for

achieving high reliability configurations. The following section however, details a

number of new challenges affecting the reliability of service provision in critical

54

infrastructures and raises questions regarding the application of HRT and BCM in

institutionally fragmented, networked settings.

Challenges to Sustaining Reliability

Although reliability of service provision is paramount in these lifeline systems,

critical infrastructure are becoming increasingly difficult to manage for reliability

due to some unusual properties, in particular their networked condition characterised

by spatial dispersion, multiple organisations and varied interconnections (de Bruijne

et al. 2006; Schulman and Roe 2007). This is largely attributable to profound and

ongoing changes currently occurring in the organisational and market structure of

critical infrastructure sectors (Schulman et al. 2004). These changes have permeated

into the academic field, with current approaches to CIP being reconsidered as well as

implications on the governance and management of these vital systems. Amongst the

most fundamental changes to have occurred affecting these systems has been the

increasing interconnectivity within and between critical infrastructure sectors.

Further, the notion of institutional fragmentation, a result of ongoing privatisation,

liberalisation and deregulation which has been occurring in many critical

infrastructure sectors around the world, has placed further pressure on the reliable

management of the systems that provide essential services.

Such changes have dramatically changed the environment in which reliability is

achieved and prompted criticism and questions about current abilities to effectively

safeguard and protect these infrastructures into the future. Dominant theories of

reliability (i.e. NAT and HRT) would suggest that high reliability is unlikely and

normal accidents more likely in these rapidly changing systems (Schulman et al.

2004; de Bruijne 2006). These ongoing developments instigate the need for a re-

consideration of how best to ensure infrastructure reliability, with academics

claiming that further research is urgently required, ranging from a better

understanding of networks and interconnections, to the impacts of deregulation and

privatisation to ensure their reliable and secure operation into the future (Little 2002;

La Porte 2005; de Bruijne 2006), both of which will now be discussed.

55

Interconnectivity of Critical Infrastructures

With increasing societal dependence on essential services, the infrastructure systems

that provide them have, with the help of technological advancements, grown in size

and complexity into enormous, large-scale technical networks. While technological

changes have improved the provision of infrastructure services, this has also

substantially increased their vulnerability by making critical infrastructures not only

more complex at the organisational level, but also complexly interconnected at the

systems level; conditions which Perrow‟s NAT would suggest increase the likelihood

of normal accidents. Not only are all critical infrastructures complex in themselves,

but are also increasingly complexly interrelated and dependent on each other‟s

constant availability, ultimately increasing the need for higher levels of reliability.

This dependence and interdependence continues to escalate both functionally and

spatially as new capacity enhancing infrastructure technologies are developed and as

urban populations and their supporting infrastructures become more concentrated

(Rinaldi et al. 2001; Zimmerman 2001; Little 2002; de Bruijne 2004; de Bruijne

2006).

With increasing interconnectivity and thus interactive complexity, critical

infrastructures must now be regarded as systems of systems where there is a wider

scale of vulnerability and potential disruption than possible in any single system or

organisation. As critical infrastructure systems grow more interconnected and indeed

interactively complex, they become vulnerable to disruptive events that can

propagate from system to system in a domino-like effect, resulting in unexpected

interactions among systems and producing unanticipated consequences. As a result

of these interactive effects, critical infrastructures are becoming increasingly

vulnerable to large-scale, cascading failures both within and across sectoral

boundaries (Amin 2001; Amin 2002; Little 2002; Little 2003; Jiang and Haimes

2004; La Porte 2005; de Bruijne 2006; International Risk Governance Council 2006).

The interconnectedness of contemporary infrastructure systems was highlighted by

the blackout that occurred in the North-East of the US in January 1998. An ice storm

reportedly set off a rapid cascading failure across the system that eventually broke

apart the entire synchronised and interconnected North-Eastern US electricity

system, causing blackouts in eleven States and two Canadian Provinces (Amin 2001;

56

Farrell, Lave and Morgan 2002). This problem was again underscored in August

2003, when the North-Eastern US electricity grid failed again, this time reportedly

due to human error (Apt et al. 2004). This situation reinforces the significance of

vulnerability resulting from the interconnectedness of infrastructure systems.

Interconnectivity presents an inherently difficult challenge to the task of CIP, as

mitigating damage and ensuring continuity of these essential services is being further

complicated. One of the most frequently cited shortfalls in knowledge related to

enhancing CIP capabilities is the incomplete understanding of interdependencies

within and between infrastructures. According to La Porte (2005), systems of

interdependent organisations are only as reliable as their least reliable part. Further

exacerbating this interconnectivity has been the trend of institutional fragmentation,

placing greater pressure on the achievement of reliable service provision which is

now being demanded increasingly under networked conditions.

Institutional Fragmentation

In addition to the increased interactive complexity within and between critical

infrastructures, another important source of change that has taken place with

potential to affect infrastructure reliability has been the significant market reform that

has occurred in many industries since the 1980s (de Bruijne et al. 2006; de Bruijne et

al. 2007; de Bruijne and van Eeten 2007; Griffiths 2008). Within this restructuring

process, developments such as privatisation, liberalisation and deregulation have

been common, rapidly changing the operating environment of networked

infrastructures and leading to increased „splintering‟ – commonly referred to as

institutional fragmentation in the delivery, management and development of critical

infrastructures (de Bruijne 2004; de Bruijne et al. 2006; de Bruijne and van Eeten

2007).

The provision of essential services was a task once the exclusive purview of

Governments; traditionally provided through large-scale integrated monopolies,

characterised by centralised, hierarchical control. However, to take advantage of the

widely publicised benefits of market mechanisms (i.e. greater efficiency, better

quality of service provision, reduced Government expenditure), essential services are

now increasingly being delivered in unbundled, competitive markets. As such,

57

Governments around the world have taken a step back towards a role of coordinating

markets and overseeing systems that are now often owned and managed by others

(Abbate 1999; Rothery 2005; de Bruijne and van Eeten 2007; Griffiths 2008),

although restructuring can take many forms ranging from full privatisation, to

corporatisation.

Not only does institutional fragmentation affect the organisational structure in these

industries, it also influences the way that infrastructures are operated and managed

(de Bruijne 2004; La Porte 2005; Rothery 2005; de Bruijne 2006; de Bruijne and van

Eeten 2007; Garnett and Kouzmin 2007). These changes have created a situation in

which the ownership and management of critical infrastructures is being shifted from

the tightly controlled and regulated hands of bureaucracy, to now be the

responsibility of large numbers of organisations where there is competition between

multiple service providers, in what de Bruijne and van Eeten (2007, 19) refer to as a

„patchwork of public and private ownership‟. Instead of one or comparatively few

public organisations employing large-scale technologies through vertically-integrated

utilities, large networks of organisations with competing interests are now involved

in the management of critical infrastructures and the reliable provision of essential

services. To maintain reliable service provision, fragmented organisations now have

to cooperate and communicate to ensure their actions are coordinated.

The adoption of more horizontal, market- and network-based industrial structures

with decentralisation of tasks and responsibilities has changed the means through

which to ensure reliability, creating problems associated with multi-organisational

coordination and communication (de Bruijne 2004; La Porte 2005; de Bruijne 2006;

de Bruijne and van Eeten 2007). The result according to de Bruijne (2006, 11) „is an

inability, lack of power or lack of authority of any single organisation to compel

others to act‟. The responsibility for the reliable provision of vital services in

infrastructures has thus changed from a primarily intra-organisational task to an

inter-organisational challenge. While many experts have recognised that CIP faces

the challenge of dealing with infrastructures that for the most part are in private

hands, few have thought through the implications of the ways in which reliability is

ensured under conditions of institutional fragmentation (La Porte 2005; de Bruijne

and van Eeten 2007).

58

It seems that many new problems have emerged as infrastructures have been

restructured and opened to market forces. Critics claim that restructuring reduces

safety and reliability and diminishes the quality of services provided by

infrastructures, a point which is reinforced by the growing list of examples of

„unreliability‟ in restructured infrastructures, which includes nearly every type of

infrastructure from railway accidents in Britain, to widespread power failures across

Europe and the US (Amin 2001; de Bruijne 2006). By far the most infamous and

costly event to occur in a restructured industry was California‟s Electricity Crisis.

This case will now be discussed as it highlights the potential problems associated

with institutional fragmentation, which is a central consideration of this study.

A Case example: California Electricity Crisis

In the mid-1990s, many states in the US introduced reform measures to restructure

their electricity industries. Among the first was California, who embarked on an

ambitious full scale reform process to create a sophisticated electricity market that

would make it one of the most competitive in the world and remove industry

inefficiencies that had developed. Restructuring had a huge impact on California‟s

electricity market structure with new organisations, markets, and procedures

fundamentally changing the industry from a largely monopolistic, vertically

integrated, and centrally controlled organisational structure into a fragmented

network structure, sustained by highly volatile and competitive markets. What was to

follow became evident in the winter of 2000, with the State experiencing rolling

blackouts, high electricity prices, and unreliability of service provision, plunging it

into an electricity crisis and market meltdown that had severe social and economic

flow on effects. The crisis left key actors in the electricity sector nearly bankrupt, not

to mention the State of California itself, which ultimately became the lender of last

resort to its nearly bankrupt utilities (Joskow 2001; Goldman, Barbose and Eto 2002;

Wolak 2003; de Bruijne 2006; de Bruijne and van Eeten 2007).

Despite facing severe electricity shortages, service disruptions were largely kept to a

minimum (de Bruijne 2006). According to de Bruijne et al. (2006, 240), the

aggregate amount of load that was shed during the crisis was „quite small, accounting

for no more than one hour‟s worth of electricity for all residential homes in the

59

State‟. Although service reliability was largely maintained in the end, debate raged as

to whether this crisis was a result of market restructuring (Joskow 2001; de Bruijne

2006). While this debate persists between the proponents and critics of institutional

reform, interest amongst organisational theorists of the effects of institutional

fragmentation on the reliability of service provision in networked critical

infrastructures has grown, with considerations examining how reliable services are

provided under restructured conditions with networks of organisations.

The Importance of Enhancing ‘Networked’ Reliability

As a result of such changes, CIP is no longer secured by a single organisation.

Instead it is increasingly the outcome of many organisations working together in a

concerted fashion (La Porte 2005). Networked conditions, however, have in large

been ignored by both NAT and HRT and are not well addressed in mainstream CIP

literature. Reliability theories had until recently not been applied at the systemic

level, with reliability theorists from both NAT and HRT acknowledging the void (see

for example La Porte 1996; Grabowski and Roberts 1997; Perrow 1999b). Existing

literature had simply listed a number of factors that make it harder to achieve

reliability in a network of organisations than in a single organisation (Grabowski and

Roberts 1996; Grabowski and Roberts 1997; Grabowski and Roberts 1999).

In response to the major changes highlighted above, networked conditions have

emerged as a central feature of modern critical infrastructure industries that needs to

be considered in terms of its influence on the reliability of service provision. A body

of work examining the issue of networked reliability in critical infrastructures has

gained momentum in recent years, with a number of articles and dissertations

published, including examinations of large-scale water systems (Roe et al. 2005),

electricity grids (Schulman et al. 2004; Roe et al. 2005), and telecommunications

networks (de Bruijne 2006; de Bruijne et al. 2006; de Bruijne and van Eeten 2007).

This work has been valuable in advancing our understanding of reliability

management in critical infrastructures using established organisational theories of

reliability. It has also been useful for bringing together the theories of NAT and

HRT, and is a valuable starting point in examining the reliability of critical

infrastructure under changed operating conditions.

60

Using the California Electricity Crisis (discussed earlier) as his case in point, along

with the mobile telecommunications industry in the Netherlands, de Bruijne (2006)

examined the impact of full restructuring on the reliability of service provision using

established reliability theories, extending HRT and NAT under networked

conditions. Both cases led de Bruijne (2006) to conclude that restructuring does pose

credible and significant threats to the ability to continuously provide reliable

services, with systems being pushed for greater efficiency at the expense of

reliability. Restructuring ultimately exposed the systems and those operating them

towards potentially more reliability threatening situations. His conclusion was that

restructuring most certainly „diminished the ability of operators to maintain reliable

service‟ (de Bruijne 2006, 237).

The case analyses presented in his work demonstrated how institutional

fragmentation had undermined the preconditions that were theoretically considered

necessary to help those who manage critical infrastructures provide reliable services,

and increased the complexity, unpredictability and volatility of large-scale socio-

technical systems. Thus, his research supports NATs theoretically deduced

assumptions that institutional fragmentation increases the complex interactivity of

critical infrastructure technology, but does not significantly affect the electricity

sector‟s already tight coupling. Similarly, his work also identified the reliability-

enhancing characteristics distinguished in HRT as being negatively affected by

institutional fragmentation in the way they were expected (i.e. decreased reliability)

(de Bruijne 2006).

But to explain why, by and large, the lights stayed on in California‟s networked

environment, de Bruijne (2006) goes beyond NAT and HRT to show why

restructured networks might still be able to provide reliable service provision despite

evidence of a decline in conventional reliability-enhancing conditions. He identified

a number of different conditions that enabled the system to maintain reliability under

trying, new circumstances in an institutionally fragmented, networked setting. Rather

than considering NAT and HRT as universally applicable organisational theories of

reliability, he extends the theories arguing that different conditions exist in

networked infrastructures that help to explain their ability to provide reliability under

extremely demanding conditions (de Bruijne 2006).

61

de Bruijne (2006) presents a list of what he terms „networked reliability‟ conditions

that emphasise the importance of real-time resilience and reliability management,

where control room operations feature prominently. Thus, Reliability-Enhancing

characteristics in networks of organisations can be seen to differ from those in

traditional HROs as a result of what he calls „the relative shift from anticipatory

long-term planning towards real-time resilience‟ (de Bruijne 2006, 390). This finding

has also been supported by Roe et al. (2005), van Eeten and Roe (2002) and de

Bruijne et al. (2006) in other studies examining this issue.

This body of research has provided evidence of major changes in the management of

reliability under networked conditions. It suggests that infrastructure restructuring

and increased interconnectivity has increased the unpredictability and uncertainty

within the systems. The authors argue that policymakers should be aware that

unplanned, unexpected reliability-threatening events in restructured, institutionally

fragmented critical infrastructures will occur, often sooner than later. Under such

conditions it is widely advocated that the best possible strategy to maintain reliable

service provision is to prepare them to deal with this uncertainty, by promoting

system resilience (Roe et al. 2005; de Bruijne 2006; de Bruijne and van Eeten 2007).

The Need for a Resilience-Based Approach to CIP

de Bruijne and van Eeten (2007) and La Porte (2005) argue that institutional

fragmentation and the associated changes in reliability management provide

additional challenges to current CIP policies. As a consequence of the changes,

infrastructure systems are faced with even more surprises with the creation of more

unpredictable and volatile disturbances that may threaten the reliability of service

provision (de Bruijne 2004; La Porte 2005). de Bruijne and van Eeten (2007)

highlight specific characteristics shared by most CIP initiatives that do not fit

comfortably with networked conditions for reliability, one of which is the reliance on

anticipation as the dominant risk strategy for CIP. They instead advocate that a

revised approach with an emphasis on resilience is necessary given the new demands

placed on organisations responsible for ensuring the reliability of critical

infrastructures (de Bruijne and van Eeten 2007).

62

In light of such changes affecting the way in which reliable service provision is

provided in critical infrastructure systems, it is evident that current approaches to CIP

appear limited and infrastructures increasingly vulnerable. A revised approach with

an emphasis on enhancing system resilience and better cooperation is clearly needed

for the secure and reliable provision of essential services under such conditions.

Recognising this need, de Bruijne (2006) suggests that future research is needed to

examine how organisations responsible for the reliable management of critical

infrastructures should organise for resilience, particularly in networked settings.

This view is further supported by authors such as Fiksel (2006) and Starr et al.

(2003). Although their work is not focused on critical infrastructure specifically, they

argue that due to the nature of modern organisations which are becoming

increasingly interconnected as part of complex, industrial value chains there is a need

for research examining resilience. Accordingly, they suggest that this should not only

be from an organisational perspective, but also as a systems phenomena drawing on

the ecological concept of adaptive capacity in order to better understand how

networks of organisations achieve resilient outcomes. Such a view can equally be

applied to critical infrastructure systems, which can be considered the most vital

industrial systems.

Further recognising this need to explore the concept of resilience in networked

settings is the work of the Resilient Organisations Group in New Zealand. In their

exploration of organisational resilience and its links to community resilience, this

body of work has been attempting to understand how a group of networked

infrastructure organisations interact to achieve systemic resilience. This exploratory

work has posited initial ideas about conditions contributing to resilient outcomes in

networked settings such as information sharing and collaboration measures, but

suggests that further work in this regard is indeed necessary to better understand how

resilient outcomes are achieved within lifeline systems under networked conditions

(Dalziell and McManus 2004; Seville et al. 2006; Seville 2009). This, in addition to

de Bruijne‟s (2006) body of work regarding institutional fragmentation, therefore

provides an interesting platform for exploring the concept of resilience in networked

settings.

63

Bringing It All Together

Both streams of literature explored in this Chapter have emphasised the need for

building resilient capacities in organisations and indeed critical infrastructure

systems amidst an operating environment characterised increasingly by instability

and uncertainty. Building on this foundation, this research seeks to explore the

concept of resilience and its potential for ensuring the ongoing reliability of critical

infrastructure systems. It was also highlighted that critical infrastructures are

increasingly being delivered under unbundled, networked conditions where the

reliability of service provision is no longer the responsibility of a single organisation

or Government. Accordingly, it is necessary to consider resilience not just as an

organisational goal, but now as a system‟s goal.

Recognising this, the current study used two different conceptual frames of resilience

established in the literature – the first an organisational strategy as advocated by

Wildavsky (1988) and the other a systems concept as proposed by Holling (1973),

both of which will be discussed in more detail in Chapter 3. Examining both

approaches to resilience provides a more comprehensive impression and

understanding of a critical infrastructure system‟s capacity to function during times

of stress and perturbation, and maintain the reliable provision of essential services.

To do this, the research utilised BCM and HRT, two key practices with

acknowledged capacities to promote resilience which were further identified in this

Chapter as useful avenues for improving the management of organisations within

these complex systems to ensure their ongoing protection and reliability. Similarly,

the current research sought to explore and generate understanding about

characteristics that may be contributing to resilient outcomes at the industry level as

the preceding review highlighted that such characteristics are largely unknown.

The following Chapter will examine in a deliberate way, inconsistencies and gaps in

the literature highlighted in this review. This treatment will provide the foundation

for establishing the purpose of the investigation and for developing the research

problem and specific questions addressed in the work. Chapter 3 combines and

contrasts the key literary domains to appropriately contextualise and formulate

research issues. Chapter 3 will also detail the theoretical and conceptual bases used

to guide the study.

64

Chapter 3: Theoretical Framework

Introduction

In our modern industrialised societies, large-scale socio-technical systems are now a

prominent feature, and none are considered more important than the critical

infrastructure systems that support economic security and social well-being.

Accordingly, the issue of CIP has gained prominence in both academic literature and

Government initiatives in recent years, with interest spurred by rising societal

dependence and public concern about the reliability of these essential services. This

is because infrastructure systems are now faced with an increasing spectrum of

threats and challenges that continue to raise questions concerning the efficacy of

preconceived ideas about their preparedness and reliability.

In Chapter 2, strong arguments were noted from relevant literature that critical

infrastructure systems are becoming more complex and interconnected, and are faced

with a growing range of threats increasing their vulnerability. A key element in these

diverse works was that to protect these vital systems and ensure the reliability of

service provision, research is needed to better understand and manage them under

changed conditions. In response, significant contributions have been made

particularly from engineering approaches that employ rigorous methods including

mathematical modelling and other quantitative approaches to examine system risk

and vulnerability, increasing our understanding of these complex systems.

The aim of this Chapter is to present the rationale for the approach to critical

infrastructure reliability used in this study. The approach taken here entails a

departure from previous reliability research that has primarily dealt with engineering

and IT responses to reduce vulnerability in infrastructure systems. This study instead

examined management responses from organisations responsible for the reliable

operation and management of a critical infrastructure system, specifically focusing

on ways of improving approaches to resilience in the industry. This Chapter

examines general themes and critical approaches to the concept of infrastructure

reliability derived from the literature presented in Chapter 2, summarising key

literature gaps. Following this, a conceptual framework supporting the basis and

65

scope of the study is discussed, as is the relevance of the study to enhanced

understanding of „system‟ resilience for the electricity sector.

Theoretical Basis of This Study

Research into infrastructure protection to enhance service reliability has gained

prominence in recent years, with increased interest spurred by Government and

societal awareness. According to Little (2003, 64), „increasing the resilience and

reliability of critical infrastructure is not purely a developmental problem but one in

which basic research is necessary‟. He suggests targeted research efforts have been

„insufficient‟ to date (Little 2003). Resilience is being increasingly recognised as an

essential capacity for critical infrastructures in order to maintain service reliability

amidst an increasingly uncertain and unpredictable task environment, challenging the

applicability of traditional approaches to CIP centred on maintaining stable

operational functionality.

The Notion of Resilience

Current thinking on resilience is a product of theoretical and practical constructs

applied in a multiplicity of disciplines. Available literature touching-on resilience

vary across many disciplines, yet does not offer a clear and definitive theoretical and

operational basis, as was highlighted in Chapter 2. The lack of a generic definition

poses challenges for research and inhibits the development of a general

understanding (Manyena 2007). It was however useful to explore the concept within

the context of this work to contribute to the development of more robust CIP

programs and practices. The different domains surrounding conceptualisations of the

concept offer something of value when considering the resilience of critical

infrastructures. There is however, a need for a shift in focus from the protection of

assets and structures through engineering resilience approaches, to the resilience of

organisations and systems which can be examined through the lens of the ecological

and social science conceptualisations of the term. The three varying

conceptualisations of resilience relevant to this investigation are evident in the

following table.

66

Table 3.1: Varying Conceptualisations of Resilience

Resilience Engineering

Organisational/

Social Science (A. Wildavsky)

Ecological (C.S. Holling)

Efficiency of Function

efficiency, constancy, predictability

Ability to Maintain

Stability Near

Equilibrium

Single Operating State to

be Maintained

other operating states should be

avoided by applying safeguards & optimal designs (search for

equilibrium)

Flexibility of Function resistance, mindfulness, uncertainty

Ability to Cope with &

Respond to the

Unexpected

As a Universal Strategy

for dealing with

Uncertainty to persist (act reliably) in the face

of change by having appropriate

institutional structures and resources

Persistence of

Function

persistence, change,

unpredictability

Ability to Absorb

Shocks & Adapt

Multiple Stability

Domains

multiple regimes of behaviour

for survival

(Holling 1996; Fiksel 2006; Folke 2006; McDaniels et al. 2008)

The three conceptualisations of resilience presented here can be split into two

contrasting views of system stability with very different consequences for evaluating,

understanding, and managing complexity and change. On one side there is the

engineering perspective, a view that has traditionally been taken in the protection of

„as-built‟ systems of critical infrastructure functioning within expected parameters.

On the other lays the ecological and social science perspectives. The two latter

conceptualisations have a great deal in common in terms of how resilience is

conceptualised, as both challenge the dominant stable equilibrium view of the

conventional engineering perspective. As highlighted in Chapter 2, socio-technical

entities such as organisations and industrial systems (e.g. critical infrastructures) are

increasingly being viewed as living organisms, because like resilient ecological

systems as described by Holling (1973), they are able to survive, adapt and grow in

the face of uncertainty and unforseen disruptions (Dekker and Hollnagel 2006; Fiksel

2006). Accordingly, this research investigation sought to move beyond the traditional

engineering view of resilience for CIP, and instead examined the applicability of

resilience from the ecological and social science perspectives as an alternative means

to ensure the reliability of these vital systems.

As highlighted in Chapter 2, the ecological discipline views resilience as a systems

concept. This is founded on the work of Holling (1973) and others where the

system‟s ability to absorb and adapt to change (most often as disturbances) are

67

critical factors. System resilience in the “Holling-ian” sense is the ability to absorb

perturbations and alter non-essential system attributes in an adaptive response to

adverse circumstances in order to survive by shifting the system into a different

stability domain or another regime of behaviour. Wildavsky‟s (1988) view of

resilience has some similarities to the seminal work of Holling, but differs in that it

examines the concept from the human perspective, extending the theory to the study

of organisations. Resilience in the “Wildavsky-ian” sense is presented in terms of

how well an organisation can absorb unexpected challenges. This capacity is based

on a flexible response to danger, by enabling organisations to cope with unexpected

threats and having the capacity to bounce back.

The traditional engineering approaches to infrastructure protection through

anticipatory fail-safe designs aimed at maintaining optimal performance may be

considered to be less effective when dealing with uncertainty, and therefore may not

always ensure system reliability under conditions of stress or disturbance (Holling

1996; Fiksel 2006). Rather than striving to maintain stability near equilibrium, the

ecological and social science disciplines view resilience as the ability to cope with

uncertainty or unforseen disruptions through „adaptive capacity‟ (Dekker and

Hollnagel 2006, 3), and thus may fit well to the protection of critical infrastructure

organisations in contemporary times (McDaniels et al. 2008). Such approaches also

take a more flexible view of resilience, considering the vital social elements of

critical infrastructure systems, in addition to the technical aspects (i.e. the protection

of physical assets and structures) that are a primary focus of the „harder‟ engineering

approaches associated with built systems that „resist‟ change, rather than „persist‟

through change (Holling 1996).

Summary of Limitations in Existing CIP Research

The following section summarises limitations of existing CIP research and highlights

research gaps that this work has sought to address. It covers a range of issues from

the largely technical approach of existing research into critical infrastructure and the

resultant need for qualitative research exploring the emergent phenomenon of

resilience. Furthermore, it highlights the need for the development of a resilience-

based approach to the protection of critical infrastructure, particularly through the

68

application of BCM and Reliability-Enhancing characteristics (derived from HRT

literature), two recognised resilience-enhancing management practices. Similarly, it

also details the need for research exploring this phenomenon in critical infrastructure

from a systemic perspective.

An Innately Technical Approach

CIP has long been approached from an engineering perspective using a positivist

(reductionist) line of inquiry. The engineering field defines reliability as the

„probability that an item [system] will perform a required function without failure

under stated conditions for a stated period of time‟ (O‟Connor, Newton and Bromley

2002, 2). As such, infrastructure reliability studies have been typically assessed using

deterministic criteria. Engineering approaches are indeed valuable for quantifying

vulnerability and systemic risk, and for advancing system designs through tangible

improvements for higher reliability configurations (Zio 2009). They are important in

improving understanding of complex socio-technical systems, but do exhibit certain

limitations that result from a purely technical perspective. Ensuring comprehensive

reliability as a systems phenomenon however requires more than probabilistic

prediction. While probability is a valid mathematical tool contributing to the

measurement of reliability (as a risk-related construct), the full dimension of the term

cannot be captured using these methods alone because they do not adequately

consider the human element (Barnes 2002).

According to Little (2003), the task of making infrastructure systems inherently safer

when stressed requires more than just improved engineering and technology. The

events of September 11 and their aftermath demonstrated that these complex systems

also have critical institutional and human components that need to be understood and

integrated into design and operational procedures (Little 2003). Consequently, a

limitation of such approaches is that they do not examine the importance of non-

technical processes and components that are often inadequately measured or

understood using mathematical models or probabilistic methods.

Notwithstanding these constraints, the use of simulation techniques and analytical

methods such as factor analysis, and mathematical modelling do allow an effective

processing of raw data and for considerations of organisational and human factors

69

thus providing considerable insight into the complex nature of the behaviours and

operational characteristics of complex technical systems (Zio 2009). This however, is

at the expense of contextual and descriptive meaning evident at the socio-technical

interface. Such studies cannot provide a comprehensive picture of how to improve

the reliability of these socially and technically complex systems. A focus on tangible

fixes for technical aspects fails to enhance our understanding of the equally

important social aspect of these essential systems, by ignoring the role softer

components, such as the social, human and organisational elements and processes

have in their effective functioning and reliability (Turner 1994).

The limitation of technical approaches is even recognised by engineers Haimes and

Longstaff (2002, 439), who state that while systemic and quantitative risk

modelling, assessment, and management, are useful for the design and operation of

complex technology based systems, „they are far less effective in socioeconomic-

based critical infrastructure systems‟. They acknowledge the myriad of dimensions to

the complexity associated with protecting critical infrastructures „from the technical,

managerial, organizational, institutional, cultural, and international-political

perspectives‟ (Haimes and Longstaff 2002, 440). They go on to argue that modelling,

assessing, and managing the risks facing these infrastructures poses a formidable

task, acknowledging the importance of addressing each dimension so that all aspects

and perspectives can be analysed „in a holistic vision‟ to make appreciable progress

towards their protection (Haimes and Longstaff 2002, 440).

Going beyond these limitations requires an examination of the socio-cultural

contexts in which reliability is sought in real world settings. Research is necessary to

move beyond a focus on reliability defined in terms of performance figures to better

understand how service provision is actually achieved in infrastructure industries in a

real world context, rather than in contrived settings. The need for research into the

protection of critical infrastructures from the social science disciplines is recognised

in order to better understand both the social and technical processes that contribute to

human capacities that support reliability. In particular, it is necessary to explore how

the organisational processes in these large-scale, socio-technical systems influence

their safety and reliability. To date, very little has been offered in the academic

literature on how to improve institutional capacities supporting reliability

70

management in critical infrastructures. Different analytical approaches are thus

required that allow access to deeper issues rather than mere shallow representations.

Instead of trying to determine what can be done to enhance the reliability of critical

infrastructures quantitatively, the problem may be recast by considering how

organisations that manage critical infrastructures organise for reliability (de Bruijne

2006). This was the purpose of this investigation which involved studying an

organisation directly, in an in situ context using a qualitative approach.

Defining a Resilience-based Approach to CIP

Theories of reliability from the field of business management, particularly in the

realm of organisational settings, offer promising avenues for examining and

improving reliability outcomes in critical infrastructure, and indeed their resilience.

Well established theories such as HRT and NAT in particular, have been recognised

in the relevant literature for their relevance and application to enhance the resilience

of critical infrastructures; as was noted in Chapter 2. Such theories offer well defined

and empirically validated strategies and system characteristics that can influence the

reliable management of large-scale complex systems.

Using the complimentary theories of NAT and HRT as their theoretical frame, a

group of authors have recently examined the issue of institutional fragmentation that

is currently affecting many critical infrastructure industries (Schulman et al. 2004;

Roe et al. 2005; de Bruijne 2006; de Bruijne and van Eeten 2007). The work has

examined the impact of this trend on management strategies for high reliability

configurations, and how reliability is achieved under such conditions. They contend

that the reliable management of these systems has shifted from anticipation strategies

based upon long-term planning, elaborate engineering models and standard operating

procedures, towards real-time operations in which experience, communication,

mindfulness and resilience are essential. The authors highlight the importance of

building resilience into critical infrastructure organisations, and that a focus on

preparing and redesigning critical infrastructures to manage specific problems is no

longer enough to ensure reliability in the midst of growing system complexity.

71

While a valuable contribution to this emerging field, a limitation of this research

however is that it only indirectly addresses the consequences of the changed

institutional and environmental conditions. The authors indicate a significant need

for future research focusing on the new demands placed on organisations responsible

for ensuring the reliable management of critical infrastructure. de Bruijne (2006)

specifically suggests the need for research examining how organisations that bear the

prime responsibility for the reliable management of critical infrastructures should

organise for resilience.

This is a view that is supported by other authors including some from the engineering

field who have proposed a new line of inquiry, termed „Resilience Engineering‟ (not

to be confused with Engineering Resilience), arguing that systems should now be

made resilient rather than just reliable (see Hollnagel, Woods and Leveson 2006;

Dekker and Hollnagel 2006). Moving away from the stable equilibrium view

associated with conventional engineering approaches, this emergent theme derived

from a safety management perspective similarly views resilience as an adaptive

capacity and considers organisations and socio-technical systems as large living

organisms with varied functional components. Success or resilience from a

Resilience Engineering perspective is based on the ability of organisations, groups

and individuals to anticipate variations in exposure to risk before failures and harm

occur, acknowledging the important role that humans have in ensuring system safety

(Hollnagel et al. 2006).

Further echoing this call for greater attention to complementary studies on resilience

are academics from another relevant area of social science research to the study of

infrastructure protection and reliability - the field of Risk and Crisis Management

(e.g. de Bruijne 2004; Boin and McConnell 2007; McManus 2008). This field has

already made significant practical contributions to the protection of critical

infrastructure with established Risk and Crisis Management processes and

frameworks considered to be integral components of a comprehensive infrastructure

protection program. Such frameworks are considered central to the ongoing

reliability of service provision, as they assist organisations operating and managing

these vital systems in identifying and mitigating threats, and managing processes for

risk reduction. Recognising however, that organisations are now operating in an

72

increasingly complex and uncertain environment, limitations of established

approaches and tools of Risk and Crisis Management have been acknowledged in the

literature – a limitation which has direct implications for the organisations employing

them.

In support of the activity examining organisational reliability in critical infrastructure

systems, further needs have been defined within Risk and Crisis Management

literature for organisations to better balance anticipation and resilience as a strategy

to reduce the potential for impacts from disturbances in uncertain conditions.

Although a focus on resilience may not be a key issue for all organisations (e.g. those

operating in relatively stable contexts) (Vogus and Sutcliffe 2007), the operating

environment of critical infrastructures indeed makes resilience a necessary strategy.

Despite many academics advocating the need for a greater focus on strategies that

emphasise resilience to better prepare organisations for the unexpected, there have

been few ideas offered in the literature as to how organisations might do this, beyond

suggestions for them to look to the so called HROs. More generally, Vogus and

Sutcliffe (2007) contend that given the dearth of empirical work exploring resilience

in organisational theory, there are many options open for future research examining

this phenomenon. This absence presents an inherent limitation in current research

examining the reliable management of critical infrastructures that needs to be

addressed.

The Application of Existing Resilience-Enabling Management Strategies

As was highlighted in Chapter 2, HRT and BCM are two recognised resilience-

enabling management tools and strategies that may, if implemented effectively,

improve the reliability of organisations operating within infrastructure systems. The

potential of high reliability operations to enhance Risk and BCM process in

organisations have not adequately been explored. When used together in

organisations managing and operating critical infrastructures, the approaches may

help to improve capacities to enhance the reliability of these vital systems amidst and

increasingly uncertain and complex operating environment, by assisting

organisations responsible for their operation and management to better prepare for

and manage the unexpected – a critical capacity for organisations to possess in the

modern world. As was noted in Chapter 2, while HRT has been empirically tested

73

and validated in the academic literature, few studies have been conducted in the

context of critical infrastructure systems (de Bruijne 2006); particularly its

examination in corporatised rather than fully privatised industries. Further, its use as

a practical strategy for improving long-term CIP in homeland security policies has

been neglected.

Similarly, the application of BCM has not been considered in other industry or

organisational contexts outside the IT realm, including critical infrastructures.

However, business continuity practices are generally starting to play a much larger

role practically in ensuring the continuity of organisations since September 11. In

particular, its recognition as a component of an effective CIP program has resulted in

the use of continuity practices becoming more widespread in critical infrastructures.

A lack of clarity however, as to what constitutes effective BCM can lead to

inconsistencies in its application and the generation of questions regarding its

effectiveness. This is in spite of a well-grounded holistic approach being firmly

established in several National and International Standards and Guidelines.

A Shift to Systems Research

A further limitation of much of the existing research examining the reliability of

critical infrastructure has been that the few previous case studies examined, have

been limited to individual organisations when increasingly infrastructure reliability is

a function of an entire system – put simply infrastructure systems are now only as

strong as their weakest link (Dalziell and McManus 2004; La Porte 2005). Little is

known about how large-scale technical systems such as critical infrastructures, and

the organisations that operate them, actually function in unison to promote system

reliability.

According to Roberts and Gargano (1990), societal problems are increasingly framed

in inter-organisational terms, and this research is no exception. Systems of critical

infrastructures have, over the decades, become so large and interconnected with

many organisations contributing to their management. Typically no single

organisation manages the provision of essential services to consumers (de Bruijne

2006). Instead, essential service infrastructures are controlled by multiple

organisations which assume responsibility for outcomes in certain geographically or

74

technically defined areas. As we now live in a networked society, taking a systems

approach to this research may help us to understand critical infrastructures, and how

they function as networks of interrelated organisations to achieve reliable outcomes

as advocated by de Bruijne (2006).

This notion of networks of interrelated organisations has been further recognised by

the Resilient Organisations group in New Zealand, which has identified some

networked conditions that may contribute to resilient outcomes (as identified in

Chapter 2). Although seeking to understand resilience in organisations and its links

to community resilience, and not always focused on critical infrastructure

organisations specifically, this research has highlighted the need for organisations to

work together to achieve resilient outcomes across traditional boundaries at a

systemic level, suggesting that further research in this regard is necessary to better

understand such conditions (Dalziell and McManus 2004; Seville et al. 2006). This

interesting research body therefore provides scope for further exploration and

analysis of resilient characteristics at the systemic level, particularly when applied in

the context of critical infrastructure systems.

Furthermore, Fiksel (2006) also contends that there is an urgent need for research to

better understand the dynamic adaptive capacity of complex systems and their

behaviour during interruptions. Although coming from a sustainability perspective,

he highlights the need to view industrial systems as organic living organisms. This

view thus has application to other industrial systems such as critical infrastructure.

Summary of Literature Gaps

As identified in the critical analysis of the literature in Chapter 2 and reinforced by

the discussion of limitations above, a number of theoretical and practical gaps

relevant to the research are evident and are summarised in the following table.

75

Table 3.2: Overview of Literature Gaps

The Need for More Complementary Research

Using Qualitative Techniques and Taking a Socio-Cultural Approach

to Explore how Critical Infrastructures are Managed for Reliability

- A recognised need for further CIP research from the social science disciplines, using

a qualitative approach to study the important socio-technical context of these

essential systems.

- Necessary to understand how reliability is achieved in organisations by examining

how softer, human-based system components such as organisational processes,

elements and functions can enhance the reliability of infrastructure service provision.

- This will help to improve understanding as to how the organisational structure and

management of these large-scale socio-technical systems influences their safety and

reliability, and how organisations that are responsible for the reliable management

and operation of these critical systems organise for reliability.

Individual Organisation Focus Industry-Wide, System Focus

The Need for Qualitative Research

Examining how Critical Infrastructure

Organisations can Organise for Resilience

to Ensure Reliable Service Provision

- It has been noted in the academic

literature from a number of relevant

social science fields that

organisations, particularly those

operating critical infrastructures,

increasingly need to harbour resilient

capacities in order to manage the

unexpected & maintain service

reliability in an age of increasing

complexity & uncertainty.

- Although this has been widely

recommended, very few solutions

have been offered as to how this can

be achieved.

The Application of BCM and HRT as

Resilience-Enhancing Strategies in the

Context of Critical Infrastructures

- Both BCM and HRT are recognised

resilience enabling management tools

& strategies. They both however

need to be explored for their

application in the context of critical

infrastructures to assist in improving

the reliability of service provision.

The Need for Qualitative Research

Examining how Networks of Critical

Infrastructure Organisations Promote

System Resilience to Ensure Reliable

Service Provision

- The need for a systems approach to

CIP studies to examine how the

reliability of service provision is

achieved across an industry, as this

is now the task of a network of

multiple organisations, rather than

any single organisation or entity

76

Implications for Research

The aforementioned limitations of the existing literature are central to the subsequent

direction of this research investigation, strongly influencing the research context,

research problem and associated research questions, as well as the appropriate

methodological choice. This research sought to shed light on these gaps by selection

of theoretical and methodological approach. The investigation has thus been

designed to explore these limitations in order to gain important insight into the

fundamental, yet seemingly neglected issue of how reliable service provision is

achieved in critical systems from a managerial perspective.

The research focused upon the resilience-enhancing strategies employed within and

across organisations operating and managing infrastructure, and how they can

potentially be improved so as to ensure reliable service provision along a critical

supply chain. The following section identifies the research context, research problem,

and finally the associated research questions that guided the direction of this research

investigation.

Research Context

This research examined resilience as a reliability-related phenomenon within critical

infrastructure systems. It focused on the socio-technical processes of critical

infrastructure systems during normal operations that help to build resilient capacities

to maintain functionality under times of stress or perturbation. Thus, there were two

considerations critical to the context of this research. Firstly, that the emergent

systems phenomena of resilience is central to this investigation, and secondly the

notion of critical infrastructure.

Critical Infrastructure – The Queensland Electricity Industry

An examination of critical infrastructure was central to this investigation, as research

in the context of critical infrastructures is required to help enhance our understanding

of these complex lifeline systems to ensure their continued reliable management and

operation. While there are numerous infrastructures that are deemed critical, the

electricity industry, as a component of the broader energy critical infrastructure,

constitutes perhaps the most complex and integral critical infrastructure supporting

modern industrialised society. Electric power systems represent the fundamental

77

infrastructure of modern society, with pressure from various stakeholders to

guarantee its always-on availability, as well as it many intricate connections with

other infrastructures which are dependent on the reliable provision of electricity.

Some of these dependencies are evident in Figure 3.1.

Figure 3.1: Examples of Electric Power Infrastructure Dependencies

Source: Adapted from Rinaldi et al. 2001

Furthermore, like other infrastructures, electricity providers have been faced with

significant challenges in recent years, particularly the issue of institutional

fragmentation which has altered the conditions under which reliability had

traditionally been achieved, as established in Chapter 2. The Queensland Electricity

Industry, which provided the context for this research investigation examining the

nature of resilience within the industry, has not been immune to this global trend,

undergoing a period of significant restructuring in the late 1990s.

Although not fully privatised, the industry was restructured from a vertically-

integrated publicly-provided utility, to a corporatised model, with the establishment

of six independent Government-Owned Corporations (GOCs) which operate as

profitable entities returning dividends to the Government shareholder. A further

result of this change has been the introduction of competition into the Generation

sector, with this part of the industry now open to investment from private

organisations. As a result of the disaggregation, the industry is now comprised of

three distinct, yet interconnected functional sectors associated with the production

and physical delivery of electricity to homes and businesses throughout the State –

Generation, Transmission, and Distribution as evident in Figure 3.2.

Electric

Power

Information Technology

Banking & Finance

Water Supply

Natural Gas Production

Oil Production

Transport (road, rail & air)

Telecommunications

78

Figure 3.2: Overview of the Structure of the Queensland Electricity Industry

As depicted in Figure 3.2, there is also a fourth sector, Retail, which is responsible

for the sale of electricity to customers. Although an important component in

electricity industry, this research investigation is limited to the critical stakeholders

involved directly in the production and physical delivery of electricity to homes and

businesses. Accordingly, electricity retailers will not be included in this

investigation. Further, for the purpose of this investigation, the industry was split into

two sections; firstly, those organisations that manage generation assets (i.e.

Generation), and secondly those that manage the network assets (i.e. Transmission

and Distribution).

Resilience as a Construct

The importance of building resilience, particularly in critical infrastructures, has been

highlighted in Chapter 2 and reinforced throughout this Chapter. The research

questions addressed in this thesis sought to understand how resilient capacities might

be generated, not only within individual infrastructure organisations, but also across

an industry characterised by networked conditions of interconnected organisations.

In order to better analyse resilience on a conceptual basis, this work applied two

constructs of resilience for analysis at both the organisational and systems level.

Accordingly, this research investigation firstly examined resilience in a critical

infrastructure system from an organisational perspective, and secondly from an

industry-wide, systems perspective. As an aid in developing a deeper understanding

of how resilience might be achieved in critical infrastructure systems, a conceptual

Structure of the Queensland Electricity Industry

Sale of Electricity

Electricity

Generation

Electricity

Transmission

Electricity

Distribution

Electricity

Retailers

Sector 1 Sector 2 Sector 3 Sector 4

The Production and Physical Delivery of Electricity

79

framework proposing the use of the two different approaches was developed. Figure

3.3 displays the framework employed in this research.

Figure 3.3: Conceptual Framework for Characterising Different Approaches to Resilience

First the “Wildavsky-ian” view of resilience from an organisational perspective, sees

resilience as an institutional goal achieved by processes in an organisation‟s

repertoire that ensure the capacity to flexibly respond to danger and bounce back

from unexpected events. It also however, considered the “Holling-ian” view of

resilience, which although from the socio-ecological realm, holds cross-disciplinary

weight in the study of infrastructures because reliable service provision is no longer

the task of any single organisation, but the responsibility of networks of

organisations working in unison. Thus, for the industry to perform effectively it is

not only the result of the actions of individual component parts, but also how the

component parts work collectively towards the goal of system resilience to ensure

the reliable provision of services. An analysis using the two different conceptual

frames of resilience provides a more comprehensive impression of the industry‟s

capacity to function during times of stress and perturbation, and to maintain the

reliable provision of essential services.

A

E

C

D

B

“Wildavsky-ian” Approach Organisational View of Resilience

Flexibility of Organisations in Face of

Danger/Uncertainty

“Holling-ian” Approach Systems View of Resilience

E

D

C

B

A

Provision of

Essential Services

Individual Organisation Perspective Industry-Wide, Systems Perspective

80

Resilience-Enhancing Characteristics

As indicated in Chapter 2, resilience is an abstract and multidisciplinary concept

which is difficult to operationalise. The variables that contribute to resilient

capacities in complex systems are largely unknown and thus there are few defined

variables that „should‟ be measured when studying resilience (Cumming et al. 2005).

This research however, drawing on existing management literature, has posited that

both institutional-and-system level resilient phenomena contribute to institutional

capacities for generating resilient outcomes, where there is a mutually supporting

relationship between the resilience-enhancing factors discussed in Chapter 2, which

is further evident in Figure 3.4. Therefore, to explore the concept of resilience within

a critical infrastructure industry the research drew on BCM and Reliability-

Enhancing Characteristics particularly from an organisational perspective. In

addition, it also sought to explore specific Industry Conditions that may be

contributing to resilient outcomes.

Figure 3.4: Factors/Processes Contributing to Resilient Outcomes

This research project explored the relationship between these resilience phenomena

in critical infrastructure industries, by examining the degree of implementation of

BCM and evidence of Reliability-Enhancing characteristics within individual

organisations situated within the Electricity Industry, as well as the presence of

certain factors supporting integration as manifested at the broader industry level. Key

characteristics from each of the identified factors were used to contrast conditions in

BCM

Characteristics:as implemented

Reliability-Enhancing

Characteristics:as evidenced

Industry Characteristics:

as manifested

81

situ within specific organisations (BCM + Reliability-Enhancing Characteristics),

and also across the industry (Industry + Reliability-Enhancing + BCM). The

resilience-enhancing characteristics explored in this research as identified in existing

literature are evident in Table 3.3.

Table 3.3: Resilience-Enhancing Characteristics

BCM Characteristics Reliability-Enhancing

Characteristics

Industry Characteristics

(of potential interest)

- Implementation of BCM

- Threat Assessment

Activities

- BIA

- Testing & Review Activities

- Futures Scenarios

- Role of Senior Management

- Corporate Governance

- Use of Standards

- Integration with Risk

Management

- Embeddedness

Process & Design Characteristics

- Technical Performance

- Flexibility & Redundancy

- Autonomy & Accountability

- Decision Making & Hierarchy

- Training & Learning

- Ownership

- Regulation/Oversight

- Investment

- Industry Culture &

Commitment

- Collaboration &

Cooperation

Goal & Commitment Characteristics

- Importance of Reliability

- Culture of Reliability

- Commitment to Reliability

- External Oversight

Research Problem

The research problem investigated was derived to address identified limitations in the

existing literature, so as to explore the fundamental issue of infrastructure reliability

through the lens of resilience, as well as options for its improvement. As such, the

central research problem examined is as follows:

How do networked critical infrastructure systems engender resilience to ensure the

reliable provision of essential services in an increasingly institutionally fragmented

environment?

The research problem centres on the network‟s ability to maintain reliability of

service provision in increasingly uncertain times by way of resilient management

practices. Accordingly, this research intended to move beyond the world of reliability

performance figures to address how reliable service provision is actually achieved in

critical infrastructure industries by focusing on how resilient capacities are developed

through appropriate management practices and industry conditions. Chapters 2 and 3

82

however have highlighted the lack of research examining the emerging phenomenon

of resilient critical infrastructure systems from a business management perspective.

With many of the potential variables of interest ill-defined in the literature, the

research problem can be considered to be complex in nature. Accordingly, research

is required to explore and describe the phenomena under investigation, with the

research problem lending itself to an exploratory research design supported by a

qualitative research approach.

Research Questions

In line with the need for an exploratory research design, the research questions for

this investigation are broad in spectrum in order to provide insight and understanding

into the complex phenomenon of critical infrastructure resilience. As previously

indicated, in order to address the identified limitations and the defined research

problem, the research investigation was divided into two distinct parts that are guided

by separate research questions.

Research Question 1

Firstly, this research investigation sought to examine this notion of resilience from

the perspective of individual organisations within the Queensland Electricity

Industry, in order to better understand how resilient capacities are engendered at the

organisational level within the supply chain.

How do organisations that bear responsibility for the reliable management of

electricity infrastructure organise for resilience?

Research Question 2

The research investigation also sought to examine resilience from the systemic level,

as electricity supply is now the task of multiple organisations working together in

unison towards the goal of reliable service provision. Accordingly, this part of the

research was aimed at better understanding how the organisations, operating in a

networked environment characterised by increasing institutional fragmentation,

engender resilience across the entire supply chain for end-to-end service reliability.

How do networks of critical infrastructure organisations foster system resilience to

ensure the reliable provision of essential services?

83

Research Outcomes Sought

This research has sought to foster a greater awareness of resilience thinking as

applied to the management of critical infrastructure in general, but also in contexts

where public infrastructure systems have been corporatised. More specifically, it has

pursued a goal to better understand how resilient capabilities can be engendered in

critical infrastructure industries that have undergone a process of institutional

fragmentation. The results are intended to enhance understanding of the application

of resilience-related theory to practice in a range of institutions within the Electricity

Industry. It has also sought to enhance understanding of how systems of critical

infrastructure function when disturbed, and thus impact the resultant reliability of

essential service provision. Such an examination of both organisation-and-system

level resilient functioning contributes to a more comprehensive understanding of this

emergent phenomenon in fragmented critical infrastructure.

Conclusion

This Chapter has introduced the important theoretical foundations underpinning this

research, highlighting the central notions of resilience (the “Wildavsky-ian” and

“Holling-ing” frames) and associated resilience-enhancing characteristics to be

explored in the context of the Queensland Electricity Industry. It has also highlighted

a number of literature gaps that this research sought to address, and consequently has

identified the need for qualitative research to elucidate meaning and generate a

clearer understanding about this emergent phenomenon in critical infrastructure

systems. This Chapter concluded by providing the research problem and associated

research questions and a discussion regarding the anticipated research outcomes. The

following Chapter will discuss the methodological choice in greater detail.

84

Chapter 4: Methodology

Introduction

The previous Chapter identified the research problem to be addressed in this research

investigation, in addition to a number of related research issues. Building on this

foundation, this Chapter details the methodology used to address this problem and

the associated research issues. Accordingly, the purpose of this Chapter is to align

the methodological choice with the theoretical position and framework derived from

the literature gaps detailed in the previous Chapter. This Chapter begins with an

examination of the scientific paradigm employed in this research investigation,

followed by a discussion justifying the methodological choice. This is then followed

by an examination of the manner in which rigour is addressed by way of validity and

reliability criteria. The Chapter will conclude with a discussion of the limitations and

ethical considerations pertinent to this research investigation.

Justification of the Scientific Paradigm

It is necessary to understand the philosophical positioning of the research

investigation in order to clarify the appropriate research design and methods.

According to Guba (1990, 17), this philosophical position or „basic set of beliefs‟

that drives the research investigation may be termed a paradigm (i.e. positivist versus

non-positivist or phenomenological lines of inquiry). This is the framework that sets

the context of the investigation containing the researcher‟s philosophical assumptions

at three levels relating to ontology (the nature of reality), epistemology (the

relationship between the researcher and that reality) and methodology (technique

used to investigate that reality) (Guba 1990; Guba and Lincoln 1994).

In essence, it is necessary to consider the inquiry paradigm in a research

investigation as it influences the researcher‟s view of the nature of reality, and in

turn, how knowledge about that reality is sought and decisions pertaining to

methodological choice might be made (Guba and Lincoln 1994). Although

important, the selection of an appropriate paradigm is not without contention, with

great philosophical debates regarding the merits of one paradigm over another, and

contention regarding where to draw the boundary lines between them (Patton 2002).

There is also significant variation in the literature in terms of the different types of

85

inquiry paradigms available to social science researchers. Table 4.1 presents a

number of conventional inquiry paradigms in social science research.

Table 4.1: Categories of Scientific Paradigms and their Philosophical Assumptions

Scientific

Paradigm

Positivist

Paradigm

Phenomenological Paradigms

Positivism Post Positivism

(Realism)

Constructivism

(Interpretivism)

Critical Theory

(Post

Modernism)

Ontology

(Reality)

Naive

Realism

Reality is „real‟

&

understandable;

a single

apprehensive

reality

Critical

Realism

Reality is „real‟

but only

imperfectly &

probabilistically

understandable

(provisionally

true)

Interpretive

(Subtle) Realism

Reality is „real‟

but imperfect &

complex;

presumes the

existence of an

external world in

which events and

experiences are

triggered by

underlying

mechanisms &

structures

Relativism

Multiple local

and specified

socially

„constructed‟

realities;

participant‟s

perceptions are

reality; socially

constructed

reality of nature

Historical

Realism

Virtual reality

shaped by

social, political,

cultural,

economic,

ethnic & gender

values;

crystallised

over time;

participant‟s

perceptions are

reality

Epistemology

(Relationship

Between

Researcher &

Reality)

Objectivist

Finding truth;

absolute

objectivity

Modified

Objectivist

Findings

probably true;

objectivity worth

striving for

Inter-Subjectivist

Participate in real

world life to

understand; belief

that abstract

things that are

born of people‟s

minds but exist

independently of

any one person

Subjectivist

Created

findings; the

subjective

world of minds

Subjectivist

Value mediated

findings; no

truth or true

meaning about

any aspect of

existence is

possible, it can

only be

constructed

Methodology

(Technique

used to

discover that

reality)

Chiefly

Quantitative

Controlled

Experiments /

Surveys

Verification of

hypotheses

Theory Testing

/ Confirmatory

(deduction)

Mix Qualitative /

Quantitative

Case Studies /

Structural

Equation

Modelling

When complex

phenomena are

already

sufficiently

understood to

warrant attempts

at generalisation;

triangulation

of research

issues by

qualitative &

some quantitative

methods

Qualitative

Instrumental

Case studies /

Convergent

Interviewing;

Focus Groups

Study perceptions

because they

provide a window

on to a reality

beyond those

perceptions;

triangulation of

research

issues by

qualitative

methods

Qualitative

Hermeneutical/

Dialectical:

Grounded

Theory;

Intrinsic Case

Studies

Participant‟s

perceptions

studied for their

own sake;

researcher is a

„passionate

participant‟

within the

world being

investigated

Qualitative

Dialogic/

Dialectical

Researcher is

a

„transformative

intellectual‟

who changes

the social world

within which

participants live

Adapted from: Guba & Lincoln (1994); Hammersley (1995); Seale (1999)

86

So as not to get caught up in the philosophical debate that surrounds the justification

of an appropriate paradigm, this research investigation considered four of the

mainstream paradigms detailed in Table 4.1, with the realism (post-positivism)

paradigm considered to be the most suitable paradigm for examining the issue of

resilience in critical infrastructure systems. A discussion supporting this selection is

provided in the following section.

Appropriateness of Realism Paradigm

In the literature, the realism paradigm has also been referred to as post-positivism

(Guba and Lincoln 1994), critical realism, and neo-postpositivism (Miles and

Huberman 1994). This paradigm contains elements from both positivism and

constructivism, although there appear to be different opinions in regards to its

position between these two opposing paradigms. For instance, Guba and Lincoln

(1994) describe realism (post-positivism) as the close cousin to positivism. In

contrast, others including Perry, Riege and Brown (1999) and Amaratunga and

Baldry (2001) describe realism as being more closely aligned with other

phenomenological approaches (e.g. other non-positivist paradigms such as

constructivism).

The differences within the literature indicate that there are varying conceptualisations

and quite permeable rather than rigid boundaries to this paradigm. As such these

„positions‟ fit well to a continuum with research efforts placed accordingly

depending on the needs of the investigation. For the purposes of this investigation,

the realism paradigm was employed with a more interpretive persuasion („subtle

realism‟8), given the aims and needs of the research (Hammersley 1992; Hammersley

1995; Seale 1999). This choice is represented in Table 4.1 (page 85) by the shaded

area.

8According to Seale (1999, 470) subtle realism „involves maintaining a view of language as both

constructing new worlds and as referring to a reality outside the text, a means of communicating past

experience as well as imagining new experiences‟. This is based on the work of Hammersley (1992,

1995) who presents subtle realism as a softer alternative for social researchers seeking a middle way

between naïve realism‟ and „relativism‟, and taking a more interpretative approach than critical

realism.

87

Moving away from the naive realism of positivist research, the realism paradigm is

based on the assumption that there is a real external world to discover although

acknowledging that it may only be imperfectly understandable (Tsoukas 1989; Healy

and Perry 2000). That is, the realism paradigm believes that there is a single reality in

line with the positivist approach; however it differs in that it recognises the

importance of individual constructions. Although concerned with the abstract things

that are born of people‟s minds, realists believe that reality exists independently of

any one person; and therefore, develops a clearer picture of this reality by

triangulating multiple perceptions. Thus, within the realism paradigm perceptions are

a window onto reality, with this approach acknowledging the difference between the

world and particular perceptions of it (Healy and Perry 2000).

The purpose of this investigation is to better understand a complex socio-technical

phenomenon in a networked system (i.e. resilience in an electricity supply chain) that

is occurring in a natural, real world environment involving humans and their real life

experiences. Critical to this is an examination of the perceptions of those within this

networked critical infrastructure system in order to generate a clearer picture of the

reality of the complex phenomenon being investigated that lies beyond individual

perceptions. Accordingly, it can be said that this research investigation is suited to

the realism paradigm as it does not seek to study participants‟ perceptions for their

own sake. Rather it aims to better understand the reality of a complex socio-technical

system as it occurs in its real world setting through the collective minds of those who

experience it (Perry et al. 1999). This is in contrast to the ontological assumptions of

the other three paradigms presented previously in Table 4.1 (page 85).

Furthermore, given the complex, pre-paradigmatic nature of the phenomenon under

investigation that has not yet been fully discovered or comprehended, emphasis lies

more on finding meaning and generating understanding of ideas emerging from the

data analysed. Resilience has been under-examined in literature from the perspective

of the non-positivist paradigms. Thus a means to measure this socio-technical

phenomenon has not yet been firmly established and this style of examination may

assist in making inroads towards enhancing understanding of the concept.

Accordingly, the research requires a flexible, inductive approach that will require

active participation in the real world so as to better understand and express its

88

emergent properties and features by interacting with informants, but not to go as far

as the constructivist paradigm focused wholly on the subjective „world of minds‟

(Perry et al. 1999; Healy and Perry 2000).

Methodological Choice

Such conditions as those discussed in the preceding paragraphs reinforce the

appropriateness of the realism paradigm for this investigation, positioning the

research somewhere between the stark objectivity of positivism and the pure

subjectivity of constructivism (as was evident in Table 4.1 – page 85). As the choice

of methodology is guided by the ontological and epistemological assumptions, there

is a limited range of relevant methodological choices that will provide the best fit for

the realism paradigm, these being in-depth interviewing, focus groups, case studies,

and in some circumstances surveys and structural equation modelling, as is evident in

the following figure.

Figure 4.1: Appropriate Methodologies by Paradigm

Source: Perry (1998)

Each research strategy provides a means for collecting and analysing empirical

evidence, and all have their own advantages and disadvantages (Yin 2003). For the

purposes of this research investigation, a qualitative case study methodology was

considered to be most appropriate for a number of reasons which will be discussed in

turn.

89

The Case Study Approach

A case study is a strategy for conducting research which involves an empirical

investigation of a particular contemporary phenomenon within its real-life context

using multiple sources of evidence (Yin 2003). This is particularly valuable when the

boundaries between the phenomenon and the context are not clearly delineated,

therefore acknowledging the importance of contextual conditions to the phenomenon

under investigation. The case study methodology allows for investigation of a

complex situation in which there will be many more variables than data points, and

therefore relies on multiple sources of data and also benefits from prior development

of theoretical propositions to guide data collection and analysis (Yin 1994).

Accordingly, the case study methodology is a common strategy in business research

(Yin 2003), and has become an increasingly important qualitative approach in many

management disciplines (Gummesson 2007; Lee, Collier and Cullen 2007). It can be

either qualitative or quantitative in approach or a combination of both depending on

the nature and aims of the research investigation (Perry 1998); although for the

purpose of this study a purely qualitative approach has been employed.

Overarching Justification for this Methodological Approach

In addition to the discussion regarding the philosophical paradigm supporting the use

of the realism paradigm, there are a number of other conditions justifying the use of a

qualitative case study methodology as an appropriate methodological choice for this

investigation. This includes the nature of the research aims, the pre-paradigmatic

nature of the research area, the need for deep understanding about the phenomena,

the nature of the research issue and associated questions, the nature of the

phenomenon under investigation (a complex, contemporary phenomena occurring in

a real life setting), and the need for multiple levels of analysis, which will each be

discussed in turn.

Lack of Existing Empirical Research – Pre-Paradigmatic Stage

As alluded to in the preceding discussion regarding the philosophical assumptions of

this investigation, the research area is pre-paradigmatic in nature as there is limited

existing empirical research examining the phenomenon of resilience in critical

infrastructure organisations. Clear measurement parameters for resilience do not

exist, thus the research area is not yet in a position to be developing and testing

90

hypotheses to generate concrete empirical evidence by way of deductive reasoning.

Instead, it is still in the stage of theory development, with the investigation guided by

research questions focused on observing phenomena within the electricity industry in

order to better understand resilience-enhancing strategies and processes that may

contribute to the reliability of service provision.

Therefore, given that the research area is in the early stages of development an

inductive approach was adopted that was more exploratory than confirmatory in

nature (Perry 1998). The aim was to uncover patterns that help to explain

phenomena, with themes of interest emerging from informant experiences and

viewpoints (Perry 1998; Cavana, Delahaye and Sekaran 2001; Yin 2003). However,

in studies such as the one presented here, where some understanding has been

achieved but where theory building is still necessary, a purely inductive approach is

unsuitable, as prior theory (albeit limited) can have a pivotal function in the design of

the case study and in the analysis of its data (Perry 1998). The position of this

research investigation as a more exploratory rather than explanatory (confirmatory)

study is evident in Figure 4.2.

Figure 4.2: Position of Case Study Research

Source: Perry (1998)

It has been argued that using both inductive and deductive reasoning offers

synergistic benefits as a purely deductive approach may limit the development of

new theories which may be important for the field of study, whilst purely inductive

Level of prior theory used in data collection

HIGH

NONE

0

Confirmatory Exploratory

Number of Cases

91

research can prevent the utilisation of established theory that may be of potential

value to the researcher (Parkhe 1993; Miles and Huberman 1994; Perry 1998).

Although a more inductive research design was suited to this investigation given the

dearth of existing empirical research examining the phenomena, this research has not

sought to generate theory from data alone as some theory exists influencing the

direction of this research.

Accordingly, the investigation required the use of a methodology which could

facilitate the development of knowledge in a relatively poorly understood area by

way of inductive theory building, providing enough flexibility to change direction

with the research issues during the fieldwork process, whilst using some deductive

reasoning based on prior theory to inform the direction of the research investigation

and provide basic structure. Thus, the case study method is considered well suited for

inductively building a rich, deep description of new phenomena where there is

uncertainty in the definition of constructs (Benbasat, Goldstein and Mead 1987;

Perry et al. 1999; Christie et al. 2000; Voss, Tsikriktsis and Frohlich 2002;

Eisenhardt and Graebner 2007).

An Interpretive, Qualitative Approach to Better Understand the Phenomena

With the immature state of knowledge of resilience in critical infrastructure systems,

the current investigation required an approach that assisted in developing a deeper

understanding of this complex phenomenon by yielding detailed description of socio-

technical interaction as it occurs in its real world setting. Accordingly, a qualitative

case study methodology was considered most appropriate to assist with theory

development rather than theory testing and verification (Tsoukas 1989). In the early

stages of theory development, where phenomena are not well understood and the

relationship between them are not known, linear and rigid quantitative research

methods are considered inappropriate as they fail to enhance understanding. In

contrast, theory building techniques such as qualitative case studies, enable a more

in-depth naturalistic inquiry, supported by a flexible emergent design that is

discovery and process orientated (Yin 1994; Perry et al. 1999; Amaratunga and

Baldry 2001; Cobb and Forbes 2002; Patton 2002; Creswell 2003).

92

The primary objective of case studies and related qualitative research is to better

understand the phenomenon being investigated, and interpret the respondents‟

experiences and beliefs in their own terms, therefore making them appropriate for the

purposes of this research and consistent with the realism paradigm (Gilmore and

Carson 1996; Sobh and Perry 2006). This approach allowed the researcher to

uncover important themes, patterns and interrelationships within and across cases as

the fieldwork unfolded. It also involved personal engagement with the phenomenon

and those individuals involved directly with it to obtain the necessary depth, insight,

and rich understanding (Cavana et al. 2001; Patton 2002).

The Nature of the Research Issue, Questions and Phenomenon under Investigation

Further supporting the use of a qualitative case study methodology over other

alternatives is the nature of the research issue and associated research questions,

which according to Yin (2003) strongly influence the selection of an appropriate

methodology. The case study is the preferred methodology when „how‟ or „why‟

research questions are being posed in unexplored research areas, as this approach

allows for sufficient depth to capture the contextual richness to effectively address

the level of understanding required by this type of question (Benbasat et al. 1987;

Patton 2002; Rowley 2002; Yin 2003; Eisenhardt and Graebner 2007). Both Patton

(2002) and Creswell (2003) suggest that such questions require an in-depth study to

understand the issues surrounding the phenomena, therefore supporting the use of a

qualitative approach. Given that this investigation sought to answer „how‟ questions

to increase understanding about the phenomena of resilience occurring in critical

infrastructure systems, a qualitative case study methodology was considered to be

particularly well suited for the purposes of this research.

According to Yin (2003), the case study methodology is deemed most appropriate

when the phenomenon under investigation is considered to be complex in nature, is

contemporary and not historical in focus, and is occurring in real world setting where

the researcher has little control over the behaviour or events. He also considers it to

be particularly relevant over alternative methods when multiple levels of analysis are

required, and when the boundaries between the phenomenon and the context are not

easily identifiable (Yin 2003).

93

Phenomenon Cannot be Separated from Context

The current research examined a complex contemporary phenomenon (resilience) as

it occurs in its real world context (critical infrastructure systems). Accordingly, the

two could not be studied independently without consideration of the other as the

context is critical to understanding the phenomenon. This is due to the fact that the

interest of this research lies in understanding how resilience occurs in critical

infrastructure systems so as to maintain the reliability of the services that they

provide. Thus, the boundaries between the phenomenon and the context could not be

clearly delineated, which according to Yin (1994; 2003) makes the case study

methodology the most suitable approach. This condition differentiates case study

research from other quantitative techniques including experimental and quasi-

experimental designs, which aim to isolate the phenomenon under study from its

context (Bergen and While 2000, Eisenhardt and Graebner 2007).

A Complex Phenomenon

The use of a case study approach in order to effectively study complex phenomena is

acknowledged in the literature, because it encompasses a holistic perspective which

is vital in obtaining in-depth knowledge about complex phenomena including

organisational processes and networks, making it a particularly relevant

methodological choice for the purposes of this study (Perry et al. 1999). Given the

complexity associated with the research problem and phenomenon, the case study

methodology is particularly well suited as it allowed the research to investigate a

variety of potential variables of interest, including those that are ill-defined or

unknown in the literature, yet emerged during the research process. It also allowed

for the classification of findings into categories, and the identification of

relationships between those categories (Perry et al. 1999). According to Perry et al.

(1999), for this reason it is particularly well suited to research into networks as the

details uncovered in a case can explore the complexities and processes of people and

organisations that this type of research requires.

94

A Contemporary Phenomenon Occurring in a Real World Context

Similarly, given the importance of exploring a contemporary phenomenon in its real

world setting, the research lent itself to a qualitative methodology that is flexible in

its approach. Under such conditions, historical methods examining events of the past

can be considered inappropriate to the needs of this investigation. Moreover, given

the level of depth and detail required when studying real life phenomena, in addition

to the lack of control over behavioural events that is characteristic of most real world

research, quantitative methods including controlled experiments were inappropriate.

This is because when working under real world conditions where circumstances are

subject to change and redirection, a more naturalistic inquiry is better suited (Rowley

2002; Yin 2003). A case study approach was considered appropriate as it allowed the

researcher to emphasise the „rich, real-world context in which the phenomena occur‟

(Eisenhardt and Graebner 2007, 25).

Multiple Levels of Analysis to Understand the Phenomenon

The research problem under investigation also requires multiple levels of analysis,

with the research conducted at two levels of analysis: the individual firm level and

the industry level. The purpose of utilising data sources from differing levels in the

system is to build a more comprehensive picture of the nature of resilience across the

system by gaining multiple perspectives and insights. The case study methodology

supports this requirement, as it allowed the researcher to examine relationships

within and between organisations in the system in order to cover all necessary

contexts and develop converging lines of inquiry (Eisenhardt 1989; Yin 2003).

According to Lee, Collier and Cullen (2007), the capacity of case studies to draw

from different data sources to allow several levels of simultaneous analysis of the

dynamics in a single setting, creates the potential for a richer understanding of

organisational phenomena than could be conveyed by statistical analysis.

Given these conditions, in conjunction with the complexity and contemporary nature

of the phenomenon being investigated, the case study methodology was considered

most appropriate (Yin 2003). A further strength of this method is its unique ability to

deal with a full variety of evidence, with it offering the flexibility to utilise various

data collection tools, including observation, interviews, questionnaires, and

document analysis, which enable researchers to undertake a deeper examination.

95

This also enables researchers to obtain a better understanding of the phenomenon

under investigation from multiple perspectives, as well as the ability to triangulate

multiple source of evidence to further clarify understanding (Amaratunga and Baldry

2001; Yin 2003). The above discussion reinforces the use of a qualitative case study

design within the realism paradigm as the best methodological fit for the purpose of

this research investigation. Such requirements also had a direct impact on the

selection of an appropriate research design, which will be detailed in the following

section.

The Case Study Design

Following the choice of an appropriate methodology, it is also imperative to consider

the methodological design. According to Yin (2003), there are four different types of

case study design including single (holistic and embedded) and multiple (holistic and

embedded). These varying case designs can be distinguished according to the

number of cases they contain (single or multiple), in addition to the number of units

of analysis involved (holistic or embedded). Underpinning the selection of an

appropriate design to suit the needs of each research investigation is the

identification of the unit of analysis (Yin 2003; Stake 2006).

Multiple Units of Analysis

As elucidated earlier, this research is broken into two parts requiring two stages of

fieldwork to examine two approaches to resilience as they contribute to reliability in

a critical infrastructure system. Accordingly, the primary unit of analysis for this

investigation to enhance understanding about the phenomenon was the critical

infrastructure system as evident in Figure 4.3.

Figure 4.3: Description of Units of

Analysis

Description Key

Phenomenon Approaches to Resilience in

Critical Infrastructure Systems

Unit of

Analysis

Critical Infrastructure System

Embedded

Sub-Units

Sub-Sectors (Generation,

Transmission & Distribution)

Organisations

Queensland

Electricity Industry

96

As depicted in the figure, the research investigation also requires an examination of

key sub-units embedded within the broader system to address the research questions.

The first is concerned with examining the “Wildavsky-ian”, organisational based

approach to resilience to understand individual firm responses to perturbations, with

firms a necessary sub-unit of analysis. The findings from the organisational-level

analysis then informed the direction of the system-level analysis where the focus

shifted to inter-firm relationships examining resilience from a “Holling-ian” based

perspective to better understand the networked system‟s response to perturbations.

The embedded units of analysis will provide an overall aggregate picture of

resilience as it is manifested in a critical infrastructure system. An examination of the

two approaches contributes to an enhanced understanding of approaches to resilience

as applied to the reliability of service provision of a networked critical infrastructure

system.

A Single- Embedded Case Design

Given that the research required multiple levels of analysis within the one critical

infrastructure system, the study therefore lent itself to an embedded single case

design, which allowed the researcher to examine the phenomenon from both the firm

and inter-firm perspective to gain a better understanding of each approach to

resilience for defining pathways for achieving reliability in a broader critical

infrastructure system. The goal was to understand the case within its situational

context. Although this research supports the use of a single case study design, much

of the literature pertaining to the design of case study methodologies encourages the

use of multiple case designs as they are considered to provide a more rigorous and

sound approach to theory building, by way of analytical generalisability achieved

through the process of replication logic (Benbasat et al. 1987; Eisenhardt 1989; Miles

and Huberman 1994; Riege 2003; Yin 2003; Halinen and Tornroos 2005; Eisenhardt

and Graebner 2007).

The management field is not alone in its debate about the value of a small-versus-

large number of cases; however, even the use of a single case can provide a rich

understanding of a specific phenomenon given their ability to draw different sources

of data to allow for several levels of simultaneous analysis of the dynamics within a

97

single setting (Lee et al. 2007). Thus, the use of a single case can be considered

valuable under certain circumstances and is supported under a number of conditions,

especially when unique contextual or phenomenological opportunities are presented.

For example, Siggelkow (2007, 20) contends that „a single case can be a very

powerful example‟, particularly when there is interest in gaining inspiration for new

ideas (inductive theory building), or for refining a conceptual argument by providing

rich insight and understanding. Yin (2003) also highlights a number of circumstances

under which a single case design is appropriate including when a case is considered

to represent a typical, a unique, or a critical case, and also when the researcher has a

unique research opportunity to study a case on the grounds of its revelatory nature, or

for a longitudinal study examining the one case at multiple points in time.

The case (the Queensland Electricity Industry) was selected as there was an

opportunity for research access through institution based contacts to explore the

phenomenon in its context, what Yin (2003) refers to as a „revelatory case‟. This

selection also provided the researcher with an opportunity to shed light on the

conceptual argument presented and maximise what could be learnt about the

phenomenon by yielding detailed understanding. Whilst it would have been useful to

examine more than one critical infrastructure system and employ a multiple case

design, this was not considered feasible given the needs of the research investigation

(i.e. examination of an entire critical infrastructure system with embedded sub-units)

in conjunction with the practical constraints associated with a Masters dissertation

(i.e. time, resources, report length). This is supported by Halinen and Tornroos

(2005), who contend that an embedded single case design is often the only option for

researchers examining networks, given the demands on the researcher to examine

multiple organisations and relationships within the one broader network.

Selection of Embedded Units – The Sample

Although only one case study was examined in this research examination, the

research issue and associated case study design required the selection of embedded

sub-units within the case for investigation. The population from which the sample of

sub-units is drawn are those firms operating in Queensland‟s Electricity Supply

Industry. As the sample is bound to a particular system in a geographic location,

98

there were only a limited number of firms from which to choose, operating across the

three sectors within the „networked‟ system in Queensland as evident in Table 4.2.

Table 4.2: Firms in Queensland’s Electricity Industry

Sector Number of

Firms Generation 8 Transmission 1 Distribution 2

TOTAL 11

Although there are only a relatively small number of firms operating in the entire

networked electricity system, given the time and resource constraints associated with

a Masters Dissertation in addition to access issues associated with location and

proximity, this research examination could not examine all of the players within the

system. Therefore, although it would have been the most valuable approach,

comprehensive sampling was not a viable option. Accordingly, given the aims of the

research investigation, purposeful sampling was employed to select a small number

of information-rich cases nested in their context for study in depth. Patton (2002)

contends that information-rich cases are those that will enable the researcher to

discover the most about the issues of central importance to a research investigation.

The cases in this investigation were therefore selected to best illuminate the research

issue and associated phenomenon, as well as providing diversity across contexts

(Patton 2002; Stake 2006), that being the different sectors that make up the

Queensland Electricity Supply Industry.

As Table 4.2 highlights, the eleven firms that operate within the industry belong to

three dyads, based on the sectors along the supply chain (i.e. Generation,

Transmission, and Distribution) identified as being critical to the reliability of

electricity service provision. It is necessary to take these dyads into consideration

because, although they are not a specified unit of analysis directly related to the

research questions, they allow a focus on unique contextual conditions for

subsequent analysis and are important in the sampling process so as to provide

sufficient insight and coverage of the research issue across the wider system.

Therefore, the firms have been drawn from each of the different sectors so as to

enable industry-wide and important contextual comparisons.

99

However, simply taking an arbitrary number of firms from each sector was not a

viable option given that there are only one and two firms in the Transmission and

Distribution sectors respectively. Although there are numerous firms in the

Generation sector, there are just a few dominant players that if perturbed could have

a serious impact on the reliability of service provision. Accordingly, firms were also

selected because they had sector dominance and thus are important to the overall

reliability of the supply chain, which is an important consideration in terms of

network reliability. Such dominant firms within the system can be expected to

display exemplary outcomes in regards to the investigated phenomenon (resilience).

In the case of Transmission and Distribution, where there are a very small number of

firms (i.e. one and two respectively), all were important enough to the overall

reliability of the system to warrant investigation. However, in the case of

Generation, the firms were selected based on their total megawatt (MW) production

supplied to the electricity market, as their role in ensuring the reliability of service

provision can be considered far more significant than the smaller Generation firms.

Based on this criterion, three firms emerged as dominant within the Generation

sector. Collectively they are responsible for almost 75% of the State‟s total electricity

production that is supplied into the market, with each operating multiple Generators

at various locations throughout the State. Although each has lost some of its

traditional market share in recent years with a substantial increase in generative

capacity from private providers, all remain responsible for fifteen percent (15%) or

more of electricity supplied into the market and are thus vital to the reliable

functioning of the Generation sector, and indeed the wider system.

The largest individual Generator contributing around fifteen percent (1,680 MW) to

the State‟s total generative capacity is the Gladstone Power Station which is privately

owned and operated via a joint venture between Comalco and NRG. Despite its

comparatively large generative capacity, approximately half of its total output (840

MW) is consumed by the adjacent Boyne Island aluminium smelter. Therefore its

relationship and contribution to the wider electricity system is minimal at

approximately eight percent (when adjusted). No other Generation firm comes close

to the identified top three in terms of power supplied into the wider electricity

100

market, solidifying the importance of these firms to the security of supply in the

wider electricity industry.

Although the sampling strategy was based on the identification of major sectors, it

can be considered to be stratified purposeful sampling combined with criterion

sampling, rather than quota sampling as taking an arbitrary number from each sector

was neither viable nor preferred. Instead dominant firms were purposefully selected

from each sector on their ability to best illustrate characteristics of the phenomenon

(i.e. dominant firm criterion – most integral firms to the reliability of service

provision) in order to give a representative picture and provide sufficient coverage of

the broader system (Miles and Huberman 1994; Patton 2002).

Based on the above discussion, six information-rich firms were purposefully selected

for study so as to understand and explain the conditions under which resilience in

critical infrastructure systems is likely to be found. The selected firms and relevant

selection criteria are detailed in the following table.

Table 4.3: Sampled Firms and Criteria

A B C D E F

Sector Base-load

Generation Base-load

Generation Base-load

Generation

Network

Provider Network

Provider Network

Provider

Sector

Dominance

(Proportion

of Segment)

35.5% (of the

State‟s

generating

capacity)

22.2% (of the

State‟s

generating

capacity)

15.4% (of the

State‟s

generating

capacity)

33.33% (serving 1/3

of the State‟s

population

with 97%

of distribution

infrastructure)

66.66% (serving 2/3

of the State‟s

population

with 3%

of distribution

infrastructure)

100%

(serving the

entire

State)

Public /

Private Public

(GOC) Public

(GOC) Public

(GOC) Public

(GOC) Public

(GOC) Public

(GOC)

All six firms operate as key players within the same system, under similar regulatory

conditions and are all GOCs, but have been carefully selected from different sectors

so as to provide coverage of the integral nodes of the broader system and allow for

comparisons based on contextual similarities or differences between these sectors

(diversity across contexts).

101

Informant Selection

Following the identification of the firms sampled for this investigation it was

necessary to identify the participants selected for data collection. Where possible,

two informants were purposefully selected from each of the firms based on their

experience and relevance to the investigated phenomenon to best help understand the

case and the phenomenon. The participants selected also represent various

managerial functions. In each of the identified firms a participant was selected from

the executive/strategic level and another from the operational/technical level to assist

with the analysis of the data. In addition, the investigation also employed a key

informant with significant industry experience but no direct affiliation with any of

the identified firms. The use of a key informant greatly facilitates understanding

about the case (Stake 1995). The data obtained from the key informant provided an

external overview perspective, offering a different view of the phenomenon and also

facilitated triangulation of evidence and data sources to enhance construct and

external validity of the research, by helping to limit bias emerging from the analysis.

The number of informants in this investigation was constrained by competing

obligations and requires a trade-off between practical restrictions (i.e. time, resources

and report size restrictions) and the expected reasonable coverage and depth of the

research issue and related phenomenon.

Data Collection Strategies

Data from case studies can be drawn from a variety of sources to enhance the quality

of an investigation. The ability to employ multiple data collection methods is

considered to be a relative strength of the case study methodology, with the purpose

of using multiple methods to obtain a rich set of data surrounding the specific

research issue, as well as capturing the contextual complexity (Benbasat et al. 1987).

The use of multiple tools provides different types of data, which contributed to the

rigour of the study by way of data triangulation so as to partially overcome the

deficiencies associated with employing one method and lead to a more consistent,

objective picture of reality (Cho and Trent 2006). This enabled the cross-checking

and comparison of data collected, as different kinds of data could be brought together

to better clarify various aspects of a phenomenon. The data collection strategies

employed in this investigation included semi-structured in-depth interviews

102

(individual and small-group), and an analysis of relevant organisational and industry

documents, both of which will now be discussed in detail.

In-Depth Interviews

The primary method of data collection employed in this investigation was the in-

depth, open-ended interview. The interview is widely considered to be one of the

most important sources of qualitative case study information, particularly to gain a

better understanding of reality through the perspectives of those directly involved, as

interviews allow data to be reported and interpreted through the eyes of particular

respondents (Stake 1995; Patton 2002). While there are a number of variations to the

interview strategy that differ in their extent to which the wording and sequencing of

questions are predetermined (Devers and Frenkel 2000; Patton 2002; Cassell 2009),

this research investigation utilised a semi-structured, open-ended approach with

multiple informants to corroborate findings and search for contrary evidence.

It also utilised a combination of individual, one-on-one interviews (Patton 2002) and

small-group interviews, with the small-group sessions utilised in the final stage to

corroborate earlier findings and stimulate deeper discussion amongst participants.

The group interview sessions complemented the individual interviews, providing for

the triangulation of data (Frey and Fontana 1991; Begley 1996). Interview data

obtained from the one-on-one and group interview sessions was recorded by the

researcher in the form of written field notes, but was further supported by the audio

recording of informant responses to assist with the data collection process and

subsequent analysis (Patton 2002).

Given that this research utilised a semi-structured approach to interviews, primary

topics and themes were defined in advance, however the wording and sequence of

questions was decided upon during the interview (Patton 2002). To provide an initial

structure and familiarise participants with the generic themes to be explored, the

investigation utilised informal questionnaires to support interview sessions one and

two. Participants were asked to rate how extensive their organisation‟s preparedness

was on a continuum from Low (limited) to High (extensive) for a range of BCM and

HRT themes identified in the literature. Participant responses served as a guide to the

subsequent interview questions, which began by asking a standard question for each

103

theme: “Why did you rate your organisation as [High, Moderate or Low] for this

particular theme?” Additional questions were utilised depending on the participants‟

initial response in order to elucidate more information and explore interesting

comments that emerged. The supporting questionnaires are available in Appendix A,

whilst the interview guide for all sessions is available in Appendix B.

This open-ended, semi-structured approach enabled the researcher to collect

consistent information across all participants by way of prompting questions and

interview themes in order to maintain the relevance of the interview to the research

issue and associated phenomenon. An additional benefit however, was in its ability

to provide informants with the flexibility and autonomy to comment on related

factors and issues that enabled the researcher to capture responses and information

that they were not previously aware of (Patton 2002). Such flexibility enabled the

researcher to yield richer data by probing deeper responses from informants,

providing important insights into the phenomenon. It was the researcher‟s task

however, to ensure that the key interview objectives were ascertained from each

informant to the best of their ability. Additional benefits of this approach lie in its

ability to capture contextual factors, to establish rapport and motivate respondents, to

anticipate and close logical gaps in the data, as well as to clarify questions and clear

doubts (Patton 2002).

The use of this particular approach can be justified for the following reasons. Firstly,

this approach is supported by the research design, which although influenced by

some prior theory, was still largely inductive in nature and thus focused on theory

development rather than theory testing. Furthermore, given that the study was

qualitative in its approach and more exploratory in nature where the aim was

discovery rather than soliciting a predetermined response, an open-ended protocol

was necessary to provide insight and understanding about important issues from the

perspective of individual informants. Similarly, given that there has been limited

research conducted within the research area and the research entailed gaining an in-

depth understanding of resilience in the electricity industry, it required substantial

interaction with those directly involved with this phenomenon, as well as flexibility

in interview protocol to allow for potentially important themes to emerge that may

not have previously been considered (Patton 2002).

104

Despite its relevance to this research investigation and its many strengths, this

approach is not without its limitations, which include potential for substantial

differentiation amongst responses from varying perspectives as a result of the

flexibility in questioning, reducing the comparability of responses. Furthermore, this

approach is subject to common problems such as bias and poor recall, where

important topics may be inadvertently omitted or incorrectly articulated. Finally,

such an approach is often considered more difficult than a fully structured interview,

particularly for someone inexperienced in conducting interviews due to the

improvisation and mental preparation required. Attempts to overcome the associated

limitations of interviews were addressed through the development of an interview

protocol or interview guide, as well as by corroborating interview data from multiple

informants and from the accompanying data collection strategy: document analysis

(Hoepfl 1997; Patton 2002).

Document Analysis

A second data collection method highly relevant to the purposes of this research was

employed to strengthen the findings of the investigation. Document analysis is

considered an important source of evidence in case study research (Yin 2003) and

was utilised in this research investigation for a number of reasons. Firstly, this

strategy is considered to provide a rich source of information about organisations,

and is thus considered to be a relevant strategy for case study research, particularly

where the firm is the unit of analysis as was the case in this investigation. Similarly,

this research investigation lent itself to documentary analysis, given that the

phenomenon under investigation (resilience) intrinsically involves planning, process,

and policy based elements which are often comprehensively detailed in written

documentation.

Furthermore, Patton (2002) suggests that documents can be a valuable source of not

only direct information supporting or disproving interview findings, but also indirect

information, as they may serve as stimulus for alternative paths of inquiry that can be

pursued during the accompanying interviews. An analysis of relevant organisational

and industry documents enhanced understanding of the phenomenon beyond what

could be provided from interviews alone. The data obtained from relevant

105

documentation within the case study organisations provided supplementary detail of

the relevant conditions, processes and strategies which may have not been easily

accessible from interviews given that information obtained from informants may

differ from what is actually happening in practice. This is also true of the reverse

situation, where paper based procedures and plans may in fact differ from what is

happening in practice (i.e. documentation bias). Accordingly, when used in

conjunction with interviews, document analysis provided an important point of

reference in the triangulation of evidence, helping to overcome the limitations of

both approaches (e.g. potential biases) when used independently of one another

which further facilitates the illumination of the research questions and associated

phenomena (Miles and Huberman 1994; Patton 1999; Patton 2002).

Data Collected

Although two informants were requested from each organisation to participate in the

data collection process, this was not possible in all cases with two of the

organisations only having one employee available to participate (Organisation B and

F). Accordingly, this provided an overall sample of ten (n=10) informants, with each

interviewed across four sessions in a combination of individual and small-group

interviews. As evident in Table 4.4, a total of thirty interviews were conducted with

ten participants from the six organisations.

Table 4.4: Interview Schedule (Number of Sessions)

Sector

Organisation

Participant

Session 1

Individual

(BCM)

Session 2

Individual

(HRT)

Session 3

Group

(System)

Session 4

Group

(Follow-

up)

Generation A A1 (1) (11) (19) (25)

A2 (2) (12)

B B1 (3) (13) (20) (26)

C C1 (4) (14) (21) (27)

C2 (5) (15)

Network

Providers D D1 (6) (16) (22) (28)

D2 (7) (17)

E E1 (8) (18) (23) (29)

E2 (9) (19)

F F1 (10) (20) (24) (30)

106

The iterative interviews took place over the period of four months (May-August

2009). An additional two interviews were conducted with the key informant to

corroborate findings. Similarly, the documents reviewed to support interview

findings are detailed in Table 4.5.

Table 4.5: Relevant Corporate Documentation Utilised

Organisation A 1. Annual Report 2007/08

2. Statement of Corporate Intent (2008)

Organisation B 3. Annual Report 2007/08

4. Statement of Corporate Intent (2008)

Organisation C 5. Annual Report 2007/08

6. Statement of Corporate Intent (2008)

Organisation D 7. Annual Report 2007/08

8. Statement of Corporate Intent (2008)

9. Network Management Plan (2008-13)

10. Summer Preparedness Plan (2008-09)

Organisation E 11. Annual Report 2007/08

12. Statement of Corporate Intent (2008)

13. Network Management Plan (2009-14)

14. Summer Preparedness Plan (2008-09)

15. Emergency Management Handbook

Organisation F 16. Annual Report 2007/08

17. Statement of Corporate Intent (2008)

18. Annual Planning Report (2009)

19. Emergency Management Handbook

20. Testing Schedule (2009)

Miscellaneous 21. Electricity Act (1994)

22. The Electricity Industry Code

23. Somerville Report (2004)

Analysis of Case Study Data

The design of the current study was a single case with embedded units employing

qualitative data collection methods. The analysis therefore applied qualitative

techniques to make sense of the voluminous amounts of raw data generated by this

type of study and elicit meaning (Miles and Huberman 1994; Patton 2002). This

involved discerning significant patterns and developing a framework for

communicating the essence of what the collected data revealed about the

phenomenon (Boyatzis 1998; Patton 2002). The focus of data analysis within the

realism paradigm according to Sobh and Perry (2006) should be on interpretations of

the data regarding underlying structures and mechanisms.

107

Single case studies are typically considered quite restricted in their analytic

approach, particularly within the realism paradigm (Sobh and Perry 2006), as

replication logic cannot be employed to augment and draw strong conclusions. A

primary benefit however, of an embedded case design is that it allows for some

comparison across embedded cases within the single case (Patton 2002; Yin 2003),

that addresses the effects of micro-contextual issues within a macro-context which is

critical in realism research (Sobh and Perry 2006). In the present study, where there

were cases nested (embedded) within a larger case, the analysis commenced with an

examination of each of the embedded units (within-case analysis). It then proceeded

with cross-case comparisons (pattern matching) across the individual embedded

cases (Patton 2002), in order to understand underlying structures and mechanisms by

examining the effects of the embedded contexts (Sobh and Perry 2006). While it was

important to maintain sensitivity to the context of the larger case, the inductive case

comparison allowed the researcher to deepen understanding of the phenomenon of

interest by also identifying similarities and differences across the embedded units

(Koulikoff-Souviron and Harrison 2006). These combine to give a holistic picture of

the larger system.

In the present study, familiarity and insight was sought through the process of

reading, listening and transcribing interview recordings verbatim. In addition, all

field notes were organised and re-written by the researcher, as well as reading and

making notes concerning the documentation consulted (Miles and Huberman 1994;

Patton 2002). The process of organising and condensing the raw data collected

resulted in a carefully constructed, comprehensive write-up of each of the embedded

organisational units to assist with subsequent analysis (Miles and Humberman 1994;

Patton 2002). This process was particularly important in allowing preliminary

insights to develop, illuminating each of the sub-units as a holistic entity within the

larger case and also to gain an understanding of that particular entity as it is situated

(Patton 2002). These initial steps in the process of data reduction and organisation

were undertaken after each stage of data collection (Miles and Huberman 1994). This

formed an integral step in the overall process of content analysis which is important

in inductive studies, and involved discovering the core consistencies and meanings

within the data, commonly referred to as patterns and themes (Patton 2002).

108

With the aim of identifying emergent patterns and themes from the perceptions

relevant to the external reality being sought (Sobh and Perry 2006), the individual

write-ups of the embedded units were manually coded and categorised to dissect the

data meaningfully. During the coding process recurring regularities in the data about

relevant structures and mechanisms were identified, with these converging lines of

inquiry revealing patterns which were then sorted into categories (themes), some of

which were predetermined from existing literature. Most of the categories however,

were either refined or developed through the data collection and analysis process to

ensure their meaningfulness and accuracy to the exploratory study (Miles and

Huberman 1994; Boyatzis 1998; Patton 2002; Sobh and Perry 2006). Another

analytical strategy employed was examining the data for divergence (irregularities) –

that is identifying data that does not appear to fit the dominant identified patterns

(Eisenhardt 1989; Amaratunga and Baldry 2001).

Trustworthiness of Qualitative Case Study Research

The debate surrounding appropriate criteria to measure an investigation‟s rigour is

contentious (Seale 1999). Case study research is the target of much criticism in the

scientific community (Parkhe 1993), often criticised for its lack of methodological

rigour, little basis for scientific generalisation, and potential for researcher bias (Yons

2003; Zach 2004). However, Riege (2003) purports that integrity and rigour can be

built into case study research at the research design, data collection, data analysis and

composition stages by employing a variety of techniques across a range of

established design tests.

On the one hand, he mentions the design tests typically associated with quantitative

studies which include construct validity, internal validity, external validity, and

reliability. On the other, he suggests the use of the qualitative techniques for the

corresponding design tests of confirmability, credibility, transferability and

dependability to establish quality in qualitative research designs (Riege 2003) as per

the framework established by Lincoln and Guba (1985). Each approach advocates

particular techniques which are often quite similar. Given the qualitative and largely

interpretive nature of this research, combined with its exploratory orientation and

109

single case design, efforts have been made to address each of the corresponding

criteria, and in the process have also covered a number of the traditional measures.

Credibility

Credibility is the corresponding design test for internal validity. Yin (2003) contends

that internal validity is only of concern for causal or explanatory studies. Given that

this research investigation was an exploratory qualitative study, the corresponding

qualitative design test of credibility was adopted to enhance the rigour of this

investigation. There are a number of strategies advocated to enhance research

credibility including triangulation, peer debriefing, member checks and researcher

self-monitoring.

In this investigation triangulation has been employed as a strategy to enhance the

credibility of the research findings, by providing cross-data consistency checks,

verify facts, and to ultimately strengthen confidence in the conclusions drawn (Patton

2002). Consistent with the realism paradigm, this strategy involves the acquisition

and understanding of multiple perspectives on a single reality to assess the

consistency of findings and develop a deeper, more comprehensive picture. This

research investigation utilised a number of triangulation strategies to demonstrate

convergence (consistency) including triangulation of multiple data sources (data

triangulation), which involved sourcing data from different participants at different

levels in an organisational hierarchy, and also by employing multiple data collection

methods including in-depth interviews (individual and small-group), and document

analysis that provide cross-data consistency checks (Patton 2002).

To further strengthen credibility, this investigation has also employed the technique

of peer debriefing, whereby data analysis and conclusions were discussed with

colleagues on a regular basis to check for consistency (Lincoln and Guba 1985). A

similar technique utilised in this investigation, that is also said to enhance the

credibility of the research findings, is the process of member checking. This process,

according to Lincoln and Guba (1985), is the most crucial technique for establishing

credibility, and involved engaging informants in a review process to examine a draft

of the case report and confirm the information detailed in it, including categories,

interpretations and conclusions. This process enabled modifications where necessary

110

to correct any unclear or inconsistent aspects (Stake 1995; Patton 2002; Riege 2003;

Stake 2006).

Transferability

Similarly, given that this research investigation was an exploratory, qualitative case

study within a single setting, focusing on an opportunity to study the research issue

within the Queensland Electricity Supply Industry, traditional methods of achieving

external validity (e.g. replication logic) do not apply. Therefore, this research

investigation was not concerned with the generalisability of research findings, but

rather the extrapolation or „fittingness‟ of findings when transferred to other similar

settings (Patton 2002). This, according to Patton (2002, 584), involves „modest

speculations on the likely applicability of findings to other situations under similar,

but not identical, conditions‟. Therefore, it was the researcher‟s responsibility to

provide dense description of the research context and sufficient descriptive data to

allow other researchers to determine if the study is fit for comparison. To enhance

the rigour of the investigation, transferability has been addressed through a number

of techniques including the establishment of a case study database, which provides a

thick description of each of the embedded cases, with the data documented and

organised for easy inspection. It also uses cross-case analysis (of the embedded sub-

units), to enhance the robustness of findings.

Dependability

Dependability (or trustworthiness) is met through securing credibility of the findings.

The goal of dependability is to minimise the errors and biases in a study. In the past,

case study research procedures have been poorly documented, making external

reviewers suspicious of the reliability of the case study. Accordingly, specific tactics

to overcome the shortcomings detailed in previous case study research have been

employed, including the use of a case study protocol and the development of a case

study database. The researcher has taken as many steps as operationally possible to

enhance dependability such as maintaining an audit trail of major research milestones

and stages so that the activities conducted may be audited by other researchers

wishing to replicate the study or selected parts of it (Lincoln and Guba 1985; Parkhe

1993; Patton 2002; Tobin and Begley 2004).

111

Confirmability

If a study demonstrates credibility and fittingness, it is also said to possess

confirmability (Lincoln and Guba 1985). Confirmability is a strategy to ensure that

the findings are neutral and free from bias. As mentioned earlier, triangulation

measures (i.e. multiple participants, multiple data collection methods) can assist with

enhancing the credibility of research. This strategy is also said to add to the integrity

and confirmability of the research by strengthening confidence in findings, through

the development of converging lines of inquiry (Patton 2002), by providing a

stronger substantiation of the research constructs (Eisenhardt 1989), by reducing the

likelihood of misinterpretation (Stake 2006), and also by reducing researcher bias

(Tobin and Begley 2004).

Another approach widely advocated in the literature to ensure confirmability is the

use of a confirmability audit (Riege 2003), or more specifically, establishing a chain

of evidence so that auditors can see how conclusions have been made. This is

particularly important as confirmability is enhanced when the integrity of the original

evidence is maintained and is clearly evident in the conclusions drawn. Accordingly,

this research investigation has preserved a strong chain of evidence to enhance the

reliability of information firstly by ensuring that sufficient links have been

maintained with the original data by using direct citations from the case study

database (i.e. the use of interview transcripts and notes made) to allow for cross-

checks of particular sources (Riege 2003). Furthermore, the data collection process

has been accurately documented indicating the circumstances under which it was

collected including reference to time and place, and has also been performed

according to the established case study protocol. Such strategies facilitate easier

cross-referencing between sections and also enable external observers to trace the

steps of the investigation by way of the evidentiary process.

Limitations

A number of limitations associated with the research design have been identified

throughout the Chapter including the use of a single case study design in addition to

the transferability of findings from the research investigation associated with case

studies in a single setting. Whilst the investigation would certainly have benefitted

from the inclusion of another industry case study, this was not feasible due to the

112

time and resource constraints associated with the preparation of a Masters

Dissertation. This practical constraint is also emphasised by the selection of the

research problem dealt with and also, given the industry analysis required the

examination of multiple organisations within the one broader network. This,

according to Perry (1998), is well within the bounds of what is considered a

reasonable number of cases for postgraduate research. These constraints also, to

some extent, limited the depth and detail of the analysis reported to fit within the

reasonable bounds of a Masters Dissertation. The researcher has attempted, to the

best of their ability, limit any negative factors intrinsic to the design and

implementation of the study.

Ethics

This research has been conducted in accordance with Queensland University of

Technology‟s Research Ethics Process for studies involving human participants. In

line with this process, all respondents were required to give informed consent before

participating in the research investigation. In doing so, all participants were asked to

read and sign an ethical consent form, which highlighted relevant information

regarding the ethical considerations for this project and specified that they could

choose to withdraw their participation from the project at any point in time (see

Appendix C).

Similarly, given that interview sessions were audio recorded so that interviews could

be transcribed verbatim, participants were also asked for consent before this could

occur. A particular ethical concern relevant to this investigation was in respect to

maintaining the anonymity and confidentiality of respondents and their affiliated

organisations. Participant responses have been treated confidentially, and the

researcher, to the best of their ability has implemented measures to protect the

identities of respondents‟ and organisations‟. It must be noted however, that due to

the nature of the industry anyone with a reasonable understanding of its structure and

the organisations within it may be able to ascertain their identities indirectly. To this

end, consideration may be given to seeking an embargo on open publication of the

thesis for a relevant period of time following completion of assessment processes.

113

Conclusion

This Chapter has discussed the choice of an appropriate methodology utilised,

putting forth arguments justifying the use of a qualitative case study methodology to

explore the phenomenon under investigation (resilience). It also discussed the data

collection tools and analytical process that were utilised to support this investigation.

The Chapter concluded with a discussion about the trustworthiness of qualitative

research and the measures undertaken by the researcher to enhance the

transferability, credibility, dependability, and confirmability, as well as the

limitations associated with the research design, and ethical considerations relevant to

this investigation. The following Chapter details the results of the data collection

process.

114

Chapter 5: Results

Introduction

This Chapter presents the results of the data collection process, which utilised a

combination of methods including individual and small group interviews, document

analysis, and supporting questionnaires. The data collected will be used to answer the

overarching research problem identified in Chapter 3.

How do networked critical infrastructure systems operating in an

increasingly institutionally fragmented environment foster resilient

capabilities to ensure the reliable provision of essential services?

To do this it will address two research questions, designed to explore the different

frames of resilience; organisation (“Wildavsky-ian”) resilience and system

(“Holling-ian”) resilience. The first, from a “Wildavsky-ian” perspective, uses

organisations as an embedded unit of analysis, to explore how they use BCM and

Reliability-Enhancing characteristics (as identified in the HRT literature) to

internally foster resilient capabilities. This will involve a within-case, and cross-case

analysis of organisation results.

RQ1: How do organisations that bear responsibility for the reliable

management of electricity infrastructure organise for resilience?

The other, from a “Holling-ian” perspective, uses the industry as a unit of analysis, to

explore how the network of organisations collectively fosters resilience from an

industry wide perspective. This will involve an analysis of industry level data.

RQ2: How do networks of critical infrastructure organisations foster

system resilience to ensure the reliable provision of essential services?

The results presented are shown in two sections. The Chapter begins with the

findings from the organisational level analysis, and focuses on the results of BCM

capability and HRT capability for each organisation studied. This will include

findings from the within-case and cross-case analyses. The second section is a

presentation of findings of analysis of whole-of-system factors.

115

Organisation (“Wildavsky-ian”) Resilience

At the outset of the data collection process all participants were provided with a

questionnaire and asked to rate how extensive their organisation‟s preparedness was

on a continuum from Low (limited) to High (extensive) for ten BCM and ten

Reliability-Enhancing characteristics identified in the literature. This was conducted

in order to gauge participant perceptions of their organisation‟s BCM and HRT

capability, and served as a basis for the subsequent interview questions. The

questionnaire ratings of the ten participants are evident in Appendix D.

The results illustrate that in general there was consistency between the ratings in the

organisations where two respondents were involved. It is also apparent that all of the

organisations were rated fairly high across the ten themes in most instances (albeit

with a few exceptions). However, when participants were probed further in the in-

depth interviews and asked to justify their responses (e.g. “Why did you rate the

organisation as high for this theme?”), it became apparent that on occasion their

response did not support their rating. For example, in Organisation (C) both

participants rated their preparedness as high for testing and review; however, their

verbal responses did not support this contention, which could indicate gaps in the

organisation‟s testing activities. These ratings should be interpreted in light of their

function as an initial means for engaging informants with the content and approach

being taken in the work.

Participant responses from interviews were used in conjunction with an analysis of

available documents from each of the organisations to determine a capability rating

for key BCM and Reliability-Enhancing characteristics examined, as defined in

Chapter 2 and Chapter 3. Results of this analysis were then applied to criteria

reference tables (available in Appendix E and F), which were developed for both

BCM and HRT characteristics based on the existing literature and International

Standards.

116

This process assisted with the analysis of results for each participating organisation9

and provided the basis upon which each agency was rated across the BCM and HRT

characteristics. The criteria tables are designed as scales to provide for a more

consistent rating of the organisations‟ capabilities across the BCM and Reliability-

Enhancing characteristics examined, and enable the organisations to be ranked by

maturity similar to a criteria assessment sheet. The BCM criteria scale utilises a five

point scale (High, Moderate-High, Moderate, Low-Moderate and Low) and is based

on recognised International Standards and best practice BCM. Alternatively, the

HRT criteria scale utilises a three point scale (High, Moderate, Low) and has been

developed based on the characteristics discussed in the extant HRT literature.

This Chapter will firstly rate each of the organisations against the BCM and HRT

criteria tables based on participant responses in interviews and information contained

in supporting company documentation in order to give a picture of their resilient

capabilities. This will be followed by a discussion of the emergent themes from the

within case (organisational level) analysis, before presenting the cross case analysis.

Within Case Analysis

Organisation A

Organisation (A) is one of Queensland‟s largest base-load Generators, and

accordingly reliability of their operations is critical to security of electricity supply in

the State. Interviews revealed that the value of BCM to resilient outcomes is well

recognised within the organisation, and findings suggest that it has developed a

mature BCM capability since its initial implementation in the late 1990s, as evident

in Table 5.1.

9 All reliability-related data referenced in this Chapter is derived from annual reports and related

documentation and confirmed during interviews with representatives from each organisation.

117

Table 5.1: BCM Capability – Organisation (A)

Use of

Standards

Integration

with RM

Threat

Assessment

Business

Impact

Analysis

Testing

&

Review

Governance

Structures

Senior

Mgmt

Embedded.

HIGH

MOD/

HIGH

MOD

LOW /

MOD

LOW

Equally, it is clear from Table 5.2 that Reliability-Enhancing characteristics, as

identified in the literature, were also quite strong within the organisation. Thus, when

taking both BCM and HRT into consideration, it is apparent that Organisation (A)

exhibits the characteristics/capabilities of a resilient organisation.

Table 5.2: HRT Capability – Organisation (A)

Process / Design Characteristics Goal / Commitment Characteristics

Technical

Perform.

Redundanc.

Flexibility

Autonomy

Account.

Hierarchy

/Decision

Training/

Learning

Import. of

Reliability

Culture of

Reliability

Comm. to

Reliability

External

Oversight

HIGH

MOD

LOW

118

Business Continuity Management

The organisation currently exhibits a high level of maturity in respect to BCM, with

participants indicating that it has strengths across all BCM characteristics examined

(see Table 5.1 – previous page). The organisation has implemented an enterprise-

wide approach that is across all functions ranging from the traditional technical

focus, to the corporate and non-core functions. For example, participants described

well established processes that are externally audited and aligned to recognised

International Standards. In addition, formal integration of Risk Management and

BCM activities has been implemented. Similarly, the organisation has a broad and

extensive continuity plan testing regime, a thorough understanding of a broad range

of threats, and it has also conducted an enterprise-wide BIA with an excellent

understanding of critical business processes. This contributes to the mature state of

BCM capability, with BCM implemented consistently across the whole organisation.

The high maturity of testing and review activities is demonstrated by the following

quote:

“[We have an] annual test plan which takes shape not only as a desktop

exercise, but a fully blown real test, that is externally facilitated, involves a

debrief and is audited. This makes the organisation a lot more advanced than

other organisations I‟ve worked for. The testing and gap analysis is well

developed and the testing of the BCPs is well understood within the business

units and...is very well anticipated [because] there‟s an acknowledgement that

we need to test those plans [to be] as ready as we can for business

interruption.” (A)

Further analysis revealed that this high level of maturity has been achieved through

strong supporting governance structures and a very high level of involvement and

engagement by Senior Management, particularly the CEO, which participants

indicated has been instrumental in embedding BCM within the organisation and

driving the development of a continuity culture. Thus, participant responses

demonstrated that there are no significant gaps in the organisation‟s BCM program,

with capability development supported across all areas. Despite this, participants

noted that there was room for improvement in regards to updating the BIA, and that

overall it is a process of continuous improvement where the organisation is always

seeking to enhance its BCM program. This recognition further reinforces the

maturity of their capability, and indeed their resilient capabilities.

119

High Reliability Theory

Organisation (A) again performs strongly across all Reliability-Enhancing

characteristics with exceptional reliability levels. This is supported by a

demonstrated moderate-to-high level of capability across all Process and Design, and

Goal and Commitment characteristics examined as evident in Table 5.2 (see page

117). Process and Design strengths include high sustained technical performance,

supported by a good understanding of the plant, a highly trained and experienced

workforce, combined with rigorous maintenance and asset management regimes of

technical equipment to ensure a high level of quality is continuously maintained.

This is best highlighted by the recent refit of the organisation‟s control system “to

ensure plant availability into the future” and that in the 2008/09 financial year the

organisation proactively “invested in major plant upgrades and maintenance to

enhance the efficiency and reliability of generating assets” (Annual Report 2008/09).

Similarly, the organisation has a very high level of in-built physical redundancy by

virtue of its design, as well as a high level of personnel redundancy combining to

ensure quite high levels of flexibility, and therefore resilient capabilities.

Furthermore, participant responses indicated that the organisation has a robust

training and development program for all employees, which is undertaken at the time

of induction and on an ongoing basis, and linked to employee key performance

indicators. This is reflected in the following participant response:

“The induction program is very extensive and there is certainly an ongoing

acknowledgement and recognition of training requirements. All of the

operators go through a full training system offline...so they can practice and

get their skills up offline so that when they do come online there are no

mistakes...no trial and error.”(A)

Similarly, the organisation described a high level of capability across all of the Goal

and Commitment characteristics, including a very high recognition of the importance

of reliability with participants describing the security of supply as paramount and

central to all business decisions. Moreover, the same can be said in regards to an

organisational culture of reliability and a commitment to reliability in mission and

goals, with an acknowledged enterprise-wide understanding of the importance of

reliability and safety, which are driven by articulated goals and corporate values.

120

This serves to reinforce the importance of reliability and safety to all staff, and is

supported by the organisation‟s vision which is “to provide safe, reliable and

efficient energy solutions for the people of Queensland” (Annual Report 07/08).

External oversight was also described as high, with significant oversight and

influence of the organisation‟s activities, which participants suggested was by virtue

of it being a GOC. This was not the only driver for commitment to reliable outcomes,

with internal drivers quite strong regardless, as supported by the following quote:

“We‟re not doing it for the purpose of complying with them, but we simply see

it as a good business practice. As a GOC we certainly have a number of very

distinct measures that we have to adhere to...[It] makes us cognisant of

course... because we not only have to do the right thing, but we also have to be

seen to be doing the right thing” (A)

The participants highlighted two Process and Design features which could be

considered reasonably high, but where there are still some acknowledged gaps in

capability. For example, in regards to autonomy and accountability the participants

noted that there is still room for improvement with a blame mentality evident,

although the organisation has taken distinct measures in recent years to encourage a

more open and friendly reporting culture. In particular, the organisation has

implemented a formal Whistleblower Protection Policy which formalises the

Organisation‟s “commitment to protect the confidentiality and position of employees

who wish to raise matters that affect [the Organisation‟s] integrity” (Annual Report

08/09). Equally, participant responses described a very hierarchical, traditional

structure, but one that is able to flexibly respond to incidents, with collegial decision

making and evidence of deference to expertise when required, particularly within

technical teams such as time critical plant operations.

Summary

It is evident from the preceding analysis that the organisation can be considered

highly reliable with resilient capacities, due to a strong presence of Reliability-

Enhancing characteristics examined and only minor areas noted for improvement.

Similarly, the organisation has a strong capability across all of the BCM themes

examined indicating a very mature BCM program. Overall, the organisation

performs well when looking at both BCM and Reliability-Enhancing characteristics,

indicating a high degree of resilient capabilities.

121

Organisation B

As a major base-load Generator in Queensland, the reliability of operations is

important for this organisation, both from a supply and a financial perspective.

Whilst the value of BCM is well recognised within the organisation where a formal

structure has been implemented since the early 2000s, the level of maturity is

moderate with limitations due to apparent deficiencies across a number of the

identified themes, as evident in Table 5.3.

Table 5.3: BCM Capability – Organisation (B)

Use of

Standards

Integration

with RM

Threat

Assessment

Business

Impact

Analysis

Testing

&

Review

Governance

Structures

Senior

Mgmt

Embedded.

HIGH

MOD/

HIGH

MOD

LOW /

MOD

LOW

In contrast, there is clear evidence of a strong presence of Reliability-Enhancing

characteristics within the organisation, with the organisation demonstrating a high

capability across Process and Design aspects, as well as Goal and Commitment

considerations which is evident in Table 5.4.

122

Table 5.4: HRT Capability – Organisation (B)

Process and Design Characteristics Goal and Commitment

Characteristics

Technical

Perform.

Redundanc

Flexibility

Autonomy

Account.

Hierarchy

/Decision

Training/

Learning

Import. of

Reliability

Culture of

Reliability

Comm. to

Reliability

External

Oversight

HIGH

MOD

LOW

Thus, with both Reliability-Enhancing and BCM characteristics taken into

consideration the organisation‟s resilient capacities can be considered to be

moderate-to-high, though there are clear areas of strength and those where there is

room for improvement.

Business Continuity Management

An important observation can be made when comparing the participant‟s self-ratings

of BCM variables to interview responses, with clear differences apparent between

perceived preparedness levels (ratings), and the organisation‟s actual capability

described in verbal responses. Currently the organisation is at a moderate level of

BCM maturity, with participant responses suggesting an enterprise-wide approach

where the organisation recognises the importance of both technical and corporate

functions. This approach however does not appear to be systematic or

comprehensive, thus resulting in significant gaps in the organisation‟s BCM

program. In fact, participant responses did not suggest a high level of capability on

any of the BCM themes examined, indicating that there is room for improvement

across all areas.

As evident in Table 5.3 (see previous page), deeper analysis identified the main gaps

as the use of Standards, the BIA, role of Senior Management, and in some aspects of

corporate governance. Other slightly more advanced areas, where there is room for

123

improvement, include the integration with Risk Management, threat assessment,

along with the organisation‟s testing and review activities.

The participant was unable to identify which International Standard was used to

develop the organisation‟s BCM framework and it was acknowledged that it does not

seek to align the process with a recognised standard, primarily because the existing

approach is said to work well. Very much linked to this issue is the apparent lack of

formal integration of BCM with Risk Management processes, although the

participant did indicate that some synergy is being achieved because they personally

manage the two frameworks and attempt to make them as integrated as possible, as

highlighted by the following statement:

“The most powerful integration you can ever do is give both systems to one

person to manage and that‟s my role. [I] make them as integrated as

possible... They sit together and I don‟t see business continuity as special in

any sense. You go through your risk management process and people on the

ground will be aware of vulnerabilities... when you point out business

continuity risk that will drive them to develop a control like a [BCP] or a more

resilient system without necessarily needing to be explicit about how you bolt

them together.” (B)

Similarly, the failure to apply a recognised BCM Standard is possibly indicative of

why the organisation has not conducted a formal enterprise-wide BIA, which forms

the cornerstone of any recognised approach to continuity management. When asked

about the BIA process as defined in HB221:2004, the following response was given

by the respondent, thus supporting the low-moderate rating for this activity:

“No. Not formally and not in that structure. IT have done that. If you look at

coal supply to the [plant and Payroll] they‟ve essentially gone through that

process... but no there‟s no formal structure to that.” (B)

Although the organisation has identified critical functions across the organisation,

this approach is not formal and not to a recognised structure. This is a significant gap

in the organisation‟s overall approach to BCM and importantly to its overall resilient

capabilities, as the BIA contributes to resilient outcomes of a BCM program by

protecting critical functions.

124

The organisation‟s broader threat assessment activities can certainly be described as

moderately mature, as it takes an enterprise-wide approach to this assessment. It was

quite apparent however that it has a far greater understanding of technical threats to

reliability as compared to other areas such as corporate functions, where a much

simpler approach is taken. Furthermore, whilst the organisation takes a broad

approach to testing and review, it was an acknowledged area for improvement due to

a number of gaps in the organisation‟s testing capability. The participant suggested

that testing of BCPs is not done to a high level of frequency or detail, and is thus an

area where they would like to develop a stronger capability.

Another significant gap in the organisation‟s approach to BCM was the role of

Senior Management and with some aspects of governance structures. Participant

responses suggested that Senior Management do not have a strong continual

presence, and involvement is reactive following a serious incident rather than

proactive. It is also limited to higher level crisis and emergency activities such as

testing exercises rather than continuity activities specifically. A similar indication

was also given in relation to governance structures, because although there are robust

structures in place supporting response activities including continuity, they also

appear to be quite reactive in nature with infrequent reporting which occurs more on

a needs basis, rather than proactively. The organisation‟s BCM capability could be

enhanced with a more proactive use of governance structures and involvement of

Senior Management.

High Reliability Theory

In relation to HRT, there was strong evidence of Reliability-Enhancing

characteristics within the organisation as well as exceptionally high reliability levels,

with company documentation indicating that plant availability is consistently high. In

fact, in the 2008/09 financial year, the organisation maintained “high reliability, with

an average of 94.53% availability” across its portfolio, 2.6% ahead of budget

(Annual Report 2008/09). Table 5.4 (see page 122) highlights that the organisation

has strengths across Process and Design characteristics including high technical

performance, whereby the participant described a high level of understanding of

plant related activities, an extensive capital works program to maintain plant at

125

exceptionally high levels, as well as a highly competent and experienced workforce.

Similarly, the organisation has a high level of in-built physical redundancy as the

plant was designed to be highly reliable, in addition to personnel and experience

redundancy, which was not a purposeful design but rather an accidental feature that

they take advantage of (i.e. duplication of functions). The combined features indicate

a high degree of flexibility and resilience.

In regards to hierarchy and decision making, participant responses indicated a very

flexible organisational structure that, despite being quite hierarchical during normal

operations, can devolve into a flatter, decentralised decision making structure under

time-critical circumstances. Furthermore, staff with time-critical roles such as energy

trading or plant operators are empowered to makes decisions, and this is supported

by a collegial, team-based environment with a purposefully designed skill mix.

Equally, training and learning is of high regard, with a range of continuous

improvement measures described such as a developed training system and thorough

processes for reviewing incidents. For example, technical staff are engaged in

ongoing training using a plant simulator to enable them to continuously improve

their awareness and understanding of plant processes, and are also subject to

performance reviews to ensure a high level of competency is maintained. Autonomy

and accountability was rated slightly lower due to acknowledged room for

improvement, but generally this capability is relatively high, with organisational

processes, management involvement, and a safety culture said to be supporting an

open, friendly environment that encourages error reporting and discovery. This is

highlighted by the following quote:

“We‟ve got a training program called the Zero Incident Process which every

employee goes through and while it‟s mainly targeted at safety it does

encourage all sorts of reporting. One of the key phrases in it is the „I see it, I

manage it‟! So everyone is responsible for any problem they see. That‟s

something we encourage and something we see as very important. Generally

we‟re pretty happy but we definitely see areas to improve as well.” (B)

The organisation was strong across all Commitment and Goal themes with a clear

recognition of the importance of reliable outcomes, but balanced with other major

business concerns such as safety and profitability. Furthermore, participant responses

were suggestive of a strong organisational culture of reliability driven by the

126

organisation‟s safety and engineering cultures. This is embedded within the

organisation and contributes to the deep awareness amongst staff of the need for

continuous improvement. Further embedding this culture are the organisation‟s goals

that are clearly articulated in the Statement of Corporate Intent, reinforcing the

importance and commitment to high reliability. For example, it states that

organisation is committed to “ensuring that system security is maintained in

Queensland through continuing to invest in the existing generation portfolio to

improve performance, and ensure reliability and availability is maintained”

(Statement of Corporate Intent 2007/08).

The high rating for commitment to reliability is further suggested by the

organisation‟s investment in Reliability-Enhancing activities, such as the capital

works budget, investment in specialist, highly trained technical staff, as well as the

participant‟s indication that it is one of the organisation‟s core competencies it seeks

to protect, as supported by the following:

“Reliability is really one of our core competencies. There‟re the two pillars.

There‟s being a smart trader and then there‟s having a reliable product to

trade. It‟s certainly an area that‟s well resourced, and it is core to the

strategic process and the capital planning process. So reliability is vital [and]

financially a very large proportion of our total expenditure after fuel goes into

maintenance to ensure high reliability.” (B)

In addition, although company documentation indicates that the organisation is quite

responsive to external constituents, participant responses suggested the nature of

external oversight is such that it does not necessarily drive behaviour but provides

the framework for key business considerations, including reliability concerns.

Summary

It is evident from the preceding analysis that the organisation can be considered to be

highly reliable with resilient capacities, by virtue of design and process features, as

well as clear organisational goals dedicated to high reliability outcomes. Similarly, it

was apparent that the organisation has a reasonably mature BCM capability, although

there are a number of areas where the capabilities could be improved to enhance its

resilience. Despite identified areas for improvement, it appears overall that there is

clearly a high level of internal resilience within the organisation when taking both

Reliability-Enhancing and BCM characteristics into consideration.

127

Organisation C

There is evidence of a level of BCM implementation within this Government-owned

critical base-load Generator, with a formal program implemented since 2002. Whilst

the value of BCM is well acknowledged by participants, the current approach

appears unconventional with significant opportunities to improve the organisation‟s

overall BCM capability. Similarly, the organisation is also currently experiencing

some issues in terms of its reliability levels which can be linked to deficiencies

across a number of the Reliability-Enhancing characteristics examined. Company

documentation and interviews indicate that whilst the organisation has capabilities in

terms of both BCM and Reliability-Enhancing characteristics, there are key

deficiencies in a number of areas across both spectrums that are contributing to the

company‟s current reliability issues and lowering its resilient capacities overall, as is

evident in the following tables.

Table 5.5: BCM Capability – Organisation (C)

Use of

Standards

Integration

with RM

Threat

Assessment

Business

Impact

Analysis

Testing

&

Review

Governance

Structures

Senior

Mgmt

Embedded.

HIGH

MOD/

HIGH

MOD

LOW /

MOD

LOW

128

Table 5.6: HRT Capability – Organisation (C)

Process / Design Characteristics Goal / Commitment Characteristics

Technical

Perform.

Redunda.

Flexibility

Autonomy

Account.

Hierarchy

/Decision

Training/

Learning

Import. of

Reliability

Culture of

Reliability

Comm. to

Reliability

External

Oversight

HIGH

MOD

LOW

Business Continuity Management

When comparing participant self-ratings of BCM variables with interview responses,

it became evident that there were differences between perceived preparedness and

the actual capability as described by participants across many themes. In addition,

clear gaps were also apparent in participants‟ understanding of core BCM concepts

and what is meant by a mature approach. The organisation‟s BCM capability can be

regarded as limited to moderate, as an inconsistent approach is employed, which

appears to be more of a traditional emergency response capability rather than a

contemporary, documented enterprise-wide approach to BCM. Responses

highlighted that there is room for improvement across all themes examined, thus

resulting in the lower maturity rating overall. Whilst the organisation has moderate

capabilities in the areas of integration with Risk Management, embeddedness, and

governance structures, the major areas for improvement are the use of Standards,

threat assessment, the BIA, as well as testing and review, and the role of Senior

Management.

Of particular concern, the organisation‟s BCM framework is not aligned to an

International Standard. Instead management relies on advice from business

continuity groups and courses, as highlighted by the following statement:

“We haven‟t really obsessed about the standard. We‟ve just taken what the

experts think, business continuity groups... we go along to their courses...So in

terms of the rest [other than security threats] we follow AS4360 and have done

for a while.” (C)

129

It is evident that significant value could be gained by aligning their existing

framework with a recognised International Standard. This significant gap can also be

linked to the moderate integration with Risk Management, because although there is

evidence of integration through threat assessment activities, participants

acknowledged that BCM needs to be elevated within this process so continuity risks

are managed from start to finish. Such limitations can also be linked to the

organisation‟s other major shortcoming in terms of this study, the absence of a BIA.

Although threat assessment activities appear relatively robust and supported by an

impressive threat register and management system, they are quite narrow in scope

and there was evidence of an immature connection between these activities and the

development of formal documented continuity controls across the whole

organisation. For example, a participant noted that the organisation does not have

formal documentation in relation to some key administrative processes like payroll

and the trading system, whilst another indicated that this was also evident in core

areas of the plant. Thus, while there is clearly a general awareness of critical business

processes, this has not been addressed formally nor has a formal BIA been

conducted. As indicated earlier, this is a key aspect of internationally recognised

approaches to BCM, and thus a major gap in the organisation‟s resilient capabilities.

Furthermore, participants acknowledged that there is room for improvement in the

area of testing and review, particularly in regards to the scope of testing as the focus

is on higher level crisis and emergency response, rather than continuity specifically.

In fact, although participants acknowledged the importance of review activities, the

organisation does not have a regular regime for testing continuity of critical

processes as supported by the following:

“We exercise from an emergency response capability rather than a [BCM]

critical processes... the business continuity planning stuff we don‟t tend to

exercise all that much.” (C)

Similarly, whilst Senior Management are actively engaged in higher level emergency

and crisis activities such as testing and live events, their direct involvement in

broader continuity is limited to oversight, as BCM is an operationally rather than top-

130

down driven process within the organisation, with site managers having the most

involvement in continuity activities. Furthermore, it was also evident that although

there was clear implicit support for BCM activities, the organisation‟s tangible

support of BCM is limited, particularly in regards to the resourcing of the function as

the organisation does not have a dedicated BCM manager. This restricts the time and

attention devoted to improving the organisation‟s BCM capability.

In contrast, governance structures are reasonably mature with a moderate-to-high

visibility of emergency and crisis management activities, although again participants

acknowledged that continuity specifically could be integrated further into this

process. Thus, it is evident that there are significant gaps in the organisation‟s BCM

program where the process could be enhanced to improve overall resilient capacities.

The need to improve was recognised by both participants, it just needs to be

supported by appropriate resourcing and a higher level involvement from Senior

Management.

High Reliability Theory

The organisation has acknowledged challenges to the achievement of high reliability

at present. For example, Participant (C1) stated that, “our reliability is not good and

it is costing us a lot of money every year because we have a lot of forced

outages...That‟s what‟s driving a lot of what is going on.” Company documentation

further indicates that the organisation is failing to meet its reliability targets with

portfolio performance in the low 90% range (91.8%) for the 2007/08 financial year, a

3.7% drop from the previous financial year (Annual Report 2007/08). Further

highlighting this is the forced outage factor, which in many cases is exceeding the

budgeted level by as much as 14% (6% targeted, 18.1% actual), when on average it

should be below 4% in the medium term (Corporate Presentation – June 2009). This

is due to acknowledged technical issues, which were apparent in the Process and

Design themes examined, most notably technical performance, and flexibility and

redundancy. In contrast, there was a demonstrated strong capability across the all of

the Goal and Commitment characteristics.

131

Participants noted the explicit link between poor reliability performance and

technical performance issues, highlighting a raft of supporting examples such as an

ageing workforce, difficulties recruiting and retaining qualified, experienced staff,

issues with the skill level of technical staff and their performance, as well as

maintenance processes and staff not following established procedures. This link is

indicated in the following participant response:

“What we do day to day, and what we do in the major overhauls, and the time

we take to do it are all contributing to our reliability problems. It‟s a

combination of things. We‟re trying to tackle the people side, the technical side

with the aim of improving our reliability because it has been sagging.” (C)

Similar issues were raised in respect to redundancy with participants noting that the

problem is two-fold, with both plant and people factors. For example, the

organisation‟s newer plant was built with cost considerations in mind, and thus it

does not have the same level of in-built redundancy as some of the older assets. The

following statement highlights this:

“It‟s to do with the type of plant too... you can see that [the older plant] is a

hell of a lot more reliable because of [the experienced staff there] and it‟s old

style plant with all of the redundancy engineers might hope to have if they had

an unlimited chequebook. But these days it‟s all about maximum bang for your

buck [and] this is what happens.” (C)

Equally, participants described the organisation as quite lean with room for

improvement in the area of job design processes that support experience redundancy.

Although participants noted many examples of initiatives designed to overcome

some of these issues and improve reliability performance, it is clear that the

combination of plant and people issues are contributing significantly to the

organisation‟s current reliability problems.

Participants also noted that there was room for improvement in the areas of

accountability and autonomy, training and learning, and decision making and

hierarchy. For instance, it was indicated that although the organisation does not have

a formal „Whistle Blower‟ process, there is an open environment with a strong

reporting culture for health and safety related issues, which is slowly starting to

expand to other business issues, as reflected in the following response:

132

“We have a database for reporting safety and environmental incidents, and

that‟s being expanded to pick up all incidents. So we are getting there, but

we‟re not there yet. Although we have a reporting process, we are not quite

there in terms of the collection of data, analysing the data. We‟ve matured that

much more with health and safety than we have operationally and with

security, quality, and the other business issues. We have a level of maturity but

it just hasn‟t transferred across all of the other business issues.” (C)

Similarly, reporting is actively encouraged, and in technical areas of the business

this, along with decision making, is supported by a collegial team-based environment

with deference to the level required. Participants described flexible structures and

decision making processes particularly during times of stress, but noted that they are

not strong during normal operations (due to hierarchy), nor is it formal during the

follow-up and review process to capture and share learnings following incidents.

This also affects training and learning, because despite noting the importance of

learning, this process is not as formal or rigorous as it could be. Training is also of

high regard within the organisation, with significant investment to improve employee

skills using centralised training, but this has proven challenging due to complexities

of the new plant.

In respect to Goal and Commitment characteristics, the organisation takes a balanced

approach to the importance of reliability, and has a culture of reliability supported by

health and safety initiatives, although the strength of this culture is said to vary

between sites. This message is further embedded by the CEO and is articulated via

measures such as the organisation‟s mission statement, Strategic Plan, Statement of

Corporate Intent, and also in staff performance measures reinforcing the

organisation‟s commitment to reliability. Moreover, the current high level of

investment in plant and people is indicative of this commitment, although it was not

the case a few years ago when the „problem child‟ plant was commissioned, with

different business objectives in this era resulting in some of the reliability problems

the organisation is facing today. Similarly, whilst external oversight is relatively high

overall demonstrated by the level of reporting required by the Queensland

Government (e.g. Statement of Corporate Intent), participants described it more as an

informal pressure and not as important as internal drivers for commitment.

133

Summary

The preceding analysis identified significant areas for improvement across a number

of BCM and Reliability-Enhancing characteristics, thus there are clear gaps in the

organisation‟s internal resilience capabilities at present. This is recognised and the

organisation is currently engaged in a number of activities to further improve its

resilient capabilities, particularly in regards to reliability, although the organisation‟s

resilience would also benefit from further work to improve BCM.

Organisation D

As one of the State‟s primary network providers, Organisation (D) is critical to the

reliability of electricity supply in Queensland. Participants acknowledged that whilst

there has always been a response capability in place due to the nature of their

operations, BCM was formally implemented within the organisation in the early

2000s. This capability has evolved over time to currently be at a moderate-to-high

level of maturity with ongoing measures to augment the process further. The

organisation‟s capability is demonstrated in Table 5.7 for BCM characteristics.

Table 5.7: BCM Capability – Organisation (D)

Use of

Standards

Integration

with RM

Threat

Assessment

Business

Impact

Analysis

Testing

&

Review

Governance

Structures

Senior

Mgmt

Embedded.

HIGH

MOD/

HIGH

MOD

LOW /

MOD

LOW

Equally, Reliability-Enhancing characteristics are evident within the organisation,

although there were a few aspects where there is room for improvement (as evident

in Table 5.8).

134

Table 5.8: HRT Capability – Organisation (D)

Process / Design Characteristics Goal / Commitment Characteristics

Technical

Perform.

Redunda.

Flexibility

Autonomy

Account.

Hierarchy

/Decision

Training/

Learning

Import. of

Reliability

Culture of

Reliability

Comm. to

Reliability

External

Oversight

HIGH

MOD

LOW

With participants acknowledging the need for building resilient capacities due to the

critical nature of their operations, and evidence of both BCM and Reliability-

Enhancing characteristics, the organisation currently exhibits moderate-to-high

resilience.

Business Continuity Management

When comparing participant self-ratings to verbal responses, there was consistency

reinforcing their understanding of BCM and its level of maturity within the

organisation. Overall, responses indicate a moderate-to-high capability that is

continuing to mature as the organisation transitions towards an enterprise-wide

approach to BCM. The organisation‟s clearest strength was in regards to the use of

Standards, with its approach to BCM aligned with the Australian Standard. In

contrast, participants noted that the integration with Risk Management has

traditionally been limited, as BCM and Risk Management were managed by different

areas of the organisation. However, efforts have recently been made to ensure further

integration between the processes, in line with the Australian Standards. For

example, measures were recently taken to move the frameworks and associated

resources into the same area, in addition to the appointment of a dedicated Risk and

BCM manager who is responsible for managing both frameworks. What precluded a

high rating for most of the other themes was the scope of their capability, including

135

the organisation‟s threat assessment and testing and review activities, as well as

governance structures and the role of Senior Management.

In regards to threat assessment, the organisation has not traditionally had a good

understanding or awareness of non-network related threats and vulnerabilities.

Similarly, whilst the organisation has conducted a formal BIA for some critical

processes, participants acknowledged that its scope is currently limited, primarily to

network functions, as highlighted by the following participant response:

“Over the next twelve months [we‟re planning to do] a business wide... threat

assessment and [BIA]. That‟s a large piece of work that needs to be

done...We‟ve looked at the impact disruption events will have on some of our

critical business functions but it‟s not group wide.” (D)

It is important to note that this deficiency is acknowledged, the organisation has

responded, and is now proactively engaged in the process of gradually broadening its

capabilities in this regard. Similarly, there were acknowledged gaps in the area of

testing and review, where again capability for network-related events is very high,

but the lack of testing for non-network functions has resulted in the lower result for

this theme with it being the organisation‟s weakest capability overall. Accordingly,

this is where there is the greatest room for improvement and the organisation has

acknowledged this, indicating that there are plans in place to begin testing non-

network related functions.

A similar trend was noted in terms of the role of Senior Management and supporting

governance structures. Although there are strong governance structures in place, as

well as clear management support and active involvement, these activities have

traditionally been focused on network-related functions, as the following statement

indicates:

“Senior Management play a very important role and get involved in

responding to supply driven events... the Executive Disaster Management

Committee convenes and works out the response to a weather driven event. Its

Charter historically has been just responding to weather driven events.” (D)

136

In addition, both participants suggested that whilst BCM has quite good visibility

within the organisation, information provision to Senior Management and the Board

could be improved. Thus, there is a clear need to broaden the scope of their

involvement, and measures have recently been taken by the organisation to ensure

this. For example, the Executive Disaster Management Team‟s Charter has

broadened to extend their role to respond to non-network related events such as IT

system failures and influenza pandemics, in addition to evidence of increased

tangible support by way of the resourcing of a BCM function. Whilst the

organisation has traditionally had a limited approach to BCM, overall there is clear

evidence that the organisation has started transitioning its BCM program from a

traditional, narrow, emergency response approach, towards a contemporary, all

hazards, enterprise-wide program, thus enhancing their resilient capabilities in this

regard.

High Reliability Theory

As a network provider, it would be impossible to achieve 100% reliability due to the

distributed nature of the organisation‟s infrastructure, in addition to its vulnerability

to external threat sources such as severe weather. Accordingly, the Electricity

Industry Code sets a range of reliability targets and Minimum Service Standards

(MSS), to reflect the network‟s diversity and to ensure that network providers are

achieving high reliability outcomes. As evident from Table 5.9, the organisation can

be considered highly reliable, as it is exceeding the MSS for all reliability measures.

Table 5.9: Reliability Performance – Organisation (D)

Network Reliability Performance 2007/08

Target

2007/08

Result

Duration Index (SAIDI)10

Minutes

Urban Distribution ≤195 179

Short Rural Distribution ≤550 447

Long Rural Distribution ≤1,090 1,030

Frequency Index (SAIFI)11

Minutes

Urban Distribution ≤2.50 1.85

Short Rural Distribution ≤5.00 3.49

Long Rural Distribution ≤8.50 6.40

Source: Annual Report 2007/08

10

System Average Interruption Duration Index (SAIDI) 11

System Average Interruption Frequency Index (SAIFI)

137

The organisation attributes the positive reliability outcomes and a 38% decline in

power interruptions over recent years to the improved resilience of the network that

is a direct result of significant investment (Annual Report 2007/08), following on

from the Somerville Report recommendations. This was further evidenced by a

strong presence of Reliability-Enhancing characteristics, particularly in regards to the

Goal and Commitment characteristics, and for most Design and Process

characteristics, with the exception of flexibility and redundancy, and autonomy and

accountability where participant responses acknowledged room for improvement.

For example, when asked about autonomy and accountability participants noted that

whilst the organisation actively encourages and promotes a no-blame reporting

culture, which is supported by an incident management system and reward and

recognition program, there is evidence that this is not as embedded as it could be as

many incidents go unreported.

Similarly, in regards to flexibility and redundancy, there is a significant amount of

in-built redundancy purposefully designed into critical processes, such as the

duplication of critical functions, as well as in job design aspects in terms of

experience and skill redundancy through cross skilling measures, and resource

mobility. This is supported by the following participant response:

We have two locations for critical functions of the business. For example, we

have two control centres so there‟s that redundancy there. I won‟t say things

are perfect but we do have that ability and the same with the contact centres

[and] HR. There is cross training that takes place in regard to technical

positions in particular. [So redundancy from a] physical as well as from a

personnel point of view.” (D)

However, the physical network, which constitutes approximately 80% of the

business, is vulnerable to frequent disruption due to external events such as severe

weather and does not have a lot of in-built redundancy by virtue of its design,

although the organisation is always doing more in this regard. Measures mitigating

potential impacts on reliability due to the network‟s lack of in-built flexibility and

redundancy include the purchase of back-up generators that help to ensure the

continuity of supply, capital budgets to „underground‟ wiring for security of supply

to critical locations such as hospitals, the Network Preparedness Plans, in addition to

138

substantial investment in maintenance to minimise threats (such as vegetation) and

ensure the network‟s robustness.

Equally, on the people side, technical staff are highly skilled and carefully recruited,

as well as strategically located across the network to be quickly mobilised in the

event of an incident to return the network to full functionality. Such measures to

enhance technical performance combine to improve the organisation‟s overall

resilience. Training and learning is also a significant capability, with ongoing,

comprehensive training and performance evaluations for technical staff, in addition

to an established department dedicated to performance reviews, and the review and

the dissemination of learning. The following participant response indicates the

organisation‟s maturity in respect to training activities:

“All staff go through a rigorous training program. Training is ongoing and

very important. For example, new technologies or processes aren‟t

implemented before training is conducted. We wouldn‟t roll out a new system

before all staff are trained to the high standards we expect and that it is safe

for them to do so. Our training regime is very well developed and we spend

very heavily in this area.” (D)

The high rating for all Goal and Commitment characteristics was demonstrated with

participants noting that reliability of supply is paramount and balanced with safety

and financial considerations. This situation was noted to be reinforced by the

organisation‟s Government ownership as it was suggested that there is not that same

level of tension between reliability and profitability as would be expected in a

privatised entity. Government ownership was also a contributing factor to external

oversight, because this pressure drives a high level of commitment to reliability

which has been noticeably stronger following the release of the Somerville Report.

This commitment is further demonstrated by the organisation‟s significant capital

investment expenditure to enhance reliability outcomes, as well as being reinforced

by things such as the organisation‟s mission and vision statements, and strategic

objectives. Similarly, participant responses suggested that the organisation has a

strong culture of reliability supported by its safety culture, where security of supply

is said to be at the front of all employee minds.

139

Summary

The previous analysis indicates that the organisation has demonstrated resilient

capabilities, with strengths across both BCM and Reliability-Enhancing

characteristics. In respect to HRT, the organisation can be considered to be highly

reliable, as the organisation has a moderate-to-high level of capability across all

Reliability-Enhancing characteristics. Similarly, the organisation is rated moderate-

to-high across all areas of BCM and is actively engaged in strengthening its existing

program by extending its approach to an enterprise-wide capability. Overall, the

organisation can be considered to have a moderately high level of resilient

capabilities, which will be further strengthened via ongoing improvement measures.

Organisation E

As one of Queensland‟s primary network providers, Organisation (E) is critical to the

reliability of electricity supply. The organisation recognises the importance of

resilient capabilities, and in particular the importance of having a mature BCM

program in place. Although it has had business continuity capability since its

establishment, interview data indicates that the organisation has made significant

advances in terms of maturity over the last year, taking the organisation to a

reasonably high level of maturity within the industry. Although the organisation is at

a moderate-to-high capability across all BCM characteristics, both participants

however acknowledged that there is still room for improvement across most areas of

BCM. There is also clear evidence of Reliability-Enhancing characteristics within the

organisation, although there were some acknowledged gaps due to the nature of their

operations.

Overall, it is evident in Table 5.10 and 5.11 that Organisation (E) has strong

capabilities across both BCM and Reliability-Enhancing characteristics, and when

taking these into consideration it is evident that it currently has a moderate-to-high

level of resilient capabilities.

140

Table 5.10: BCM Capability – Organisation (E)

Use of

Standards

Integration

with RM

Threat

Assessment

Business

Impact

Analysis

Testing

&

Review

Governance

Structures

Senior

Mgmt

Embedded.

HIGH

MOD/

HIGH

MOD

LOW /

MOD

LOW

Table 5.11: HRT Capability – Organisation (E)

Process / Design Characteristics Goal / Commitment Characteristics

Technical

Perform.

Redunda.

Flexibility

Autonomy

Account.

Hierarchy

/Decision

Training/

Learning

Import. of

Reliability

Culture of

Reliability

Comm. to

Reliability

External

Oversight

HIGH

MOD

LOW

Business Continuity Management

Participants demonstrated a comprehensive understanding of BCM, and this was

further reflected in their self-ratings which correlated with verbal responses. Their

responses were indicative of a fairly mature approach to BCM that has been

advanced considerably in the last 18 months following measures to strengthen it as

an enterprise-wide capability. Improvements are ongoing as the organisation

continues to broaden the scope of its program which has traditionally been narrow in

focus, constrained to network considerations. Key areas of strength include the use of

141

Standards, with the BCM framework closely aligned to the Australian Standard.

Similarly, participant responses indicated that there is strong integration with Risk

Management, as the organisation‟s Enterprise Risk Management and BCM

frameworks have been purposefully designed to feed into one another as suggested in

the Australian Standard. Another area of considerable strength for the organisation is

in regards to governance structures, with participants describing strong structures

with a high frequency and level of reporting, ensuring the visibility of BCM to the

Board level. This is reflected in the following participant response:

“In terms of governance, it rolls up into our risk management. We treat

[BCM] as a treatment for a type of corporate risk and we have a strong

governance structure over that and reporting through to the Board each month

and reports through to Management and Board sub-committees each quarter.”

(E)

In contrast, participants also identified a number of areas where capability is

relatively high, but where additional room for improvement is acknowledged, thus

slightly lowering their rating overall. The first is in regards to threat assessment, as

gaps were highlighted in terms of the scope of these activities, as threats to the

network are well understood, whereas awareness of non-network threats is gradually

improving. In respect to the BIA, the organisation has a comprehensive

understanding of the organisation‟s critical business processes, and has conducted an

enterprise-wide BIA. Participant responses however, indicated that whilst the BIA is

fairly comprehensive, further work needs to be conducted in terms of understanding

critical interdependencies, as indicated by the following participant response:

[We have] an understanding of our critical functions and resources they

depend on but more work needs to be done in the area of the BIA which the

organisation is looking at doing. [An external consultant] rated us as fairly

well implemented for [BIA]. We‟re looking to roll out [BIA] workshops so we

can understand the end to end process. I think we‟re very good standalone but

holistically we need to understand the interrelationships which is something

we are trying to improve.” (E)

Similarly, the organisation has a strong capability in terms of network testing and

review due to a historical focus on network preparedness. However, the organisation

has recently started testing its non-network capabilities, further strengthening its

approach. Another area for improvement acknowledged by both participants is in

142

regards to the role of Senior Management. Again the scope of their involvement has

been limited to a network focus, such as network-related testing and higher level

crisis response activities. This is demonstrated by a participant‟s response regarding

Senior Management involvement in business continuity related activities:

“On the network side we are absolutely competent. On the non-network I think

we have some opportunities for improvement... I think management capacity is

purely focused on storm related events.” (E)

Despite a traditionally narrow focus, their role is starting to broaden with non-

network activities increasingly receiving greater visibility and engagement. Senior

Management have also reinforced their support for BCM, both conceptually and

tangibly, evidenced by the creation of a dedicated BCM function. However, both

participants acknowledged that support and engagement could certainly be further

augmented. Although the organisation‟s BCM capabilities have traditionally been

narrow, it has recognised the importance of an enterprise-wide, all hazards approach

to BCM and is now actively engaged in evolving their program further through

continuous improvement measures.

High Reliability Theory

Network reliability can be impacted by a range of external events such as severe

weather. To gauge the organisation‟s reliability performance, it is measured against a

range of MSS measures as set by the Government Regulators that take into

consideration the network‟s diversity. As Table 5.12 highlights, the organisation is

exceeding the MSS for all measures, with the exception of Rural SAIFI, thus

indicating a high level of reliability.

Table 5.12: Reliability Performance – Organisation (E)

Key Performance Measures

2007-2008

Actual Performance

2007-2008

Target

Measure

SAIDI – Total (minutes) 131.7 171.0

SAIDI – CBD (minutes) 4.0 20.0

SAIDI – Urban (minutes) 84.7 134.0

SAIDI – Rural (minutes) 242.1 244.0

SAIFI – Total 1.55 1.91

SAIFI – CBD 0.04 0.33

SAIFI – Urban 1.05 1.54

SAIFI – Rural 2.71 2.63

Source: Annual Report 2007/2008

143

The organisation‟s reliability has not always been as high. Following on from the

results of the Somerville Report released in 2004, the organisation has been

implementing targeted measures to achieve the report‟s recommendations and

enhance reliability outcomes. The positive reliability results are “due to the

continued investment in reliability programs” (Annual Report 2007/08), such as the

organisation‟s capacity building and maintenance budget, as well as the measures

that come out of the organisation‟s Summer Preparedness Plan. The organisation has

also undergone a significant reorientation in its business objectives, with a greater

commitment to reliability that is now more balanced with financial objectives.

This was articulated in interviews with a strong capability across most of the

Reliability-Enhancing characteristics, except redundancy and flexibility, autonomy

and accountability, and decision making and hierarchy, which had noted areas for

improvement. For example, participants noted that whilst the organisation

encourages and promotes error reporting and discovery at the corporate level, it has

traditionally had a blame mentality and this new approach may not be consistent

across the whole organisation, with subcultures present within divisions and teams

depending on the leadership approach of the local manager. Participants also noted

that some policies may even have the unintended consequence of encouraging non-

reporting and is thus a distinct area for improvement.

Similarly, in regards to decision making and hierarchy, the organisation has

traditionally been rather bureaucratic, although there is recognition of the need to

change this and measures have been implemented to devolve accountability to the

front line, empowering individuals and teams to encourage ownership and

responsibility for decision making. This is evidenced by the following participant

response:

“We are going through a maturity where as a bureaucratic organisation we

are trying to devolve accountability to the person that is best placed to make

that decision. To do this we have a strategy called an improved accountability

strategy which is about empowering the front line to make decisions and

empowering managers to make decisions. Generally what happens in

bureaucratic processes, people are afraid to make a decision; people hand it

off. So we are very big on making sure that people understand their

accountability.” (E)

144

There is also evidence of physical redundancy in terms of network design that

considers the points of failure in population dense or critical locations (such as

hospitals or Central Business Districts), mobile back-up generators, duplication of

some critical functions such as control and customer contact centres, and personnel

redundancy that participants suggested was through learned rather than formal

mechanisms. The current measures are not exhaustive and there is room for

improvement in terms of both physical and personnel redundancy measures, although

it would be far too costly to make the entire network redundant. To counter this

vulnerability, the organisation has strong technical performance measures including a

strict maintenance regime and extensive capital works and maintenance budget,

quality equipment and tools, structured policies and procedures, and the recruitment

of competent staff whose performance is regularly reviewed.

Similarly, the organisation recognises the importance of training and learning, what

they refer to as continuous improvement measures. This is evidenced by a rigorous,

continuous training regime for technical staff, as well as piloting and training

measures before the roll-out of new equipment. Further measures include

competency assessment management and performance appraisal systems with

employees measured against key performance indicators, in addition to thorough

incident review processes and the dissemination of findings to avoid a repeated

situation, all of which is supported by the organisation‟s safety culture.

In respect to Goal and Commitment characteristics, company documentation and

interviews indicated that the organisation is very much committed to reliable

outcomes and recognises its importance, and is thus carefully balanced with other

key corporate performance objectives. The organisation‟s primary function is to

“provide a safe and reliable electricity supply” and this is recognised in company

documentation, the organisation‟s vision and mission statements, and was further

reiterated in interview sessions. Moreover, the organisation has a strong safety

culture. This, combined with the key performance indicators, and the organisation‟s

strong community focus that is reinforced by powerful community expectations,

drives a culture of reliability with every employee cognisant of the importance of

145

achieving reliable outcomes, and ultimately keeping the lights on. This is reflected

by the following participant response regarding the importance of reliability:

“It‟s such a core function; a core purpose which is really in our vision which

is powering lifestyles forever. It‟s hardwired into us. Our culture is very much

on the customer. The high expectation that the customer has on reliable

electricity supply has almost become a basic human. Therefore in terms of the

organisational culture we are about delivering that... it‟s making sure that

everything that we do comes back to that central purpose.” (E)

This commitment to reliable outcomes is further evidenced by the organisation‟s

exceptional capital and maintenance works budget, dedicated to enhancing the

network‟s reliability. However, participants noted that spending $6 billion every five

years is indeed unsustainable and consequently reliability is now being looked at

with sustainability considerations in mind.

Summary

Although there were a number of noted areas for improvement, the preceding results

identified many positive resilience-enhancing characteristics. Overall, the

organisation currently has a moderate-to-high level resilient capability. With the

organisation committed to further improving its BCM and HRT capabilities, such

measures will only continue to improve the organisation‟s internal resilience

capabilities into the future.

Organisation F

It is evident from interviews that this critical network provider has a continuity

response capability that has been formally implemented for approximately eight

years. Whilst the importance of a response capability is well recognised due to the

nature of their operations, the organisation is pursuing an unconventional approach to

BCM, with traditional emergency and crisis response capabilities rather than a

contemporary approach to business continuity. Although the participant indicated

that their current process serves them well, they did acknowledge that there is

significant room for improvement across a lot of the BCM characteristics (see Table

5.13), and work is ongoing in this space.

146

Table 5.13: BCM Capability - Organisation (F)

Use of

Standards

Integration

with RM

Threat

Assessment

Business

Impact

Analysis

Testing

&

Review

Governance

Structures

Senior

Mgmt

Embedded.

HIGH

MOD/

HIGH

MOD

LOW /

MOD

LOW

Moreover, there was strong evidence from company documentation and interviews

of strong Reliability-Enhancing characteristics present within the organisation, as

evident in Table 5.14.

Table 5.14: HRT Capability - Organisation (F)

Process / Design Characteristics Goal / Commitment Characteristics

Technical Perform.

Redunda. Flexibility

Autonomy Account.

Hierarchy /Decision

Training/ Learning

Import. of Reliability

Culture of Reliability

Comm. to Reliability

External Oversight

HIGH

MOD

LOW

Discussions with the participant indicate that whilst there is evidence of both BCM

and Reliability-Enhancing characteristics, the organisation‟s overall resilient

capacities can be considered to be moderate as there are a number of areas where

there is significant room for improvement, particularly in regards to BCM where

many of these capability gaps are acknowledged.

147

Business Continuity Management

It was apparent when comparing participant self-ratings of BCM themes with

interview responses, that the participant often had a higher perception of the

organisation‟s capability, although some themes were accurately rated (e.g. Use of

Standards). It was also evident that whilst the participant has a strong understanding

of emergency and crisis management concepts, their understanding of BCM and

internationally recognised approaches was limited. When considering the various

International Standards for BCM, the organisation can be considered to be at a

limited-to-moderate maturity level, with participant responses indicating that there is

quite significant room for improvement across all variables examined. Overall, it is

evident that the organisation has a traditional emergency response capability rather

than a contemporary all hazards, enterprise-wide approach to BCM that seeks to

build resilience by protecting critical business functions.

In terms of its relative strengths, the organisation has a moderate level of maturity in

regards to testing and review activities, and also in terms of governance structures.

However, it was also evident that the organisation is limited in its capabilities in the

areas of the use of Standards, integration with Risk Management, threat assessment

with the BIA in particular, as well as the role of Senior Management, in regards to

the scope of their involvement as well as their tangible support of the BCM program.

One of the most significant gaps was in relation to the organisation‟s use of

Standards, with the existing approach to BCM based on personal experience and

knowledge rather than a recognised International Standard. With the absence of a

framework based on a recognised International Standard it is easy to understand why

there are significant gaps in other areas of BCM. For example, in terms of integration

with Risk Management, although some integration was noted, Risk Management

activities which are security focused, remain quite separate to the organisation‟s

Emergency Management activities. The participant acknowledged this limitation and

suggested that work is about to begin which may close the gap. The limited

integration between Risk Management and BCM is reflected in the following

interview response:

148

“We use that [4360] for doing the risk assessments for our substations which

then determines the level of security that we will install. I don‟t that we

necessarily then use that for how we would respond to an emergency. It

certainly was used to work out the level of security.” (F)

The organisation‟s understanding of threats also appears to be more mature for

network-related functions and focused on security, as compared to corporate and

other non-network areas of the business, nor are they recorded in a centralised

location (such as a risk register). The scope of the organisation‟s threat assessment

activities is evident in the following statement:

“We have certainly done it very thoroughly with our networks. We‟ve done it

fairly thoroughly with our IT. I wouldn‟t say we have done it thoroughly with

places like this - the corporate site.” (F)

Whilst threat assessment activities were identified as a moderate gap, the

organisation‟s major limitation is the absence of a BIA. Although there appears to be

a reasonably high level of understanding of key business activities generally, they are

not well documented and instead it appears to be based on intrinsic knowledge and

experience. Whilst this is currently a significant gap for the organisation, it is a

limitation that is well recognised and will hopefully soon be alleviated, with work

about to commence with the roll out of an externally facilitated enterprise-wide BIA

(as represented by the arrow in Table 5.13 – page 146). This recognition is evident in

the following participant response:

“We‟re actually in the process... we‟ve got some contractors in at the moment

because we recognise that this has been done based on experience and

intuition rather than a formal process. [They] are starting a Business Impact

Analysis which will go through all of that.” (F)

This is a significant piece of work that will substantially improve the organisation‟s

overall BCM capability, not just in terms of the BIA itself, but also across other

aspects such as integration with Risk Management, threat assessment, and the use of

Standards. The informant indicated that the project will integrate existing capabilities

within a contemporary approach to BCM.

In contrast, the organisation is slightly more mature in terms of the role of Senior

Management, as there is a high level of active involvement and engagement ranging

149

from planning, to testing and review activities suggesting a moderate capability in

this regard. This involvement however is limited to Crisis and Emergency

Management level activities due to the absence of continuity activities specifically.

Nonetheless, this capability may mature if their role is extended to include the

continuity related activities implemented once the BIA is complete. This is also true

of the organisation‟s corporate governance structures, which again are currently quite

robust for security and emergency response activities, but not continuity specifically,

thus resulting in a moderate rather than a high level of maturity.

Furthermore, whilst there is a high level of conceptual support and evidence of

tangible support with the funding of the BIA project, issues associated with the

resourcing of a BCM function were identified as the organisation does not have

dedicated BCM role. The current manager noted issues relating to time and

resourcing that are hampering the further development of a BCM capability within

the organisation, thus lowering the organisation‟s overall capability. In contrast, one

of the organisation‟s apparent strengths was in regards to testing and review

activities with participant responses and company documents indicating a large

annual testing regime. Again what limited the organisation‟s level of capability in

this regard is the scope of the testing activities, with the testing very much network

focused and at a higher Emergency and Crisis Management level rather than

functional level continuity processes.

High Reliability Theory

In contrast, company documentation and interview data indicates a high level of

capability across most of the Reliability-Enhancing characteristics examined, and is

reinforced by the organisation‟s strong performance against the Australian Energy

Regulator‟s (AER) reliability targets. In all instances, the organisation has met or

exceeded the targets. In fact, the organisation is benchmarked internationally and is a

world-leader in network performance measured in terms of cost efficiency and

customer service levels (reliability), with participants‟ in the top-right quadrant of

Figure 5.1 delivering above-average reliability at below-average cost (Organisation F

is represented in Figure 5.1 by the large circle).

150

Figure 5.1: Composite Benchmark of Performance – Weighted Average Organisation (F)

Other entities that participated in ITOMS12

Source: Annual Report 2007/08

Data analysed indicates that Process and Design strategies, combined with the

organisation‟s Culture and Commitment to reliable outcomes have resulted in the

strong reliability performance outcomes. An area where capability was not as strong

was in regards to redundancy and flexibility. For example, whilst participant

responses indicated substantial in-built physical redundancy, and some flexibility in

respect to field-related activities, there is noted room for improvement in terms of

human resources. In regards to technical performance, the high rating was indicated

by a strong maintenance and capital investment regimes that are supported by a

sophisticated, automated approach, in addition to a highly skilled workforce who

work to well defined processes and procedures.

There is also evidence of a moderate-to-high level of accountability and autonomy,

with a no-blame culture encouraged and instituted through the organisation‟s safety

culture, however this is not in writing. This is demonstrated by the following

statement:

12

This acronym is not expanded to help maintain the anonymity of the organisation.

151

“We have certainly got an open and friendly environment that encourages

error discovery. We have a no blame sort of approach to things... We verbalise

it all the time to people. I don‟t know that we have actually got it in writing

though. We certainly encourage that but it may not be documented as such

perhaps.” (F)

In addition, flexible decision making is supported by a relatively flat organisational

structure, and a strong team-based environment where deference to expertise is

promoted through rotating staffing structures. Equally, there is a very high level of

importance placed on training and learning activities, evidenced by examples such as

the organisation‟s significant training budget, ongoing training and development

programs for all staff with a key focus on technical staff capabilities, as well as

performance targets and reviews for individuals and teams where reliability is a key

measure. The organisation has also recently improved its methods for identifying and

reviewing technical issues, with process improvements such as data recoding and

reporting systems implemented (Annual Report 2007/08).

In respect to Goal and Commitment characteristics the organisation was equally as

strong, with a clear commitment to reliability and safety, evidenced in the

organisation‟s corporate objectives, as well as their substantial investment in

reliability-enhancing projects, particularly technological initiatives. This is also

reflected consistently throughout company documentation and also articulated

through the organisation‟s mission statement which is to be “committed to delivering

[...] network and related services at world-class levels of safety, reliability and cost

effectiveness” (Annual Report 2008/09).

Similarly, the organisation has a commitment to a strong safety culture that is

combined with a culture of reliability. This is promoted and reinforced by employee

Key Performance Indicators (KPIs) and the AER reliability standards, ensuring that

employees are aware of the importance of maintaining safe and reliable outcomes.

Similarly, participant responses and the international benchmarking are indicative

that the organisation recognises the importance of reliability, and is successfully

balancing reliability with other corporate objectives such as financial and safety

considerations. The importance of reliability is reflected in the following participant

response:

152

“The importance of reliability, is very high on our agenda, and we have

organisation culture of reliability which I guess it is built into all of our KPIs

and everything we do. Our internal KPIs are linked into gain sharing for staff

and just the normal monthly reporting KPIs. We have also got this

international benchmarking... and the AER service standards... So we have got

a number of drivers. It is all there and it all comes from the Board down.” (F)

Summary

Overall, the organisation has demonstrated resilient characteristics, particularly in the

area of HRT. Whilst the organisation has a strong emergency and crisis response

capability, there is significant room for improvement in the area of BCM, in terms of

expanding the approach to an enterprise-wide capability that is aligned with an

internationally recognised Standard and focused on maintaining the resilience of

critical business functions. Following the roll-out of the planned enterprise-wide BIA

and external review of the existing program, the organisation will be well on its way

to strengthening its resilient capacities.

Embedded Cross-Case Findings

As is evident from the preceding analysis of the individual embedded cases, the

organisations studied have developed both BCM and HRT capabilities to promote

resilient outcomes. Whilst there is a strong presence of these capabilities across the

six organisations, it is clear that each displays a different combination with some

exhibiting more strongly the characteristics of a resilient organisation. Table 5.15

summarises the individual organisational ratings for BCM as presented in previous

capability tables.

Table 5.15: Summary of BCM Capability Ratings for Organisations

US RM TA BIA TR GS SM EM

H

(5)

A, D,

E

A, E A A A A, E A A

M-H

(4)

D E, D E E D D, E D, E

M

(3)

B, C B, F D B, D,

F

C, F B

L-M

(2)

B, C,

F

F C B C B C, F B, C,

F

L

(1)

C, F

153

Furthermore, by using the criteria reference table (available in Appendix E), an

overall BCM capability rating can be determined for each organisation by combining

the ratings for each of the BCM themes. As the combined results show in Table 5.16,

which ranks the organisations on their BCM capability, there are noticeable gaps

between those with the strongest BCM capability overall (A, E, D), and those with

room for improvement (B, C, F).

Table 5.16: Organisation Rank for BCM Capability Ratings

Organisation Rating

1. (A) 40/40

2. (E) 35/40

3. (D) 31/40

4. (B) 20/40

5. (F) 18/40

6. (C) 17/40

Similarly, Table 5.17 provides a summary of the individual organisation‟s ratings

across each of the Reliability-Enhancing characteristics, as presented in previous

HRT capability tables.

Table 5.17: Summary of HRT Capability Ratings for Organisations

TP FR AA DH TL IR RC CR EO

H

(3)

A, B

D, E,

F

A, B A, B,

C, D,

E, F

A, B,

D, E,

F

A, B,

C, D,

E, F

A, B,

C, D,

E, F

A, B,

C, D,

E, F

A, B,

C, D,

E,F

M

(2)

C, D,

E, F

A, B,

C, D,

E, F

C

L

(1)

C

Furthermore, by using the criteria reference table (available in Appendix F), an

overall HRT capability rating can be determined for each organisation by combining

the ratings for each of the Reliability-Enhancing characteristics. As the ratings

demonstrate in Table 5.18, all organisations are quite similar in their HRT capability,

albeit slightly lower for Organisation (C).

154

Table 5.18: Organisation Rank for HRT Capability Ratings

Organisation Rating

(A), (B) 26/27

(D), (E), (F) 25/27

(C) 23/27

When combining the above ratings for BCM and HRT capabilities, it gives a clear

impression of the six organisations‟ overall resilient capabilities. Figures 5.2 and 5.3

graphically represent the overall capability ratings, with Organisation (A)

demonstrating the strongest presence of internal capabilities of a resilient

organisation.

Figure 5.2: Combined HRT & BCM Capability Rating

A B C D E F

HRT 26 26 23 25 25 25

BCM 40 20 17 31 35 18

0

10

20

30

40

50

60

70

Cu

mu

lati

ve

Sco

re

155

Figure 5.3: Overall Resilient Capabilities

Based on this comparative analysis it is evident that there are notable areas of

difference and similarity contributing to the overall resilient capability of the

organisations studied. Although the organisations are quite strong across most of the

Reliability-Enhancing characteristics, the most widespread differences are in regards

to BCM capability.

Business Continuity Management

Overall, BCM appears to be well implemented across the GOCs within the

Queensland Electricity Industry. Given that BCM is a relatively new phenomenon in

organisations, this in itself is exemplary with very few industries displaying

widespread implementation. Whilst all participants indicated that their organisations

have always had an emergency response capability due to the nature of their

operations, significant efforts have been made in recent years to formalise and

strengthen capabilities based on contemporary approach to BCM. Participants

suggested that this can be largely attributed to the advent of critical infrastructure

protection regimes in the early 2000s, with the Government placing external pressure

on the organisations to implement and advance a business continuity program.

Similarly, participants also noted that with the Government as their primary

shareholder, there is an additional pressure to have a response capability in place.

AB

CD EF

0

5

10

15

20

25

30

0 10 20 30 40

Hig

h R

elia

bil

ity

Ca

pa

bil

ity

BCM Capability

156

In addition to this external push, all participants noted an internal pull, suggesting

that there is also a strong recognition of the importance of BCM within their

organisations, with it considered to be part of best practice or a necessity due to the

nature of their operations. Overall, it was evident that there is widespread recognition

of the importance of BCM across the organisations studied, with all having BCM

regimes implemented for at least 5-10 years. Whilst all of the organisations have had

a BCM program implemented for approximately the same period of time, this is

where the similarities cease. As evident in Table 5.15 (see page 152), the

organisations vary significantly in terms of the degree of capability across the BCM

themes examined. The combined results (see Figure 5.3 – page 155) indicate that

there are a number of BCM characteristics that clearly differentiated the strongest

performers. The most significant gaps in capability can be attributed to the use of

Standards, the scope of BCM activities, the scope of involvement and support of

Senior Management, and embeddedness, in addition to the usage of BIA. The

differences in results can be linked to the following overall emergent themes.

Understanding BCM: The Importance of a Framework

Of the six organisations examined, only (A), (D) and (E) are following an approach

to BCM based on an internationally recognised Standard. Similarly, they are the only

organisations to have conducted a BIA and to have formally integrated their

Enterprise Risk Management with BCM activities. The absence of this guidance is

suggestive of why Organisations (B), (C), and (F) comparatively have room for

improvement in regards to their overall BCM capability. Consulting a Standard

would have greatly assisted in framing the development of their BCM programs. The

following commentary details understanding from select informants with response to

their organisation‟s limited alignment with relevant International Standards,

highlighting the ad hoc nature of their approach to BCM. For instance, the participant

from Organisation (B) indicated that International Standards have been consulted but

they are unsure the extent to which their program aligns with those Standards:

“In terms of Handbook 226 or the British Standard, we‟ve looked at it but

haven‟t done a gap analysis, probably because I don‟t feel there‟s a lot of

value. We seem to have a comprehensive system that works well. We‟ve used it

in practice and it does assist us. We don‟t seek to align ourselves to any other

standards.” (B)

157

This response indicates a level of complacency and overconfidence, and does

not suggest that the organisation‟s BCM program is undergoing a process of

continuous improvement. Similarly, participant responses from Organisation

(F) described an approach to business continuity built on experience rather than

alignment with International Standards, as highlighted by the following quote:

“I have to admit this has come up from a lot of history and personal

experience. I inherited this and we didn‟t delve into [the standards] too

much... We‟ve done it based on experience. We‟ve certainly followed the

concepts of 4360 for the risk assessment of the networks, but we‟ve probably

taken a simplified approach to it... At this stage we‟re not big users of it.” (F)

Furthermore, given that the BIA is a critical component of internationally recognised

approaches to BCM, it is not surprising that the organisations which have not

conducted a formal BIA are also those who do not align with a recognised

International Standard. Interestingly, the BIA also emerged as a clear indicator of

BCM maturity with the organisations‟ BIA ranking order (see Table 5.15 – page

152) corresponding with the rank order for the overall BCM maturity (see Table 5.16

– page 153). The absence of a formal BIA was a critical gap for Organisation (B),

(C) and (F), significantly lowering their resilient capabilities overall. This absence is

best highlighted by the following participant response when questioned about their

BIA:

“No. That‟s the straight answer. We know the impact of loss of trading is high

[so it is] identified as a critical process... but to say we‟ve gone through the

whole organisation and say these are our critical processes and this is our

BIA. Have we done this? No, we haven‟t and I don‟t think that we will ever be

able to corner everybody to do that... Most of those terms [Maximum

Allowable Outage etc.] are... a complete unknown around this joint.” (C)

In contrast, and reflective of the overall maturity of their BCM program,

Organisation (A) has conducted a formal, enterprise-wide BIA, as highlighted by the

following statement:

“We‟ve done a [BIA] of key areas within the plant... [we have BCPs] in place

for... loss of boiler, transformer, conveyor [etc.] and workarounds...We also

did if from a business perspective, so we went through key processes...like

payroll and key corporate functions.” (A)

158

The BIA serves to differentiate contemporary BCM from traditional response

strategies such as Emergency Response Planning (ERP) and DRP, by virtue of the

protection of critical business functions. By conducting a BIA, firms identify

business functions critical to the continuity of their operations and develop

appropriate workarounds to ensure their continued operation in the face of

disturbance, albeit not at full functionality. A formal and comprehensive BIA can

therefore be considered to be a key contributor to resilient outcomes, as it provides

an organisation with flexibility to recover full functionality „gracefully‟. Therefore,

with the absence of a BIA, firms are not practicing a contemporary approach to

BCM.

The same can be said in regards to integration with Risk Management, as

International Standards clearly indicate the importance of the relationship between

Risk and BCM. According to Standards Australia (2004a, 7-8), „mature organisations

display an increasingly integrated approach using common language, shared tools

and techniques, and periodic assessments of the total risk profile for the entire

organisation.‟ Reflective of their use of International Standards, Organisations (A),

(D), and (E) clearly recognise this link and have measures supporting this

integration. This is highlighted by the following participant response from

Organisation (E):

“We look at business continuity as a risk and business continuity management

as a risk treatment... BCM is a subset of our risk management and we always

sort of think of it along those lines... the ability to continue operations is a risk

domain the same as any other so we don‟t see a need to necessarily separate it

out from the other risk activities that we do... [it‟s] inherently integrated with

our risk management.” (E)

This recognition was also evident in participant responses from Organisation (D),

although the level of integration was acknowledged to be improving, as reflected by

the following comment:

“There‟s probably I‟d say they‟re partially integrated and that‟s because I

mean up until recently we haven‟t had the single point of accountability... what

I am interested in doing is ensuring that business continuity risks to the

organisation are factored in to the development of the corporate risk profile as

well as the business unit risk profile. Historically with our business risk

profiles you may not have seen business continuity type risks. But going

forward now you do or you will see that.” (D)

159

In contrast, interview responses from Organisation‟s (B), (C), and (F) described a

situation where, despite signs of integration between the two processes, the level of

integration was considerably weaker. This is supported by the following commentary

from Organisation‟s (B) and (C), which highlights room for improvement in this

regard:

“[I] make them as integrated as possible... They sit together and I don‟t see

business continuity as special in any sense. You go through your risk

management process and people on the ground will be aware of

vulnerabilities... when you point out business continuity risk that will drive

them to develop a control like a [BCP] or a more resilient system without

necessarily needing to be... explicit about how you bolt them together. It‟s just

another way of controlling a particular risk.” (B)

“One is not subordinate to the other. [They‟re integrated] through... the risk

assessment... whatever threats we identify to business continuity... they‟re

handled through our risk assessment and reporting process... [BCM] needs to

be elevated. We tend to talk about... our risk management, but not the [BCM]

process behind the risk management process... the risks that we‟ve accepted

should have a [BCM] process, but I don‟t think we‟re there... We have this

template we apply, but I don‟t think that we manage start to finish.” (C)

The use of a recognised International Standard would also further contribute to the

development of a clear and accurate understanding of what BCM is and what

constitutes a best practice approach for the greatest possible outcomes. Overall, it

was evident from interview responses that organisations following a recognised

International Standard have the best understanding of BCM, and were not confusing

this contemporary approach with traditional ERP and DRP activities. Consulting a

recognised framework is therefore important in garnering accurate understanding of

what constitutes an effective approach to contemporary BCM, and further supports

the development of a more robust business continuity program that will contribute to

building resilient capabilities within the organisations.

160

Scope of BCM Program

Similarly, the scope of the organisations‟ BCM program emerged as a critical theme

in the interviews, with this an overall indicator of capability in regards to BCM

activities (i.e. the scope of testing and review activities, threat assessment and BIA),

in addition to management involvement (scope of Senior Management and

governance). In respect to BCM activities, there are major differences in regards to

the scope of testing and review, threat assessment and BIA activities, with many

organisations failing to take a holistic, enterprise-wide approach to these tasks.

Again, those organisations that are following an internationally recognised Standard

are those that demonstrate a contemporary enterprise-wide approach to their BCM

activities that encompasses the various levels of response strategies (i.e.

Organisations A, D and E), rather than a narrow, technical focus synonymous with

older approaches to incident response (e.g. DRP, ERP etc.) (i.e. Organisations B, C

and F).

Literature suggests that an enterprise-wide approach is a key consideration of a

contemporary BCM program, which considers the broad socio-technical aspects of

organisations including both hard and soft assets, as well as both internal and

external threats (Elliot et al. 2002). The narrow scope traditionally evident in many

of the organisations appears to be a result of the technical nature of the industry, with

those organisations historically focusing activities and developing strong capabilities

in respect to plant or network related functions. Whilst their technical capabilities can

be considered robust, in contrast non-technical functions, such as payroll, may have

been considered but have not received the same degree of attention in a number of

organisations (B, C, F). This is highlighted by the following participant response

from Organisation (C):

“We‟re reasonably well prepared but I can see holes... in a couple of key

areas, namely IT Disaster Recovery and Recovery of Business Processes.... in

relation to some key administrative processes like payroll and our trading

systems, although we have processes in place it‟s not documented [or] tested...

The big important thing is to have power going over the fence.”(C)

161

This was also true in Organisations (D) and (E), who until recently also had a very

narrow capability in regards to BCM activities, but have now recognised the

importance of an enterprise-wide approach and have been extending the scope of

their program accordingly. This is highlighted by the following statement:

“Historically [we‟ve] been quite good at responding to weather driven events

that affect reliability of supply. That‟s our number one focus... you might not

necessarily say that we have a high level of capability and resiliency across

the broad BCM spectrum... it hasn‟t been organisational-wide and that‟s

something my role is trying to do now to make it enterprise-wide like risk

management.” (D)

Furthermore, the scope of activities also influences the scope of management

involvement, with Senior Management and Board involvement in some cases limited

to higher level crisis activities and/or core technical functions rather than supporting

a holistic, enterprise-wide approach. Again, it was the organisations with the highest

capability overall that have the broadest level of engagement and involvement from

Senior Management and the Board. For instance, participants from Organisation (A)

described significant, hands-on involvement across all areas of BCM evidenced by

the following quote:

“The involvement of Senior Management including the CEO and Board is

quite hands on... they‟re involved with the review of threats, review of

situations, and... the Crisis Management Team is usually managed by

Executive Managers... [They] provide a dual level involvement; input into the

[BIA]... as well as a review-point for the information provided by the direct

reports of the business units.” (A)

Similarly, Organisations (D) and (E) have recently expanded the scope of their BCM

programs, and in turn the scope of Senior Management and Board involvement has

broadened. Participants from these two organisations noted that management focus

had traditionally been limited to core technical functions, but indicated that a

reorientation had begun and will continue to evolve over time. The following quote

from a participant from Organisation (D) highlights this change in the scope of

Senior Management involvement, reflecting the broadening of the scope of their

BCM program:

162

“Senior Management play a very key and important role and obviously get

involved in responding to supply driven events... the Executive Disaster

Management Committee [EDMC], convenes and works out the appropriate

response to a weather driven event. Its Charter historically has been just

responding to weather driven events, so if there‟s a major IT system failure

that committee may not necessarily have responded. hat we‟ve just done is...

the way the Charter is worded, its responsibilities or the role has been

broadened to respond to [other events such as] IT system failures, failures of

supply chains, and influenza pandemics.” (D)

In contrast, whilst Senior Management and the Board from Organisation‟s (B), (C)

and (F) support and are involved in BCM activities, the scope or extent of this

involvement was limited when compared to Organisations (A), (D), and (E). For

example, when compared to the responses from organisation (A), (D) and (E) the

following quote from a participant from Organisation (B) demonstrates the limited

scope of Senior Management involvement in BCM activities:

“[Senior Management] had a much stronger role at the initiation of the

sponsorship but the role is mainly now supporting specific initiatives when we

identify them and we find that is very good. Our Audit Risk Management

Committee and the Board gets a report from me once a year which says well

this is our business continuity management system and this is how it works,

and this is what has happened and all of those sorts of things. We have an

executive committee that looks at all of these things twice a year but in the

main it‟s if I need something I go and ask for it rather than a strong continual

presence.” (B)

As the results and above discussion indicate, the scope of an organisation‟s BCM

program, in regards to both activities and management involvement, greatly

influence the maturity of an organisation‟s business continuity, and therefore resilient

capabilities.

163

Support of BCM

The nature of support for BCM also emerged as a clear indicator of overall maturity,

with the level of support, both tangible and conceptual, having a major influence on

the maturity of BCM capability. Although differences exist in regards to the nature

of Senior Management involvement and engagement, all of the organisations display

a clear commitment to BCM processes, with all participants acknowledging its

importance as it underpins the achievement of core business objectives. Whilst all

organisations have conceptual support of BCM and recognise the importance of it to

their operations, a number of participants described shortcomings in tangible support

such as a lack of time and resources, which in some instances was attributed to their

deficiencies across the range of BCM themes, including the absence of a BIA.

Of the six organisations examined, four have a dedicated risk and BCM role. The

organisations that currently display a narrow focus recognise the need for a more

holistic approach but spoke about a lack of resources and time to expand the

capability further. For example, in Organisation (B), the importance is recognised but

it could get more time share. Similarly, respondents from Organisation (C) and

Organisation (F) indicated that they were faced with competing responsibilities, as

BCM was only one of the many functions they are responsible for in their role.

Therefore, in these less mature organisations, BCM is only one of many activities

that are competing for the BCM manager‟s attention. For example, respondents from

both Organisation (C) and (F) noted that things could be a lot better if they had the

time and resources, as highlighted by the following statement:

“The other issue is resources. It‟s just me. This is a part time role for me. I‟ve

got some other things to do... I‟ve got all sorts of things to do. So it‟s partly

resources too.” (F)

This sentiment is echoed by the following statement by a participant from

Organisation (C):

“I‟d be sure that some of the industry players are much better at this than we

are because one thing is that they‟ve had people dedicated to it. We haven‟t.

We‟re a couple of dabblers...We‟re dabbling in this space because we both

overlap in it....but we haven‟t got anyone driving this you and we haven‟t got

anyone dedicated to it which is even more important.”(C)

164

Interestingly, Organisation (C) also noted that they were pursuing a cost based

approach to BCM and could do more but do not have the financial resources and

therefore “put off” work. Organisations (D) and (E) have also battled resourcing

issues in the past, but have recently increased the tangible support of their BCM

program, whereby both now have dedicated Risk and BCM functions and supporting

staff. This is reflective of their improved capability over the last 12-18 months, but

indicative of why there is still room for improvement to augment their BCM

capability further. This is reflected in the following statement from Organisation (E):

“Conceptually they support it. I could see it evolving a lot further. It‟s one of

these issues that everybody will acknowledge is important but you‟ll struggle

fto get real management attention on it. It competes with all other things that

managers need to do. They want everything but in terms of resourcing

sometimes business pressures aren‟t totally aligned with what we need to

achieve [pushing it] to a lower priority. [But] the Board have certainly been

engaged as well which has helped with the creation and resourcing of this

function.” (E)

The responses indicate that whilst conceptual support is indeed important for the

implementation and continuation of a BCM program, without tangible support to

ensure adequate resourcing and attention for the ongoing development of a

comprehensive BCM program, an organisation‟s business continuity and thus

resilient capabilities will be constrained. Further supporting this observation is the

fact that of the six organisations, the one with the strongest BCM capability overall

(Organisation A) also described the highest level of tangible and conceptual support.

Participants from Organisation (A) indicated that this has been driven by Senior

Management, and has contributed to the development of a deep BCM culture which

is embedded throughout the organisation. This is evidenced by the following quote:

“There‟s a culture of [BCM] at the Senior Management and CEO level. [The

CEO] saw it as a natural part of what we had to have in place. It‟s now an

accepted part of what we do. There‟s not a reluctance to deal with [BCM] in

general. The CEO has a lot of buy-in on [BCM], probably more than Risk

Management. We‟ve had a number of real events where the value of being

prepared has been brought home to them. It‟s certainly important to us to see

that involvement from the CEO and Board. When it comes to building a

[BCM] culture, fantastic.” (A)

165

This reinforces the contention that management support can be considered critical in

the development of a robust BCM capability. This finding is also supported by

Standards Australia (2006a), which suggests that gaining the support of Senior

Management is crucial for developing and embedding a successful BCM program, in

addition to regular communications and training and review activities. Participant

responses further reinforced this point, suggesting that reporting and visibility of a

BCM capability provides Senior Management and the Board with a level of

assurance that their organisation is well placed to mitigate risks and respond to

disturbances that threaten the ongoing sustainability of the organisation. Generating

visibility through information reporting and direct involvement helps to build a level

of assurance and contributes to reinforcing Senior Management support and

commitment of the BCM program. Overall, it is evident that whilst the implicit

support may be there across all of the organisations, tangible support varies

considerably, influencing the scope and maturity of BCM across the industry.

High Reliability Theory

Although there is a notable difference in regards to BCM capability across the six

organisations, this can be contrasted with the HRT results as all organisations

demonstrate a relatively strong presence of Reliability-Enhancing characteristics, as

described in the HRT literature. It is evident from their combined ratings in Tables

5.17 (page 153) and 5.18 (page 154) that, despite slight differences most notably

with regards to Organisation (C), all six organisations are quite similar demonstrating

a strong HRT capability overall. A number of key themes emerged from the analysis

including, Sectoral Differences, Inheritance Considerations, and most notably, the

strength of Goal and Commitment characteristics.

Sectoral Differences

It is evident in Table 5.17 (see page 153) that the Generators (with the exception of

Organisation C) were consistent in their ratings across the Reliability-Enhancing

characteristics, and thus their overall HRT capability rating (see Table 5.18 – page

154). This was also true of the Network Providers (D, E, F), which were just a point

behind with an overall rating of (25/27). Interestingly this slightly lower rating is

attributed to flexibility and redundancy, with this theme emerging as a notable point

166

of difference amongst the organisations. Although all organisations demonstrate a

moderate-to-high level of flexibility and redundancy, the Generators (particularly

Organisations A and B), benefit from significant physical redundancy designed into

their plant. This is due to the concentrated nature of their operations and supporting

assets whereby the Generators‟ overall reliability can be supported by implementing

considerable redundancy to protect critical points of failure.

In contrast, due to the geographically dispersed nature of the Network Providers‟

infrastructure assets, further redundancy is somewhat precluded because of the

exorbitant costs associated with engineering the entire distributed infrastructure

network. Accordingly, physical redundancy measures are typically implemented on a

risk-based analysis, generally to protect critical locations such as hospitals and

Central Business Districts, rather than the whole network. Broader redundancy is

further supported via other measures such as mobile generators. Despite these

measures, the Network Providers will never enjoy the same level of physical

structural flexibility and redundancy as the Generators, given the dispersed nature of

their assets (as opposed to generation assets which are relatively centralised). The

nature of the Network Providers‟ infrastructure is further reflected in their lower

reliability levels/targets than Generators, whereby their dispersed infrastructure, with

countless points of failure, is also more vulnerable to external threats such as

weather.

Furthermore, the Generation sector also has a higher degree of redundancy by virtue

of the excess of generation capacity currently in the Queensland Industry. This in

itself ensures that if one Generator experiences reliability troubles (as is currently the

case), the portfolio effect of the Generators combined ensures that security of supply

is not impacted. However, it was also indicated that the Generators‟ output is

constrained by the fact that the network infrastructure is operating at capacity. Thus,

the Generators could produce more electricity, but are restricted by the capacity of

the network infrastructure. These features emerged as key Sectoral Differences.

167

Inheritance Considerations – The Critical Case

Due to Organisation (C)‟s current reliability issues, the organisation emerged as a

divergent case differing slightly to its industry counterparts. Although Organisation

(C) does have a degree of in-built redundancy in respect to its broader generation

portfolio, the slightly lower rating than its “sister Generators” is reflected in a

number of noted shortcomings. As was evident in the individual analysis, most

organisations are currently exceeding their reliability targets, with the exception of

Organisation (C), which is facing unique challenges that are affecting its reliability

outcomes. Despite not meeting its reliability performance targets, which can be

attributed to limitations reflected in the slightly lower rating for a number of Design

and Processes characteristics, the organisation is still achieving relatively high

reliability performance outcomes; performance levels are just not as high as they

could be.

In particular, it was indicated that at the time of disaggregation the other GOC

Generators inherited infrastructure that was of varying quality, and this legacy

continues to impact the organisation‟s resilient capacities. It was noted that

Organisation (C) was the smallest of the three GOC Generators at the time of

disaggregation and they have subsequently been the only GOC Generator to

commission significant new plant under the new corporatised operating environment.

Its newest and largest plant however, was not commissioned with the same level of

redundancy in mind, but was instead constrained by cost considerations and is now

the primary source of the organisation‟s current reliability troubles. This is

highlighted by the following quote:

“We were the runt when they created this industry... the runt of the pack. We

got all of the „bomby‟ plant and now we‟re the biggest. We had to either

improve or die so we continued to build power stations. The others... they‟re

still working pretty much with what they had and they‟ve got some damn good

plant too mind you. It‟s all been state of the art, built to last. [The reliability

problems with the new plant is] partly because of the way it came together. It

was a joint venture. It was built to a price rather than to a reliability target.

It‟s still got its teething problems [and] if that comes offline it‟s a huge hole in

our portfolio.” (C)

168

This also has flow-on effects to the organisation‟s slightly lower training and

learning, and technical performance ratings, as the new plant is causing significant

difficulties in these areas due to its complex and problematic nature, thus culminating

in their lower rating for Reliability-Enhancing characteristics overall. Despite these

noted limitations in terms of some Process and Design characteristics, Organisation

(C) scored well across all Goal and Commitment characteristics, as did all of the

organisations examined with this therefore emerging as a distinct strength.

Goal and Commitment Characteristics

The four Goal and Commitment characteristics can be considered to be a key

strength amongst all six organisations examined, with this clearly a dominant

contributor to ensuring resilient outcomes in each of the organisations. This strength

can similarly be attributed to Inheritance Considerations, by virtue of their common

heritage as a GOC, as well as an implicit understanding of their purpose for being as

an essential service provider. Accordingly, all described a very strong commitment to

reliability, and by consequence of their engineering base and safety focus, a culture

of reliability. This is reinforced by external oversight, predominately their ownership

by the Queensland Government. This is supported by the following quote from

Organisation (A):

“Because we are critical infrastructure security of supply is paramount...

if we make a decision it has to take into account the security of supply or

essentially the reliability .That comes from the shareholder...the

Queensland Government. From a business perspective if it affects security

of supply then we make the decision from a security of supply

perspective.” (A)

This sentiment is also reflected in the following statement from an interview with

Organisation (D):

“[There‟s] a strong safety culture and commitment to reliability from all

staff. This is embedded across the organisation. It is what we do. All staff

know this and are working towards this objective.” (D)

An important point to note was that although the HRT literature suggests that

reliability is not to be traded off for other more common organisational goals such as

profitability, the organisations examined interestingly suggest the importance is

169

rather on maintaining an appropriate balance between competing corporate

objectives. This is reflected in the following statement by a participant from

Organisation (D):

Reliability is certainly the most important part of our business (along with

safety) but costs are also a consideration, as they are for any

organisation. We cannot ignore financial considerations, but we do

everything we can to ensure reliability, now and into the future. I think

that we have a balanced approach. We invest very heavily in reliability

and would love to do more but it is constrained by costs. We aren‟t going

to gold plate anything. We simply couldn‟t. But we are not in the business

of profiteering. We are in the business of reliability” (D)

The following statements by participants from Organisation (A) and (B) also support

this contention:

“The reliability of our business is certainly very important... [it‟s]

embedded. [Pressures to cut] costs are there, but there‟s an

acknowledgement that reliability is of utmost importance to us and

therefore there is certainly a balance.”(A)

“After safety and environment and all that is preserved, our core business

is reliability... It is balanced with profitability seeking and we wouldn‟t

see that there is a mismatch between those two goals.” (B)

Although a slight variation from the description in the literature, this „balanced

approach‟ does not appear to be affecting the reliability of electricity supply in the

context of this industry, with all six organisations suggesting that they have got the

balance right. Some organisations however indicated that they have not always had a

balanced approach, as there was evidence of profit-seeking behaviour as recently as

2004 which did serve to impact the reliability of supply. The affected organisations

attributed this to the orientation of the Chief Executive Officer (CEO) and the nature

of external oversight at this time, particularly shareholder pressure. This is best

supported by the following statement:

“The catalyst for improving reliability was the 2004 storm... for the

Government to stop stripping dividends... The Somerville Report... forced

them to spend more money on building the necessary infrastructure to do

the proper risk [and] vegetation management because every year there

was a strong focus on cutting costs to increase the dividend and its gone

the other way now which is greater capital [and] operational

170

expenditure. Some of that was management driven rather than ownership

driven... so the CEO of the time saw producing bigger and better

dividends for the Government as the primary measure at the expense of

investing in long-term reliability.” (E)

This influence is further echoed in the following statement by a participant from one

of the Generators:

“When I first started, we were very much in a low margin return to

Government and there was a huge focus on [returning] the maximum amount

of profit. That culture... the Executive Management, and Board directive was

very much to return every ounce of profit back to Government and

[Organisations D, E and F] were in the same position. We benefitted from the

non-reliability of (The Network Providers) with the brownouts. That brought

about a different focus from our shareholding ministers and our Executive and

we benefitted.” (C)

The unreliability experienced during this period served to reinforce the importance of

reliable outcomes as this is their fundamental reason for being, but also indicates

how precarious this balance can be; an issue that was acknowledged by many

participants in their interview responses.

Overall, the strength of Goal and Commitment characteristics emerged as a dominant

theme in the organisational analysis, currently contributing significantly to resilient

outcomes within all GOCs across the industry. Given this consistency across all six

organisations, it is further suggestive of strong links to industry-wide resilient

phenomena.

Summary

Overall, there is evidence of strong Reliability-Enhancing characteristics as

described in the HRT literature across all of the GOCs examined, with Goal and

Commitment characteristics emerging as a dominant indicator, whilst Organisation

(C) emerged as a divergent case by virtue of Inheritance Considerations. Similarly,

the preceding discussion indicates that BCM is widely implemented across the GOCs

however there are significant differences between the six organisations with the

Importance of a Framework based on a recognised International Standard, the Scope

of BCM Program, and Support of BCM emerging as critical themes affecting their

resilient capabilities from a BCM perspective.

171

Whole-of-System Resilience (“Holling-ian”)

The same participants were interviewed about broader industry-level characteristics

that may be contributing to resilient outcomes for the security of electricity supply in

Queensland from a whole-of-system perspective. Questions for discussion were

developed based on existing literature, albeit limited, in conjunction with the insights

gained from initial interview sessions. The major themes explored can be grouped

into two major areas: Industry Structure and Governance, and the Attitude and Ethos

of industry participants. The themes that emerged from the two major areas of

interest include the following and each will be discussed in turn.

1. Industry Structure and Governance

o The Role of the Government as Shareholder

Oversight

Investment

2. Attitude and Ethos

o Industry Commitment and Culture of Reliability

Collaboration and Cooperation Measures

Industry Structure and Governance Characteristics

Whilst the Queensland Electricity Industry was previously a vertically integrated,

publicly provided essential service wholly-owned and run by the State Government,

it underwent a process of disaggregation in 1997 into six GOCs. Thus, one of the

biggest contextual issues is that it is a semi-privatised (corporatised) industry.

Although still owned by the Queensland Government and not fully privatised like

other utility industries around the world, the organisations have been corporatised

and now operate as independent, profitable entities providing returns on investment

to the shareholder.

While this process has entailed a deregulation, there are niche players within the

Generation sector that are fully privatised operators. The broader and dominant

context of the industry however is that of GOCs, with the Government the primary

shareholder for reveunes generated. Thus, all of the State‟s base-load generation

continues to be operated and managed by the three GOC Generators, whilst all

172

network functions remain dominated by the three GOC Network Providers that

operate in an uncompetitive environment.

Thus, there have been significant changes in Industry Structure and Governance, and

according to a participant from Organisation (B), the “corporatisation of the industry

has driven some interesting results.” On the one hand, some participants noted that

resilience may have improved as a result of the disaggregation, because in some

instances, it actually made coordination with Network Providers easier. This is

supported by the following commentary:

“[Previously] we had the Far North Queensland Electricity Board, the TERB

in Townsville, the Wide Bay Burnett Electricity Board, the Capricornia

Region Electricity Board and SEQEB, South West Queensland Electricity

Board and they all merged and formed two organisations and that was a

positive from business resilience because it was less people to interact with

and coordinate. So in terms of that coordination that has certainly been a

positive.” (F)

“I don‟t think the disaggregation of the industry has had an adverse impact

on reliability, if anything it has probably improved reliability outcomes...

The breaking up of a vertically integrated entity into the relevant parts,

where there‟s now Retail, Distribution, Transmission, and Generation... has

probably had a beneficial impact in that it‟s allowed each entity within the

value chain to focus on what they do best... Distributors are no longer

distracted by what is going on in the Generation space.” (D)

In contrast, all participants (including those above) also described a situation under

this new structure where a profit motive in the quest to return dividends to the

Government shareholder had previously driven some of the GOCs to run down their

infrastructure, and subsequently impact the reliability of service provision. This is

highlighted by the following statement:

“One of the reasons why we are now spending so much money is for that

very reason that the Government demanded higher dividends from the

Government Owned Corporations. As a result, the expenditure on

maintenance etcetera was drastically reduced. There were reliability issues

when Somerville was set up and that unreliability was due to lack of

maintenance and ageing assets due to lack of funding allowed for

previously.” (D)

173

The result of this unreliability was detailed in the Somerville Report, commissioned

in 2004 by the Queensland Government to investigate the causes and provide

recommendations for improvement to secure the State‟s electricity supply (State of

Queensland 2004). The results of this report served to have a major impact on the

industry‟s resilience by reinforcing the importance of reliable outcomes, both within

the organisations and from the major shareholder, the Queensland Government, who

had to answer to the voting public. This sentiment is suggested by the following

participant response:

“We‟ve been a little bit better off as a result of some of the (Distributors‟)

infrastructure unreliability which is sensitised the Government to make sure

budgets are in place for plant, equipment. So a couple of years ago there was

some unreliability issues with both (The Distributors). Immediately after

those were in the marketplace the State Government certainly ingested skills,

resources, training and budgeting to those GOCs which sort of reflect back

on us to make sure that we had our reliability and our resources and budgets

in place for that.” (C)

The Role of the Government as Shareholder

Accordingly, it is evident that the industry structure has potential to affect resilient

outcomes by virtue of leadership decisions, not only from the Senior Management

level within the individual organisations, but also from a Government shareholder

perspective. As the previous quotes highlight, an unfavourable shareholder

orientation was evident and served to have a negative effect on a number of the

GOCs, most notably Organisation (E) which subsequently impacted the reliability of

supply to customers. Following on from the Somerville Report however, there has

been a reorientation of objectives, with the Queensland Government now focused on

ensuring reliable outcomes, allowing for increased, yet prudent investment, rather

than pressuring firms for dividends. This change has been strongly articulated to

firms through enhanced shareholder oversight and investment in reliability. This

change has been well received and reflected in the individual firms‟ Goal and

Commitment characteristics, as described earlier in the Chapter (see pages 168-170).

174

Investment and Oversight

Whilst investment in reliable outcomes was in some instances limited before the

Somerville Report, this has improved markedly due to the changed shareholder

orientation. The Somerville Report was a catalyst for change in the industry with the

Queensland Government now encouraging increased capital expenditure. Over the

last five year period there has been evidence of significant investment in Reliability-

Enhancing activities across all GOC participants, further evidenced in company

documentation. This has had further benefits within the individual GOCs, with the

changed shareholder orientation ensuring greater emphasis is placed on investing in

BCM and other Reliability-Enhancing characteristics, ultimately encouraging firms

to develop these capabilities. This was demonstrated by the results of the

organisational analysis presented earlier in this Chapter.

In addition to investment, the changed nature of the shareholder‟s orientation is also

evident in the extent of their oversight, with participant responses suggesting that this

is an important factor contributing to enhanced resilient outcomes within the

industry. In the wake of the Somerville Report, the Queensland Government has

increased its oversight of the industry through more stringent reporting processes.

For example, whilst all GOCs have always had to provide a Statement of Corporate

Intent and Annual Report to the Government annually, the Network Providers now

also have to provide Network and Summer Preparedness Plans. This is supported by

the following quote:

“Following the Somerville Report we have to provide a Network Planning

Report [and a] Summer Preparedness Plan... by May each year [that sets]

out how the organisation plans to prepare its supply network for the

upcoming summer to minimise outages to customer supply, manage and

minimise the impact to weather driven events to customer supply, identify

and respond to emergencies with potential to impact customer supply, and

protect customers from electricity supply issues. That goes to the

Government and Regulator.” (D)

This serves to reinforce the importance of ensuring the security of supply, and

provides the Government with a level of assurance that the organisations are

planning for, and investing in reliable outcomes. The change in shareholder

orientation has also been reinforced via the presence of politically appointed

company Boards of Directors‟, who are a key link for driving the Government‟s

175

orientation home to the organisations. The change appears to have been well received

by the Senior Management of all industry GOCs, which are now focused on ensuring

reliable outcomes, as evidenced by the results of the organisational level analysis

discussed previously in this Chapter.

Whilst not related the shareholder‟s orientation, another source of constant positive

oversight is from external regulatory bodies that contribute to reliable outcomes

within the industry. Most participants noted the important role of the National

Electricity Market Management Company (NEMMCO), or its successor the

Australian Electricity Market Operator (AEMO). Their authority is critical in

coordinating the industry, compelling the organisations to act reliably by ensuring

that they are accountable through the range of formal protocols that must be

followed. This driver is not unique to the GOCs, but exists as a powerful driver for

both private and Government-owned participants within the industry.

Thus there is currently a combination of oversight from the State Government and

the National Regulator contributing to resilient outcomes within the industry. Despite

playing an important role in ensuring reliable outcomes, most participants suggested

that the nature of oversight could be enhanced from both constituents. For example,

outside their reporting requirements, many participants indicated that the

Government‟s direct intervention was limited and somewhat trusting, leaving the

industry largely to its own devices until required. On a similar note, participants also

suggested that the industry would benefit if there was a more concerted, proactive

effort by the Government to coordinate activities between the GOCs. Many

participants noted the recent Swine Flu threat as an example of the limited nature of

their coordination efforts. This is highlighted by the following response:

“During the early Swine Flu pandemic there was a question in my mind as to

whether Government should be taking a more proactive role. It probably

wasn‟t necessary in the end but they were quite comfortable letting us run up

our own plans, put measures into place. We had already assured them that

we had them and that they were all under control...but it seemed a little

trusting. So maybe there could be more Government coordination.” (B)

176

Although the industry would benefit from further improvements in regards to these

external drivers, participants suggested that the Government‟s expectations are well

understood by the GOCs and that there is an implicit understanding that they would

certainly intervene to coordinate activities should reliability of supply be affected, as

was evidenced by the commissioning of the Somerville Report. This contention is

highlighted by the following statement:

“Because of Government ownership you have some faith that if it got to a

state where there were reliability problems occurring regularly then there‟d

be some sort of coordinated Government intervention to solve it” (E)

Similarly, participants noted that although AEMO serves to regulate the industry and

compels organisations to act reliably, it was described as a passive Regulator that has

never interjected to ensure system reliability. They also have a presence mainly

through reporting and compliance mechanisms, but were suggested to not be as

hands on or engaged as they could be. This is evident by the following quote:

“NEMMCO does have a role and the new regulator AEMO can actually

interject to compel. But I don‟t think that has actually ever played a part to be

honest. Passive Regulators is probably the right term. [It is] not like in

financial services where your regulator could be turning up and saying run

these simulations.” (E)

Nonetheless, there was a strong awareness that like the Government, the Regulator

certainly could and would interject if necessary, but it was apparent that the industry

could benefit from greater involvement by both the National Regulator, and from the

primary shareholder, particularly in respect to the coordination of activities and the

nature of oversight, as supported by the following quote:

„The Government doesn‟t really play a leading role in coordinating or

disseminating that information. That‟s probably an area where we‟re all going

along the same path and developing very similar strategies and if we shared

them you would get a lot of benefit from that.” (A)

Despite these noted limitations, it is clear that these external mechanisms still

contribute to resilient outcomes in the Queensland Electricity Industry. The GOCs

are compelled to act reliably through this combination of external regulatory

oversight and Government ownership, characterised by a positive shareholder

orientation that is reflected by the nature of their investment and oversight. External

177

regulation is constant driver for firms, but the Somerville Report served to highlight

that without a positive shareholder orientation, resilient outcomes may be

compromised. Furthermore, as the previous quotes suggest, there is also scope for

improvement in respect to Government-led coordination measures which would

further augment the collaborative and cooperative relationships evident between the

organisations, which will now be discussed.

Attitude and Ethos Characteristics

Whilst Industry Structure and Governance factors are certainly important for

contributing to resilient outcomes from an industry-wide perspective, another major

contributor emerged in terms of the collective Attitude and Ethos of the GOCs,

which can similarly be attributed to Government ownership. In this regard,

participants noted two factors of interest. Firstly, that there is a powerful collective

commitment to reliable outcomes, and a strong culture of reliability amongst the

GOCs. This has subsequently driven the development of cooperative relationships

between the GOCs which have emerged as a powerful coordination mechanism to

ensure the achievement of resilient outcomes from an industry-wide perspective.

Industry Commitment and Culture of Reliability

Building on the results of the organisational analysis, participants described the

presence of an industry-wide commitment to reliable outcomes and a culture of

reliability. All of the organisations in the individual analysis indicated the presence

of a culture of reliability and commitment to reliable outcomes, yet further suggested

that this was not unique to their organisation but rather the broader GOC fraternity.

This does not however extend to the entire industry, with participants indicating that

there are two distinct parts to the Queensland Electricity which can be divided by

ownership. The following quote supports this contention:

“Within the public sector you‟ve got a very high reliability culture, to the point

where it‟s not commercial; a very different culture. We don‟t have historically

the debt load that [private] projects have and the Government is a pretty good

shareholder in that if we don‟t make the benchmark return in any particular

year that doesn‟t immediately drive pressure to restructure the business. There

are really two parts to the industry. There‟s the Government-owned sector and

the private sector. The Government-owned sector has a very high

commitment... [with] any number of drivers towards reliability and one is the

178

historical culture and two is shareholder pressure because the Government

will immediately get pressure if any essential service doesn‟t work and they

are not shy about coming out and making big statements and giving us

direction and they have certainly got a balance between their profit motive and

their delivery of essential service motive. In the private sector they‟ll run

commercially and they‟re not so big in Queensland that they can significantly

affect reliability individually. They probably don‟t feel that they have got a

system wide responsibility outside of their obligations.” (B)

This is further reinforced by the following statement:

“It‟s the nature of the industry as it currently is and the nature of the people in

it. Most of the key players have come from the same organisation... same ethos

and ethics. So it‟s very heavily embedded. Right now it‟s working well for the

people of Queensland. I think that‟s the culture of the Queensland Industry.

[The GOCs] all come out of the same egg. It will change now because other

players are coming in. All the retailers are becoming Generators but heaven

help us if we are left to them because they won‟t have that culture. They don‟t

even pretend to have it. They just want to manipulate the market. They‟re

never going to provide reliable generation to the people of Queensland.

Private enterprises are just the cream in the market; targeted at the money

end. Presumably they‟ve paid huge dollars [and] want to maximise their

returns, which usually means cutting costs in terms of maintenance.”(C)

As suggested by participant responses, on the one hand the private players are run

commercially, and do not consider their responsibility outside of their market

objectives. Their role within the boarder industry however is limited in that they are

not large enough to affect reliability individually, with their capacity currently

limited to small-scale peaking-plant. In contrast, the GOCs who remain the major

players within the industry controlling all network assets and base-load generation

capacity were suggested to have a very high commitment to reliable outcomes and a

very different culture by virtue of unique drivers.

The drivers were identified as firstly the shared cultural heritage, in that all of the

GOCs were essentially created from the same „egg‟, with the disaggregation into

independent corporatised organisations occurring just a decade ago. This common

cultural heritage has been maintained, carried by the industry‟s long-serving

employees and today remains embedded throughout the GOC sector. As a result of

this common heritage there is a strong collective understanding of their purpose for

being, which is as a reliable essential service provider.

179

Whilst this is an important observation, it is critical to note that these are the views of

the participants of the GOCs. Whilst their experience in the industry is both lengthy

and deep, these views must be tempered with a degree of caution as they may be

fraught with a level of bias. The current study did not interview private players to

substantiate the comments, although an impartial key informant provided valuable

insights that supported these views. Similarly, a number of the participants

highlighted experiences with private players through joint venture operations with

their GOCs where they were able to witness such behaviour first hand. Such insights

provided further support for tempering the potential for bias of these views.

Similarly, participants suggested that the other driver is shareholder pressure from a

common owner contributing to this collective ethos, as the Government reinforces

the importance of reliable outcomes and the purpose of being an essential service

provider. Participants also indicated that Government ownership allows participants

to remain committed to reliable outcomes as they are not burdened with the same

high debt load as the private players who are in the market looking to make a return

on investment. Whilst the Queensland Government expects the GOCs to operate as

profitable entities, it takes a balanced approach to its objectives, and can because of

the flexibility engendered by the portfolio effect.

The presence of this collective commitment and culture of reliability was further

reinforced by the language used by the participants to describe the GOC sector, such

as “sister Generators”, “our GOC brothers”, and the “GOC fraternity”, all descriptive

terms that suggest a sense of collective community and strong relationships. The

collective commitment and culture of reliability generated from the common

ownership and cultural heritage has also contributed to the formation of strong

collaborative relationships between the GOCs.

180

Collaboration and Cooperation Measures

Participants also identified that there was strong sense of collaboration amongst the

GOCs with this acting as a coordination mechanism amongst the organisations,

although there were distinct variations in the strength of these relationships,

particularly between the sectors. The presence of an open environment for dialogue

and sharing of information was stronger at the Network end of the value chain, due

to the competitive market environment within the Generation sector. Whilst

information is shared informally across the industry, there are variations in terms of

the level of formal collaboration and engagement between the Generators and the

Network Providers, as evident in the following figure which indicates the nature of

strength of collaborative relationships between the six GOCs.

Figure 5.4 highlights that formal relationships between the Generation Sector

(Organisations A, B and C) and Network Providers (Organisations D, E and F) are

relatively weak overall. In contrast, formal collaborative relationships are particularly

strong amongst the Network Providers. The following quote by a participant from

Organisation (F) highlights this observation:

(NB: Line width indicates strength of formal measures)

B

A

C

F

D

E

Key Formal Measures

Informal Measures

Figure 5.4: Industry Collaborative Relationships

181

“Not much with Generators because it‟s a different thing... even though we

don‟t get involved in group discussion [etc.] we help each other. [For

example] there were protestors at Swanbank climbing up the chimney. We

were helping them with our cameras... We certainly help each other... when we

think industry, we think more (the other Distributors) and us. So good

collaboration and close collaboration and regular collaboration between us

and the other two Distributors.” (F)

Similarly, the Generators suggested that there is very little formal collaboration

amongst the Generation sector outside of the market. Given that they are operating

within a competitive market, they are said to share information for different reasons,

with concerns regarding the threat of insider trading which is said to preclude further

formal collaboration and communication. This is highlighted by the following quote:

“We don‟t have any sort of larger collaboration outside the market for reasons

of insider trading and collusion. [This occurs] at quite a low-level because

there continue to be lots of contacts throughout the industry. Our senior

engineers have worked at every power plant in the state. Because of that you

get some low level engagement. For example, our engineers would talk to

[Organisation A] power station‟s engineers because [the plants] are really

built from the same template. They‟ll share spares, information on reliability.

You will find that each of the disciplines will have their informal contacts.

There‟re these informal communications. They are really based on networks

and people have those networks because the industry has been reasonably

consolidated for a long time. There‟re some ideas but there is no real formal

mechanism. It‟s a friendly environment. As long as [insider trading risk] is

managed there‟re all sorts of ways that information is shared.” (B)

Although the Generators do have some low-level formal mechanisms in place

between the GOCs, particularly between Organisation (A) and (B) due to a similar

plant design, in addition to a number of loose cooperative arrangements with

Network Providers, the majority of formal interaction occurs through the market

mechanism via measures such as ancillary services and the scheduling of planned

outages, which is not unique to the GOC Generators. Beyond the market and low-

level examples of collaboration, the Generators instead predominately rely on

informal communication mechanisms to engage with each other, and the Network

Providers through colleagues within the GOC fraternity. This nature of collaboration

amongst the Generators supported by the following quote:

182

“We have our sister Generators. We do things like have a shared spares

agreement... if a piece of plant fails, we share equipment. There‟s certainly

lots of informal networking between .the Government-owned Generators.” (A)

Further supporting this view is the following response by a Participant from

Organisation (C):

“Compared to when we were all under one banner as the Queensland

Electricity Commission, we wouldn‟t share as much and the sites wouldn‟t

benefit as much as they would have before because they‟d have had far more

information available because it was one organisation... it‟s less than where

we were but it‟s still informal and it works... you‟ve got this sort of club where

people move around... if you don‟t get sharing any other way you‟re going to

get people who share by virtue of changing employment... all the operators

and engineers know each other and they‟re all probably on each other‟s e-

mail list... it‟s very open.” (C)

Similarly, the Network Providers also rely on this informal mechanism, although

formal collaboration and information sharing is certainly a lot stronger at this end of

the value chain, with participants describing a range of formal measures which are

continuously being strengthened. For instance, the formal relationship described

between Organisation (F) and Organisation (D) was the strongest, with significant

formal interaction between the two entities. Similarly, there was also quite strong

formal interaction between Organisation (D) and Organisation (E), with formal

measures increasing. The formal relationship between Organisation (F) and

Organisation (E) is slightly weaker by virtue of their physical connections. However,

efforts have been made to increase the formal relationship in recent years. For

example, joint simulation testing exercises have recently been undertaken. A

selection of the informal and formal collaborative measures described by

participants, are summarised in the following table:

183

Table 5.19: A Selection of Collaboration Measures

Informal Collaboration Measures

Strong Informal Networks: there is a close knit pool of employees that work in the

Queensland Electricity Industry, in a lot of cases, for their entire lives (a „life industry‟).

Employees often move between organisations. Presence of strong networks and bonds.

Everyone knows everyone else and has contacts within their different disciplines (e.g.

BCM managers, IT, Engineers, Health & Safety etc.). People will talk regularly with

peers from other organisations to share ideas, gain insight et cetera. A lot of one-on-one

collaboration occurs at the individual level

Regular Communications: via e-mail and telephone (evidence during recent H1N1

Pandemic)

Small Pool of External Consultants: provide informal advice about other firms within

the industry („cross pollinate‟)

Formal Collaboration Measures

Amongst GOC Generators

Shared Spares Arrangements: strongest between Organisations (A) and (B), but

arrangements in place between all three GOC Generators

Collaboration within the Electricity Market: scheduling maintenance to reduce impact to

supply; ancillary services cover etcetera

Arrangements to Share Personnel: low-level arrangements, for example, to share Health

& Safety advisors during site outages, however has no real benefit to reliability

Amongst GOC Network-Providers

Joint Training / Scenario Planning Exercises: Significant joint training exercises

annually between Organisation (D) and (F). Fewer exercises between Organisation (D)

and (E), and Organisation (E) and (F). No exercises include the Generators

Emergency / Disaster Response Arrangements: arrangements in place between all three

Network-Providers including sharing of personnel, resources and equipment to restore

supply. This was highlighted by response to The Gap Storm in 2009 and Cyclone Larry

in 2008

Sub-Contracting Arrangements: between Organisation (D) and (F) ensures very close

contact and collaboration between the two organisation

Streamlining / Standardising Processes & Equipment: to ensure seamless changeover,

particularly between Organisation (D) and (E)

Shared Spares Arrangements: between all three Network Providers.

Joint Functions: For example, Organisation (D) and (E) have joint call centre

capabilities, in addition to joint IT service provider.

184

As the previous discussion highlights, there are clear differences between the

industry participants in terms of the industry Attitude and Ethos characteristics, most

notably between the GOCs and the private players, but also some noted differences

between the Generators and Network Providers in terms of formal collaboration

measures. The results are summarised in the following figure:

Figure 5.5: Industry Attitude and Ethos Characteristics

Conclusion

The preceding Chapter summarised the key results that emerged from the findings. It

firstly presented the results of the organisational-level (“Wildavsky-ian”) analysis,

whereby it was identified that there were distinct similarities and differences between

the six organisations examined in respect to their resilient capacities based on an

assessment of BCM and Reliability-Enhancing characteristics (as identified in the

HRT literature). In particular, it was identified that there was a significant disparity

in the level of BCM capability between the organisations. In contrast, it was

established that Goal and Commitment characteristics, as described in the HRT

literature, were consistently strong across all six organisations.

Sh

are

d C

ult

ure

of

Rel

iab

ilit

y

Formal Collaboration Measures

HIGH

LOW

LOW HIGH

GOC

Network

Providers

(D) (E) (F)

GOC

Generators

(A) (B) (C)

Private

Players

185

Following on from the organisational-level findings, the Chapter detailed the results

of the system-wide analysis (“Holling-ian”). From an Industry Structure and

Governance perspective the orientation of the major shareholder (the State

Government) emerged as a critical consideration with significant potential to impact

resilient functioning. Similarly, from an Industry Attitdue and Ethos perspective a

collective culture and commitment to reliable outcomes was evident across the

industry that has encouraged the maintenance of collaborative tendencies between

GOC participants. Of particular interest in relation to resilience at the systemic level,

was the emergence of key sectoral considerations related to cultural and structural

inheritance factors from earlier pre-corporatised industry environment. Inheritance

Considerations also were found to manifest at the organisational and systemic level.

186

Chapter 6: Discussion and Conclusions

As indicated in previous Chapters, this research has utilised two different frames of

reference to explore resilience in the context of critical infrastructure systems. The

first, “Wildavsky-ian”, is an organisational level analysis to examine how critical

infrastructure organisations manage for resilience (the focus of RQ1). The other,

“Holling-ian”, is a systems-level analysis examining more broadly how networks of

organisations foster resilience (the focus of RQ2). The use of both frames sought to

better understand the overall research problem, which dealt with how networked

critical infrastructure systems operating in an increasingly institutionally fragmented

environment seek to foster resilient capabilities to ensure the reliable provision of

essential services.

The results from the preceding Chapter indicate that drivers of resilience exist at two

different levels within the Queensland Electricity Industry, with overlaps evident

most notably between Reliability-Enhancing and Industry characteristics. Similarly,

there is evidence of resilient and reliable functioning along the supply chain,

although this does not manifest via expected means as suggested by the relevant

literature, nor with the same combination of contributory factors outlined in this body

of knowledge. Thus, while there were distinct industry-wide conditions, there were

also noted variances between the organisations themselves and also between the

sectors (Generators and Network Providers).

Figure 6.1 summarises the key findings presented in Chapter 5, with three distinct

categories emerging from the analysis – organisational, sectoral, and industry-wide

conditions, which all appear to be influenced by a common thread that is interwoven

between them.

187

This Chapter will examine the significance of these findings from the perspective of

better understanding how these factors and conditions influence a capacity to become

resilient, and how they might inform the enhancement and improvement of

performance to ensure the security of electricity supply.

“Wildavsky-ian”: Organisational Level (Research Question 1)

This aspect of the research sought to examine how critical infrastructure

organisations organise for resilience. As noted in the previous Chapter, there are a

number of distinct similarities and differences at the organisational level that were

identified between the six GOCs examined. BCM and HRT Practices together

contribute to resilience, but in different combinations within each of the individual

organisations across the industry. BCM and HRT Practices can therefore be

considered key strategic determinants of resilience in the industry at the

organisational level. It was also identified however that Heritage Factors, which

manifested in the transition from a Government monopoly to a semi-privatised

(corporatised) structure also influence resilient functioning within the organisations.

Thus, the combination of organisational Practices (BCM and High-Reliability) and

Heritage Factors are suggested to be key emergent factors for how the critical

infrastructure organisations examined organise for resilience.

Network

Providers Non-competitive

Organisational

Emergent Factors

High Reliability & BCM

Practices (Heritage Factors)

Ho

llin

g-i

an

Figure 6.1: Summary of Key Findings

Wil

davsk

y-i

an

Industry-Wide

Emergent Factors Collective Commitment

& Culture of Reliability (Heritage Factors)

Government-Owned

Sectoral

Emergent Factors Nature of Operating

Environment (Regulation &

Collaboration)

Nature of Infrastructure (Heritage Factors)

Generation Market

A B C D E F

188

BCM and High-Reliability Practices

As was demonstrated in the previous Chapter, there was strong evidence that BCM

has been adopted by all organisations within the industry and its importance to

assuring resilient outcomes is widely recognised. There are, however, significant

inconsistencies in the approach and application by the organisations, and thus

marked differences in the level of achieved resilience amongst them. It was apparent

that BCM has been inconsistently adopted by the various organisations, with

activities currently occurring largely in organisational silos, despite having a

common owner (the Queensland Government). Further, it was evident that not all

organisations follow an approach that is consistent with those prescribed in the

literature or recognised International Standards, and instead a number of the

organisations demonstrate a conventional emergency response capability that is

rather limited in scope when contrasted with contemporary enterprise-wide

approaches to BCM.

All of the organisations appear to have a strong traditional emergency response

capability engrained by virtue of the nature of their operations, which has been

maintained during the industry‟s structural transition. Some however are clearly

more sophisticated in their approach to BCM, with a number of factors contributing

to this. In one instance, it was identified that the geographic spread of infrastructure

operated by the organisation has affected the development of an enterprise-wide and

meaningful BCM capacity. This can be attributed to the fact that the organisation

inherited six geographically dispersed entities in the corporatisation process, each

with their own unique management style and processes for emergency response in

place rather than a centralised, coordinated and enduring capability.

This was a single instance, however the variability noted above can be more

generally attributed to variations in the level of BCM resourcing within the

individual organisations in regards to both the tangible and intangible support

available from management. As noted in Chapter 5, there were distinct variations in

respect to BCM resourcing between the organisations. The organisations with the

most robust or mature BCM capability were those with evidence of significant

resourcing. In comparison, those organisations which commented on resourcing

189

issues or limitations were those deemed to have opportunities for improvement.

Whilst there were significant differences between the organisations in respect to

BCM capability, there was far more consistency in the level of achieved resilience in

relation to the Reliability-Enhancing characteristics, with strong evidence of the

features described in the HRT literature across the organisations.

The notion of inheritance (as an artefact from common public sector ownership) also

emerged as a factor influencing BCM and HRT Practices in the organisations. This

was particularly evident in respect to Organisation (C)‟s Reliability-Enhancing

characteristics when compared with other GOCs across the industry. Following the

disaggregation of the industry, Organisation (C) inherited plant that could be

considered limited in capacity and of sub-standard quality when compared to the

assets of other GOC base-load Generators. This deficiency led to a need for

investment in new plant however, recent investment decisions were made with cost

considerations in mind by virtue of the new corporatised environment across the

industry, and has subsequently left that organisation with challenges to their

reliability. Such conditions affect HRT Process and Design characteristics, including

technical performance, and structural flexibility and redundancy. Training and

learning has also been impacted as a result of the new and problematic plant, which

operators are finding difficult to fully understand. Such issues are reflective of the

organisation‟s lower overall rating for HRT capability, and supportive of the

emergent finding that conditions associated with inheritance (pre-corporatised

influences) have impacted resilience at the organisational level.

Whilst this was a notable difference at the organisational level, there was uniformity

across all six GOCs with regards to the Goal and Commitment characteristics, with

all describing conditions consistent with those identified in the HRT literature. With

such widespread presence of these conditions within the individual organisations, it

is apparent that Goal and Commitment characteristics are not just organisational-

level characteristics but, by extension, a distinct feature of the industry‟s GOCs. In

contrast, although there is some evidence of formal collaboration in terms of

response testing, and informal collaboration in respect to information sharing, there

is no consistent industry-wide BCM capability which can be attributed to a lack of

guidance from the collective owner; the Queensland Government. Accordingly,

190

BCM is currently an organisational level resilient-enhancing phenomenon with room

for improvement from a systemic context, whilst Reliability-Enhancing

characteristics are evident at both the organisational and systemic levels. There are

however also some variations due to sectoral conditions (discussed later – see page

194), and inheritance of organisational and cultural norms.

“Holling-ian”: Industry Level (Research Question 2)

The “Holling-ian” view of resilience was utilised as a frame of reference for

exploring the whole-of-supply chain aspects of the industry as a contributory feature

to resilient functioning. The associated research question sought to examine how

networks of organisations foster system resilience. As shown in Figure 6.1 (page

187), resilience from this perspective not only manifests at the industry-wide level,

but also at the sectoral level. In fact, distinct differences emerged between the

Generators and Network Providers as each are faced with specific business drivers

that impact on capacity to function in a resilient manner. Such variations can be

primarily attributed to the presence of a different operating environment across

sectors, resulting in different regulatory conditions and levels of collaborative

engagement. Despite these differences, collaboration is influenced to some extent by

a Collective Commitment and Culture of Reliability that is evident across the

industry. This was a distinct theme to emerge from the organisational analysis that

appears to influence resilient functioning at the industry-wide level.

Overall, it is evident that heritable traits and a deep understanding of their reason for

being are contributing to how the network of critical infrastructure organisations

examined in this research foster system resilience to ensure the reliable provision of

essential services, whilst other factors (i.e. Nature of the Infrastructure and Nature of

the Operating Environment) emerged at the sectoral level, impacting how resilient

outcomes are fostered between different parts of the larger system.

191

Collective Commitment and Culture

As noted in the organisational analysis, all six GOCs described Goal and

Commitment charactersitics that were consistent with the HRT literature. Upon closer

examination a reason for this consistency was evidenced by a Collective Commitment

and Culture evident within all GOCs that focused effort on resilient outcomes in the

industry. All six GOCs display a clear commitment to reliable outcomes and promote

a culture of reliability, which is said to differ from the private industry participants

by virtue of a set of unique drivers. These drivers can be attributed to heritage

phenomena as a result of common Government Ownership (as discussed earlier in

Chapter 5), in addition to the Critical Nature of their Operations which has resulted

in a fundamental understanding of their reason for being (Raison D'être) as providers

of essential services. Both of these factors combine to contribute to this Collective

Commitment and Culture and by extension, influence the presence of BCM and HRT

Practices.

The Critical Nature of Operations

Positive heritable traits such as recognition of the primacy of service delivery have

survived the transition into a corporatised industry structure and continue to

influence resilient outcomes in the industry. It became evidently clear that the

Critical Nature of their Operations and Raison D'être as essential service providers

is well understood amongst all of the GOCs in the industry. This awareness of

organisational purpose and collective understanding, contributes to the presence and

ongoing development of BCM and Reliability-Enhancing Practices within the

individual organisations. It also contributes to a Collective Commitment and Culture

amongst the GOCs as this shared understanding underlies their focused commitment

and reinforces the stability of such a culture of reliability. Whilst the Critical Nature

of their Operations is well understood and is an emergent theme at the industry-wide

level, another factor significantly influencing resilient outcomes for the GOCs in

particular is the shared Government Ownership and the common heritage associated

with that ownership.

192

In fact, the GOCs have a dual purpose. The first is to provide essential services to the

public and secondly, it is to provide a return on investment to the Government

shareholder. While the former has always been the case, the latter is a factor related

to the transition from a public sector institution to corporatised entities, as opposed to

privatised entities. Thus, organisational patterns (pre-corporatisation) have been

inherited, and to a strong-yet-subtle degree, maintained after the transformation.

Government Ownership

There are clear heritage factors at play in regards to the relationships between the

GOCs. The noted Collective Commitment and Culture of Reliability can also be

attributed to Government Ownership. The individual organisations as they currently

exist were established just a decade ago following the disaggregation of the industry,

but still essentially come from the same “egg” and continue to remain under the

common umbrella with the State Government as the dominant shareholder. Thus,

despite the fragmentation and shift to a corporatised structure, this change has not

fundamentally altered the objectives or outlook of the GOCs as their behaviour

continues to be influenced by the common owner and shared cultural heritage. The

presence of this fraternity or family-like mind-set that contributes to the existence of

a Collective Commitment and Culture of Reliability, has survived the industry

fragmentation, and continues to be reinforced by the industry‟s long serving

employees who have maintained strong connections across the newly erected

organisational boundaries.

There does however appear to be a sensitivity to temporal factors (i.e. the nature of

the current workforce). For example, long serving employees were very much aware

of this collective ethos and heritage phenomenon, whilst very recent entrants did not

seem to relate strongly to it. A respondent who had joined the industry 12 months

before involvement in the study was largely unaware of such conditions, whilst

another who joined 18 months ago had noticed signs of the phenomenon. There does

appear however to be a strong push by the organisations to maintain positive

heritable attributes, particularly within the techincal roles, with traineeships and

mentor programs, career mobility across the organisations, and the general

perception that it is a lifetime industry for many in these professions. Such features,

193

coupled with a widespread understanding of their Raison D'être, contributes to the

maintenance of these heritable traits.

Similarly, the continued presence of the Government as a common owner, albeit a

few steps removed, further ensures the maintenance of these heritable traits. This

also serves to reinforce the importance of, and responsibility for, reliable service

provision, as the Government is the responsible authority of last resort and is thus

held accountable for disruptions to service by constitutents (i.e. Queensland voters).

Whilst historical ownership has positively influenced the development of a

Collective Commitment and Culture of Reliability through heritable attributes,

another aspect associated with Government Ownership that can, and has served to,

negatively impact on resilient outcomes within the industry in the past, is the

Shareholder‟s Orientation. As has been indicated, under the new operating structure

the GOCs now have to provide a return on investment to the major shareholder, the

Queensland Government. Although the industry is currently highly reliable, with the

importance of a balance between investment and return reinforced by the Somerville

Report in 2004 (State of Queensland 2004), the orientation of the shareholder in

respect to oversight and investment has negatively impacted reliable service

provision in the past. The Shareholder‟s Orientation can therefore be considered to

be a major indicator of resilient outcomes within the industry, as it influences the

Nature of Oversight and ultimately the level of Investment in BCM and HRT

Practices by the GOCs. This industry-wide characteristic therefore serves to

influence resilient outcomes at both the industry-wide and organisational level.

Although the current balance between reliability and profitability has not always

been maintained by all of the GOCs, it is more easily achieved in this context than

amongst the fully privatised elements of the industry. Whilst private industry

participants would certainly be seeking reliable service provision, interview

responses indicated that their motivation differs given that they are not subject to the

same level of scrutiny or regulation as the GOCs. Further, their concern for reliable

outcomes was suggested to be primarily driven by acquisition of market share and to

ensure adequate return on investment. This is different to the approach of the GOC‟s

that operate as providers of an essential service with a community service obligation.

194

Thus, whilst the GOCs exhibit a Collective Commitment and Culture of Reliability, it

was suggested that the private players do not “even pretend to have it”, as their

objectives are very much skewed towards ensuring profitable outcomes – due to the

high investment costs of entering the market. Despite this, their capacity to influence

overall reliability of service provision was suggested to be minimal due to the small-

scale nature their generation capacity, but is expected to become more influential as

demand for electricity increases and more suppliers enter the market.

Sectoral Differences

Building on both the organisational and industry-wide analyses, it is apparent that

there are notable sectoral differences influencing resilient functioning between the

GOC Generators and Network Providers. Flexibility and redundancy emerged as a

key sectoral consideration from the organisational analysis of Reliability-Enhancing

characteristics, while collaboration and regulation emerged as points of difference

between the sectors from the industry-wide analysis. These distinct sectoral

differences can be attributed to two emergent factors: differences in the Nature of

Infrastructure operated, and also the Nature of the Operating Environment.

Nature of Operating Environment: Market-Based vs. Non-Competitive

The Nature of the Operating Environment serves to impact resilient functioning

within the industry, which is most evident in terms of differences in both Regulation,

and also in respect to Collaboration. This is because the GOC Generators are

competing with private players in a market-based environment. In contrast, all

network assets (Distribution and Transmission infrastructure) remain wholly-owned

by the Government. Thus, the two sectors have different regulatory conditions and

compliance regimes that affect their operations. For instance, the Generators noted

that they are subject to certain regulatory conditions due to the competitive market,

such as Competition Policy and other related elements of the Trade Practices Act

which preclude any behaviour that could be seen to be anti-competitive or collusive.

Accordingly, the differences between the operating environments and associated

regulatory conditions have also served to impact the nature of collaboration within,

and between industry sectors.

195

In general, the Collective Commitment and Culture of Reliability that has developed

from the common Government Ownership and associated heritage factors have also

contributed to the formation of strong collaborative relationships between the GOCs.

This is particularly evident in respect to informal collaboration which is strong

between all GOCs and promoted via informal networks. The strength and nature of

formal collaboration measures however were found to vary between the Generators

and Network Providers as a result of the different operating environments between

these sectors, as noted above. Under the market-based conditions of the Generation

sector, the level of formal and to some extent informal collaboration between the

Generators is constrained due to concerns about allegations of collusion.

Accordingly, this can reduce the opportunity for appropriate and suitable information

sharing and collaboration on matters such as BCM and HRT Practices, that may

otherwise increase efficiencies and indeed resilient outcomes. In contrast, the formal

collaboration between the Network Providers, whom are not constrained by market

conditions, was clearly far more significant and important as an enabler of reliable

and resilient functioning.

Nature of Infrastructure

In addition to the operating envrionment, the Nature of Infrastructure from the

perspective of structural flexibility and redundancy also emerged as a key sectoral

difference in the context of Reliability-Enhancing characteristics in the individual

organisations. In particular, it was noted that Generators have assets that are

relatively centralised geographically, located in one or a few locations, and are also

less vulnerable to wide-area disruption often caused by destructive weather events.

In contrast, the Network Providers have geographically dispersed assets that are

highly vulnerable to external threat sources, particularly weather. The scope and

extent of their infrastructure precludes a comparatively high level of physical

redundancy (as compared to the Generators) due to the high costs of implementation,

and also has further implications in regards to the level of investment required to

maintain the assets at a high level of reliability. Although a Reliability-Enhancing

characteristic explored within individual organisations, this pattern emerged as a

distinct difference amongst the sectors and thus has links to both the “Wildavsky-

ian” and “Holling-ian” lenses of resilience.

196

Summary of Findings – Implications for the Research Problem

It is evident from the preceding discussion that resilience-enhancing characteristics

are evident in both the organisational (“Wildavsky-ian”) and industry-wide

(“Holling-ian”) contexts: which also includes sectoral level considerations that

emerged in the analysis. The emergent themes identified and the relationships

between them contributing to understanding the research problem are evident in the

following figure:

Figure 6.2: Key Relationships Between Emergent Themes

Figure 6.2 embodies the research problem of this thesis, with all of the factors

identified in this figure suggested to collectively foster / influence resilient outcomes

in networked critical infrastructure systems functioning in an increasingly

institutionally fragmented environment. These factors however, were found to

manifest in different combinations and at different levels across the electricity supply

chain, with organisational, sectoral and industry-wide considerations. As was

highlighted previously in Figure 6.1 (page 187), and discussed extensively in

preceding sections, heritage factors maintained in the transition to a corporatised

structure, were evident across all levels of the industry and subsequently can be

Government

Ownership

Collective

Culture &

Attitude

Shareholder

Orientation

Practices BCM & HR

(Resourcing)

Critical Nature

of Operations

Coordination

Collaboration Investment

Oversight

Nature of

Infrastructure

Operating

Environment

National

Regulator

Raison D'être

External Events (e.g. GFC)

197

considered an umbrella theme to both Research Question 1 and 2, with links to many

of the emergent findings most notably BCM and HRT Practices, Government

Ownership, Collective Culture and Commitment, and the Critical Nature of

Operations.

The differences in how resilient capacities could manifest were found to be

influenced by aspects such as the Nature of Infrastructure in addition to the Nature

of the Operating Enviornment, which also consequently impacts the degree of

Collaboration between the organisations. Such factors emerged as sectoral level

considerations. Further influencing the degree of Collaboration in the industry is the

Collective Culture and Commitment to reliable outcomes which was found to be an

industry-wide condition contributing to resilient functioning, that is associated with

with the organisational level Goal and Commitment characteristics as identified in

the HRT literature.

This Collective Culture and Commitment phenomenon can more broadly be

attributed to the common Government Ownership, as well as the Critical Nature of

Operations which has resulted in a fundamental understanding of their Raison D'être

as providers of essential services. These emergent themes were all found to influence

resilient functioning at the industry-wide level, and can be considered heritage

factors. Further influencing resilient functioning at the industry-wide level, and also

associated with Government Ownership, is the Nature of Oversight and the level of

Investment in resilience-enhancing practices which can be influenced, either

positively or negatively, by the Shareholder‟s Orientation towards reliable outcomes.

The influence of External Events such as the Global Financial Crisis (GFC) and the

Carbon Pollution Reduction Scheme (CPRS), were also found to have the potential

to influence resilient functioning within the industry.

All of these factors, by association, can be found to influence the nature of BCM and

HRT Practices in the individual GOCs. Thus, such practices contribute to resilient

capacities at the organisational level, but many were also found to manifest at the

systemic level, under either industry-wide or sectoral considerations. Although the

research identified that there is a degree of Collaboration amongst the GOCs in

regards to BCM and HRT Practices, Coordination by the collective owner (the

198

Queensland Government) was identified as a means for enhancing BCM and HRT

Practices at both the organisational and systemic level. Such Coordination would

facilitate greater Collaboration amongst participants, and also assist with Investment

considerations from a cost savings perspective in terms of pooling knowledge and

resources. A further point, is that resourcing was found to influence a capacity

achieveing resilience, from a BCM perspective within individual organisations.

Resilience-Enhancing Characteristics

A key analytical framework applied in this research combines both insitutional and

systemic level factors as a means for understanding insitutional capacities for

fostering resilient outcomes. It is this combined analysis which allows a greater

understanding of the research problem. As suggested by Figure 3.4 (see Chapter 3 –

page 80), the critical review of the literature suggests that there would be a logical

balance between the three resilience-enhancing characteristics, namely, BCM,

Reliability-Enhancing, and Industry characteristics. The findings however suggest

that they manifest in different combinations across the industry, varying from the

predicted pattern of the resilience-enhancing characteristics (BCM, Reliability-

Enhancing) as identified in the literature, and also those emergent characteristics

identifiable at the industry level. The strength of the presence of the characteristics,

and the relationships between them are evident in Figure 6.3.

Figure 6.3: Strength of Presence & Emergent Relationships

Between Resilience-Enhancing Characteristics

BCM

Characteristics

Reliability-Enhancing

Characteristics

Industry

Characteristics

199

The BCM and Reliability-Enhancing characteristics examined were directly derived

from those firmly established in existing literature or relevant International

Standards. Accordingly, their corresponding circles are represented by a solid line. In

contrast, the circle representing Industry level characteristics is denoted by a broken

line as they were emergent factors that have not been firmly established in existing

literature. To identify Industry characteristics, the research investigation explored

predicted characteristics based on the literature, in conjunction with the iterative

research process that allowed emergent themes to come to the fore.

The findings demonstrate that, whilst all three conditions or factors are present in

some capacity, it was evident that in the context of this study the Reliability-

Enhancing characteristics were generally most consistenly aligned with the classical

conditions prescribed in the literature. Reliability-Enhancing characteristics, as well

as Industry level charactersitics were strong across all organisations examined (albeit

with a few exceptions), thus denoting larger circles in Figure 6.3 (page 198). This is

in contrast to BCM characteristics, which are represented by a smaller circle, as they

were inconsistently adopted by the six organisations resulting in variations in the

level of achieved resilience.

Furthermore, whilst there were variations in how resilience is achieved across the

industry, there were also variations in the strength of emergent relationships between

the three resilience-enhancing characteristics. The strength of the relationship

between these characteristics is indicated in Figure 6.3 (page 198) by the weight of

the line connecting them. Firstly, there was an emergent relationship between BCM

and Reliability-Enhancing characteristics with evidence of a number of common

factors. Although it is not the purpose of this investigation to explore these

relationships in great detail, a few notable examples include an emergent relationship

between training and learning considerations from a HRT perspective, and BCM

testing and review activities. In this instance, many participants mentioned testing

and review activities in their response to the nature of training and learning (in the

context of HRT).

200

Similarly, in their discussion about structural flexibility and redundancy

characteristics, participants discussed workaround measures in place as a form of

redundancy to ensure continuity from a BCM perspective. Whilst the examples

provided are by no means exhaustive, the relationships identified between BCM and

Reliability-Enhancing characteristics are an emergent consideration and could serve

as an interesting avenue for future research to be explored in greater detail.

The strongest relationship was evident between Reliability-Enhancing and Industry-

wide characteristics, with many of the Reliability-Enhancing characteristics

manifesting at the Industry-wide (“Holling-ian”) level. Of particular interest was the

HRT Goal and Commitment characteristics which were so strong at the

organisational (“Wildavsky-ian”) level, that they emerged as an industry-level

characteristic. For example, culture of reliability and commitment to reliable

outcomes at the organisational level emerged as a Collective Commitment and

Culture from an industry-wide perspective. A further example is in relation to the

shared spare parts arrangements which were evident at the industry level, but also

have links to redundancy and flexibility measures from a HRT perspective at the

organisational level.

In contrast, the results demonstrated that there is a much weaker relationship between

Industry-wide characteristics and BCM activities. For example, although there is

evidence of collaboration in respect to testing activities, as well as informal

discussions regarding BCM, this is the current extent of BCM activities undertaken

at the industry level. Overall, BCM activities are currently conducted largely in

organisational silos and thus, there is significant room for improvemet in this regard.

Potential Threats and Opportunities for Improvement to Resilient

Functioning

Although there are resilient-enhancing characteristics evident in the industry, both at

the organisational and industry level, there are a number of potential threats that have

been identified that may impact resilient functioning into the future. On the contrary,

a number of opportunities for improvement have also been identified from the

preceding analysis that may further enhance resilient functioning and thus reliable

service provision.

201

Threats to Resilient Functioning

From an external perspective, participants noted the impact of the current GFC,

which has increased focus on cost considerations. This would logically influence

decisions to apply capital expenditure to the purchase of new infrastructure elements

and deep maintenance amongst other things. Although a concern, participants do not

believe that it will have an adverse affect on the reliability of supply but it is making

them be more prudent with their spending, particularly in regards to maintenance. In

contrast however, the CPRS was identified as the largest source of potential concern

within the industry, particularly for the Generators, with potential to impact

reliability of supply and the cost of electricity service provision. Such emergent

issues in the external environment are creating uncertainty for industry participants

and have thus resulted in more targeted risk-based decision making, particularly in

regards to maintenance in the context of a financially constrained commercial

environment. Given such pressures on investment, it may have the potential to

impact resilience in the medium-to-long term despite the risk-based decision making.

A further source of potential disturbance that may impact the resilience of the

industry is the threat of privatisation which was cited by most participants. This is of

particular concern as ownership can be seen to underpin many of the existing

resilient capacities present within the Queensland Electricity Industry. Participant

responses suggested that a change of ownership arrangements could fundamentally

alter the drivers that currently contribute to resilient outcomes within the industry,

both at the organisational and industry-wide level. Despite citing the negative

experiences of interstate counterparts who have experienced this level of

restructuring, the extent of this potential impact on the Queensland Electricity

Industry however is unknown and is at this stage speculative.

Opportunities for Enhancement

Whilst there are a number of threats that may impact or challenge existing conditions

contributing to resilient outcomes, a number of opportunities have been discerned

from the analysis that may enhance resilient functioning in the industry. The first is

in respect to addressing the disparity in the level of BCM capability between the

organisations. Although all six organisations have the same owner, it was identified

202

that all vary in respect to their BCM capability. The first factor identified

contributing to this disparity is the level of resourcing of BCM functions within the

individual organisations. It was identified that those organisations with a centralised

BCM function at the corporate level and dedicated staff, have a more robust BCM

capability that is consistent with a recognised International Standard and is

enterprise-wide in nature.

Thus, it is clear that to enhance resilient outcomes in terms of BCM practices,

organisations would benefit from appropriate resourcing of a dedicated function

which requires Senior Management support. Whilst BCM activities may already

have implicit Senior Management support, it is critical that this is backed by tangible

support. Furthermore, participants also noted that although they have to provide an

assurance to the Government that they have a continuity response capability and

testing regime in place, there are very few policy guidelines offered by the

Government in terms of what constitutes a best practice approach to BCM. This

indicates a policy gap that is contributing to the disparate BCM capability.

More generally, there is evidence that significant value might be gained in terms of

resilient functioning through closer collaboration, particularly in respect to BCM.

Aside from informally sharing information, the organisations are currently

conducting their BCM activities as independent silos, as evidenced by the clear

differences between their BCM capabilities. According to the Australian BCM

Standard (Standards Australia 2004a, 12), a lack of collaboration is a common

practical problem, as the threat of competition often prevents organisations from

sharing information that would serve to benefit them collectively in developing their

BCM capabilities. This is possibly indicative of why there is more formal

collaboration with BCM and other activities at the un-competitive, network end of

the electricity supply chain. Differences between the organisations‟ BCM capabilities

within this sector indicate however that this collaboration is not comprehensive and

could be further augmented. Furthermore, despite operating in a competitive market,

it was also noted by the GOC Generators that they do not consider their “sister

Generators” as competition and would have greater formal interaction if they could.

In fact, the fear of collusion and the absence of a coordinating mechanism are factors

currently thwarting closer collaboration, particularly in the Generation sector.

203

Thus, given the apparent weakness or incomplete governance directives to the GOCs

and even the private players, the Government could play an enhanced role in

bringing everyone together to overcome fears of collusion and coordinate greater

collaborative engagement. Such a validation and coordination by Government is a

critical emergent factor for consideration, whereby the industry would benefit from

Government policy directives defining clear parameters around how that engagement

should occur, and in particular more explicit guidelines and requirements for BCM

practices.

Enhanced collaboration through a coordinated mechanism could be of great benefit,

because independently each of the organisations possess unique skills, whereby

pooling this knowledge and indeed resources together could add significant value. In

addition to benefits of collaborative learning and the potential cost savings of not

having to individually seek external advice, collaboration will further enhance the

potential for resilience in the industry by ensuring that higher level planning has been

undertaken across the value chain, and also ensure end-to-end service reliability.

Thus, enhanced coordination combined with clear guidelines would ensure

consistency and promote greater efficiency, which is of particular benefit given the

current climate of economic uncertainty and increased frugality. It is noted in

particular, that the current disparity in BCM capability at both the organisational and

industry level may be enhanced with appropriate resourcing at the organisational

level and greater guidance at the industry level.

Implications for Theory and Future Research Directions

Resilience has been identified as an important capacity for critical infrastructure

systems, to ensure the uninterrupted delivery of essential services. This capacity is

now critical for not only the organisations responsible for the sustained delivery of

these vital services, but also increasingly the broader industry settings given the

interdependent and networked nature of contemporary infrastructure systems. This

research contributes significantly to our understanding of resilience in geographically

dispersed critical infrastructure by using HRT and BCM as key bodies of literature.

Addressing the literature gaps identified in Chapter 2 and summarised in Chapter 3,

204

this research has explored evidence of resilient functioning in critical infrastructure

systems from a sociological perspective. This is a critical contribution to the

literature as few studies have looked at resilience in the context of critical

infrastructures from a non-engineering perspective.

To do this it utilised two conceptualisations of resilience, referred to as “Holling-ian”

and “Wildavsky-ian”, which were found to be useful frames for exploring how

resilience manifests within critical infrastructure systems. These frames provided

insight into how this phenomenon manifests at multiple levels within an electricity

supply chain (i.e. organisational, and industry-wide including sectoral

considerations). These frames allowed for different insight to emerge into the

challenges of managing complex critical infrastructure, within the context of a

partially fragmented (corporatised) environment.

As noted in the Chapter 2, resilience is a multi-disciplinary concept that remains ill-

defined and difficult to operationalise. The variables that contribute to resilient

capacities in complex systems remain largely unknown and thus there are few

defined variables that can be measured when studying resilience (Cumming et al.

2005). However, as detailed in the review of the literature, BCM and HRT are logical

enablers of resilient functioning in organisations. Both can contribute to resilient

functioning in organisations through their capacity to prevent disruptions from

occurring, and also to quickly respond to, and recover from them should they occur.

Whilst the link between these characteristics and resilient outcomes is pragmatic, it

cannot be measured.

For example, whilst HRT literature has identified a link between HROs and resilient

functioning, further research is required to operationalise this link. Similarly,

although firmly established in International Standards and progressed in academic

literature by the important contributions by the likes of Elliot et al. (2001), the

theoretical underpinings of BCM have not been clearly deliniated in relevant

literature, nor is the relationship between BCM and resilient functioning measurable.

Furthermore, few studies have explored the characteristics that contribute to resilient

functioning at the broader industry level; an important consideration given the

205

contemporary networked structure of critical infrastructure infrastructures. Thus, this

research also sought to explore emergent characteristics that may be contributing to

resilient functioning at the industry-level (between interconnected organisations).

Emergent themes for exploration were identified in relevant literature and throughout

the research process. Building on exploratory work by Seville et al. (2006) which

posited a number of conditions and characteristics that may be contributing to

resilience between interconnected infrastructure organisations, this research has

sought to better understand how resilience manifests at this level, identifying a

number of similar and some additional conditions, that may contribute to resilient

functioning under networked conditions.

The aim of this research was not specifically to measure the characteristics identified

in the literature, but to explore and better understand how these characteristics

manifest in the organisations and the broader industry, and how they contribute to

resilient functioning. This research has identified that all three of these conditions

(BCM, HRT and Industry characteristics) are present and may contribute to resilient

functioning across the industry, but this occurs in different combinations. Whilst the

current research provides valuable insight and the findings elucidated the

relationships further, future confirmatory research is required before definitive

conclusions about their viability as indicators of resilient functioning can be relied

upon. This work provides useful context for progressing this valuable line of

research.

Future studies may examine each of the identified resilience-enhancing

characteristics in greater detail (e.g. a study examining BCM, HRT or Industry

Characteristics), and/or in different contexts. For example, a future study may

examine these conditions in a different type of critical infrasturcture (e.g.

Telecommunications industry) that has undergone a similar restructuring process, or

explore how these characteristics manifest under different industry structural

conditions (e.g. a fully-privatised as compared to a wholly Government-owned

critical infrastructure industry). Such research would contribute to a greater

understanding of the applications and limitations of the concept of resilience to large-

scale, interconnected systems of infrastructure.

206

In particular, this research has also validated the presence of Reliability-Enhancing

characteristics as prescribed in the HRT literature, and importantly in the context of

partially fragmented critical infrastructure systems. In his Masters Dissertation, de

Bruijne (2006) explored HRT in the context of two fully privatised industries and

found that Reliability-Enhancing characteristics were decreased under those

institutional conditions (de Bruijne 2006). This research has demonstrated that under

less extreme institutional fragmentation (i.e. corporatisation) there was little impact

on Reliability-Enhancing characteristics, although it does have potential to be

affected, as was evidenced by the results of the Somerville Report (as discussed in

Chapter 5). Future research is required to explore the presence of these conditions in

other critical infrastructure industries operating under similar institutional conditions

(e.g. corporatised structure) to verify or add to the findings presented in this study.

A particular nuance that differed slightly from the description provided in the

literature was for the Importance of Reliability (a Reliability-Enhancing

characteristic) as detailed by de Bruijne (2006). Whilst the literature suggests that

reliability with a strong safety focus, should be the primary objective and not

marginalisable for other more common organisational objectives such as

profitability, participants spoke about the importance of maintaining a balance

between the corporate objectives of profitability and those of reliability/safety,

instead of a one dimensional approach. All six organisations were found to advocate

this balanced approach to corporate objectives, as reliability (safety) and profitability

(efficiency) were described as being mutually exclusive. That is, an electricity

organisation cannot be profitable if it is not reliable, and similarly it cannot invest in

reliability if it is not profitable. This relationship is represented in Figure 6.4.

Figure 6.4: A Balanced Approach to Reliability

Reliability (Safety)

Profitability (Efficiency)

207

Future studies may seek to explore how this characteristic contributes to reliable

functioning in critical infrastructure organisations, or similarly how organisations

work to balance these competing organsiational goals to ensure reliable functioning

under a range of operational conditions.

As the results and discussion have identified, the Goal and Commitment

characteristics from the HRT literature were the most pronounced within the

industry, and can also be seen to manifest at the industry level. This is interesting in

the context of the HRT literature as it has not yet been widely explored under

industry level (supply chain) conditions. While the research presented here has

touched on this issue in a limited manner, more detailed assessments are required to

futher extend this work from the current level of the organisation to broader foci at

an industry-wide level. With this in mind, HRT can also be seen to have value in

application to continuity and reliability in general supply chain contexts.

The research has also provided insight and understanding of BCM characteristics as

defined in International Standards (Standards Australia 2004a; British Standards

Institute 2006) from an academic perspective, as well as exploring the link between

this valuable management process and resilient functioing in the context of critical

infrastructure, from both organisational and broader industry perspectives. In

particular, the results highlighted the importance of appropriate resourcing, both in

terms of tangible and intangible support. Future studies may seek to examine in

greater detail how support (tangible and intangible) affects an organisation‟s BCM

capability.

In addition, the research identified that there are increasing pressures on investment

in resilience-enhancing practices as a result of external events such as the GFC and

CPRS, which have increased the importance and prevalance of risk-based decsision

making and futures planning. Given the constrained environmental conditions,

development of research exploring the role of scenario-based futures thinking from a

resilience perspective, and how it can help ensure reliable service provision in critical

infrastructure systems, is indeed essential.

208

Furthermore, the analysis provided some interesting results in that there was a clear

overlap evident between particular BCM and Reliability-Enhancing characteristics

which are both considered to be key, yet independent, resilience-enhancing

management processes in the literature. Although not explored in great detail in this

study, the relationships identified between BCM and HRT are an emergent

consideration and could potentially serve as an interesting avenue for future research

to be explored in greater detail. This is an interesting consideration for future

research as there may be significant benefits for both the literature and similarly

professional practice by developing greater understanding of the synergy between

these two resilience-enhancing management practices.

Research Limitations

A limitation of this study is that the findings may be unique or constrained by the

context of the industry as a corporatised, public sector. Thus, findings reported here,

while being informative, may not provide definitive direction or be applicable to a

fully privatised electricity supply chain. For instance, the heritable traits identified

may not be a notable phenomenon across other fully privatised electricity industry

and may be limited to the context of the industry studied or other electricity

industries where a similar corporatisation process has been undertaken. It would have

been useful to study multiple industries as cases, however this was not feasible due to

the contraints associated with a Masters Dissertation. This notion is supported by

Halinen and Tonroos (2005), who contend that an embedded single case design is

often the only option for research examining networks, given the demands to

examine multiple organisations and relationships within the one broader network.

Similarly, the level of detail and depth of analysis has been constrained by the

amount of time and resources available within the boundaries of a Masters

Dissertation. Accordingly, the current study has not been an in-depth examination of

any one group, or a detailed study of any one literature area (e.g. BCM, HRT and

Industry conditions). The exploratory nature of this work has examined the presence

(or absence) of these characteristics and how they contribute to resilient outcomes.

Accordingly, the study has not sought to define remedies for any gaps, but more to

describe them and illicit meaning of the phenomenon in the context of the problem at

hand. The findings may have been enhanced by a longer period of time and also by

209

the researcher being embedded within the six organisations, allowing for a more

detailed first-hand observation of the phenomenon.

Conclusion

A key goal of this research has been to better understand how resilience can be

engendered in networked critical infrastructure systems. The broader context of the

Queensland Electricity Industry is that it operates under increasingly institutionally

fragmented conditions, yet is expected to maintain reliability and essential supply of

service. In order to further explore this context, the research has utilised two different

analytical frames. The first focused on organisation-specific issues (“Wildavsky-

ian”), while the latter (“Holling-ian”), allowed for the exploration of whole of supply

chain conditions. At the organisational level, BCM and HRT Practices were found to

influence resilient functioning, but these characteristics manifested in different

combinations.

More broadly, at the systemic level a sense of Collective Culture and Group

Commitment to reliable outcomes and the Shareholder‟s Orientation were identified

and linked to Government Ownerhip. Sectoral level considerations such as Nature of

Infrastructure and Operating Environment also emerged from the analysis

influencing resilient functioning at the systemic level. Heritage factors maintained in

the transition to a corporatised structure were also found influence resilient

functioning across all levels.

Findings of the thesis suggest that for the purpose of this industry, the two analytical

frames (“Holling-ian” and “Wildavsky-ian”) have allowed the generation of findings

supporting the proposal that the industry as a whole exhibits resilient capacities. This

however was found to manifest at multiple levels across the Queensland Electricity

Industry and in different combinations, and can thus be considered an emergent

systems-level phenomena. Further, individual segments of the industry exhibit

variable levels of resilience intrinsic to their function and role within the supply

chain. Overall, the emergent factors identified contribute to resilient functioning

within this critical industry, and indeed support reliable provision of this essential

service.

210

Appendices

Appendix A: Supporting Questionnaires

The Role of Business Continuity Management The purpose of this questionnaire is to compare the organisation‟s degree of preparedness

across themes of BCM. When responding, please consider how extensive your organisation‟s

preparation is amongst each of the themes.

How extensive is your organisation’s operational capacity across the following themes? Please circle either: High (Extensive) / Medium (Moderate) / Low (Limited).

• The degree of implementation of BCM activities?

HIGH MEDIUM LOWImplementation

• Comprehensive (both internal & external) assessments of threat environment?

HIGH MEDIUM LOWThreat Assessment

• The degree of senior management input into BCM development and BIA?

HIGH MEDIUM LOW

Role of Senior Management

• The level of integration of information on BCM and Risk Issues with mandated audit and financial reporting to board level?

HIGH MEDIUM LOW

Corporate Governance

• Regular testing of plans and crisis simulations?

HIGH MEDIUM LOWTesting of Plans

• The degree to which BCM activities are embedded and communicated across the entire organisation?

HIGH MEDIUM LOW

Embeddedness within

Organisation

• The degree to which „futures‟ scenarios are used as strategic planning tools?

HIGH MEDIUM LOWFutures Scenarios

• The degree to which BCM is viewed as an integrated and major component of the organisation‟s Risk Management Framework?

HIGH MEDIUM LOW

Integration with Risk Management

Frameworks

• The degree to which the organisation‟s BCM and Risk Management activities follow the Australian/New Zealand Standards: ANZS:4360 and HB226.

HIGH MEDIUM LOW

Australian / New Zealand

Standards

• Overall detail and comprehensiveness of BCPs?

HIGH MEDIUM LOW

Sophistication of BCM Activities

211

High Reliability Theory in Practice The purpose of this questionnaire is to identify Reliability-Enhancing characteristics within

the organisation using major themes that are well established within the academic literature

on High Reliability Theory. When responding, please consider the extent to which your

organisation exhibits the Reliability-Enhancing characteristics listed below.

To what extent do you think your organisation exhibits the following characteristics? Please circle either: High (Extensive) / Medium (Moderate) / Low (Limited).

• Exceptionally high reliability levels?

HIGH MEDIUM LOWReliability Levels

• Sustained high technical performance?

HIGH MEDIUM LOW

Technical Performance

• High levels of structural flexibility and redundancy measures?

HIGH MEDIUM LOW

Flexibility & Redundancy

• Encourage/promote high degree of responsibility and accountability amongst all employees?

HIGH MEDIUM LOW

Autonomy & Accountability

• Encourage/promote flexible decision making processes and collegial patterns of hierarchy?

HIGH MEDIUM LOW

Decision Making & Organisational

Hierarchy

• A continual search for system improvement through organisational learning and regular training exercises (including for worst case scenarios)?

HIGH MEDIUM LOW

Training & Learning

• Reliability is not marginalisable (i.e. cannot be traded off; is the number one priority) within the organisation?

HIGH MEDIUM LOW

Importance of Reliability

• Promotion/existence of an organisational culture of reliability?

HIGH MEDIUM LOW

Organisational Culture of Reliability

• Commitment to maintaining highly reliable operations?

HIGH MEDIUM LOW

Commitment to Reliability

• External groups with access to timely and credibleinformation provide guidance to the organisation?

HIGH MEDIUM LOW

External Oversight

212

Appendix B: Interview Guide

Session 1 – BCM Questions What is your position within the organisation?

With this position, what activities are you responsible for?

Where are the organisation‟s Risk Manager(s) positioned within the company

and who do they report to?

1. Implementation

Why did you rate the organisation low/medium/high for this theme?

When did the organisation first implement BCM activities?

Why did the organisation decide to implement BCM?

How did the organisation implement BCM activities?

2. Threat Assessment

Why did you rate the organisation low/medium/high for this theme?

What types of threat have been identified as being more critical for your

organisation?

Once threats are indentified what occurs?

3. Role of Senior Management

Why did you rate the organisation as low/medium/high for this theme?

How are senior management involved in BCM development and specifically in

the Business Impact Analysis?

4. Corporate Governance

Why did you rate the organisation as low/medium/high for this theme?

If so, how is continuity planning included in your organisation‟s standard

patterns of corporate reporting?

5. Testing of Plans

Why did you rate the organisation as low/medium/high for this theme?

How regularly does the organisation test plans and conduct crisis simulations?

Within this theme I will also ask you about the review of your business

continuity activities. So do you have a BCM lifecycle (i.e. frequency of updating,

testing strategies, maintaining relevance)? How often do you re-do BCP‟s?

Do you have defined triggers to review and re-scope the plans? If so, what are

they? Are they internal, external, or a combination of both?

213

6. Embeddedness

Why did you rate the organisation as low/medium/high for this theme?

How does the organisation embed BCM across the entire organisation?

How does the organisation communicate BCM activities across the entire

organisation?

7. Futures Scenarios

Why did you rate the organisation as low/medium/high for this theme?

Is there any difference in emphasis placed on strategic planning as compared to

continuity planning?

8. Alignment / Integration of BCM with Risk Management Framework

Why did you rate the organisation as low/medium/high for this theme?

How are the organisation‟s BCM activities integrated within its wider risk

management framework?

9. Australian / New Zealand Standards

Why did you rate the organisation as low/medium/high for this theme?

What BCM framework does your organisation align itself with?

10. Sophistication of BCM Activities

Why did you rate the organisation as low/medium/high for this theme?

How do you believe you could further improve the quality/comprehensiveness of

your BCM activities?

Session 2 – High Reliability Theory Questions

1. Overall Reliability Levels

Why did you rate the organisation as low/medium/high?

What are the organisation‟s current reliability levels?

2. Technical Performance

Why did you rate the organisation as low/medium/high for this theme?

First class machinery and highly trained competent staff are said to be critical for

sustained high technical performance. How does the organisation go about

investing in these features?

3. Flexibility and Redundancy Measures

Why did you rate the organisation as low/medium/high for this theme?

Given that incidents are always likely to occur despite the best intentions,

redundancy measures are considered critical to organisational resilience. How

214

does the organisation use redundancy measures to contribute to ensure

organisational resilience?

4. Autonomy and Accountability

Why did you rate the organisation as low/medium/high for this theme?

How does the organisation encourage and reward employees for the discovery

and reporting of errors?

5. Decision Making and Organisational Hierarchy

Why did you rate the organisation as low/medium/high for this theme?

Because errors in high risk organisations can propagate rapidly, a hierarchical,

bureaucratic structure cannot operate at all times. How does the organisation

encourage more collegial, decentralised, flexible decision making processes and

patterns of authority, particularly during emergencies and times of stress?

6. Training/Learning

Why did you rate the organisation as low/medium/high for this theme?

Highly reliably organisations continuously strive to improve their reliability

performance, expertise and their ability to deal with reliability threatening events

through regular training and organisational performance reviews? How does the

organisation support these characteristics?

7. Importance of Reliability

Why did you rate the organisation as low/medium/high for this theme?

Pressures for efficiency and economy are said to conflict with the goal of

reliability. How does the organisation ensure that the goal of reliability is not

comprised for other common organisational goals such as pressures for reduced

costs or increased profits?

8. Organisational Culture of Reliability

Why did you rate the organisation as low/medium/high for this theme?

How does (could) the organisation promote an organisational culture of

reliability?

9. Commitment to Reliability

Why did you rate the organisation as low/medium/high for this theme?

What organisational processes and features do you believe demonstrate the

organisation‟s commitment to high levels of reliability?

215

10. External Oversight

Why did you rate the organisation as low/medium/high for this theme?

How does pressure from external stakeholders impact the organisation?

Closing Question

Are there any other issues/organisational aspects not yet discussed that you

believe could be facilitating or inhibiting resilience within your organisation?

Session 3 – Systems View Questions As an overall question, how has industry restructuring (institutional fragmentation)

impacted industry reliability and resilience? Why?

How are generation, transmission and distribution functions coordinated / integrated

to ensure that the lights stay on in Queensland? What measures are in place to ensure

accountability for reliable service provision? How has the level of regulation

changed?

Investment is a major indicator of resilience. From an industry wide perspective,

have any issues over investment in reliability/resilience-enhancing activities arisen

under the new institutional environment?

How aware is your organisation of critical dependencies with upstream and

downstream dependents? How does your organisation identify and generate this

awareness & what is done to manage them?

Do you believe that the broader industry acknowledges the importance of

maintaining reliability? What do you think demonstrates this?

Do you believe that there is a culture of reliability and/or a shared commitment to

reliability in the Queensland Electricity Industry? Why?

What do you think could put pressure on this current mindset?

How does your organisation collaborate and engage with other industry participants

to ensure system reliability and resilience?

What mechanisms (formal or informal) are in place for industry participants to

communicate and share information?

Thinking about your upstream & downstream dependents – how does the

organisation engage & involve other industry participants in your Risk & BCM

activities? Please describe your organisation‟s involvement (if any) in other industry

participants‟ Risk & BCM activities? What about other industry stakeholders (e.g.

Government- Federal, State)?

216

How do industry participants collaborate in terms of scenario planning and

continuity response exercises? What are some of the arrangements that are currently

in place?

What collective measures or arrangements are in place to ensure system wide

flexibility and redundancy?

Have the recent institutional changes increased resistance to inter-organisational

cooperation, information sharing & engagement? If so, why?

Is this resistance attributable to privacy /reputation concerns in a more competitive

environment?

What other issues may be inhibiting or facilitating resilience in the industry?

Are there any possible issues that have potential to impact reliability into the future?

What improvements do you believe could be made to improve the overall resilience

of the industry?

Session 4 – Follow-Up Questions Why do you believe BCM is important to your organisation? How do you

believe it contributes to the resilience of your operations?

Overall, would you consider your current BCM process to be quite mature?

Why?

o Would you describe your Business Continuity structure as formal or

informal? Why?

o Specifically, do you consider your governance & management structures

for BCM to be mature? Why? Would you say that you have the full

commitment & support of the Senior Management (EMT) & the Board?

Why?

o Do you consider your current BCM practices (in terms of training, threat

assessment, testing & review etc.) to be mature? Why?

One benefit of BCM is to improve operational resilience to unforseen events (i.e.

preparing for the unexpected). Have you developed broad credible disruption

scenarios to frame your continuity planning activities? If so, could you please

explain how you identified those and how you use them?

All Standards mention the use of a Business Impact Analysis (BIA) as part of a

best practice model. Do you identify critical business functions or processes

within the organisation (i.e. an understanding what is important to the

organisation)?

217

o Do you do a Business Impact Analysis (for critical business processes or

functions)? If so, how? (e.g. Resource Requirements, Interdependencies,

MAOs etc.)

o Has the organisation done a comprehensive analysis of interdependencies

for critical business processes?

o Do you have BCPs for your critical business processes or functions?

Do plant operators and trading staff, who have second by second responsibility

for ensuring reliability, have autonomy over decision making on a day to day

basis? Is this supported by a team environment?

218

Appendix C: Ethical Consent Form

PARTICIPANT INFORMATION for QUT RESEARCH PROJECT

“Resilience in the Queensland Electricity Industry” Research Team Contacts

Ms. Natalie Sinclair Masters Student

Dr. Paul Barnes Senior Lecturer & Principal

Supervisor 0402 363 638 0439 545 551

[email protected] [email protected]

Description This project is being undertaken as part of a Masters project for Ms. Natalie Sinclair. The purpose of this project is to assess impediments to, and opportunities for, the successful implementation of organisational resilience in the Queensland Electricity Industry. The research team requests your assistance because you represent a key area of the Electricity Industry in Queensland.

Participation Your participation in this project is voluntary. If you do agree to participate, you can withdraw from participation at any time during the project without comment or penalty. Your decision to participate will in no way impact upon your current or future relationship with QUT. Your participation will involve a series of interviews and a short questionnaire task. The interviews will be conducted at a location deemed appropriate by you (the participant), and should take approximately an hour per session. Follow-up interviews may be required in some instances.

Expected benefits It is expected that this project will not benefit you individually. However, results are likely to benefit the organisation and the wider Electricity Industry.

Risks There are no risks beyond normal day-to-day living associated with your participation in this project.

Confidentiality All comments and responses are anonymous and will be treated confidentially. The names of individual persons are not required in any of the responses. The interviews will involve audio recording, however participants can choose to participate without being recorded if they wish. All audio recordings will be destroyed after the contents have been transcribed. Comments obtained in the interviews can be verified by participants prior to final inclusion. All information obtained during interviews will remain confidential, accessed only by the interviewer, and will only be used for the purpose stated above.

Consent to Participate We would like to ask you to sign a written consent form (enclosed) to confirm your agreement to participate.

Questions / further information about the project Please contact the researcher team members named above to have any questions answered or

219

if you require further information about the project.

Concerns / complaints regarding the conduct of the project QUT is committed to researcher integrity and the ethical conduct of research projects. However, if you do have any concerns or complaints about the ethical conduct of the project you may contact the QUT Research Ethics Officer on 3138 2340 or [email protected]. The Research Ethics Officer is not connected with the research project and can facilitate a resolution to your concern in an impartial manner.

CONSENT FORM for QUT RESEARCH PROJECT

“Resilience in the Queensland Electricity Industry”

Statement of consent By signing below, you are indicating that you:

have read and understood the information document regarding this project

have had any questions answered to your satisfaction

understand that if you have any additional questions you can contact the research team

understand that you are free to withdraw at any time, without comment or penalty

understand that you can contact the Research Ethics Officer on 3138 2340 or [email protected] if you have concerns about the ethical conduct of the project

agree to participate in the project

understand that the project will include audio recording

Name

Signature

Date / /

220

Appendix D: Summary of Participant Questionnaire Responses

Participant Ratings

Themes A1 A2 B1 C1 C2 D1 D2 E1 E2 F1 B

usi

nes

s C

onti

nuit

y M

anag

emen

t

Implementation

H H M-H M-H H M M M-H M H

Embeddedness

H H M-H H M M M H M M-H

Sophistication

H H M-H M-H M M M M M H

Role of Senior

Management

H H M M-H H M M L-M M H

Corporate

Governance

M H M H H M M-H H H H

Futures Scenarios

H M M H M M H M M H

Threat Assessment

M H M-H M-H M M H H M M-H

Testing & Review

H H M-H H H M M M L H

Use of Standards

H H M H M H M-H H H L-M

Integration with Risk

Management

H H M-H H H M M M M M-H

Hig

h R

elia

bil

ity T

heo

ry

Reliability Levels

H H H M-H M-H H M-H H M H

Technical

Performance

H H H M-H M-H H H H M H

Flexibility &

Redundancy

H M H M-H M-H H H H H M

Autonomy &

Accountability

M H M-H M-H H H M M H M-H

Decision Making &

Hierarchy

M H H H H H H M M H

Training & Learning

M H M-H H H M H M M H

Importance of

Reliability

H H M H H H H M H H

Culture of Reliability

H H H H H H H H H H

Commitment to

Reliability

H H H H H H H H H H

External Oversight

H H M-H M-H L-M M H H M H

Key

H = High

M-H = Medium to High

M = Medium

L-M = Low to Medium

L = Low

Appendix E: Criteria Reference Table – BCM Capability Use of Standards

(US)

Integration with

Risk Management

(RM)

Threat Assessment

(TA)

Business Impact

Analysis

(BIA)

Testing &

Review

(TR)

Governance

Structures

(GS)

Senior

Management

(SM)

Embeddedness

(EM)

High Level of

Maturity

(5)

4360 for RM

Internationally recognised standard /

approach for BCM

Link well

recognised

Clear, formal

integration

Resources managed from the same area

Comprehensive,

enterprise-wide understanding of

threats

Formal, enterprise-wide

BIA – clear understanding of critical

functions

Enterprise wide

testing regime Across all levels

(crisis,

emergency & continuity)

Strong governance

framework with broad scope –

proactive, regular

reporting

Significant

involvement in all aspects of BCM –

conceptual &

tangible support

Strong awareness of

BCM amongst employees (including

induction, training

courses etc.)

Regular, Proactive

Communication

Moderately

High Level of

Maturity

(4)

Use 4360 for RM

Partial use of recognised BCM

Standard

Link recognised

Some level of formal integration

Resources managed from the same area

Enterprise-wide with

broadening understanding of all

threats

Formal, enterprise- wide

BIA but some gaps in understanding of

interdependencies – understanding of critical

functions

Broadening

scope of testing regime (towards

enterprise wide focus)

Across all levels

(crisis, emergency &

continuity)

Strong governance

framework proactive, regular

reporting – but some limitations

with continuity

Broadening scope of

involvement across a range of activities

– conceptual & evidence of tangible

support

Awareness/

understanding of BCM amongst employees;

Proactive communication

measures in place

Moderate

Level of

Maturity

(3)

Use of 4360 for RM,

No Standard followed for BCM

Link recognised

No formal integration

Resources managed

from the same area

Narrow understanding

of threats (not enterprise-wide)

Formal, but not enterprise

wide BIA – understanding of critical

functions

Narrow testing

regime (limited to technical)

Across all levels

(crisis, emergency &

continuity)

Good governance

framework & regular, proactive

reporting but not

continuity oriented (crisis / emergency

focus)

Narrow scope of

involvement (limited to technical

focus in a range of

activities) Conceptual support

with limited tangible

support

Awareness /

Understanding limited to those responsible for

BCM activities

Some proactive communication

measures, but often

reactive

Moderate with

some

Limitations

(2)

Partial use of 4360,

No Standard for

BCM

Link recognised

Not formally

integrated

Managed separately

Very narrow

understanding of

threats (limited to technical focus)

Some work and

understanding of critical

functions but not a formal, enterprise-wide

BIA

Narrow testing

regime

Limited to higher level plans (crisis

/ emergency

level)

Governance

framework but not

supporting continuity

activities –

reactive &/or infrequent

reporting

Narrow scope of

involvement

(technical focus &/ or limited to high

level crisis activities

– conceptual support but not tangible

support

Awareness/

Understanding limited

to those responsible for BCM activities

Reactive rather than

proactive communication

Limited Level

of Maturity

(1)

No Standards Link not recognised

Little or no integration

Poor or no

understanding of threats

No formal BIA but some

understanding of critical functions

Limited testing

regime overall

Limited / weak

governance structures – limited

or no reporting

Little or no

involvement – conceptual support

at most

Limited awareness or

understanding of BCM; Limited/No

BCM Communication

Appendix F: Criteria Reference Table: HR Capability

Technical

Performance

(TP)

Flexibility &

Redundancy

(FR)

Autonomy &

Accountability

(AA)

Decision Making

& Hierarchy

(DH)

Training &

Learning

(TL)

Importance of

Reliability

(IR)

Commitment to

Reliability

(RC)

Culture of

Reliability

(CR)

External

Oversight

(EO)

HIGH

(3) Sustained high

technical

performance Strong

Understanding of

environment Highly trained

competent staff;

Rigorous maintenance

High level of

technical &

personnel redundancy

Can respond

flexibly to disruptions

Friendly, open

environment for

proactive error reporting;

Reporting culture

evident

Collegial, team

environment;

Flexible organisational

structure

Deference to expertise during

time critical

situations

Rigorous &

ongoing training

(technical staff)

Process of

organisational

learning following an incident

Performance

reviews

All decisions take

into consideration

reliability & safety and are carefully

balanced with

financial considerations

Reliability is

paramount

Strong

commitment to

reliability articulated in

mission statement

Significant investment in

Reliability-

Enhancing attributes

Strong culture of

reliability

supported by a safety culture;

Reinforced in

company values

Strong external

oversight and

governance mechanisms

regulate &

encourage commitment to

reliable behaviour

MEDIUM

(2) Good technical

performance some difficulties

understanding

operating environment;

qualified/trained

staff, but may lack experience;

Reasonable

maintenance

Moderate level of

technical & personnel

redundancy but

room for improvement

Some ability to

respond flexibly to disruption

Improving

environment to encourage

proactive error

reporting; Reporting culture

developing

Collegial, team

environment

Capacity to

devolve into a

flexible organisational

structure during

times of crisis

Deference to

expertise at last

resort during time critical situations

Opportunities for

training & learning (technical

staff)

Process of organisational

learning following

an incident

Performance

reviews

Some balance

between corporate objectives of

reliability &

profitability

Reliability is

important, but not

paramount)

Commitment to

reliability articulated in

mission statement

Investment in Reliability-

Enhancing

attributes

Culture of

reliability evident and supported by a

safety culture

Moderate external

oversight and governance

mechanisms that

have some influence on

organisation‟s

commitment to reliable operations

LOW

(1) Problems with

technical

performance: trouble

understanding

operating environment

Trained staff but

lack expertise / experience

Infrequent or

inadequate maintenance

Low level of

technical and

personnel redundancy –

significant room

for improvement

Limited ability to

respond flexibly to

disruption

Blame culture /

mentality evident;

No attempts to develop a

reporting culture

Hierarchical

organisational

structure at all times with limited

/ no capacity for

deference to expertise during

time critical

situations

Infrequent /

limited

opportunities for ongoing training

(technical staff)

Limited / no attempts at

organisational

learning following an incident

Limited / no

attempts to review performance

Other corporate

objectives come

before reliability / safety

(reliability can be

traded-off)

Limited or no

commitment to

reliability articulated in

mission statement

Limited or no investment in

reliability

enhancing attributes

Limited or no

culture of

reliability; Disregard for

reliability and

safety amongst employees

Limited or no

influence from

external oversight and governance

mechanisms on

organisation‟s commitment to

reliable operations

Reference List

Abbate, J. 1999. From control to coordination: New governance models for

information networks and other large technical systems. In The governance of large

technical systems. Ed. O. Coutard. London: Routledge.

Abele-Wigert, I. and M. Dunn. 2006. International critical information

infrastructure handbook: Volume 1. Centre for Security Studies, ETH: Zurich.

Adger, W. N. 2006. Vulnerability. Global Environmental Change, 16 (3): 268-281.

Amaratunga, D. and D. Baldry. 2001. Case study methodology as a means of theory

building. Work Study, 50 (3): 95-104.

Amin, M. 2001. Toward self-healing energy infrastructure systems. Computer

Applications in Power, IEEE, 14 (1): 20-28.

Amin, M. 2002. Toward secure and resilient interdependent infrastructures. Journal

of Infrastructure Systems, 8 (3): 67-75.

Apt, J., L. B. Lave, S. Talukdar, M. G. Morgan and M, Ilic. 2004. Electrical

blackouts: A systemic problem. Issues in Science and Technology, 20 (4): 55-61.

Auerswald, P., L. M. Branscomb, T. M. La Porte and E. Michel-Kerjan. 2005. The

challenge of protecting critical infrastructure. Issues in Science and Technology, 22

(1): 77-83.

Australian Government. 2008. Trusted information sharing network for critical

infrastructure protection. Canberra: Attorneys General's Department.

Barnes, P. H. 2002. Approaches to community safety: Risk perception and social

meaning. The Australian Journal of Emergency Management, 17 (1): 15 - 23.

Barnes, P. H. 2005. Can organisational failures be prevented before they occur? A

discussion about corporate governance and risk management. 4th Global Conference

on Business and Economics 2005, June 26-28. St Hugh's College: Oxford University.

Beck, U. 1992. Risk society: Towards a new modernity. London: Sage.

Begley, C. M. 1996. Using triangulation in nursing research. Journal of Advanced

Nursing, 24 (1): 122-128.

Benbasat, I., D. K. Goldstein and M. Mead. 1987. The case research strategy in

studies of information systems. MIS Quarterly, 11 (3): 369-386.

Bergen, A. and A. While. 2000. A case for case studies: Exploring the use of case

study design in community nursing research. Journal of Advanced Nursing, 31 (4):

926-934.

224

Boin, A. 2004. Lessons from crisis research. International Studies Review, 6 (1):

165-194.

Boin, A. and P. Lagadec. 2000. Preparing for the future: Critical challenges in crisis

management. Journal of Contingencies and Crisis Management, 8 (4): 185-191.

Boin, A., P. Lagadec, E. Michel-Kerjan and W. Overdijk. 2003. Critical

infrastructures under threat: learning from the anthrax scare. Journal of

Contingencies and Crisis Management, 11 (3): 99-104.

Boin, A. and A. McConnell. 2007. Preparing for critical infrastructure breakdowns:

the limits of crisis management and the need for resilience. Journal of Contingencies

and Crisis Management, 15 (1): 50-59.

Boin, A. and D. Smith. 2006. Terrorism and critical infrastructures: Implications for

public-private crisis management. Public Money and Management, 26 (5): 295-304.

Botha, J. and R. Von Solms. 2004. A cyclic approach to business continuity

planning. Information Management and Computer Security, 12 (4): 328-337.

Boyatzis, R. E. 1998. Transforming qualitative information: Thematic analysis and

code development. Thousand Oaks, CA: Sage.

British Standards Institute. 2006. Code of Practice for Business Continuity

Management. BS25999-1. London: British Standards Institute.

Cassell, C. 2009. Interviews in organizational research. In The Sage Handbook of

Organizational Research Methods, ed. D. A. Buchanan and A. Bryman, 500-515.

London: Sage.

Cavana, R. Y., B. L. Delahaye and U. Sekaran. 2001. Applied business research:

Qualitative and quantitative methods. Milton, Queensland: John Wiley and Sons

Australia.

Charters, I. 2007. Business Continuity Management Good Practice Guidelines. The

Business Continuity Institute (BCI).

Charters, I. 2008. Business Continuity Management Good Practice Guidelines. The

Business Continuity Institute (BCI).

Cho, J. and A. Trent. 2006. Validity in qualitative research revisited. Qualitative

Research, 6 (3): 319-340.

Christie, M., M. Rowe, C. Perry and J. Charmard. 2000. Implementation of realism

in case study research methodology. In Entrepreneurial SME: Engines for growth in

the Millennium. ICSB World Conference 2000, June 7-10. Southbank, Brisbane.

225

Cobb, A. K. and S. Forbes. 2002. Qualitative research: What does it have to offer to

the gerontologist? Journal of Gerontology Series A: Biological Sciences and Medical

Sciences, 57 (4): 197-202.

Creswell, J. W. 2003. Research design: Qualitative, quantitative, and mixed method

approaches. 2nd

Ed. Thousand Oaks, CA: Sage.

Cumming, G. S., G. Barnes, S. Perz, M. Schmink, K. E. Sieving, J, Southworth, M.

Binford, R. D. Holt, C. Stickler and T. Van Holt. 2005. An exploratory framework

for the empirical measurement of resilience. Ecosystems, 8 (8): 975-987.

Dalziell, E. P. and S. T. McManus. 2004. Resilience, vulnerability and adaptive

capacity: Implications for system performance. International Forum for Engineering

Decision Making (IFED), December 6-8. Stoos, Switzerland.

de Bruijne, M. 2004. Next generation critical infrastructures: The push and pull to

real-time. IEEE International Conference on Systems, Man and Cybernetics 2004,

October 10-13. The Hague, Netherlands.

de Bruijne, M. 2006. Networked reliability: Institutional fragmentation and the

reliability of service provision in critical infrastructures. Faculty of Technology,

Policy and Management, Delft University of Technology, Netherlands.

de Bruijne, M. and M. van Eeten. 2007. Systems that should have failed: Critical

infrastructure protection in an institutionally fragmented environment. Journal of

Contingencies and Crisis Management, 15 (1): 18-29.

de Bruijne, M., M. van Eeten, E. Roe and P. Schulman. 2006. Assuring high

reliability of service provision in critical infrastructure. International Journal of

Critical Infrastructures, 2 (2/3): 231-246.

Dekker, S. and E. Hollnagel. 2006. Resilience engineering: New directions for

measuring and maintaining safety in complex systems. School of Aviation, Lund

University.

Devers, K. J. and R. M. Frenkel. 2000. Study design in qualitative research - 2:

Sampling and data collection strategies. Education for Health, 13 (2): 263-271.

Dolan, A. H. and I. J. Walker. 2003. Understanding vulnerability of coastal

communities to climate change related risks. Journal of Coastal Research,

Proceedings of the 8th

International Coastal Symposium: Itajai Brazil.

Drabek, T. E. 1970. Methodology of studying disasters past patterns and future

possibilities. The American Behavioral Scientist, 13 (3): 331-344.

Drennan, L. and A. McConnell. 2007. Risk and crisis management in the public

sector. New York: Routledge.

Dynes, R. R. 1970a. Organized behavior in disaster. Lexington, MA: Lexington

Books.

226

Dynes, R. R. 1970b. Organizational involvement and changes in community

structure in disaster. The American Behavioral Scientist (pre-1986), 13 (3): 430-440.

Egan, J. 2007. Anticipating future vulnerability: Defining characteristics of

increasingly critical infrastructure-like systems. Journal of Contingencies and Crisis

Management, 15 (1): 4-17.

Eisenhardt, K. M. 1989. Building theories from case study research. The Academy of

Management Review. 14 (4): 532-550.

Eisenhardt, K. M. and M. E. Graebner. 2007. Building theory from cases:

Opportunities and challenges. Academy of Management Journal, 50 (1): 25-32.

Elliott, D., E. Swartz, B. Herbane. 2002. Business continuity management: A crisis

management approach. London, Routledge.

Farrell, A. E., L. B. Lave and G. Morgan. 2002. Bolstering the security of the electric

power system: The infrastructure cannot be made invulnerable, but the industry can

improve its ability to provide service even when attacked. Issues in Science and

Technology, 18 (3): 49-56.

Farrell, A. E., H. Zerriffi, H. Dowlatabadi. 2004. Energy infrastructure and security.

Annual Review of Environment and Resources, 29: 421-469.

Fiksel, J. 2006. Sustainability and resilience: Towards a systems approach.

Sustainability: Science, Practice and Policy, 2 (2): 14-21.

Folke, C. 2006. Resilience: The emergence of a perspective for social-ecological

systems analyses. Global Environmental Change, 16 (3): 253-267.

Frederickson, H. G. and T. La Porte. 2002. Airport security, high reliability and the

problem of rationality. Public Administration Review, 62 (1): 33-43.

Freestone, M. and L. Lee. 2008. Planning for and surviving a BCM audit. Journal of

Business Continuity and Emergency Planning, 2 (2): 138-151.

Frey, J. H. and A. Fontana. 1991. The group interview in social research. Social

Science Journal, 28 (2): 175-187.

Garnett, J. L. and A. Kouzmin. 2007. Communicating throughout Katrina:

Competing and complementary conceptual lenses on crisis communication. Public

Administration Review, 67: 171-188.

Gibb, F. and S. Buchanan. 2006. A framework for business continuity management.

International Journal of Information Management, 26 (2): 128-141.

Gilmore, A. and D. Carson. 1996. Integrative qualitative researching: a services

context. Marketing Intelligence and Planning, 14 (6): 21-26.

227

Goldman, C. A., G. L. Barbose and J. H. Eto. 2002. California customer load

reductions during the electricity crisis: Did they help to keep the lights on? Journal

of Industry, Competition and Trade, 2 (1-2): 113-142.

Grabowski, M. R. and K. Roberts. 1996. Human and organizational error in large

scale systems. IEEE Transactions on Systems, Man and Cybernetics, 26 (1): 2-16.

Grabowski, M. R. and K. Roberts. 1997. Risk mitigation in large scale systems:

Lessons from high reliability organizations. California Management Review, 39 (4):

152-162.

Grabowski, M. R. and K. Roberts. 1999. Risk mitigation in virtual organizations.

Organization Science, 10 (6): 704-721.

Griffiths, M. 2008. Aviation infrastructure protection: Threats, contingency plans and

the importance of networks. Australian Security and Intelligence Conference 2008,

December 1-3. Edith Cowan University, Perth.

Guba, E. G. 1990. The alternative paradigm dialog. In The Paradigm Dialog. Ed. E.

G. Guba. 17-28. Newbury Park, CA: Sage.

Guba, E. G. and Y. S. Lincoln. 1994. Competing paradigms in qualitative research.

In Handbook of Qualitative Research, eds. N. K. Denzin and Y. S. Lincoln, 105-117.

Thousand Oaks, CA: Sage.

Gummesson, E. 2007. Case study research and network theory: Birds of a feather.

Qualitative Research in Organizations and Management: An International Journal,

2 (3): 226-248.

Haimes, Y. Y. and B. M. Horowitz. 2004. Modeling interdependent infrastructures

for sustainable counterterrorism. Journal of Infrastructure Systems, 10 (2): 33-42.

Haimes, Y. Y. and P. Longstaff. 2002. The role of risk analysis in the protection of

critical infrastructures against terrorism. Risk Analysis, 22 (3): 439-444.

Halinen, A. and J. Tornroos. 2005. Using case methods in the study of contemporary

business networks. Journal of Business Research, 58 (9): 1285-1297.

Hamel, G. and L. Valikangas. 2003. The quest for resilience. Harvard Business

Review, 81 (9): 52-57.

Hammersley, M. 1992. What's wrong with ethnography: Methodological

explorations. London: Routledge.

Hammersley, M. 1995. The politics of social research. London: Sage.

Handmer, J. and S. Dovers. 1996. A typology of resilience: Rethinking institutions

for sustainable development. Industrial and Environmental Quarterly, 9 (4): 482-

511.

228

Healy, M. and C. Perry. 2000. Comprehensive criteria to judge validity and

reliability of qualitative research within the realism paradigm. Qualitative Market

Research: An International Journal, 3 (3): 118-126.

Herbane, B., D. Elliott and E. Swartz. 1997. Contingency and continua: Achieving

excellence through business continuity planning. Business Horizons, 40 (6): 19-25.

Herbane, B., D. Elliott and E. Swartz. 2004. Business continuity management: time

for a strategic role? Long Range Planning, 37: 435-457.

Hoepfl, M. C. 1997. Choosing qualitative research: A primer for technology

education researchers. Journal of Technology Education, 9 (1): 41-63.

Hoffer-Gittell, J., K. Cameron, K Cameron, S. Lim and V. Rivas. 2008. Airline

industry responses to September 11th. In International terrorism and threats to

security: Managerial and organizational challenges, ed. R. J. Burke and C. L.

Cooper, 267-290.Cheltenham: Edward Elgar.

Holling, C. S. 1973. Resilience and stability of ecological systems. Annual Review of

Ecology and Systematics, 4: 1-23.

Holling, C. S. 1996. Engineering resilience versus ecological resilience. In

Engineering within ecological constraints, ed. P. Schulze, 31-44. Washington, DC:

National Academy Press.

Hollnagel, E., D. D. Woods and N. Leveson. 2006. Resilience engineering: Concepts

and precepts. Aldershot: Ashgate.

Holmgren, A. J. 2007. A framework for vulnerability assessment of electric power

systems. In Critical infrastructure: Reliability and vulnerability, ed. A. T. Murray

and T. H. Grubesic, 31-55. Berlin: Springer.

Ingles, O. 1991. A linguistic approach to hazard, risk and error. In New perspectives

on uncertainty and risk, ed. J. Handmer., B. Dutton, B. Guerin and M. Smithson, 66-

78. Canberra: Centre for Resource and Environmental Studies, Australian National

University.

International Risk Governance Council. 2006. Managing and reducing social

vulnerabilities from coupled critical infrastructures. IRGC White Paper No.3.

Geneva, Switzerland.

Jarman, A. 2001. Reliability reconsidered: A critique of the HRO-NAT debate.

Journal of Contingencies and Crisis Management, 9 (2): 98-107.

Jermier, J. M. 2004. Complex systems threaten to bring us down. Organization and

Environment, 17 (5): 5-8.

Jiang, P. and Y. Y. Haimes. 2004. Risk management for leontief-based

interdependent systems. Risk Analysis, 24 (5): 1215-1229.

229

Joskow, P. L. 2001. California's electricity crisis. Oxford Review of Economic Policy,

17 (3): 365-388.

Kendra, J. M. and T. Wachtendorf. 2003. Elements of resilience after the World

Trade Center disaster: Reconstituting New York City's emergency operations centre.

Disasters, 27 (1): 37-53.

Koulikoff-Souviron, M. and A. Harrison. 2006. Using case study methods in

researching supply chains. In Research Methodologies in Supply Chain Management,

ed. H. Kotzab, S. A. Seuring, M. Muller and G. Reiner, 267-282. New York:

Physica-Verlag.

Kovoor-Misra, S., R. F. Zammuto and I. I. Mitroff. 2000. Crisis preparation in

organizations: prescription versus reality. Technological Forecasting and Social

Change, 63 (1): 43-62.

La Porte, T. 1994. A strawman speaks up: Comments on the limits of safety. Journal

of Contingencies and Crisis Management, 2 (4): 207-211.

La Porte, T. 1996. High reliability organizations: Unlikely, demanding and at risk.

Journal of Contingencies and Crisis Management, 4 (2): 60-71.

La Porte, T. 2005. Governance and the specter of infrastructure collapse. 8th

National Public Management Research Conference 2005. October 1. University of

Southern California.

La Porte, T. and P. Consolini. 1998. Theoretical and operational challenges of high

reliability organizations: Air traffic control and aircraft carriers. International

Journal of Public Administration, 21 (6-8): 847-852.

Lagadec, P. 1993. Preventing chaos in a crisis: Strategies for prevention, control and

damage limitation. London: McGraw-Hill.

Lagadec, P. 2005. Crisis management in the 21st Century: “Unthinkable” events in

“inconceivable” contexts. Paris. EcolePolytechniqueCentre National de la Recherche

Scientifique.

Lagadec, P. 2007. Crisis management in the twenty-first century: Unthinkable events

in inconceivable contexts. In Handbook of Disaster Research, ed. H. Rodriguez, E.

L. Quarantelli and R. R. Dynes, 489-507. New York: Springer.

Lagadec, P. and U. Rosenthal. 2003. Critical networks and chaos prevention in

highly turbulent times. Journal of Contingencies and Crisis Management, 11 (3): 97-

98.

Lee, B., P. M. Collier and J. Cullen. 2007. Reflections on the use of case studies in

accounting, management and organizational disciplines. Qualitative Research in

Organizations and Management: An International Journal, 2 (3): 169-178.

Lincoln, Y. S. and E. G. Guba. 1985. Naturalistic inquiry. Newbury Park, CA: Sage.

230

Little, R. G. 2002. Controlling cascading failure: understanding the vulnerabilities of

interconnected infrastructures. Journal of Urban Technology, 9 (1): 109-123.

Little, R. G. 2003. Toward more robust infrastructure: Observations on improving

the resilience and reliability of critical systems. In 36th Annual Hawaii International

Conference on Systems Science 2003, January 6-9. 58-66. Hawaii.

Manion, M. and W. M. Evan. 2000. The Y2K problem and professional

responsibility: A retrospective analysis. Technology in Society, 22 (3): 361-387.

Manion, M. and W. M. Evan. 2002. Technological catastrophes: their causes and

prevention. Technology in Society, 24 (3): 207-224.

Mannarelli, T., K. Roberts and R. G. Bea. 1996. Learning how organizations mitigate

risk. Journal of Contingencies and Crisis Management, 4 (2): 83-92.

Manyena, S. B. 2007. The concept of resilience revisited. Disasters, 30 (4): 433-450.

Marias, K., N. Dulac and N. Leveson. 2004. Beyond normal accidents and high

reliability organizations: The need for an alternative approach to safety in complex

systems. Engineering Systems Division Symposium 2004. March 24. Cambridge,

MA: MIT.

McDaniels, T., S. Chang, D. Cole, J. Mikawoz and H. Longstaff. 2008. Fostering

resilience to extreme events within infrastructure systems: Characterizing decision

contexts for mitigation and adaptation. Global Environmental Change, 18 (2): 310-

318.

McManus, S. T. 2008. Organisational resilience in New Zealand. Civil Engineering.

Canterbury, University of Canterbury.

Mendonca, D., E. E. Lee and W. A. Wallace. 2004. Impact of the 2001 World Trade

Center attack on critical interdependent infrastructures. IEEE International

Conference on Systems, Man and Cybernetics 2004, October 10-13. The Hague,

Netherlands.

Miles, M. B. and A. M. Huberman. 1994. Qualitative data analysis: An expanded

sourcebook. Thousand Oaks, CA: Sage.

Mitroff, I. I. 1988. Crisis management: Cutting through the confusion. Sloan

Management Review, 29 (2): 15-20.

Mitroff, I. 1994. Crisis management and environmentalism: A natural fit. California

Management Review, 36 (2): 101.

Mitroff, I. I., T. C. Pauchant, M. Finney and C. Pearson. 1989. Do (some)

organizations cause their own crises? The cultural profiles of crisis-prone vs. crisis

prepared organizations. Industrial Crisis Quarterly, 3 (4): 269-283.

231

Mitroff, I. I., P. Shrivastava and F. E. Udwadia. 1987. Effective crisis management.

Academy of Management Executive, 1 (4): 283.

Narich, R. 2005. Critical infrastructure, continuity of services and international

cooperation. International Journal of Critical Infrastructures, 1 (2/3): 293-298.

National Fire Protection Association (NFPA). 2008. Standard on

Disaster/Emergency Management and Business Continuity Programs. Quincy, MA:

National Fire Protection Association.

O'Connor, P. D. T., D. Newton and R. Bromley. 2002. Practical reliability

engineering. Chinchester, England: Wiley.

Parkhe, A. 1993. Messy' research, methodological predispositions, and theory. The

Academy of Management Review, 18 (2): 227-268.

Patton, M. Q. 1999. Enhancing quality and credibility of qualitative analysis. Health

Services Research, 34 (5): 1189-1208.

Patton, M. Q. 2002. Qualitative research and evaluation methods. Thousand Oaks,

CA: Sage.

Pauchant, T. C. and R. Douville. 1993. Recent research in crisis management: A

study of 24 authors' publications from 1986 to 1991. Industrial and Environmental

Crisis Quarterly, 7 (1): 43-66.

Pauchant, T. C. and I. I. Mitroff. 1988. Crisis prone versus crisis avoiding

organizations: is your company's culture its own worst enemy in creating crises?

Industrial Crisis Quarterly, 2 (1): 53-63.

Pearson, C. M. and J. A. Clair. 1998. Reframing crisis management. Academy of

Management Review, 23 (1): 59-76.

Pearson, C. M., S. Kovoor-Misra, J. A. Clair and I. I. Mitroff. 1997. Managing the

unthinkable. Organizational Dynamics, 26 (2): 51-64.

Pearson, C. M. and I. I. Mitroff. 1993. From crisis prone to crisis prepared: a

framework for crisis management. Academy of Management Executive, 7 (1): 48-59.

Perrow, C. 1984. Normal accidents: Living with high risk technologies. New York:

Basic Books.

Perrow, C. 1994. The limits of safety: The enhancement of a theory of accidents.

Journal of Contingencies and Crisis Management, 2 (4): 212-220.

Perrow, C. 1999a. Normal accidents: Living with high risk technologies. Princeton:

Princeton University Press.

Perrow, C. 1999b. Organizing to reduce the vulnerabilities of complexity. Journal of

Contingencies and Crisis Management, 7 (3): 150-155.

232

Perrow, C. 2007. Disasters ever more? Reducing U.S. vulnerabilities. In Handbook

of Disaster Research, ed. H. Rodriguez, E. L. Quarantelli and R. R. Dynes, 523-533.

New York: Springer.

Perry, C. 1998. Processes of a case study methodology for postgraduate research in

marketing. European Journal of Marketing, 32 (9/10): 785-802.

Perry, C., A. Riege and L. Brown. 1999. Realism's role among scientific paradigms.

Irish Marketing Review, 12 (2): 16-23.

Pollard, D. and S. Hotho. 2006. Crises, scenarios and the strategic management

process. Management Decision, 44 (6): 15.

Pommerening, C. 2007. Resilience in organizations and systems: Background and

trajectories of an emerging paradigm. In Critical Thinking: Moving from

Infrastructure Protection to Infrastructure Resilience: CIP Program Discussion

Paper Series, ed. J. A. McCarthy. George Mason University.

Preble, J. F. 1997. Integrating the crisis management perspective into the strategic

management process. Journal of Management Studies, 34 (5): 22.

Quarantelli, E. L. 1970. A selected annotated bibliography of social science studies

on disasters. The American Behavioral Scientist (pre-1986), 13 (3): 452-456.

Quarantelli, E. L., P. Lagadec and A. Boin. 2007. A heuristic approach to future

disasters and crises: New, old, and in-between types. In Handbook of Disaster

Research, ed. H. Q. Rodriguez, E. L. Quarantelli, and R. R. Dynes, 16-41. New

York: Springer.

Rasmussen, J. 1997. Risk management in a dynamic society: A modeling problem.

Safety Science, 27 (2-3): 183-213.

Renn, O. 1992. Concepts of risk: A classification. In Social theories of risk, ed. S.

Krimsky and D. Golding, 53-79. Westport, CT: Praeger.

Richardson, B. 1994. Socio-technical disasters: profile and prevalence. Disaster

Prevention and Management, 3 (4): 41-69.

Riege, A. 2003. Validity and reliability tests in case study research: A literature

review with "hands-on" applications for each research phase. Qualitative Market

Research: An International Journal, 6 (2): 75-86.

Rijpma, J. A. 1997. Complexity, tight–coupling and reliability: Connecting Normal

Accidents Theory and High Reliability Theory. Journal of Contingencies and Crisis

Management, 5 (1): 15-23.

Rijpma, J. A. 2003. From deadlock to dead end: The normal accidents-high

reliability debate revisited. Journal of Contingencies and Crisis Management, 11 (1):

37-45.

233

Rinaldi, S. M., J. P. Peerenboom and T. K. Kelly. 2001. Identifying, understanding,

and analysing critical infrastructure interdependencies. IEEE Control Systems

Magazine. December (2001): 11-25.

Robb, D. 2006. Lessons in business continuity and disaster recovery. Business

Communications Review, 36 (11): 52-55.

Roberts, K. 1990. Managing high reliability organizations. California Management

Review, 32 (4): 101-113.

Roberts, K. and R. Bea. 2001. Must accidents happen? Lessons from high reliability

organizations. The Academy of Management Executive, 15 (3): 70-79.

Roberts, K. and G. Gargano 1990. Managing a high-reliability organization: A case

for interdepedence. In Managing Complexity in High Technology Organizations, ed.

M. A. von Ginlow and S. A. Mohrman, 146-159. New York: Oxford University

Press.

Roberts, K. and D. M. Rousseau. 1989. Research in nearly failure-free, high

reliability organizations: Having the bubble. IEEE Transactions on Engineering

Management, 36 (2): 132-139.

Roberts, K., D. M. Rousseau, T. La Porte. 1994. The culture of high reliability:

Quantitative and qualitative assessment aboard nuclear powered aircraft carriers. The

Journal of High Technology Management Research, 5 (1): 141-161.

Rochlin, G. I. 1993. Defining high reliability organizations in practice: A taxonomic

prologue. In New challenges to understanding organizations, ed. K. Roberts, 11-32.

New York: Macmillan Publishing Company.

Roe, E., P. Schulman, M. van Eeten and M. de Bruijne. 2005. High-reliability

bandwidth management in large technical systems: Findings and implications of two

case studies. Journal of Public Administration Research and Theory, 15 (2): 263-

280.

Rothery, M. 2005. Critical infrastructure protection and the role of emergency

services. The Australian Journal of Emergency Management, 20 (2): 45-50.

Rowley, J. 2002. Using case studies in research. Management Research News, 25

(1): 16-27.

Sagan, S. 1993. The limits of safety: Organizations, accidents, and nuclear weapons.

Princeton, N.J: Princeton University Press.

Schulman, P., E. Roe, M. van Eeten. M. de Bruijne. 2004. High Reliability and the

management of critical infrastructures. Journal of Contingencies and Crisis

Management, 12 (1): 14-28.

234

Schulman, P. R. and E. Roe. 2007. Designing Infrastructures: Dilemmas of Design

and the Reliability of Critical Infrastructures. Journal of Contingencies and Crisis

Management. 15 (1): 42-49.

Seale, C. 1999. Quality in qualitative research. Qualitative Inquiry, 5 (4): 465-478.

Seville, E. 2009. The goal of resilient organisations. 3rd Annual Business Continuity

Summit 2009, March. Brisbane, Australia.

Seville, E., D. Brundson, A. Dantas, J. Le Masurier, S. Wilkinson and J. Vargo.

2006. Building organisational resilience: A summary of key research findings.

Resilient Organisations Programme, New Zealand.

Sheaffer, Z., B. Richardson, Z. Rosenblatt. 1998. Early-Warning Signals

Management: A lesson from the Barings Crisis. Journal of Contingencies and Crisis

Management, 6 (1): 1-22.

Shrivastava, P. 1987. Bhopal: Anatomy of a crisis. Cambridge, MA: Ballinger

Publishing Co.

Shrivastava, P. 1994a. The evolution of research on technological crises in the US.

Journal of Contingencies and Crisis Management, 2 (1): 10-20.

Shrivastava, P. 1994b. Technological and organizational roots of industrial crises:

lessons from Exxon Valdez and Bhopal. Technological Forecasting and Social

Change, 45 (3): 237-253.

Shrivastava, P. 1995. Ecocentric management for a risk society. Academy of

Management Review, 20 (1): 118-137.

Shrivastava, P., I. Mitroff, D. Miller and A. Miglani. 1988. Understanding Industrial

Crises. The Journal of Management Studies, 25 (4): 285.

Siggelkow, N. 2007. Persuasion with case studies. Academy of Management Journal,

50 (1): 20-24.

Smith, D. 1990. Beyond contingency planning: Towards a model of crisis

management. Industrial Crisis Quarterly, 4 (4): 263-275.

Smith, D. and C. Sipika. 1993. Back from the brink: Post-crisis management. Long

Range Planning, 26 (1): 28-38.

Sobh, R. and C. Perry. 2006. Research design and data analysis in realism research.

European Journal of Marketing, 40 (11/12): 1194-1209.

Stake, R. E. 1995. The art of case study research. Thousand Oaks, CA: Sage.

Stake, R. E. 2006. Multiple case study analysis. New York: Guilford Press.

235

Standards Australia. 2004a. Business Continuity Management. HB221:2004. Sydney:

Standards Australia.

Standards Australia. 2004b. Risk Management Standard. AS/NZS 4360:2004.

Sydney: Standards Australia / Standards New Zealand.

Standards Australia 2006a. A practitioners guide to business continuity management.

HB292:2006. Sydney: Standards Australia.

Standards Australia 2006b. Security Risk Management. HB167:2006. Sydney:

Standards Australia / Standards New Zealand.

Starbuck, W. H. and F. J. Milliken. 1988. Challenger: Fine-tuning the odds until

something breaks. The Journal of Management Studies, 25 (4): 319-340.

Starr, R., J. Newfrock and M. Delurey. 2003. Enterprise resilience: Managing risk in

the networked economy. Strategy and Business, 30 (1).

State of Queensland. 2004. Detailed Report of the Independent Panel Electricity

Distribution and Service Delivery for the 21st Century. Brisbane: Department of

Natural Resources Mines and Energy.

Staw, B. M., L. E. Sandelands and J. E Dutton. 1981. Threat-rigidity effects in

organizational behavior: A multilevel analysis. Administrative Science Quarterly, 26

(4): 501-524.

Stead, E. and C. Smallman. 1999. Understanding business failure: Learning and un-

learning lessons from industrial crises. Journal of Contingencies and Crisis

Management, 7 (1): 1-18.

Sutcliffe, K. M. and T. J. Vogus. 2003. Organizing for resilience. In Positive

organizational scholarship: Foundations of a new discipline, eds. K. S. Cameron, J.

E. Dutton and R. E. Quinn, 94-110. San Francisco: Berrett-Koehler.

Tenner, E. 1997. Why things bite back. New York: Vintage Books.

Tobin, G. A. and C. M. Begley. 2004. Methodological rigour within a qualitative

framework. Journal of Advanced Nursing, 48 (4): 388-396.

Tsoukas, H. 1989. The validity of idiographic research explanations. The Academy of

Management Review, 14 (4): 551-561.

Turner, B. A. 1976. The organizational and interorganizational development of

disasters. Administrative Science Quarterly, 21 (3): 378-397.

van Eeten, M. and E. Roe. 2002. Ecology, engineering and management:

Reconciling ecosystem rehabilitation and service reliability. New York: Oxford

University Press.

236

Vogus, T. J. and K. M. Sutcliffe. 2007. Organizational resilience: Towards a theory

and research agenda. In ISIC IEEE International Conference on Systems Man and

Cybernetics 2007, October 7-10, 3418-3422. Montreal, Quebec.

Voss, C., N. Tsikriktsis and M. Frohlich 2002. Case research in operations

management. International Journal of Operations and Production Management, 22

(2): 195-219.

Wallace, W. A., D. Mendonca, E. Lee, J. Mitchell and J. Chow. 2003. Managing

disruptions to critical interdependent infrastructures in the context of the 2001 World

Trade Center attack. In Beyond September 11: An account of post-disaster research,

eds. M. F. Myers, 165-198. Boulder, CO: Natural Hazards Research and

Applications Information Centre Center, University of Colorado.

Weick, K. E. 1987. Organizational culture as a source of high reliability. California

Management Review, 29 (2): 112-127.

Weick, K. E. and K. M. Sutcliffe. 2001. Managing the unexpected: Assuring high

performance in an age of complexity. San Francisco, CA: Jossey-Bass.

Weick, K. E. and K. M. Sutcliffe. 2007. Managing the unexpected: Resilient

performance in an age of uncertainty. San Francisco: John-Wiley and Sons.

Wildavsky, A. 1988. Searching for safety. New Brunswick, NJ: Transactions

Publishers.

Wolak, F. A. 2003. Diagnosing the California electricity crisis. The Electricity

Journal, 16 (7): 11-37.

Wolf, F. and P. Sampson. 2007. Evidence of an interaction involving complexity and

coupling as predicted by Normal Accident Theory. Journal of Contingencies and

Crisis Management, 15 (3): 123-133.

Yin, R. K. 1994. Case study research: Design and methods. Thousand Oaks, CA:

Sage.

Yin, R. K. 2003. Applications of case study research. Thousand Oaks, CA: Sage.

Zach, L. 2004. Using multiple-case studies design to investigate the information-

seeking behavior of arts administrators. Library Trends, 55 (1): 4-21.

Zimmerman, R. 2001. Social implications of infrastructure network interactions.

Journal of Urban Technology, 8 (3): 97-119.

Zio, E. 2009. Reliability engineering: Old problems and new challenges. Reliability

engineering and system safety, 94 (2): 125-141.