Your expert guide to Subtitle -...
-
Upload
vuongtuyen -
Category
Documents
-
view
216 -
download
0
Transcript of Your expert guide to Subtitle -...
Disaster recovery and business continuity
Your expert guide to Subtitle
Page 1 of 38
In this e-guide
Disaster recovery: Risk
assessment and business
impact analysis
Disaster recovery training
and staffing strategies
Coming up with a new
configuration and change
management plan
Disaster recovery
awareness and testing
require training, strategic
plans
Evaluating BC/DR program
performance
Case study: Cloud
collaboration boosts
Cumbria County Council's
disaster response abilities
Disaster recovery and business continuity
In this e-guide:
Investing in technologies and processes that can safeguard an enterprise
and its operations in the face of downtime should be a must for any
business, as end-users can be remarkably unforgiving when unable to
access the services they need during work and at play.
Not only can a solid business continuity strategy protect organisations
from reputational damage and lost trade, but for those operating in
regulated industries it can also prevent firms being hit with downtime-
related enforcement action.
But even the most diligently prepared disaster recovery plan should be
subject to review from time-to-time to ensure it delivers the expected
results.
In this guide, we take a look at the steps enterprise can and should take
to ensure, should their infrastructure fail, they can continue to trade and
operate, and why it pays to regularly test the robustness of their disaster
recovery processes.
Caroline Donnelly, Datacentre Editor
Page 2 of 38
In this e-guide
Disaster recovery: Risk
assessment and business
impact analysis
Disaster recovery training
and staffing strategies
Coming up with a new
configuration and change
management plan
Disaster recovery
awareness and testing
require training, strategic
plans
Evaluating BC/DR program
performance
Case study: Cloud
collaboration boosts
Cumbria County Council's
disaster response abilities
Disaster recovery and business continuity
Paul Kirvan, Guest Contributor
Disaster recovery risk assessment and business impact analysis (BIA) are
crucial steps in the development of a disaster recovery plan. But, before we
look at them in detail, we need to locate disaster recovery risk assessment
and business impact assessment in the overall planning process.
To do that, let us remind ourselves of the overall goals of disaster recovery
planning, which are to provide strategies and procedures that can help
return IT operations to an acceptable level of performance as quickly as
possible following a disruptive event. The speed at which IT assets can be
returned to normal or near-normal performance will impact how quickly the
organisation can return to business as usual or an acceptable interim state
of operations.
Having established our mission, and assuming we have management
approval and funding for a disaster recovery initiative, we can establish a
project plan.
A disaster recovery project has a fairly consistent structure, which makes it
easy to organise and conduct plan development activity.
Page 3 of 38
In this e-guide
Disaster recovery: Risk
assessment and business
impact analysis
Disaster recovery training
and staffing strategies
Coming up with a new
configuration and change
management plan
Disaster recovery
awareness and testing
require training, strategic
plans
Evaluating BC/DR program
performance
Case study: Cloud
collaboration boosts
Cumbria County Council's
disaster response abilities
Disaster recovery and business continuity
Adapted with permission from the BCM Lifecycle developed by the Business Continuity Institute.
As you can see from The IT Disaster Recovery Lifecycle illustration, the IT
disaster recovery process has a standard process flow. In this, the BIA is
typically conducted before risk assessment. The BIA identifies the most
important business functions and the IT systems and assets that support
them. Next, the risk assessment examines the internal and external threats
and vulnerabilities that could negatively impact IT assets.
Page 4 of 38
In this e-guide
Disaster recovery: Risk
assessment and business
impact analysis
Disaster recovery training
and staffing strategies
Coming up with a new
configuration and change
management plan
Disaster recovery
awareness and testing
require training, strategic
plans
Evaluating BC/DR program
performance
Case study: Cloud
collaboration boosts
Cumbria County Council's
disaster response abilities
Disaster recovery and business continuity
Following the BIA and risk assessment, the next steps are to define, build
and test detailed disaster recovery plans that can be invoked in case
s critical IT assets. Such plans
provide a step-by-step process for responding to a disruptive event with
steps designed to provide an easy-to-use and repeatable process for
recovering damaged IT assets to normal operation as quickly as possible.
Detailed response planning and the other key parts of disaster recovery
planning, such as plan maintenance, are, however, outside the scope of this
article so let us get back to looking at disaster recovery risk assessment and
business impact assessment in detail.
Disaster recovery risk assessment
In the IT disaster recovery world, we typically focus on one or more of the
following four risk scenarios, the loss of which would have a negative impact
Loss of access to premises Loss of data Loss of IT function Loss of skills
Risk assessments focus on the risks that can lead to these outcomes.
Peter Barnes, FBCI, managing director of London-based 2C Consulting said,
the impact on
Page 5 of 38
In this e-guide
Disaster recovery: Risk
assessment and business
impact analysis
Disaster recovery training
and staffing strategies
Coming up with a new
configuration and change
management plan
Disaster recovery
awareness and testing
require training, strategic
plans
Evaluating BC/DR program
performance
Case study: Cloud
collaboration boosts
Cumbria County Council's
disaster response abilities
Disaster recovery and business continuity
the business if delivery of critical applications and services were to be
denied as a result of a fire or server failure, for example, and to assess the
risks
A key aspect is to know what services run on which parts of the
infrastructure, said Andrew Hiles, FBCI, managing director of Oxfordshire-
based Kingswell International
company had grown by acquisi
One easy way to create a risk assessment is illustrated by this table.
Working with IT managers and members of your building facilities staff as
well as risk management staff if you have them, you can identify the events
that could potentially impact data centre operations.
Page 6 of 38
In this e-guide
Disaster recovery: Risk
assessment and business
impact analysis
Disaster recovery training
and staffing strategies
Coming up with a new
configuration and change
management plan
Disaster recovery
awareness and testing
require training, strategic
plans
Evaluating BC/DR program
performance
Case study: Cloud
collaboration boosts
Cumbria County Council's
disaster response abilities
Disaster recovery and business continuity
Based on experience and available statistics, you can estimate the likelihood
of specific events occurring on a scale of 0 to 1 (0.0 = will never occur, and
1.0 = will always occur). You can do the same with the impact of the event,
using a 0 to 1 range (0.0 = no impact at all, and 1.0 = total loss of operations).
The final column lists the product of likelihood x impact, and this becomes
your risk factor. Those events with the highest risk factor are the ones your
disaster recovery plan should primarily aim to address.
Another way to capture and display risk information is with a risk matrix.
Entries in each part of the above table can be plotted on a four-quadrant
matrix, as shown here.
A risk matrix, adapted with permission
from "Principles and Practice of Business
Continuity: Tools and Techniques," by Jim
Burtles, copyright 2007 by Rothstein
Associates; ISBN 1-931332-39-8
Page 7 of 38
In this e-guide
Disaster recovery: Risk
assessment and business
impact analysis
Disaster recovery training
and staffing strategies
Coming up with a new
configuration and change
management plan
Disaster recovery
awareness and testing
require training, strategic
plans
Evaluating BC/DR program
performance
Case study: Cloud
collaboration boosts
Cumbria County Council's
disaster response abilities
Disaster recovery and business continuity
In terms of how we treat these risks, we can use the following
categorisation:
Prevent: High-probability/high-impact events (actively work to
mitigate these) Accept: Low-probability/low-impact events (maintain vigilance)
Contain: High-probability/low-impact events (minimize likelihood of occurrence)
Plan: Low-probability/high-impact events (plan steps to take if this occurs)
Types of risks to consider
In the previous section we described a basic disaster recovery risk
assessment. But, there are many types of risk, so what are some of the key
ones that should be addressed from a UK IT perspective?
Supply chain disruptions present a key risk, said Susan Young, MBCI, a risk
management professional with a London-
an IT standpoint, reliance on outsourced providers not only presents a pure
IT risk but also a supply chain risk. For example, in the Lloyd's insurance
market in London, all businesses depend on a firm called Xchanging to
provide premiums and claims processing. This is a huge dependency with
Hardware failure is another key danger to UK organisations. Kingswell
report on UK email downtime
Page 8 of 38
In this e-guide
Disaster recovery: Risk
assessment and business
impact analysis
Disaster recovery training
and staffing strategies
Coming up with a new
configuration and change
management plan
Disaster recovery
awareness and testing
require training, strategic
plans
Evaluating BC/DR program
performance
Case study: Cloud
collaboration boosts
Cumbria County Council's
disaster response abilities
Disaster recovery and business continuity
showed hardware failure (server and SAN), connectivity loss and database
corruption (in that order) as the main causes of downtime. A 2010 SunGard
report said the most common cause of UK invocations was hardware,
followed by power and
Water damage is a key risk to organisations in the UK, and sometimes the
area may be
when taps are left running in the toilets two floors above when everyone
The BIA
A BIA attempts to relate specific risks to their potential impact on things
such as business operations, financial performance, reputation, employees
and supply chains. The table below depicts the relationship between specific
risks and business factors.
Page 9 of 38
In this e-guide
Disaster recovery: Risk
assessment and business
impact analysis
Disaster recovery training
and staffing strategies
Coming up with a new
configuration and change
management plan
Disaster recovery
awareness and testing
require training, strategic
plans
Evaluating BC/DR program
performance
Case study: Cloud
collaboration boosts
Cumbria County Council's
disaster response abilities
Disaster recovery and business continuity
Risks can affect the entire company or just small parts of it. Operational and
financial losses may be significant, and the impact of these events could
BIAs are built on a series of questions that should be posed to key members
of each operating unit in the company, including IT. Questions should
address the following issues, as a minimum:
Understanding how each business unit operates Identification of critical business unit processes that depend on IT Financial value of critical business processes (for example, revenues
generated per hour) Dependencies on internal organisations Dependencies on external organisations Data requirements Minimum time needed to recover data to its previous state of use System requirements Minimum time needed to return to normal or near-normal operations
following an incident Minimum number of staff needed to conduct business Minimum technology needed to conduct business
BIA outputs should present a clear picture of the actual impacts on the
business, both in terms of potential problems and probable costs. The
results of the BIA should help determine which areas require which levels of
Page 10 of 38
In this e-guide
Disaster recovery: Risk
assessment and business
impact analysis
Disaster recovery training
and staffing strategies
Coming up with a new
configuration and change
management plan
Disaster recovery
awareness and testing
require training, strategic
plans
Evaluating BC/DR program
performance
Case study: Cloud
collaboration boosts
Cumbria County Council's
disaster response abilities
Disaster recovery and business continuity
protection, the amount to which the business can tolerate disruptions and
the minimum IT service levels needed by the business.
to define the
the tolerances to an outage for critical applications or infrastructure
and reduce the risk of service loss, such that you can provide service to the
business in an acceptable timeframe.
Next article
Page 11 of 38
In this e-guide
Disaster recovery: Risk
assessment and business
impact analysis
Disaster recovery training
and staffing strategies
Coming up with a new
configuration and change
management plan
Disaster recovery
awareness and testing
require training, strategic
plans
Evaluating BC/DR program
performance
Case study: Cloud
collaboration boosts
Cumbria County Council's
disaster response abilities
Disaster recovery and business continuity
Paul Kirvan, Guest Contributor
What are some steps companies can do to mitigate downtime resulting from
a lack of trained IT staff in the aftermath of a disaster? Obviously, one
answer is "Train additional IT staff members to perform IT tasks," but how
realistic is that? And what if those staffers are unable to respond following a
disaster as well?
Business continuity plans and disaster recovery training plans should
examine the staffing issue initially as part of the business impact analysis
(BIA) and risk assessment (RA) phases. These initiatives should identify
staffing issues that need to be addressed. From a budget perspective,
adding staff may not be an option. If that's the case, cross-training of
existing IT staff is highly recommended, as is rotating the alternate staff in
and out of production assignments, if possible, to ensure their skills are
current.
If your organization has only one data center and your budget cannot
underwrite a second data center, consider one of the many hosted data
center options currently available. These can be found under such headings
Page 12 of 38
In this e-guide
Disaster recovery: Risk
assessment and business
impact analysis
Disaster recovery training
and staffing strategies
Coming up with a new
configuration and change
management plan
Disaster recovery
awareness and testing
require training, strategic
plans
Evaluating BC/DR program
performance
Case study: Cloud
collaboration boosts
Cumbria County Council's
disaster response abilities
Disaster recovery and business continuity
as Software as a Service (SaaS), Infrastructure as a Service (IaaS) or Data
Center as a Service (DCaaS). You can subscribe to as much (or as little)
resources as your budget can handle. You'll also be contracting with trained
IT professionals, who should be able (with advance training, knowledge and
suitable documentation) to step in and support your production systems if
your existing staff is unavailable.
If your recovery time objectives (RTOs) are aggressive, it may be necessary
to arrange for data backup and recovery services, in addition to other
managed IT services, to ensure that interruptions to your production
systems will be minimal. Of course, if your organization has more than one
data center, and if the data centers are sufficiently distant from each other
(e.g., at least 20-30 miles), you could replicate data from one data center to
the other and mitigate the impact of a staffing loss by spreading your IT
staff across sites and ensuring there is plenty of cross-training of all
employees.
Next article
Page 13 of 38
In this e-guide
Disaster recovery: Risk
assessment and business
impact analysis
Disaster recovery training
and staffing strategies
Coming up with a new
configuration and change
management plan
Disaster recovery
awareness and testing
require training, strategic
plans
Evaluating BC/DR program
performance
Case study: Cloud
collaboration boosts
Cumbria County Council's
disaster response abilities
Disaster recovery and business continuity
Alex Barrett, Guest Contributor
In the context of information technology, the change management plan --
and its kissing cousin configuration management -- are usually thought of as
subsets of IT service management, or ITSM. They require configuration data
about an organization's IT infrastructure and the services running on it.
They say the only constant is change, and nowhere is that more true than in
the data center. Despite all our practice dealing with change, doing so
gracefully and efficiently is still one of the most challenging aspects of IT
operations.
Change management helps IT operations professionals follow established
procedures for making changes to an environment -- or discover the
changes that cause a service to go awry, said Rob England, an IT consultant
and blogger known as The IT Skeptic based in Wellington, New Zealand.
According to England, these tools and processes can help IT departments
can answer two central questions: "How fast and how accurately can you
assess the impact [of a change] to your organization?" and "Does the cost
of downtime exceed the cost of adding more processes and tools?"
Page 14 of 38
In this e-guide
Disaster recovery: Risk
assessment and business
impact analysis
Disaster recovery training
and staffing strategies
Coming up with a new
configuration and change
management plan
Disaster recovery
awareness and testing
require training, strategic
plans
Evaluating BC/DR program
performance
Case study: Cloud
collaboration boosts
Cumbria County Council's
disaster response abilities
Disaster recovery and business continuity
Indeed, no one does change management for the hell of it. IT organizations
follow established practices and procedures in the hopes of minimizing
outages and maximizing service levels (the metric by which many of them
are judged). But while we all want more uptime and the better outcomes that
change management promises, the number of organizations that have
effective processes in place is small.
The CMDB letdown
Part of the change management problem is the industry's own making. Not
so long ago, IT management vendors and practitioners got it in their heads
that the first step toward change and configuration management was to
implement an IT Infrastructure Library (ITIL)-inspired configuration
management database (CMDB).
At its core, a CMDB is a simply a database that stores so-called
configuration items (CIs). CIs describe and track individual assets, how they
are configured, and their relationships to one another. That data is often
used in support of other IT management tools such as a service desk and
incident management.
This sounds straightforward enough, but depending on whom you ask,
adoption of CMDBs has been somewhere between modest and downright
disappointing. While CMDBs are commonplace in the Fortune 1,000, the
Page 15 of 38
In this e-guide
Disaster recovery: Risk
assessment and business
impact analysis
Disaster recovery training
and staffing strategies
Coming up with a new
configuration and change
management plan
Disaster recovery
awareness and testing
require training, strategic
plans
Evaluating BC/DR program
performance
Case study: Cloud
collaboration boosts
Cumbria County Council's
disaster response abilities
Disaster recovery and business continuity
number of implementations trails off for smaller organizations, said Ronni
Colville, an IT operations management analyst at Gartner.
Among the problems that organizations have cited are high costs for
software and consulting, difficulty in populating the database, intergroup
politics, and inflated expectations about CMDB capabilities.
"A CMDB sounds like a good idea in theory. In practice, if you try and
implement every little nuance, it's like driving pins in your eyes," said Brian de
Haaff, Citrix Systems' senior product line director for GoToAssist, the
company's IT service management offering.
Indeed, in the early days of CMDBs, many organizations undertook initiatives
without properly analyzing the work involved or the business justification,
said Gartner's Colville. As a result, she said, "there were a lot of false
t doesn't solve world hunger. It's not making
dinner. What the heck?'"
England calls shops that need a CMDB "The 5% Club."
"There are 5% of organizations that are so complex that they need a CMDB
-- and have the resources to actually do it," he said. But for the remaining
95%, implementing such a project is rarely worth the cost, time or effort,
England said.
Page 16 of 38
In this e-guide
Disaster recovery: Risk
assessment and business
impact analysis
Disaster recovery training
and staffing strategies
Coming up with a new
configuration and change
management plan
Disaster recovery
awareness and testing
require training, strategic
plans
Evaluating BC/DR program
performance
Case study: Cloud
collaboration boosts
Cumbria County Council's
disaster response abilities
Disaster recovery and business continuity
"The main reason you would do a CMDB project is for impact assessment,"
England noted. "If people can answer questions about the impact of a
change fast enough, then you don't need to invest in a CMDB."
For that 5% of shops that have paid their dues implementing a CMDB,
however, it can be a beautiful thing.
In part two of this article, see how a large packaged foods corporation is
using CMDB to pinpoint problems to keep production flowing in its
warehouses.
Next article
Page 17 of 38
In this e-guide
Disaster recovery: Risk
assessment and business
impact analysis
Disaster recovery training
and staffing strategies
Coming up with a new
configuration and change
management plan
Disaster recovery
awareness and testing
require training, strategic
plans
Evaluating BC/DR program
performance
Case study: Cloud
collaboration boosts
Cumbria County Council's
disaster response abilities
Disaster recovery and business continuity
Paul Kirvan, Guest Contributor
Once you have drawn up a detailed disaster recovery plan, the next stages
in the project are twofold: to prepare and deliver disaster recovery
awareness and training programmes so all employees are prepared to
respond as required by the plan in an emergency, and to to carry out
disaster recovery testing to ensure the plan works properly and that DR
teams know their roles and responsibilities.
ISO/IEC 27031:2010, Information technology Security techniquesGuidelines for information and communication technology readiness for business continuity
This is the global standard for IT disaster recovery as it applies to end users.
Another ISO standard, ISO/IEC 24762, addresses Information and
communications technology disaster recovery from a service provider
perspective. Both these standards can help you develop and implement ICT
disaster recovery programmes.
Page 18 of 38
In this e-guide
Disaster recovery: Risk
assessment and business
impact analysis
Disaster recovery training
and staffing strategies
Coming up with a new
configuration and change
management plan
Disaster recovery
awareness and testing
require training, strategic
plans
Evaluating BC/DR program
performance
Case study: Cloud
collaboration boosts
Cumbria County Council's
disaster response abilities
Disaster recovery and business continuity
Disaster recovery awareness and training strategies
implemented to ensure that processes are in place to regularly promote ICT
DR awareness in general, as well as assess and enhance competency of all
relevant personnel key to the successful implementation of ICT DR
Perhaps the most important strategy in raising disaster recovery awareness
is to secure senior management support and funding for DR programmes.
Visible and frequently occurring endorsements from senior management will
help raise awareness of and increase participation in the programme.
The next key strategy is to engage your human resources (HR) organisation
in the process. They have the expertise to help you organise and conduct
awareness activities, such as department briefings and messages on
employee bulletin boards. You can also encourage HR to incorporate
briefings on DR as well as business continuity into new employee induction
programmes.
Another important strategy is to leverage the Internet. If your organisation
has an intranet, launch a DR page that describes what your programmes
does; answers FAQs; and provides links to forms and services, schedules,
and other relevant materials.
Page 19 of 38
In this e-guide
Disaster recovery: Risk
assessment and business
impact analysis
Disaster recovery training
and staffing strategies
Coming up with a new
configuration and change
management plan
Disaster recovery
awareness and testing
require training, strategic
plans
Evaluating BC/DR program
performance
Case study: Cloud
collaboration boosts
Cumbria County Council's
disaster response abilities
Disaster recovery and business continuity
Be sure that any awareness activities are approved by management and HR,
as well as your own IT management. Your messages should be informative
activities.
Building an awareness and training plan
Here are additional activities for successful disaster recovery awareness
and training programmes:
Conduct an awareness and training needs analysis. Assess existing staff competencies regarding roles in DR plans. Establish an ongoing awareness and training programme. Establish record-keeping of staff training and awareness activities. Establish competency levels for IT staff and how they should be
maintained. Conduct staff performance assessments post-disaster and re-
evaluate training.
As part of these activities, you should develop and conduct training on:
Technical recovery activities Emergency response activities, for example, situation assessment
and evacuation Specialised recovery, such as recovering to hot sites or cold sites or
third-party managed DR services Return-to-normal activities
Page 20 of 38
In this e-guide
Disaster recovery: Risk
assessment and business
impact analysis
Disaster recovery training
and staffing strategies
Coming up with a new
configuration and change
management plan
Disaster recovery
awareness and testing
require training, strategic
plans
Evaluating BC/DR program
performance
Case study: Cloud
collaboration boosts
Cumbria County Council's
disaster response abilities
Disaster recovery and business continuity
Restoration of business systems and processes
Since you will be working with a variety of vendors and specialised service
providers, examine their training programmes to see if they can be
leveraged into your internally developed training activities.
Disaster recovery testing strategies
The most important strategy in disaster recovery testing is simply to test,
test and test again. Your organisation depends on the availability of IT
operational but that they can survive an unplanned outage. Disaster
recovery testing will ensure that all your efforts to provide recovery and
resilience will indeed protect critical IT assets.
instances, the whole set of IRBC [ICT readiness for business continuity]
elements and processes, including ICT recovery, cannot be proven in one
that continually addresses the entire spectrum of operational and
administrative activities that an ICT organisation faces.
Based on the size and complexity of your IT infrastructure, disaster recovery
testing activities should address recovery of hardware, software, data and
databases, network services, data centre facilities, people (for example,
Page 21 of 38
In this e-guide
Disaster recovery: Risk
assessment and business
impact analysis
Disaster recovery training
and staffing strategies
Coming up with a new
configuration and change
management plan
Disaster recovery
awareness and testing
require training, strategic
plans
Evaluating BC/DR program
performance
Case study: Cloud
collaboration boosts
Cumbria County Council's
disaster response abilities
Disaster recovery and business continuity
relocation of staff to an alternate site), and the business. For each of these
factors, critical information will be identified in the business impact analysis,
or BIA.
Types of tests
ISO 27031 makes some key points with regard to disaster recovery testing:
should not expose the organisation to an unacceptable level of risk. The test
and exercise programme should define how the risk of individual exercise is
addressed. Top-management sign-off on the programme should be obtained
and a clear explanation of the ass
wider business continuity management scope and objectives and
complementary to the organisation's broader exercise programme. Each
test and exercise should have both business objectives (even where there is
no business involvement) and defined technical objectives to test or validate
Since there are many aspects of an IT environment to be tested, there are
different kinds of tests to be initiated. This figure shows the three basic IT
DR tests.
Page 22 of 38
In this e-guide
Disaster recovery: Risk
assessment and business
impact analysis
Disaster recovery training
and staffing strategies
Coming up with a new
configuration and change
management plan
Disaster recovery
awareness and testing
require training, strategic
plans
Evaluating BC/DR program
performance
Case study: Cloud
collaboration boosts
Cumbria County Council's
disaster response abilities
Disaster recovery and business continuity
Types of IT disaster recovery tests
Basic disaster recovery testing begins with a desktop walk-through activity,
in which DR team members review DR plans step by step to see if they make
sense and to fully understand their roles and responsibilities in a disaster.
The next kind of test, a simulated recovery, impacts specific systems and
infrastructure elements. Specifically, tests such as failover and failback of
critical servers are among the most frequently conducted. These tests not
only verify the recoverability of primary and backup servers but also the
network infrastructure that supports the failover/failback and the
specialised applications that effect failover and failback.
Operational exercises extend the simulated recovery test to a wider scale,
typically testing end-to-end recovery of multiple systems, both internal and
external, the associated network infrastructures that support connectivity of
those assets, and the facilities that house primary and backup systems.
These tests are highly complex, and provide a higher level of risk compared
to other tests, as multiple systems will be affected. Loss of one or more
Page 23 of 38
In this e-guide
Disaster recovery: Risk
assessment and business
impact analysis
Disaster recovery training
and staffing strategies
Coming up with a new
configuration and change
management plan
Disaster recovery
awareness and testing
require training, strategic
plans
Evaluating BC/DR program
performance
Case study: Cloud
collaboration boosts
Cumbria County Council's
disaster response abilities
Disaster recovery and business continuity
critical systems from this kind of test could result in a serious disruption to
the organisation.
Tests have several key goals, as stated in ISO 27031:
Build confidence throughout the organisation that resilience and recovery strategies will satisfy the business requirements.
Demonstrate that critical ICT services can be maintained and recovered within agreed service levels or recovery objectives regardless of the incident.
Demonstrate that critical ICT services can be restored to pre-test state in the event of an incident at the recovery location.
Provide staff members with an opportunity to familiarise themselves with the recovery process.
Train staff and ensure they have adequate knowledge of ICT DR plans and procedures.
Verify that ICT DR plans are synchronised with the ICT infrastructures and business environment.
Identify opportunities for improving ICT DR strategies or recovery processes.
Provide audit evidence and demonstrate the organisation's ICT service competence.
Developing disaster recovery testing plans
IT disaster recovery testing plans provide a step-by-step process for:
Setting the stage of the exercise by defining the test scope
Page 24 of 38
In this e-guide
Disaster recovery: Risk
assessment and business
impact analysis
Disaster recovery training
and staffing strategies
Coming up with a new
configuration and change
management plan
Disaster recovery
awareness and testing
require training, strategic
plans
Evaluating BC/DR program
performance
Case study: Cloud
collaboration boosts
Cumbria County Council's
disaster response abilities
Disaster recovery and business continuity
Defining test objectives Defining success criteria Defining the ICT assets to be tested Defining the roles and responsibilities of test participants Defining exercise steps in a logical sequence, plus unannounced
injects that challenge the delegates in how they respond to unanticipated changes
Conducting a post-test review of what worked, what did not and lessons learned
Revising the DR plans based on test results If possible, retesting the plan to ensure the changes work as intended
The following list provides a suggested table of contents for an IT DR test.
completed, such as researching the systems to be tested, researching
existing recovery procedures, identifying test scripts (if any), creating and
approving test scripts, coordinating with other IT departments and business
units in the company, and coordinating with external vendors and service
providers.
Page 25 of 38
In this e-guide
Disaster recovery: Risk
assessment and business
impact analysis
Disaster recovery training
and staffing strategies
Coming up with a new
configuration and change
management plan
Disaster recovery
awareness and testing
require training, strategic
plans
Evaluating BC/DR program
performance
Case study: Cloud
collaboration boosts
Cumbria County Council's
disaster response abilities
Disaster recovery and business continuity
Page 26 of 38
In this e-guide
Disaster recovery: Risk
assessment and business
impact analysis
Disaster recovery training
and staffing strategies
Coming up with a new
configuration and change
management plan
Disaster recovery
awareness and testing
require training, strategic
plans
Evaluating BC/DR program
performance
Case study: Cloud
collaboration boosts
Cumbria County Council's
disaster response abilities
Disaster recovery and business continuity
Next activities
Once your DR plans have been tested and your awareness and training
plans have been initiated, the next steps are to initiate a maintenance
programme and initiate an audit and review programme. The first ensures all
the previous DR activities we have been discussing are scheduled for annual
or semiannual review, testing and updating. The second ensures that all DR
programme activities are aligned with established policies and operational
controls. Another part of the audit process is to establish a process of
continuous improvement. This ensures that DR programmes remain aligned
to the business as well as international standards and good DR practice.
Next article
Page 27 of 38
In this e-guide
Disaster recovery: Risk
assessment and business
impact analysis
Disaster recovery training
and staffing strategies
Coming up with a new
configuration and change
management plan
Disaster recovery
awareness and testing
require training, strategic
plans
Evaluating BC/DR program
performance
Case study: Cloud
collaboration boosts
Cumbria County Council's
disaster response abilities
Disaster recovery and business continuity
Paul Kirvan, Guest Contributor
How do you know your business continuity and disaster recovery (BC/DR)
programs and associated activities are performing up to expectations?
Setting metrics and expectations gives you the opportunity to check your
program's performance against your goals. For example, performance
metrics addressing the frequency of BC plan exercises and business impact
analysis (BIA) updates will help ensure proper plan performance. Be sure to
involve your quality assurance (QA) and internal audit (IA) departments in
performance evaluations.
In Section 9, Performance Evaluation, of the global business continuity
standard ISO 22301:2012, Business Continuity Management Systems --
Requirements, the following three subsections address performance
evaluation in detail:
9.1 -- Monitoring, Measurement, Analysis and Evaluation
9.2 -- Internal Audit
9.3 -- Management Review
It is important to examine what happens when something out of the ordinary
occurs, such as a minor operational disruption, system or technology
Page 28 of 38
In this e-guide
Disaster recovery: Risk
assessment and business
impact analysis
Disaster recovery training
and staffing strategies
Coming up with a new
configuration and change
management plan
Disaster recovery
awareness and testing
require training, strategic
plans
Evaluating BC/DR program
performance
Case study: Cloud
collaboration boosts
Cumbria County Council's
disaster response abilities
Disaster recovery and business continuity
outage, or supply chain interruption, and use those lessons learned to
improve your ability to anticipate potential disruptions. It is also helpful to
study real-world examples of disaster response in organizations similar to
your own. The information that you gather will allow you to recommend
modifications to existing operational, strategic, planning, financial, legal,
technological, structural, physical, intellectual and human-based activities so
as to increase their reliability, resilience and recoverability from disruptive
incidents -- minimizing the impact to business operations.
Here's how this works:
Page 29 of 38
In this e-guide
Disaster recovery: Risk
assessment and business
impact analysis
Disaster recovery training
and staffing strategies
Coming up with a new
configuration and change
management plan
Disaster recovery
awareness and testing
require training, strategic
plans
Evaluating BC/DR program
performance
Case study: Cloud
collaboration boosts
Cumbria County Council's
disaster response abilities
Disaster recovery and business continuity
In both cases, the business continuity staff examined key operations within
the company in detail. A business impact analysis (BIA) is typically used to
gather information. Data from a BIA and risk assessment (RA) should
identify what could happen if there was a disruption to the supply chain,
technology or other important business function. Analysis of other
companies' experiences can shed light on possible outcomes of a supply
chain and/or technology failure and will also identify strategies to prevent
these disasters from occurring.
By analyzing all elements in a supply chain, for example, and asking pointed
questions regarding the impact of a supply chain disruption, business
continuity analysts can pinpoint areas of greatest risk to a supply chain and
thereby also identify strategies to prevent disruptions and mitigate the
severity of disruptions that may occur. The same can be true of critical
technology operations.
Performance evaluation of BC/DR programs should be an ongoing activity.
An organization's BC staff should regularly examine all aspects of company
business operations, identify internal/external risks to those operations and
then identify potential solutions to address those risks. Outcomes may come
in the form of modifications to BC plan procedures, updates to BC policies,
revisions to IT infrastructure operations, changes to training programs and
revisions to plan exercises.
Page 30 of 38
In this e-guide
Disaster recovery: Risk
assessment and business
impact analysis
Disaster recovery training
and staffing strategies
Coming up with a new
configuration and change
management plan
Disaster recovery
awareness and testing
require training, strategic
plans
Evaluating BC/DR program
performance
Case study: Cloud
collaboration boosts
Cumbria County Council's
disaster response abilities
Disaster recovery and business continuity
It's been said time and again that business continuity and disaster recovery
plans are living documents. They reflect current business operations and
requirements, and as such must be fluid enough to adapt quickly and
dynamically reflect changes in those operational attributes. A key part of the
performance evaluation process is that it is an ongoing activity. It's not
something that occurs annually or on an ad hoc basis.
Summary
By constantly looking for ways to improve business operations and reduce
the likelihood of emergencies, BC/DR professionals can ensure that their
efforts will keep the organization, its supply chain, its technology
infrastructure and its employees performing in the most resilient ways
possible.
Next article
Page 31 of 38
In this e-guide
Disaster recovery: Risk
assessment and business
impact analysis
Disaster recovery training
and staffing strategies
Coming up with a new
configuration and change
management plan
Disaster recovery
awareness and testing
require training, strategic
plans
Evaluating BC/DR program
performance
Case study: Cloud
collaboration boosts
Cumbria County Council's
disaster response abilities
Disaster recovery and business continuity
Caroline Donnelly, Datacentre Editor
The unsanctioned use of cloud services by employees is a common problem
within many organisations, and one that Cumbria County Council found itself
facing up to in early 2014.
The use of consumer-grade cloud file-sharing services was pervasive within
the council at this time, as employees sought ways to side-step file size
restrictions of their email accounts to pass on documents to colleagues and
external stakeholders.
In light of the sensitive nature of some of the information being shared, the
council knew it had to act, but issuing a blanket ban on using these services
was out of the question. At least, says Kevin Maxwell, service support
manager at Cumbria County Council, until a suitable and appropriate
alternative could be procured.
certain cloud services using the internal network, but we knew if we just did
Page 32 of 38
In this e-guide
Disaster recovery: Risk
assessment and business
impact analysis
Disaster recovery training
and staffing strategies
Coming up with a new
configuration and change
management plan
Disaster recovery
awareness and testing
require training, strategic
plans
Evaluating BC/DR program
performance
Case study: Cloud
collaboration boosts
Cumbria County Council's
disaster response abilities
Disaster recovery and business continuity
that without offering an alternative it would have created resentment
atmosphere and people
Give and take
After assessing a range of enterprise-ready products and services, the
public sector-
focused cloud-based collaboration system for regulatory compliance and
ease of use reasons.
-based public body, so we have to conform to PSN requirements
and other governmental security legislation, and we were specifically looking
for a solutio
says.
-sharing solutions people were finding for themselves were
hosted all over the world with no guarantee about the security measures in
Any file-sharing platform the council decided to use would need to let
employees share documents with external third-parties without them
requiring an account, he adds.
want to go through the overhead of setting people up with accounts on the
Page 33 of 38
In this e-guide
Disaster recovery: Risk
assessment and business
impact analysis
Disaster recovery training
and staffing strategies
Coming up with a new
configuration and change
management plan
Disaster recovery
awareness and testing
require training, strategic
plans
Evaluating BC/DR program
performance
Case study: Cloud
collaboration boosts
Cumbria County Council's
disaster response abilities
Disaster recovery and business continuity
network for a one-
Maxwell says.
For example, Maxwell
regularly receives from members of the public conducting genealogical
research.
birth certificates, for example, which do not always fit in the limits of a
while the information
Objective Connect, with the service allowing team members to share
important documents, often at short notice, for use in court cases.
th hour to share
sensitive and important case material with a barrister who might be going to
court that afternoon. So it is essential for them to set up access for external
Page 34 of 38
In this e-guide
Disaster recovery: Risk
assessment and business
impact analysis
Disaster recovery training
and staffing strategies
Coming up with a new
configuration and change
management plan
Disaster recovery
awareness and testing
require training, strategic
plans
Evaluating BC/DR program
performance
Case study: Cloud
collaboration boosts
Cumbria County Council's
disaster response abilities
Disaster recovery and business continuity
Storm clouds gather
The importance of being able to share huge files containing critical
information with external parties was reinforced in December 2015 when an
extratropical cyclone, dubbed Storm Desmond, hit Cumbria, leaving a trail of
destruction.
In 24 hours, 341.4mm of rain fell on Cumbria, flooding around 6,500 homes
and leaving 45,000 without power. Key roads and bridges within the region
were also severely damaged, prompting the local police to classify the freak
The strength and security of around 600 roads, bridges and other pieces of
key infrastructure within the area needed to be assessed afterwards to work
out how best to repair and restore them.
economy and highways team, responsible for overseeing this on-going
process, which involves compiling huge reports to detail the damage
inflicted.
time, and most of those files were 20MB to 30MB apiece, with photos in
them as well. It quickly became a huge beast of data we were moving
Page 35 of 38
In this e-guide
Disaster recovery: Risk
assessment and business
impact analysis
Disaster recovery training
and staffing strategies
Coming up with a new
configuration and change
management plan
Disaster recovery
awareness and testing
require training, strategic
plans
Evaluating BC/DR program
performance
Case study: Cloud
collaboration boosts
Cumbria County Council's
disaster response abilities
Disaster recovery and business continuity
the asset and the cost, requiring input from external contractors and civil
engineers.
what resources and when, because you get updates when the other party
that information onto design so they can come up with solutions, and that
Meanwhile, the list of assets his team needs to keep a watchful eye on
continues to grow, as a result of subsequent weather events causing fresh
damage.
-survey some of the bridges because of high
around £5m to £6m of resurfacing work we need to get up and running on
the higher-level roads before the temperature starts dropping as we move
into autumn, because the work
Assessing the options
Page 36 of 38
In this e-guide
Disaster recovery: Risk
assessment and business
impact analysis
Disaster recovery training
and staffing strategies
Coming up with a new
configuration and change
management plan
Disaster recovery
awareness and testing
require training, strategic
plans
Evaluating BC/DR program
performance
Case study: Cloud
collaboration boosts
Cumbria County Council's
disaster response abilities
Disaster recovery and business continuity
Maxwell says setting up a Sharepoint site for Sheard and his team would
have bee
chaos as a result of Storm Desmond too.
Maxwell.
to work because their homes had been flooded.
supporting staff around
Onwards and upwards
While Objective Connect has proved a sound investment, Maxwell admits
the council has taken a tentative approach to adopting cloud technologies,
because of concerns about the maturity and reliability of off-premise
technologies.
the direction of travel is that we will start to go to the cloud more and more
Page 37 of 38
In this e-guide
Disaster recovery: Risk
assessment and business
impact analysis
Disaster recovery training
and staffing strategies
Coming up with a new
configuration and change
management plan
Disaster recovery
awareness and testing
require training, strategic
plans
Evaluating BC/DR program
performance
Case study: Cloud
collaboration boosts
Cumbria County Council's
disaster response abilities
Disaster recovery and business continuity
-premise
felt the cloud market is mat
Next article
Page 38 of 38
In this e-guide
Disaster recovery: Risk
assessment and business
impact analysis
Disaster recovery training
and staffing strategies
Coming up with a new
configuration and change
management plan
Disaster recovery
awareness and testing
require training, strategic
plans
Evaluating BC/DR program
performance
Case study: Cloud
collaboration boosts
Cumbria County Council's
disaster response abilities
Disaster recovery and business continuity
As a CW+ entire portfolio of 120+
websites. CW+
members-
of having to track such premium content down on your own, ultimately helping
you to solve your toughest IT challenges more effectively and faster than
ever before.
Take full advantage of your membership by visiting www.computerweekly.com/eproducts
Images; Fotalia
© 2016 TechTarget. No part of this publication may be transmitted or reproduced in any form or by any means
without written permission from the publisher.