DISASTER RECOVERY PLANNING - Macquarie...
Transcript of DISASTER RECOVERY PLANNING - Macquarie...
P2
TABLE OF CONTENTS
EXECUTIVE SUMMARY
WHAT IS A DISASTER RECOVERY PLAN (DRP)?
WHY SHOULD MY COMPANY HAVE ONE?
WHAT DISASTERS SHOULD WE PREPARE FOR?
HOW TO CREATE AN EFFECTIVE DRP
COMMON DRP MISTAKES
NEED HELP PROTECTING YOUR BUSINESS?
CHAPTER 01
CHAPTER 02
CHAPTER 03
CHAPTER 04
CHAPTER 05
CHAPTER 06
EXECUTIVE SUMMARY 01
02
03
04
05
06
EXEC
UTI
VE S
UM
MAR
Y
This comprehensive guide will help
you understand what a disaster
recovery plan is and how to
effectively implement one for your
business. It was designed to serve
as a primer in disaster management
and preparation.
DISASTERS CAN BE DEVASTATING IF NOT PREPARED FOR
Whether man made or naturally
occurring, disasters can be a real
threat to a company’s survival. Some
common disasters include:
» Catastrophic security compromise
» Fire
» Flood
» Earthquake
» Power failure
In many cases the negative effects
of disasters can be prevented or
greatly reduced if your company is
ready to act and the proper steps are
promptly taken.
Careful disaster recovery planning
can mitigate damage and reduce the
risks of major data and profit loss.
EVERY EFFECTIVE DISASTER RECOVERY PLAN SHOULD FOLLOW THESE 6 SPECIFIC STEPS
Although every disaster recovery plan
will be different, all effective disaster
recovery planners should follow these
key steps:
» Delegate responsibilities
» Perform risk assessment
» List your recovery objectives
» Formulate your plan
» Test your plan
» Implement your plan
ABOUT MACQUARIE TELECOM
Macquarie Telecom provides dynamic hosting and communications
platforms that companies can truly rely on for the delivery of their corporate
communications and applications. Combining business-grade full line
telecommunications (voice, data and mobile) with Australian owned and located
hosting services, Macquarie Telecom is not just a single solution – it is an end-
to-end communications platform that enables customers to make smarter
decisions on how to run and build their businesses.
P3
WHAT ISA DISASTER RECOVERY PLAN?
CHAPTER 1
WH
AT IS
A D
ISAS
TER
REC
OVE
RY
PLA
N?
In the event of a man made or
natural disaster, it is critical that
your company be properly prepared.
A disaster recovery plan (DRP)
provides clear instructions for the
recovery and protection of your IT
infrastructure in disaster scenarios.
A DRP can take the form of written
or verbal instructions, but, in order
to maximise effectiveness, it is
usually explained to staff through
training sessions and distributed in
written form.
WHAT’S THE DIFFERENCE BETWEEN A DRP AND A BUSINESS CONTINUITY PLAN?
Although sometimes confused, the
disaster recovery plan and business
continuity plan (BCP) are separate
concepts. The DRP is a subset of the
more general BCP, which contains
five components, including:
» Business Resumption Plan
» Occupant Emergency Plan
» Continuity of Operations Plan
» Incident Management Plan
» Disaster Recovery Plan
The disaster recovery plan provides a
guide for returning IT infrastructure
to normalcy following a disaster,
whereas the other elements of the
BCP deal with non-IT related issues.
59% OF FORTUNE 500 COMPANIES EXPERIENCE AN AVERAGE OF 1.6 HOURS
OF DOWNTIME EVERY WEEK, WHICH TRANSLATES TO AN AVERAGE YEARLY DOWNTIME COST OF $2.79 MILLION.
01
02
03
04
05
06
P4
Every second of IT downtime
can have devastating effects on
customer experience, revenue and
your company’s image. An effective
disaster recovery plan can greatly
reduce the costs associated with a
disaster event.
By defining clear guidelines before
disaster strikes, you give your IT
team the tools they need to react and
recover faster.
DISASTERS CAN BE COSTLY
The financial ramifications of even
short periods of downtime can be
staggering. Costs can be in the form
of lost revenue, lost productivity, and
costs associated with returning to
normalcy.
Companies who rely on e-commerce,
telecommunications or other IT
services for their revenue stream can
be particularly affected by downtime,
with losses of up to $11,000 and
averaging $5,600 across companies
per minute of downtime.[1]
Downtime is also remarkably
prevalent, even in the largest
companies. 59% of Fortune 500
companies experience an average of
1.6 hours of downtime every week,
which translates to an average yearly
downtime cost of $2.79 million.[2]
DISASTERS CAN NEGATIVELY AFFECT YOUR COMPANY’S IMAGE
Although much more difficult to
quantify than financial costs, the
effects of disaster related downtime
on a company’s reputation can be
significant. Half of all companies
surveyed report that downtime has
a negative effect on a company’s
image and 35% reported that they
believe downtime would negatively
affect their customers’ loyalty.[3]
When customers cannot access
your website, applications, or get
proper assistance, goodwill can be
significantly diminished.
DISASTER RECOVERY PLANS REDUCE THE NEGATIVE EFFECTS OF DISASTER DOWNTIME
DRPs can help a company deal with
disaster in two ways.
THEY REDUCE THE LENGTH OF
DOWNTIME
DRPs can allow your company to
return to normalcy much more
quickly. This can be invaluable when
downtime costs can be as high as
$11,000 per minute.
THEY MAKE DISASTERS LESS
COSTLY
Aside from reducing the length of
downtime, disaster recover planning
can actually make the effects of
downtime less destructive.
If your company has a contingency
plan to recover lost data and
communicate with customers, even
in periods of disaster, losses will be
minimised.
WH
Y SH
OU
LD M
Y C
OM
PAN
Y H
AVE
ON
E?
WHY SHOULDMY COMPANY HAVE ONE?
CHAPTER 2
01
02
03
04
05
06
P5
Disasters are generally defined as
any man made or natural event that
causes substantial destruction.
As they relate to the protection of
your data and IT infrastructure, the
most common disasters you should
be prepared for are:
NATURAL DISASTERS
FIRE
Fires can cause loss of
telecommunications infrastructure,
electricity, structures and personnel.
This is one of the most devastating
and common disasters.
FLOOD
This can be either caused by
large natural phenomena such as
rainstorms, or more modest man
made sources like a water leak.
If your company or IT infrastructure is
located in a flood prone area, they can
be extremely destructive.
EARTHQUAKE
Major earthquakes can strike without
warning and are one of the most
difficult disasters to prepare for.
TROPICAL CYCLONE
These are seasonal disasters that
can severely damage coastal IT
infrastructure.
MAN MADE DISASTERS
POWER FAILURE
Power failure, especially for extended
periods of time can be very difficult to
manage.
CATASTROPHIC SECURITY
COMPROMISE
Although not always categorised as
a disaster, IT security compromises,
such as hacking or deliberate actions
by a rogue employee, can nonetheless
be incredibly destructive and result
in a loss of data or customer trust.
Your team should be prepared for all
malicious attacks.
RIOTING
Riots caused by civil unrest or
other disasters can be difficult to
predict or control. Make sure your
IT infrastructure is always properly
secured.
This is only a cursory list of the
most common disasters affecting
IT infrastructure. In order to be fully
prepared, make your own list of
potential disasters.
WH
AT D
ISAS
TER
S SH
OU
LD W
E P
REP
ARE
FOR
?
WHAT DISASTERSSHOULD WE PREPARE FOR?
CHAPTER 3
01
02
03
04
05
06
P6
Every disaster recovery plan will be
unique as they must be customised to
fit each company’s risks and needs.
However, there are certain steps that
must always be followed in order to
create effective DRPs.
DELEGATE RESPONSIBILITY
This is an often overlooked step, but
one of the most important. At the
outset of the planning process, it
is critical that a leader be selected
and that responsibilities be clearly
delegated before disaster strikes.
This increases accountability and
efficiency during times of crisis.
GET UPPER MANAGEMENT
SUPPORT
If there is no management
commitment to the creation and
follow through on a DRP, then the
plan will not succeed. The leaders
of your company must be fully
responsible for the success, or
failure, of the plan so that it can attain
the necessary financial and human
resources.[4]
FORM A COMMITTEE TO OVERSEE
DRP CREATION
Once management is fully committed,
the next step is to create a committee
to create and approve the plan. This
committee will likely be composed of
technical experts within the company
who will be responsible for developing
the content of the plan and
management, who will approve and
oversee the plan’s implementation.
HO
W T
O C
REA
TE A
N E
FFEC
TIVE
DR
P
PERFORM A RISK ANALYSIS
In order to be properly prepared
for disaster, it is important to first
identify which disasters will have the
most impact on your business and
which are most likely to occur.
The risk analysis process identifies
the likelihood of a disaster occurring
and analyses the possible results
should that disaster occur. This
gives your company an idea of which
disasters pose the greatest threats
and allows you to properly prioritise
your resources.
ORDER THREATS BY THEIR RISK
SCORE
In order to create an objectively
prioritised list of the greatest threats
to your organisation, it is necessary
for the planning committee to
quantify each possible disaster.
Cisco Systems disaster recovery
experts recommend you start by
assessing the probability that each
event will occur on a scale from 1-10,
then assessing the potential impact
on your business and time to return
to normalcy using the same scale.
Add the scores together to get the
risk score for each threat.[5]
Below is an example risk assessment
table with the typical scores
associated with each threat. Your
business’s own scores may vary
depending on location and industry.
Business Risk Component Probability Impact Total Risk Score
Human Error(s) (5) Medium (10) High (15) M / H
Software Bug (3) Low (10) High (13) L / H
Hardware Failure (8) High (10) High (18) H / H
Security Breach (3) Low (6) Medium (9) L / M
Fibre Cut (10) High (10) High (20) H / H
Natural Disaster (2) Low (10) High (12) L / H
Civil Unrest (2) Low (5) Medium (7) L / M
HOW TO CREATEAN EFFECTIVE DRP
CHAPTER 4
01
02
03
04
05
06
P7
DETERMINE POSSIBLE OUTCOMES
OF EACH RISK
Once the risks have been assessed and
ordered, the committee must begin
the process of listing all the possible
outcomes should any of the major
threats occur. This list will be your
template when deciding what issues
need to be addressed in your plan.
LIST YOUR OBJECTIVES
After preparing the risk assessment,
it’s time to start listing your recovery
objectives. This will enumerate the
goals of the DRP and allow you to
create more effective recovery plans.
PRIORITISE YOUR RECOVERY
Determine which applications are
the most valuable to your business
and which can be offline for longer
periods of time. This provides the
information necessary to properly
respond and reduce the impact of a
disaster.
DETERMINE YOUR RECOVERY POINT
OBJECTIVE (RPO)
The recovery point objective is the
farthest point in the past from which
data can be recovered. For example,
if the RPO is one hour, data must
be backed up to a separate secure
location every hour in order to meet
the objective. Your company’s choice
of RPO will likely depend on the value
of the data stored and the rate at
which your company generates data.
DETERMINE YOUR RECOVERY TIME
OBJECTIVE (RTO)
Commonly confused with recovery
point objective, the recovery time
objective is the maximum amount of
time an IT system or application can
be offline. For example, a recovery
time objective of four hours indicates
that systems must be back online
within four hours. Your recovery time
objective might vary depending on the
severity and type of disaster and may
not necessarily be realistic, but rather
the optimal time in which normal
operations should resume.
FORMULATE A DRP
Once the prep work is done, the
committee can start creating the
DRP itself. This should take the form
of a written document and be made
available to all relevant parties once
it is completed. Additional verbal
training is also recommended.
HO
W T
O C
REA
TE A
N E
FFEC
TIVE
DR
P
DETERMINE POSSIBLE OUTCOMES
OF EACH RISK
Once the risks have been assessed and
ordered, the committee must begin
the process of listing all the possible
outcomes should any of the major
threats occur. This list will be your
template when deciding what issues
need to be addressed in your plan.
LIST YOUR OBJECTIVES
After preparing the risk assessment,
it’s time to start listing your recovery
objectives. This will enumerate the
goals of the DRP and allow you to
create more effective recovery plans.
PRIORITISE YOUR RECOVERY
Determine which business
applications and systems are the
most valuable to your business
and which can be offline for longer
periods of time. Outline which services
and functions of the business need
continuity, and which do not.
This provides the information
necessary to properly respond and
reduce the impact of a disaster.
DETERMINE YOUR RECOVERY POINT
OBJECTIVE (RPO)
The recovery point objective is the
farthest point in the past from which
data can be recovered. For example,
if the RPO is one hour, data must
be backed up to a separate secure
location every hour in order to meet
the objective. Your company’s choice
of RPO will likely depend on the value
of the data stored and the rate at
which your company generates data.
DETERMINE YOUR RECOVERY TIME
OBJECTIVE (RTO)
Commonly confused with recovery
point objective, the recovery time
objective is the maximum amount of
time an IT system or application can
be offline. For example, a recovery
time objective of four hours indicates
that systems must be back online
within four hours. Your recovery time
objective might vary depending on the
severity and type of disaster, and may
not necessarily be realistic, but rather
the optimal time in which normal
operations should resume.
FORMULATE A DRP
Once the prep work is done, the
committee can start creating the
DRP itself. This should take the form
of a written document and be made
available to all relevant parties once
it is completed. Additional verbal
training is also recommended.
THE EXTENT OF THE DAMAGE IS OFTEN DETERMINED IN THE EARLY STAGES OF A DISASTER. IT IS CRITICAL THAT THE DRP
INCLUDES FIRST RESPONSE INSTRUCTIONS FOR LIKELY DISASTER SCENARIOS.
01
02
03
04
05
06
P8
CREATE A DETECTION PLAN
Prior to the actual recovery process,
it is necessary to first determine that
a disaster has actually occurred.
It is important that the DRP not be
initiated until a full assessment of
the damage has taken place, so as to
limit false alarms and unnecessary
disruptions to work activity.
DEVELOP THE RECOVERY
PROCEDURE
This is the most important step in
your DRP as it will determine how
your team responds after a disaster
has been declared. The disaster
planning committee must make
several key decisions to ensure a
quick return to normalcy.
» Make a first response directive.
The extent of the damage is often
determined in the early stages of a
disaster. For this reason it is critical
that the DRP include first response
instructions for likely disaster
scenarios. This might include
shutting off utilities, assessing
damage, preparing backup power,
and/or powering off equipment.
» Plan backup sites. To ensure that
work can continue with minimal
interruption, the committee
must choose backup sites where
computing can continue with
minimal interruption. In this there
are several options. Hot sites are a
near replica of the original working
site with real time backups of
data and fully equipped hardware
ready to be used immediately. Cold
sites are simply separate spaces
in which work operations can be
moved. They do not contain backup
hardware or data, so operations
may take some time to resume.
Because of the cost associated
with hot sites and the slow recovery
time of cold sites, many companies
choose to operate warm sites,
which are smaller scale versions
of the original work site, with data
backups that may be hours to days
old and backup equipment that
is not as extensive as that at the
original site.
» Source replacement hardware. In
the event that necessary hardware
is damaged or destroyed, it is
important to have a reliable, up-to-
date source for replacements. Make
a list of all mission critical devices
along with a reliable replacement
source. In some cases your company
may find it necessary to keep backup
hardware on hand or to leverage an
Infrastructure as a Service (IaaS)
HO
W T
O C
REA
TE A
N E
FFEC
TIVE
DR
P
platform to speed the recovery of
certain applications.
» Source backup personnel. During
some disasters, personnel may be
unable to work. In these cases it
can be necessary to call additional
help. In order to expedite this
process, your recovery plan should
include a source of off site human
resources.
» Make your plan responsive.
The best plans recognise that
it is impossible to foresee every
situation. When drafting your
DRP, include instructions that
are adaptable to a wide variety
of scenarios. This will allow your
recovery team to quickly get
operations up and running again,
no matter what happens.
PLAN FOR RECONSTRUCTION
After the disaster has passed, your
team will need instructions on how
to return to normalcy. This might
include work site inspections for
structural damage, purchasing new
equipment, installing new hardware,
and systems testing. It should also
include guidelines for how and when
staff should return to work.
COMPILE THE DRP DOCUMENT
After the disaster recovery plan has
been carefully formulated, it must
be formatted into a clear, concise
document. The instructions should
be simple and easily followed, but
detailed enough to cover any potential
issues that might arise. Creating a
document that is effective in times
of an actual emergency can be
challenging, so significant effort
should be made to ensure that it is
well made.
TEST THE PLAN
This step will reveal any flaws in the
DRP and offer insights into how it
can be improved. The plan should
first be carefully reviewed by the DRP
committee and checked for obvious
errors. After this initial evaluation has
been completed, a dry run should be
initiated in which testers simulate
potential disasters.
Plans should be judged by how well
they meet their RTO and RPO goals
in simulations. If the results
of testing are unsatisfactory, it may
be necessary to significantly revise
the DRP.
01
02
03
04
05
06
P9
IMPLEMENT THE PLAN
After a DRP has successfully passed
the testing phase, it must be approved
by management. At this point
additional changes informed by cost or
resource concerns may be made and
the plan may have to go through more
rounds of testing and revision.
If the plan is approved, it should
be immediately implemented
by management. This includes
distribution of the written plan to
all relevant employees and training.
Management should also create
a regular review schedule for the
DRP so that it can be updated to
address any changes that may occur
in the future.
HO
W T
O C
REA
TE A
N E
FFEC
TIVE
DR
P
DETERMINE POSSIBLE OUTCOMES
OF EACH RISK
Once the risks have been assessed and
ordered, the committee must begin
the process of listing all the possible
outcomes should any of the major
threats occur. This list will be your
template when deciding what issues
need to be addressed in your plan.
LIST YOUR OBJECTIVES
After preparing the risk assessment,
it’s time to start listing your recovery
objectives. This will enumerate the
goals of the DRP and allow you to
create more effective recovery plans.
PRIORITISE YOUR RECOVERY
Determine which applications are
the most valuable to your business
and which can be offline for longer
periods of time. This provides the
information necessary to properly
respond and reduce the impact of a
disaster.
DETERMINE YOUR RECOVERY POINT
OBJECTIVE (RPO)
The recovery point objective is the
farthest point in the past from which
data can be recovered. For example,
if the RPO is one hour, data must
be backed up to a separate secure
location every hour in order to meet
the objective. Your company’s choice
of RPO will likely depend on the value
of the data stored and the rate at
which your company generates data.
DETERMINE YOUR RECOVERY TIME
OBJECTIVE (RTO)
Commonly confused with recovery
point objective, the recovery time
objective is the maximum amount of
time an IT system or application can
be offline. For example, a recovery
time objective of four hours indicates
that systems must be back online
within four hours. Your recovery time
objective might vary depending on the
severity and type of disaster and may
not necessarily be realistic, but rather
the optimal time in which normal
operations should resume.
FORMULATE A DRP
Once the prep work is done, the
committee can start creating the
DRP itself. This should take the form
of a written document and be made
available to all relevant parties once
it is completed. Additional verbal
training is also recommended.
MANAGEMENT SHOULD ALSO CREATE A REGULAR REVIEW SCHEDULE FOR THE
DRP SO THAT IT CAN BE UPDATED TO ADDRESS ANY CHANGES THAT MAY OCCUR
IN THE FUTURE.
01
02
03
04
05
06
P10
Disaster recovery planning is a
complicated and difficult process
involving many people and long hours.
As such, it frequently results in an
imperfect DRP. We’ve listed some of
the most common mistakes to avoid.
NOT TRAINING EMPLOYEES ON THE DRP
After the DRP document has
been created and distributed, it is
important that the whole team be
trained on it. This will make the plan
clearer, give employees the chance to
practise using it, and allow them to
ask any questions they may have.
AIMING TOO LOW WITH RTOS AND RPOS
It is important to remember that
recovery point objectives and recovery
time objectives are goals, not
requirements. As such, they should be
set to represent the optimal recovery
process, not necessarily the likely one.
This will encourage your team to work
harder and be more diligent in the
planning and recovery process.
NOT CONDUCTING END USER TESTING
Until the end user is able to use an
application, testing is not complete.
Services may start and appear to be
working properly on the back-end
but be inoperable on the user’s end.
This situation can be among the most
damaging if not prepared for, as the
IT team will be unaware that there is
a problem and unable to respond.
NOT UPDATING THE PLAN
DRPs should be updated at least
once a year, and whenever a business
application or process is changed.
This ensures the plan includes
updated hardware, evolving business
structure, and other changes. Many
companies overlook this step only to
find a previously effective DRP doesn’t
CO
MM
ON
DR
P M
ISTA
KES
COMMONDRP MISTAKES
CHAPTER 5 perform under current conditions.
MAKING THE PLAN TOO COMPLICATED
Although plans should be
thorough and include all necessary
information, they should not be
overly long or complicated. Prioritise
information and present it in clear,
concise steps so your team can react
quickly and properly, even in the most
stressful situations.
NOT DELEGATING PROPER RESOURCES TO THE DRP
This incredibly common mistake
can destroy the chances of disaster
recovery success. Many companies
believe that the risks of disaster are
slim and disaster recovery planning
is a low priority. However, disasters,
though rare, can threaten the
viability of your company. Only those
companies that are properly prepared
will suffer minimal losses in the
worst disasters. This includes having
complete reliance on only a handful of
individuals who may also be affected
by the same disaster, for example a
bushfire or flood.
STORING THE DRP ON THE NETWORK
This may sound like common sense,
but we do know of at least one
company that kept their Disaster
Recovery Plan on the network and
were unable to access the information
during a disaster. Think holistically
about what elements are required to
reinstate a failed IT service, including
installation media, servers, storage
and installation instructions.
01
02
03
04
05
06
P11
Macquarie Telecom’s LAUNCH
Disaster Recovery provides
completely outsourced disaster
recovery solutions at the hypervisor
level.
With one of the lowest downtimes
of any disaster recovery service,
LAUNCH can help your company
mitigate losses and get up and
running again faster.
WANT TO LEARN MORE ABOUT HOW
LAUNCH CAN HELP YOUR COMPANY
PREPARE FOR DISASTER?
Contact Macquarie Telecom
on 1800 004 943 or visit
macquarietelecom.com
REFERENCES:
» [1] Ponemon Institute Study Quantifies Cost of Data
Center Downtime. Emerson Network Power. 2011.
» [2] Assessing the Financial Impact of Downtime.
http://www.businesscomputingworld.co.uk/
assessing-the-financial-impact-of-downtime/.
Business Computing World. 2011
» [3] http://www.arcserve.com/us/lpg/~/media/Files/
SupportingPieces/ARCserve/avoidable-cost-of-
downtime-summary-phase-2.pdf
» [4] http://www.drj.com/new2dr/new2dr/w2_002.htm
» [5] http://www.cisco.com/en/US/technologies/
collateral/tk869/tk769/white_paper_c11-453495.html
NEE
D H
ELP
PR
OTE
CTI
NG
YO
UR
BU
SIN
ESS?
NEED HELPPROTECTING YOUR BUSINESS FROM DISASTER?
CHAPTER 6
01
02
03
04
05
06