Disaster recovery. prepare.plan.perform.

46
Disaster Recovery: Prepare, Plan & Perform

description

 

Transcript of Disaster recovery. prepare.plan.perform.

Page 1: Disaster recovery. prepare.plan.perform.

Disaster Recovery: Prepare, Plan & Perform

Page 2: Disaster recovery. prepare.plan.perform.

2

Agenda

• Introductions

• Key things to know about disaster recovery requirements and expectations in today’s world

• How a large manufacturer moved to meet disaster recovery challenges

• Supporting multiple customers through data loss and recovery

• Q&A

We are live tweeting and answering questions using hash tag: #DellDP. Join us at tweetchat.com/room/DellDP

Page 3: Disaster recovery. prepare.plan.perform.

3

Introducing our speakers

Moderator: George Crump

Storage Switzerland.www.storage-switzerland.com

Rob MacCara,

System Administrator

Maritime Paper Products, Ltd.

Robby Wright

Chief Technical Consultant

Abtech Technologies

We are live tweeting and answering questions using hash tag: #DellDP. Join us at tweetchat.com/room/DellDP

Page 4: Disaster recovery. prepare.plan.perform.

4

Key things to know about disaster recovery requirements and expectations in today’s world

We are live tweeting and answering questions using hash tag: #DellDP. Join us at tweetchat.com/room/DellDP

Page 5: Disaster recovery. prepare.plan.perform.

5

About George Crump & Storage Switzerland

• Analyst firm covering storage, cloud and virtualization markets

• Knowledge of these markets is gained through product testing, real-world implementations and interactions with users and suppliers

• The results of this research are found in the articles, briefing reports, case studies and lab reports on www.storage-switzerland.com

George CrumpChief Steward, Storage Switzerland

[email protected]

twitter.com/storageswissyoutube.com/user/storageswiss

We are live tweeting and answering questions using hash tag: #DellDP. Join us at tweetchat.com/room/DellDP

Page 6: Disaster recovery. prepare.plan.perform.

Meeting the Recovery Expectation

We are live tweeting and answering questions using hash tag: #DellDP. Join us at tweetchat.com/room/DellDP

Page 7: Disaster recovery. prepare.plan.perform.

• Users expect to up 100% of the time just like "FaceBook" or they expect outages to be minimal - Minutes of downtime, not hours

• Meeting this expectation means that data can no longer be "restored" - The network transfer is too time consuming

• Data have to be recoverable "in-place" and it has to be readable the first time - Verification

We are live tweeting and answering questions using hash tag: #DellDP. Join us at tweetchat.com/room/DellDP

Meeting the Recovery Expectation

Page 8: Disaster recovery. prepare.plan.perform.

Meeting the Zero Data Loss Expectation

We are live tweeting and answering questions using hash tag: #DellDP. Join us at tweetchat.com/room/DellDP

Page 9: Disaster recovery. prepare.plan.perform.

• Users expect you to have every copy and version of their data protected all the time. Just like "DropBox"

• Traditional once-a-night backup is no longer enough. Too much data is created, modified and potentially deleted in a day

• Backup has to occur at multiple points throughout the day, potentially hourly, without impacting performance

We are live tweeting and answering questions using hash tag: #DellDP. Join us at tweetchat.com/room/DellDP

Meeting the Zero Data Loss Expectation

Page 10: Disaster recovery. prepare.plan.perform.

Meeting the Keep It Forever Expectation

We are live tweeting and answering questions using hash tag: #DellDP. Join us at tweetchat.com/room/DellDP

Page 11: Disaster recovery. prepare.plan.perform.

• Tape makes this difficult because you have to manage time expectations. Disk makes this difficult because you have to manage expense expectations. - Try Tape, Require scalable deduplicated disk

• Users expect you to keep all their data forever, near-instantly available...for free

• Reality is that 99% of the data will never be needed again. The problem is you don't know where that 1% is going to come from.

We are live tweeting and answering questions using hash tag: #DellDP. Join us at tweetchat.com/room/DellDP

Meeting the Keep It Forever Expectation

Page 12: Disaster recovery. prepare.plan.perform.

• Backup may be “all about recovery” but backups mattero Granular Backupso Frequent Backupso Validated Backups

• Recovery Needs To Changeo In-Placeo Virtual

We are live tweeting and answering questions using hash tag: #DellDP. Join us at tweetchat.com/room/DellDP

Meeting Cloud Expectations

Page 13: Disaster recovery. prepare.plan.perform.

13

How a large manufacturer moved to meet disaster recovery challenges

We are live tweeting and answering questions using hash tag: #DellDP. Join us at tweetchat.com/room/DellDP

Page 14: Disaster recovery. prepare.plan.perform.

14

Rob MacCara

• System Administrator, Maritime Paper Products, Ltd., Nova Scotia

• 30 years of IT experience

• Built his own company building, selling, and maintaining Digital Video Surveillance systems for both small and large retail customers

• 20 years in the Royal Canadian Navy as a communications technician specializing in computer systems

Page 15: Disaster recovery. prepare.plan.perform.

15

Maritime Paper Products: manufacturing market leader• Major corrugated box manufacturer with multi-

continental reach

• Factories across the Canadian Atlantic provinces

• Forward-thinking with both automation and sustainable manufacturing

• Massive daily output

Page 16: Disaster recovery. prepare.plan.perform.

16

Maritime Paper’s business-critical applications

• Peak summer periods have us running 24 hours a day, with three shifts

• We depend on:o Domain controller o Microsoft SQL Server appso Microsoft Exchange Server

• A server failure can stop everything and close plants, resulting in potentially very large revenue losses

Rob standing in front of one of Maritime Paper’s high-speed box fabricating machines

Page 17: Disaster recovery. prepare.plan.perform.

17

Challenge of a failing legacy backup scheme

• Time-consuming rotation of tapes, full backups to removable disks

• Older server population prone to major failures

• Day-long file recoveries

• File backup only

• Virtualization effort In Maritime’s high-volume production environment, the pressure is on

to deliver to customers on time so their customers are also on time.

Page 18: Disaster recovery. prepare.plan.perform.

18

Time for a change

Virtualization was the last straw

• Existing backup software wasn’t up to the task

• Choice seemed to be either an expensive upgrade or having to settle for greatly reduced backup capability

Maritime reached a point where its existing legacy system would require a

costly upgrade.

Page 19: Disaster recovery. prepare.plan.perform.

19

Looking for solutions

Key requirements

We looked for an advanced backup and disaster recovery solution that:

• Worked across virtual and physical servers

• Reduced recovery times

• Restored to any type of machine

• Included replication

Page 20: Disaster recovery. prepare.plan.perform.

20

Why we decided on Dell AppAssure

A ”night and day difference”

• Strong ROI and a wealth of advanced backup and recovery features

• No more tape

• Recoveries in minutes

• Simplified recovery offsite

Page 21: Disaster recovery. prepare.plan.perform.

21

Results

• 95% savings in storage space

• Minutes to recover lost file versus 24 hours with our previous data protection product

• $28,000 savings in software and hardware

Page 22: Disaster recovery. prepare.plan.perform.

22

Fast forward to today

A strong DR, migration and testing solution

• Pair of virtual AppAssure Core machines for local and offsite recoveries

• Push-button failover to virtual

• Simplified creation of DR site

• Fast restores at any level

• Rapid P2V and V2V migrations

• Faster DR testing

Page 23: Disaster recovery. prepare.plan.perform.

23

Recovery proof point

Recent primary domain controller’s RAID backplane failed

• Resulted in a major crash

• Up and running in less than an hour as a virtual machine, including troubleshooting the server

• Avoided:o All–nightero Panico Revenue loss

• Costs and staff time would have been much higher without Dell AppAssure

Page 24: Disaster recovery. prepare.plan.perform.

24

What we learned

Now we know:

• Our data is safe

• It’s being continually backed up

Page 25: Disaster recovery. prepare.plan.perform.

25

Supporting multiple customer through data loss and recovery

We are live tweeting and answering questions using hash tag: #DellDP. Join us at tweetchat.com/room/DellDP

Page 26: Disaster recovery. prepare.plan.perform.

26

About Robby Wright & Abtech

Robby WrightChief Technical ConsultantAbtech Technologies

www.abtechtechnologies.com

We are live tweeting and answering questions using hash tag: #DellDP. Join us at tweetchat.com/room/DellDP

Page 27: Disaster recovery. prepare.plan.perform.

27

Two Customers – Similar Solutions

• Customer #1 – A power company in Texas that needs to protect against storm damage at their primary site– In a hurricane area– Flat ground and near a river - increases flooding possibilities– Can’t afford to lose customer information and billing capabilities– Must be able to pay vendors after disaster damage

• Customer #2 – A global coverage web conference hosting company– Hosting equipment sites on multiple continents– Hosts very large conferences – 40,000+ attendees– Sites back each other up

We are live tweeting and answering questions using hash tag: #DellDP. Join us at tweetchat.com/room/DellDP

Page 28: Disaster recovery. prepare.plan.perform.

28

The Power Company• Has two sites connected by dark fiber optic cables

– Set up with 10GBit IP connection between sites

• Using NetVault Backup and SmartDisk for normal backup processes

• NetVault SmartDisk’s new replication feature allows painless replication of deduplicated backup data between sites

• Mix of approximately 60 physical servers plus virtual machines

• Many databases, email, portals

• Needs long backup chains for regulatory compliance – Tape allows this

We are live tweeting and answering questions using hash tag: #DellDP. Join us at tweetchat.com/room/DellDP

Page 29: Disaster recovery. prepare.plan.perform.

29

How do we plan the backups?

• Determine what needs to be backed up

• Determine priorities for both backup and recovery

• Set RPO and RTO for each

• Determine backup window availability

• Do we need special handling for databases or other applications?

• Make sure we have what we need to recover

• Design backup system to meet requirements

We are live tweeting and answering questions using hash tag: #DellDP. Join us at tweetchat.com/room/DellDP

Page 30: Disaster recovery. prepare.plan.perform.

30

All Jobs Start With A Server Survey

Customer Name:

PriorityServer or VM

NameOperating System

Application on Server

RTO RPO IP AddressDisk Allocated

(GB)Disk Used

(GB)Dependencies

We are live tweeting and answering questions using hash tag: #DellDP. Join us at tweetchat.com/room/DellDP

Page 31: Disaster recovery. prepare.plan.perform.

31

Determine the RPO/RTO For Each Server

RPO Recovery Point Objective

How far back can data be lost?

RTO Recovery Time Objective

How long after failure until system up and usable?

Failure Point RTO

RPOLost Data Time Down

Time

We are live tweeting and answering questions using hash tag: #DellDP. Join us at tweetchat.com/room/DellDP

Page 32: Disaster recovery. prepare.plan.perform.

32

The Power Company’s Recovery Objectives

• RPO - They can stand up to one day’s loss of data– Data can be re-input if necessary for time lost– Backup to both disk and tape provides multiple fallbacks if necessary

• RTO – They need to be back up and running in 8 hours max.– Bare metal backup and recovery restores O/S and applications– SmartDisk and/or tape provides recovery of data

We are live tweeting and answering questions using hash tag: #DellDP. Join us at tweetchat.com/room/DellDP

Page 33: Disaster recovery. prepare.plan.perform.

33

What is special handling?

• Exchange, SharePoint and databases either have to be stopped or have some method for a point-in-time backup to be able to recover it.

• Active file systems require snapshot capabilities for accurate backup

• Virtual machines can be backed up as either a client or as a VM• Use of plug-ins requires backup as a client, not a VM.

We are live tweeting and answering questions using hash tag: #DellDP. Join us at tweetchat.com/room/DellDP

Page 34: Disaster recovery. prepare.plan.perform.

34

GB/Hr, Real Life

You Must Be Able To Get the Data From The Disk(s) To the Tape Drive Fast

Enough !!!

File size x disk I/O rate = MB/sec.

My tape drive will record 100MBytes/Sec, so why is my backup slow?

We are live tweeting and answering questions using hash tag: #DellDP. Join us at tweetchat.com/room/DellDP

Page 35: Disaster recovery. prepare.plan.perform.

35

How do we speed up backups?

• Backing up to disk first allows multiple servers to back up at their own speed

• Backups can then be streamed to tape at full speed of the drive.

• Maximizes utility of and saves wear and tear on the tape drives

We are live tweeting and answering questions using hash tag: #DellDP. Join us at tweetchat.com/room/DellDP

Page 36: Disaster recovery. prepare.plan.perform.

36

Disk-To-Disk-To-Tape

SD Server

SD Disk

SD Agen

t

NetVault Server ClientNVBK Server

NV Clien

t

1. NV server tells client to perform backup to SD

2. Client sends data to SD server3. Data is duplicated to tape library4. If used, SD server de-dupes

data5. SD server stores data on disk

De-dup Process

Tape Librar

y

We are live tweeting and answering questions using hash tag: #DellDP. Join us at tweetchat.com/room/DellDP

Page 37: Disaster recovery. prepare.plan.perform.

37

SmartDisk Replication

• Smart Disk can now replicate data without rehydrating it

• This saves bandwidth

SD Server

SD Disk #1

SD Agen

t

NetVault Server

NVBK Server

De-dup Process

SD Server

SD Disk #2

SD Agen

t

De-dup Process

We are live tweeting and answering questions using hash tag: #DellDP. Join us at tweetchat.com/room/DellDP

Page 38: Disaster recovery. prepare.plan.perform.

38

How Well Does It Work?

• Customer had their main database server fail• We recovered the OS in ½ hour• Database data took another hour• They were up and running in less than 2 hours• Routine single file recoveries take seconds• Second site recovery takes the same time

We are live tweeting and answering questions using hash tag: #DellDP. Join us at tweetchat.com/room/DellDP

Page 39: Disaster recovery. prepare.plan.perform.

39

Customer #2: A global coverage web conference hosting company

• Customer has data centers in the U.S., England and Pacific Rim

• They wanted a backup system that would allow them to recover any office to another site

• They had very large network pipes between the sites

We are live tweeting and answering questions using hash tag: #DellDP. Join us at tweetchat.com/room/DellDP

Page 40: Disaster recovery. prepare.plan.perform.

40

The Web Conference Company’s RPO/RTO

• RPO - They can stand up to one day’s loss of data– Clustered servers means one can fail, others in the datacenter will fill in – Other sites can cover if a data center is lost– If data is available at another datacenter, it can be spun up quickly

• RTO – They need to be back up in 1 hour max. – Time is money!– Standardized server image allows quick duplication of a server– Customer setup data is only important part of recovery

We are live tweeting and answering questions using hash tag: #DellDP. Join us at tweetchat.com/room/DellDP

Page 41: Disaster recovery. prepare.plan.perform.

41

How We Did It....

• NetVault Backup has the capability of easily making a duplicate of a backup– It is built into the backup setup window– Just takes a few clicks– You can specify where you want the duplicate made– You can specify how you want the duplicate made

– By the client– By the server

• SmartDisk became the target to allow very fast recovery

We are live tweeting and answering questions using hash tag: #DellDP. Join us at tweetchat.com/room/DellDP

Page 42: Disaster recovery. prepare.plan.perform.

42

Making A Backup Copy – It’s Easy !!!

We are live tweeting and answering questions using hash tag: #DellDP. Join us at tweetchat.com/room/DellDP

Page 43: Disaster recovery. prepare.plan.perform.

43

Site-To-Site Transfers

Client

NV Client

SD Agent

SD Disk

Client

NV Client

SD Agent

SD Disk

InternetInternet

San Jose

London

Hong Kong

SD Disk

NetVault Server

SD Agent

NVBK Serve

r

We are live tweeting and answering questions using hash tag: #DellDP. Join us at tweetchat.com/room/DellDP

Page 44: Disaster recovery. prepare.plan.perform.

44

Recovery....

• Because alternate sites have backup copy, they can recover to other servers at the alternate site

• If the primary NetVault server is missing, a quick download and install of NetVault software creates a new NetVault server

• The SmartDisk at the alternate site is imported into NetVault

• NetVault allows you to recover the data to another server by simply selecting the server as the target

We are live tweeting and answering questions using hash tag: #DellDP. Join us at tweetchat.com/room/DellDP

Page 45: Disaster recovery. prepare.plan.perform.

45

Q&A We are live tweeting and answering questions using hash tag: #DellDP

Join us at tweetchat.com/room/DellDP