Disaster Recovery with Amazon Web Services: A · PDF file4. Enterprises have different levels...

16
Disaster Recovery with Amazon Web Services: A Technical Guide

Transcript of Disaster Recovery with Amazon Web Services: A · PDF file4. Enterprises have different levels...

Page 1: Disaster Recovery with Amazon Web Services: A · PDF file4. Enterprises have different levels of tolerance for business interruptions . and therefore a wide variety of disaster recovery

Disaster Recovery with Amazon Web Services: A Technical Guide

Page 2: Disaster Recovery with Amazon Web Services: A · PDF file4. Enterprises have different levels of tolerance for business interruptions . and therefore a wide variety of disaster recovery

1

An unplanned disruption to a business or data center can have devastating and far-reaching impact on productivity, customer service, supply chain, and revenue. As the marketplace becomes increasingly global and customers demand around-the-clock service, enterprises need to avoid business-disrupting events by implementing robust business continuity solutions.

Page 3: Disaster Recovery with Amazon Web Services: A · PDF file4. Enterprises have different levels of tolerance for business interruptions . and therefore a wide variety of disaster recovery

2

• Loweron-goingreoccurringexpensessuchasco-locationcharges,power,andmaintenance

• Increasedefficiencyforrecoverytimeand recovery point objectives

• Greaterspeedtomarket–doesnotrestrictgrowthinprimaryfacility

• EntrywayfororganizationstogainfamiliarityandskillswithAWSandcloudcomputing

Inthispaper,wewill:1. Define the challenges that enterprises

face in adopting public cloud solutions for disaster recovery.

2. Describe the value that large enterprises can gain by adopting cloud-based DR withservicessuchasAmazonWebServices(AWS)fordisasterrecovery.

3. Provide recommended disaster recovery architecture patterns.

Although cost-effective, multi-tenant, cloud-baseddisasterrecovery(DR)solutions are available in the market-place, many large enterprises have not yet adoptedthem.Limitedadoptionrateisprimarily due to concerns about security, reliability, and performance.

Enterprisescanreapthefollowingbenefitsfrom cloud-based disaster recovery vs. traditional.

• Saveupto85%inreducedinfrastructurecapitalexpenditureforservers,storage,networking,andphysicaldatacentercosts

Page 4: Disaster Recovery with Amazon Web Services: A · PDF file4. Enterprises have different levels of tolerance for business interruptions . and therefore a wide variety of disaster recovery

3

Enterprisesfacespecificchallengesindesigning, deploying, and managing cloud-based DR solutions that satisfy their risk profile and business continuity requirements.Specifically:

HandlingLargeOracleand SQLDatabases:Deploying, replicating, and managing large databases over long distances presents a challenge in cloud computing. Enterprisesmustexercisecareindesigning their DR architecture and selecting database deployment locations, replication technologies, database designs optimizedforcloudcomputing,anddatadeduplicationstrategiesforminimizingdatabasesizes.

NetworkIntegration:Oneofthegreatestchallengesincloudcomputingistheneedtominimizelatencybetweeninternalandcloud-basedservers.Integrationusuallyinvolvesdeploymentofnetworkoptimizationtoolstomonitorand manage both internal traffic and trafficbetweenin-houseinfrastructureanddatacenters.Externalbottlenecksare particularly difficult to manage since they may be beyond the control of internal ITstaff.IPaddressingandnetworkconvergence is another challenge that enterprisesface.NetworkwoeshavebeeneasedwithseveralAWSsolutionssuchasElasticIPaddressing,VirtualPrivateCloud(VPC),andRoute53,whichhavemadeiteasier to manage internet addressing across data centers and the cloud.

ENTERPRISE CHALLENGES AND REQUIREMENTSRapidSpin-upofStandbyMachineImages:Minimizingbusinessdisruptionintheevent of a disaster requires an architecture that enables deployment of infrastructure resources reserved for rapid recovery. Selectionandmanagementofreliabletechnologies for this purpose is critical for business continuity and cost management.

ChangeManagement:Maintainingsystems,software,andservicesbecomesacomplextaskforlargeenterprises. Moving infrastructure to the cloud does not remove this requirement and change management must still be coordinated.

LackofCloudComputingExpertise:Therapidevolutionofcloudtechnologyoutpaces availability of in-house subject matterexpertswhocanassistenterprisesinsolutionsarchitecture,SLAnegotiations,anddeployment.Inadditiontoacquisitionand development of internal capabilities, enterprisesoftenneedexternalexpertswithmuchbroaderandin-depthcapabilities in cloud computing and datacenteroptimizationtodevelopandimplement solutions.

RecoveryModel:Businesseswillneedtoscrutinizeallaspects of their disaster recovery approach even testing and overseeing all aspects of the recovery scenario as a fully proven and valid recovery event is difficult to achieve.

IncreasingBusinessContinuityExpectations:Enterprisesareincreasinglyexpectedtoprovide round-the-clock service. High availability demands are pushing enterprises toblurthelinesbetweendisasterrecoveryand uninterruptible business continuity, whichdemandsmoreinnovativeandchallenging solutions.

Facing these challenges, an enterprise-levelDRsystemmusthavethefollowingcharacteristicsandcapabilities:

1.Abilitytohandlelarge,complexenterprise-scale databases.

2.RobustnetworkconnectivitybetweenDRandproductioninfrastructure,aswellasother critical data sources.

3. Ability to rapidly scale to facilitate testing and accommodate fluctuations in business activity.

4. Automated orchestration of systems to minimizehumanerrorandincreasespeedof deployment.

5.Capabilitytoprovidelarge,permanentstorage for both data and server images inmultiplezonestoprotectagainsttheeffects of regional outages.

6.Accesstoexpertiseindesigning,deploying, and operating high-availability DRsystemsintheeventofwide-scaleregional disasters.

Page 5: Disaster Recovery with Amazon Web Services: A · PDF file4. Enterprises have different levels of tolerance for business interruptions . and therefore a wide variety of disaster recovery

4

Enterpriseshavedifferentlevelsoftolerance for business interruptions andthereforeawidevarietyofdisasterrecovery preferences, ranging from solutionsthatprovideafewhoursofdowntimetoseamlessfailover.Accenture’sdeepexpertisewithinfrastructuredesignandDRarchitecturesalongwithAWS’

TheColdMethodisatraditionalbackupmodel that provides data replication to non-volatilemediasuchastape,whichis then typically stored at a secure off-site location. As demand for storage and the need for faster recovery increases, traditional backup and recovery methods may no longer meet business requirements. TheColdMethodenablesanenterprisetorealizethecost-savingsofcloud-based

DISASTER RECOVERY ARCHITECTURES

DISASTER RECOVERY COLD METHOD

Figure 1: DR Cold Method

leading service provides a cost-efficient andoptimizedDRcapability.SmartDRoffersthreedesignmethodswhichmeetthe recovery needs of most enterprises usingcombinationsofAWSservices:

Thesesolutionsprovidegreatversatilityas they facilitate integration into current environments by leveraging current

DRwhileaddressingthechallengesofflexibilityandrecoverytime.ThisDRmethodalsolowersDRcostsbyeliminatingthe need for duplicate infrastructure.

TheColdMethodenablesanenterprisetouseitsbackupsoftwareofchoicetoreplicatedataintoAWS’scloud.Inthe event of a disaster, data could be restored from the cloud back onto in-

Method/Pattern RTO Cost

Cold

Pilot-Light

Warm

Low RTO >= 1 business day

Moderate RTO < 4 hours

Aggressive RTO < 1 hour

Lowest Cost

Moderate Cost

Highest Cost

enterprisebackupsoftware.SeveralvendorsaredevelopingandoptomzingbackupsoftwareintegrationwithAWSsuchasCommvault,Veritas,BackupExec,etc.EnvironmentsthatarealreadyvirtualizedprovidethebestRTOwithAWSduetoeaseof integration.

house resources or made available to EC2instancesonAWSinfrastructure.Additionally, backups of on-site virtual machine images could be restored for DRusingtheAWSImportfunction.Thismethod suits enterprises that seek cost savingsandcantoleraterelativelylowRPOrequirementsandslowerrecoverytimeobjectives.

On-Premise Infrastructure Accenture/AWS DRReplication Options

Internet

VPN Connection

AWS Import/Export

AWS Storage GatewayCorporate Data Center

S3 Bucket with Objects

Recovery/Fail-Over Zones

Scaling EC2 Recovery Instances

Optional Recovery with EC2

Enterprise-leveldisasterrecoveryisprimarily measured in terms of Recovery TimeObjective(RTO)andRecoveryPointObjective(RPO).RTOisameasureofthemaximumamountoftimewithinwhichoperationsareexpectedtoberesumedafteradisaster.RPOisameasure,intermsoftime,ofthemaximumamountofdatathat can be lost as a result of a disaster.

Page 6: Disaster Recovery with Amazon Web Services: A · PDF file4. Enterprises have different levels of tolerance for business interruptions . and therefore a wide variety of disaster recovery

5

StorageLayer:TheColdMethodissupportedbyseveralreplicationmethodsasshowninFigure1.Replicationviathecloud,aVPNconnection,AWSImport/Exportfeature,orAWS’StorageGatewayallfacilitateamethodinwhichdatacanbereplicatedfromanon-premisessystemtoAWS’sS3orEBSstorage.ThereplicateddataisstoredasobjectsintheS3bucketsthatcanbeprovisionedtoanEC2instance.

ApplicationLayer:FortheColdMethod,applicationsandcriticaldataarestoredinaS3bucket.Thisrepository provides a point of recovery for data to be recovered during a failure. At the point of disaster, an environment is provisionedatAWS.Withcloud-basedDR,automatedprocessescanuseAMIs,cloudformation,orscriptsalongwiththeimportfunction to help automate and speed up the ColdMethodrecoverycapabilities.

DatabaseLayer:WiththeexpertiseofAccentureresources,DRwithAWSconductsinitialassessmentsof tier 1 database systems and selects the resources that should be replicated into AWS.Certaindatabaseoptionswouldbe decided on such as the ability to take backups of on-premises database servers or in the event of a failure, snapshots can berestoredtoanEC2instanceinAWS.Applicationswouldthenbere-pointedtothedatabaseserverhostedinAWS.

NetworkLayer:TheColdMethodusesacombinationofAWSservicestoaddressthechallengesofnetworkintegrationandmaximizebusinesscontinuity:

• Route53andElasticIPareconfiguredtoenable high availability and redundancy bydivertingnetworktraffictoAWSduringatriggeredevent(disaster,testing,orloadbalancingneed).

• RoundRobinisconfiguredtoprovidehigh availability, load balancing, and networkaddressingwithnoimpactonloads in the local environment.

ManagementLayer:Basic Management tools provided by AWSsuchasAWSCloudFormationoffersdevelopers and system administrators an easywaytocreateacollectionofrelatedAWSresourcesandprovisiontheminanorderlyandpredictablefashion.Thirdpartytools such as puppet labs take automation tothenextlevelwithholisticinfrastructureorchestration.

ThefollowingAWSservices,asshowninthefigureabove,canbeleveragedintheColdMethod:

Snapshots:Whenasecure,optimizednetworkconnectionisestablishedbetweentheenterpriseandAWSinfrastructure,snapshotsof the on-premises systems can be made and replicatedtoanS3bucketandprovisionedintoEC2instanceswhenneeded.

S3:DatathatisreplicatedviatheStorageGatewaywillbestoredinanEBSvolumeorS3bucketfromwhichdatacanbecopiedback to the local site for recovery or easily provisionedintoanAWSvirtualmachine(EC2).Inago-liverecoveryscenario,theS3bucketwouldallocatetheappropriatedataobjectsintoanEBSvolumetowhichtheEC2wouldattach.

DirectConnect:AWSDirectConnectprovidesadedicatedconnectionfromtheenterprise’sLANtoAWS’edgenetwork.DCincreasesnetworkreliability,resilience,andsecurity,thus partly addressing the challenge of optimizingnetworkconnectivity.DirectConnectmayalsobeexcludedtofurtherreduce cost.

Page 7: Disaster Recovery with Amazon Web Services: A · PDF file4. Enterprises have different levels of tolerance for business interruptions . and therefore a wide variety of disaster recovery

6

ThePilotLightmethodsatisfiesmostenterprise environments that require comprehensive back-up, relatively fast recovery,andredundancy.APilotLightsolution consists of an up-to-date core infrastructureconfiguredinAWSwiththe ability to quickly provision a full scale environment during a recovery process. Thismethodentailsreplicationoftier1systemstoAWS,aswellascreationandmaintenanceofAMItemplates.Thissolutionaddressesanenterprise’sneedforflexibleDRcapabilities,seamlessnetworkintegration, rapid activation of standby AMIs,andminimalbusinessinterruption.

WiththePilotLightconfiguration,itisnecessary to set up automation to bring up the full environment using scripts andwell-definedprocesses.Accenture’sexpertisewiththesetoolsandsystemshelps to reduce the time to recovery and limits the amount of manual intervention. ThissolutionfitsbusinesseswithmoderateRPOrequirementsandslowerrecoverytimeobjectivessincecoredataandserviceswillbe replicated actively.

Thetraditionalenterprisemodelconsistsofseveralco-existinglayers that function together to supportanITenvironment:

StorageLayer:DRwithAWSstorageissupportedbyAWS’StorageGateway.TheGatewayenablesreplication of data from an on-premises systemtoAWS’sS3orEBSstorage.Thereplicateddataisstoredasadatavolume that could be provisioned to an

DISASTER RECOVERY PILOT-LIGHT

EC2instance.IfreplicatedtoEBS,thevolumecaneasilybemountedtotheEC2instance.EBSvolumesarebestsuitedwhenperformancematters.S3provideslowcoststorage that is durable and can be mounted toahostwhenIOperformanceisnotcritical.

ApplicationLayer:A small application footprint is run in a warmconfigurationatAWS.Thissmallerfootprint can maintain a minimum of business transactions. At the point of disaster, a larger environment is provisioned atAWS.WithCloud-basedDR,automatedprocessesuseAMIsandapplicationvirtualimagesalongwiththeImportfunction to quickly provision the necessary infrastructure for the applications. By keepingdormantAMIsonhand,thismethod enables enterprises to address the challenge of being able to spin up serverimagesquicklyforDRpurposes.TheabilitytoquicklyspinupAMIsalsoenablesenterprises to test the DR system more oftenandmorerigorously.Thesmallerwarmenvironmentalsoprovidesacostsavings compared to running a full live environment or self-maintained data center.

DatabaseLayer:AmazonWebServicesRDSserviceenables enterprises to operate and scale relationaldatabasessafelywithinthecloud.CustomizationsneedtobemadeforOraclesystemsexceeding3TBandforout-of-boxAWSsolutionsexceeding16TB,storagetrafficsaturationwilldrivetheperformance targets and ability to build pastlimitations.Withthesecapabilities,DRwithAWSaddressesthechallengeofhandlinglargedatabaseswithincloudenvironments. Active replication of core databasesystemsalsoaddressesanyRPOwoesandmaintainsadynamicbackupready to be used during a triggered event.

ManagementLayer:InadditiontotoolsprovidedbyAWS,DRwithAWSleveragesacombinationoftoolsto address the challenges of spinning up imagesandoptimizingbusinesscontinuityin cloud environments. Basic Management toolsprovidedbyAWSsuchasAWSCloudFormation,willtypicallysufficebutholistic automation and orchestration can beachievedwithtoolssuchasPuppetLabsandChef.Theseservicescancustomizea complete environment to any level of granularitywhereatriggeredeventwouldnot need human intervention.

On-Premise Infrastructure AWS Accenture/AWS DR

Mirror Replication Actively Replicated Database & Server

Static DNS/Proxy/Query Server

Static Application Server

Application Server

Database

Recovery/Fail-Over ZonesIf failure is detected

DNS/Proxy/Query Server

Request/Receive

Not running but active when triggered during failover event

Figure 2: DR Pilot-Light

Page 8: Disaster Recovery with Amazon Web Services: A · PDF file4. Enterprises have different levels of tolerance for business interruptions . and therefore a wide variety of disaster recovery

NetworkLayer:MultipleAWSservicesandfeaturescanaddressthechallengesofnetworkintegrationandmaximizebusinesscontinuity:

• Direct Connect is configured to provide a dedicated, secure connection from the on-premisesenvironmenttotheAWScloud,facilitatingnetworkconvergenceandbandwidththroughput.

• VPC couldbeusedtofurthercustomizeanetworktopologybyenablinggranularityintonetworkmodificationssuchasVLAN’s,subnets,ports,andIPaddresses.

• Route 53andElasticIPareconfiguredto enable high availability and redundancybydivertingnetworktraffictoAWSduringatriggeredevent(disaster,testing,orloadbalancingneed).RoundRobincanbeconfiguredtoprovide high availability, load balancing, andnetworkaddressingwithnoimpactin the local environment.

After the core infrastructure is spun-up and configuredinaPilotLightDRMethod,allother systems are activated via templates, automationtools,ornewlybuiltmachines.Thesesystemswillbereadyduringafailedeventinwhichserverswillprovisionandspin up to take over production systems. DR policywoulddictatethelevelofurgencyinwhicheachapplicationandserviceneedstobeactivated.Lesscriticalsystemscouldbeconfigured via installation packages if the DR scenario lasts for a substantial amount of time.

DISASTER RECOVERY WARM CONFIGURATIONDisasterRecoveryinawarmconfigurationallowscustomersanearno-downtimesolutionwithanear-to-100%uptimeSLAarrangement.WarmMethodextendsbeyondthePilotLightimplementationbyreplicating and keeping systems up-to-datewithintheAWS.Withtwomirroredenvironments, if the main site is interrupted intheeventofadisaster,anetworkfailoverdivertstraffictotheotherlocationwithina matter of seconds to provide near-perfect business continuity.

WarmDRutilizestoolsfromAccentureandAWSpartnerssuchasPuppetLabsorCheftoenableadministratorstoautomatetasks,deployapplications,minimizehumanerror, and manage infrastructure changes withinthecloudandon-premisessystems.WarmMethodcanalsobeleveragedforavailabilitypurposesaswellasloadbalancing if business activities require scalabilityacrosstwoenvironments.

On-Premise Infrastructure Accenture/AWS DR

Mirror Replication

Actively Replicated Database & Server

Active Failover Zone

EC2 Copy of Production

Database

If failure is detected

Production Servers

ROUTE

Traffic53

Figure 3: DR Warm Method

7

Page 9: Disaster Recovery with Amazon Web Services: A · PDF file4. Enterprises have different levels of tolerance for business interruptions . and therefore a wide variety of disaster recovery

EachlayerbuildsupontherespectivePilotLightlayersandaddscertainfeatures,aswellaslimitations,thatexistwhenbuilding a comprehensive DR environment.

StorageLayer:Thewarmconfigurationwouldleveragethestoragegatewaytoreplicateallprioritydata.Althoughthewarmconfigurationis not fully scaled to a production environment, it is fully functional and available on demand.

ApplicationLayer:AllapplicationswouldhaveaminimalbasefootprintinthewarmDRenvironment,enabling an easier recovery time compared tostartingfromafewcoreapplications,asshowninPilotLight.Multipleapplicationinstances can be running and receiving regularupdateswheretheWarmstandbysite is available to immediately take over.

DatabaseLayer:Alltier1andtier2databaseswouldhaveanactivecopyinAWS.Tier3couldbeincludedaswell,dependingonthebusinessesneeds.Oraclesystemscouldencounter limitations and may not fully supportacloudmodelwhichwouldentailcustomizationsnotyettested.Certainsystems that prove not compatible can be configuredinaDirectConnectfacilitytoprovidecloudintegration(e.g.,OracleRAC).

ManagementLayer:AlthoughAWShasbuiltmanytoolsandmanagement features for disaster recovery scenarios,awarmenvironmentaddsseverallayers of management and automation complexitythataresolvedbyusingthirdpartytoolssuchasPuppetLabs.

NetworkLayer:Warmconfigurationsolutionwouldtakethe Pilot light design and incorporate IPLoadbalancer,Route53,andDirectConnecttoenhanceandoptimizenetworkresiliency and availability. Routine upgradesandmaintenancetotheAWSsystemalongwithahigh-speednetworkrunningbetweensitesforreplicationaddsadditionalcomplexityandexpense.

SinceWarmMethodmaintainsanactivecopyoftheproductionenvironment,RTOandRPOrequirementsareeasilymet,astheswitchoverisinamatterofsecondswithnodataloss.Althoughmostofthecritical systems are actively running, several services that have been deemed non-critical may be created in the DR environment—thiswouldsimplybedoneby importing the virtual machine image or havinganAMIonhandtospinupduringatriggeredevent.TheuseofReservedInstancesisalsohighlyrecommendedforcritical applciations to ensure sufficient capacity.

8

Page 10: Disaster Recovery with Amazon Web Services: A · PDF file4. Enterprises have different levels of tolerance for business interruptions . and therefore a wide variety of disaster recovery

ENTERPRISE READINESS

AWS PARTNER SOLUTIONS

Acriticalsteptowardsuccessfuladoptionof cloud-based DR is getting the enterprise ready for disaster recovery in a multi-tenant cloud environment. Preparedness involvesdevelopingexperienceinvirtualizationtechnologies,optimizingbothinternalnetworkandexternalconnection(s)tocloudresources,optimizingdatabaseimplementations for cloud computing, aligningcloudsecuritywithinternalsecurity policies and groups, and acquiring deepcloudcomputingexpertise.

Successfuladoptionofcloud-basedDRsolutionsstartswithaligningtheenterpriseITinfrastructureandfunctionswiththeinfrastructureservicesofferedbyAWS.Migratingtoavirtualizedcomputingenvironment prepares the enterprise for cloud computing and also enables realizationofthecost-savingbenefitsofvirtualizationtechnology,whichincludeflexibility,scalability,andmoreefficientutilizationofcomputingcapacity.Deep

understandingofvirtualizationtechnologyalso enables better planning and decision-making in preparation for migrating to cloud-based DR. Furthermore, being able to utilizecomputingcapacitymoreeffectivelythroughvirtualizationpotentiallydrivescentralizationandefficientprovisioningofcomputing resources as services to business units.Suchachangeusuallyrequirescarefulplanning and change management support.

Optimizingnetworkconnectivityincludesprocuring more robust connectivity both totheInternetandtointernalnetworkresources,minimizinglatency-inducingbottlenecks, implementing dedicated connectivitytopublicexchanges,andusingcontent delivery services.

Minimizingthevolumeofdatathatneedsto be transmitted through deduplication to remote DR resources is likely to result in costsavings,fasterrecoveryspeeds,fewerbandwidthbottlenecks,andlowerchanceof data loss.

Finally, enterprise-readiness must include acquisition of cloud computing skills through training of internal staff and acquisitionofcloudcomputingexperiencethroughhiringandcontractingofexternalexpertise.Inadditiontotechnicalexpertise,theenterpriseshoulddeveloporacquire strong capabilities in defining and negotiatingenterprise-worthySLA’sforcloud-based services.

AWSallowsyoutotestaDRscenariomoreoftenandatalowercostthanthetraditional model as there is no physical hardwaretopurchase,configure,andmaintain.

LeveragingAccenture’sknowledgewillenable enterprises to prepare their decision-makers, environment, support staff, and users for successful adoption of public cloud-based disaster recovery.

WhileAWSoffersthebestsuitesofservicesfor enterprise-scale DR in the public cloud, the market also offers many complementing solutionsofferedbyAWSpartners.

ThereareanumberofAWSpartnersolutionsthat help enterprises migrate and maintain DRinthecloud.Thefollowingaresomeof the functionalities in the marketplace addressspecificenterprisechallenges:

StorageDomain

RiverbedTechnology:WhitewaterCloudStorageGatewayisanappliance that caches and replicates data from an in-house database through an on-site backup management application to either a private or public cloud, thus eliminating tape backup systems, minimizingadministrativeburden,reducingenterprise risk, and improving DR readiness. Theapplianceisalsocompatiblewithmost enterprise data backup management systems and major database products.

9

Page 11: Disaster Recovery with Amazon Web Services: A · PDF file4. Enterprises have different levels of tolerance for business interruptions . and therefore a wide variety of disaster recovery

NetApp:PrivateStorageforAWSisanenterprisestoragesolutionthatoptimizesdatareplicationbetweenon-sitesystemsandAWS.Itusesanon-sitestorageapplianceto replicate via a dedicated connection toNetAppPrivatestoragewithinaDirectConnectlocation.NetAppisleadinginthisareawithsimilarproductsfromcompetitorsexpectedinthenearfuture.

CA:ARCserveReplicationandHighAvailabilityis an enterprise solution that enables disk-to-disk replication of data and complete systemstoEC2instances.ARCserveReplication provides automatic or manual switchoverandswitchback.SystemhighavailabilityisprovidedonlyforWindowssystems but data replication is available.

Zadara:Zadara has created a separate private storagecloudthoughAWS’DirectConnect.ItenablesanenterprisetobuildaVirtualPrivateCloudArray(VPCA)withintheZadara private cloud and connect it to the enterprise’sAWScomputingenvironment.

Corporate Data Center AWS Cloud

VPN Connection

Disk-to-Disk REplication

Switchover Switchback

CA ARCserve VPC

EC2 Instances

Corporate Data Center AWS Cloud

Replication Client VPC Array (VPCA)

EC2 Instances

Zadara VPC

Corporate Data Center

AWS Simple Storage Service

AWS Direct Connect

AWS Direct Connect

NetApp Private Storage

AWS Direct Connect Facility

10

Page 12: Disaster Recovery with Amazon Web Services: A · PDF file4. Enterprises have different levels of tolerance for business interruptions . and therefore a wide variety of disaster recovery

Application DomainZadara has created a separate private storagecloudthoughAWS’sDirectConnect.ItenablesanenterprisetobuildaVirtualPrivateCloudArray(VPCA)withinthe Zadara private cloud and connect ittotheenterprise’sAWScomputingenvironment.

Technologieswithinthisdomainoptimizeapplicationperformancetomaximizeuserexperience

•Riverbed Technologies: TheStingrayTrafficManagerisanexampleofcloud-basedsoftwaredesignedtovastly improve application performance. StingrayprovidesWebContentOptimization(WCO),loadbalancing,improved scalability through offloading TCPandSSLconnectionoverhead,andbuilt-in performance monitoring and scripting.

Database DomainSafeandefficientreplicationofdatabasesis a particular challenge for cloud-based implementations.Solutionsinthisdomainattempt to simplify and speed up database backup and replication processes to minimizeriskofdatacorruption,networklatency,anduserexperiencedegradation.

• Riverbed Technologies By caching and managing replication of databases from in-house to cloud-based resources, RiverbedTechnologies’WhitewaterCloudStorageGatewayapplianceattemptstomitigate these risks and ease adoption of cloud-based DR.

• SAPandAWShavebeenworkingtogether designing solutionsthat can meetallbusinessesdemands.NewandexistingSAPcustomerscandeploytheirSAPsystemonanAWSEC2instancesinproduction environments that have been fully vetted, tested, and certified by both

vendorsforuse.ServicessuchasSAPRDStakemanagementtoanotherlevelby automating and making a deployment much quicker.

• Oracle RACisnotsupportedbyAWSandwouldentailplacingtheRACsysteminanAWS’directconnectfacilitytointegratewithAWSservices.

• Oracle Cloud BackupIntegaratingRMANwithAWSS3,OracleCloudBackup module enables enterprises to streamdatabasebackupstoAWSS3usingOracleRMANcommandsandprograms.Comparedtoon-sitetapebackup, this solution is more reliable since it is based on disk instead of tape, whichismoreaccessibleforrestoreoperations, and cheaper in terms of upfront capital costs.

• AvnetCloudBackupforOracleDatabasesusesOracleRecoveryManager(RMAN)toenabledatabasebackuptoAWSS3.

• ZmandaForMySQLdatabases,areliablesolutionisprovidedbyAWSpartnerZmandathroughAmandaEnterprisebackupandrecovery.ThissolutionenablesanenterprisetouseAWSS3asabackup target from on-premises backup infrastructureusingabrowser-basedmanagement console.

NetworkDomainSolutionswithinthisdomainattempttomaximizeaccessto,andefficientutilizationof,networkbandwidthinordertomaximizeapplication performance in the cloud environment.

• Riverbed TechnologiesSteelheadWANoptimizationsolutionusesthecombinationofvirtualization,datadeduplication,storagecentralization,bandwidthoptimization,applicationacceleration, and resource consolidation to address this challenge.

Management DomainTechnologieswithinthisdomainattemptto automate and simplify orchestration of cloud-based resource management to minimizehumanerrorandmaximizespeed.

• RightScale enables an enterprise to deliver applications in public and private clouds that are resilient to scheduled maintenance,unpredictablehardwarefailures,andoccasionaldisasters,withthe ability to clone entire environments andstageinanotherdatacenterwithone click.

• Puppet Labs & ChefbothofferITautomationsoftwaresolutionsthathelp system administrators manage infrastructure throughout its lifecycle, from provisioning and configuration to patchmanagementandcompliance.Thesesolutions can easily automate repetitive tasks, quickly deploy critical applications, and proactively manage change—scaling from10sofserversto1000s,eitheron-premises in the cloud or both.

• OpsWorks and CloudFormation automatethedeploymentforaVPCbasedstackonAWS.OpsWorkdsisanapplicationmanagementsolutionwithautomation tools that enable modeling and control of applications and the supportinginfrastructure.BothOpsworksandCloudFormationintegratewithChefand Puppet.

11

Page 13: Disaster Recovery with Amazon Web Services: A · PDF file4. Enterprises have different levels of tolerance for business interruptions . and therefore a wide variety of disaster recovery

ForTier1withRTOandRPOof<1hour,thebusinesswouldchooseWarmDRandwillhavethefollowing:• EC2instancesforallservicesrunningat

all times.

• In-houseandcloudinfrastructureload-balanced and configured for auto-failover,whichisfacilitatedbyAWSRoute53,ElasticIPaddresses,andElasticLoadBalancing.

• Initialdatasynchronizationusingin-housebackupsoftwareorfiletransferprotocol.

• Incrementaldatareplicated/synchronizedusingstoragegateway.

• Automationusedforrapidfailoverandspin-up of environment using Puppet Labssoftware.

EXAMPLE

ForTier2withRTOandRPOof<4Hours,thebusinesswouldchoosePilotLightDRandwillhavethefollowing:• Criticalcoreelementsofsystemalready

configured.

• EC2instancesrunningforcriticalservices.

• Pre-configuredAMIsforTier-2appsthatcan be quickly provisioned upon failure.

• Cloudinfrastructureload-balancedandconfiguredforautomaticfailoverwhichisfacilitatedbyAWSRoute53,ElasticIPaddresses,andElasticLoadBalancing.

• Initialdatasynchronizationusingin-housebackupsoftwareorfiletransferprotocol.

• Incrementaldatareplicated/synchronizedusingstoragegateway.

• Automationusedforrapidfailoverandspin-up of environment using Puppet Labssoftware.

ForTier3withRTOandRPOof<8hours,thebusinesswouldchooseColdDRandwillhavethefollowing:• AlldatareplicatedintoS3bucket.

• InitialdatasynchronizationusinginhousebackupsoftwareorfiletransferprotocolviatheweborAWSImport/Exportfeature.

• Pre-configuredAMIsforTier1andTier2 apps that can be quickly provisioned upon failure.

• Incrementaldatareplicated/synchronizedusingstoragegateway.

• EC2instancesarespun-upfromobjectswithinthes3buckets.

CompanyXhasacombinationofTier1,Tier2,andTier3businessapplications.Theycanchoosefromthefollowingoptions:

Asenterpriseneedsfordisasterrecoveryprogresstowardaneedforcompletebusinesscontinuity,andwhileITbudgetsforDRremainstagnant,enterprisescannolongeravoidconsideringcost-effective,multi-tenant,cloud-baseddisasterrecoverysolutionslikeAWS.

Enterprisesmust,however,appreciateandnavigatethechallengespresentedinenterprise-levelcloudcomputing.ManagingrisksbypartneringwithAccenturetoimplementSmartDRwithAWSgivesenterprisestheopportunitytosignificantlyimprovedisasterrecoverywhiletakingadvantageofthepotentiallysignificantcostsavings.AccenturecanhelpenterprisestransitiontothecloudwithourbestpracticesandexpertisealongwithmarketleadingAWSpartners.

12

Page 14: Disaster Recovery with Amazon Web Services: A · PDF file4. Enterprises have different levels of tolerance for business interruptions . and therefore a wide variety of disaster recovery

For further information or comments on this article, contact:Joseph Beah EmergingTechnology&[email protected]

Alejandro Flores EmergingTechnology&[email protected]

Keith Linnenbringer ApplicationModernizationandOptimization [email protected]

Chris Scott EmergingTechnology&Innovation [email protected]

FURTHER INFORMATION

Footnotes

WrittenincollaborationwithAWSsolutionarchitects and business development leads.

Specialthanksto:

Tom Laszewski [email protected]

1.TimothyWood,EmmanuelCecchet,K.K.Ramakrishnan,PrashantShenoy,JacobusvanderMerwe,ArunVenkataramani,“DisasterRecoveryasacloudService:Economicbenefits&deploymentchallenges”, University of Massachusetts Amherst, AT&TLabs.

2.“DisasterRecovery,”AmazonWebServicesLLC,accessedNovember27,2012,http://aws.amazon.com/disaster-recovery

3.GlenRobinson,IanniVamvadelis,andAttilaNarin“UsingAmazonWebServicesforDisasterRecovery,”AmazonWebServicesLLC,accessedNovember27,2012,http://aws.amazon.com/disaster-recovery

4.KarenDye,“DoYourDisasterRecoveryTimeObjectivesMeetYourBusinessRequirements?”,SunMicrosytems,accessedApril24,2013,http://i.zdnet.com/whitepapers/BizContWP.pdf?tag=mantle_skin;content

5.JacobGsoedl,“Disasterrecoveryinthecloudexplained,”StorageMagazine-Vol.10Num.July5,2011.

6.TimothyWood,K.K.Ramakrishnan,PrashantShenoy,JacobusvanderMerwe,“Enterprise-ReadyVirtualCloudPools:Vision,Opportunities,andChallenges”,GeargeWashingtonUniversity,AT&TResearchLabs,UniversityofMassachussetsAmherst.

7.T.Wood,A.Gerber,K.Ramakrishnan,J.VanderMerwe,andP.Shenoy.Thecaseforenterprisereadyvirtualprivateclouds.In

8.ProceedingsoftheUsenixWorkshoponHotTopicsinCloudComputing(HotCloud),SanDiego,CA,June2009

9.RobLivingstone,“WhenDisasterThundersThroughtheCloud,”accessedJanuary22,2013,http://www3.cfo.com/article/2012/1/the-cloud_cost-of-disaster-recovery-in-cloud?currpage=1

10.HenrikRosendahl,“IsEnterpriseCloudBackupatEconomyPricesforReal?,”accessedJanuary22,2013,http://www.wired.com/insights/2012/08/enterprise-cloud-backup/

11.BrandonButler,“Disasterrecoveryinthecloud:Vendorsjumpin;Enterpriseswade,”NetworkWorld,accessedJanuary22,2013,http://www.networkworld.com/news/2012/092712-disaster-recovery-cloud-262818.html?page=3

12.JimCooke,“CloudReadinessfortheEnterprise,”CloudComputingJournal,accessedJanuary22,2013,http://cloudcomputing.sys-con.com/node/2086147

13.“AnnouncingAmandaEnterprise3.1,Radicallysimple,intelligentandrobustnetworkbackupandrecovery,”Zmanda,accessedJanuary22,2013,http://www.zmanda.com/backup-Amazon-S3.html

13

Page 15: Disaster Recovery with Amazon Web Services: A · PDF file4. Enterprises have different levels of tolerance for business interruptions . and therefore a wide variety of disaster recovery

14

Page 16: Disaster Recovery with Amazon Web Services: A · PDF file4. Enterprises have different levels of tolerance for business interruptions . and therefore a wide variety of disaster recovery

Copyright©2013Accenture All rights reserved.

Accenture, its logo, and High Performance Delivered are trademarks of Accenture.

ABOUT ACCENTUREAccenture is a global management consulting, technology services and outsourcing company,withapproximately275,000peopleservingclientsinmorethan120countries.Combiningunparalleledexperience,comprehensivecapabilitiesacrossallindustriesandbusinessfunctions,andextensiveresearchontheworld’smostsuccessfulcompanies,Accenturecollaborateswithclientstohelpthembecomehigh-performancebusinessesandgovernments.ThecompanygeneratednetrevenuesofUS$27.9billionforthefiscalyearendedAugust31,2013.Itshomepageiswww.accenture.com.