Leveraging the Public Cloud for Disaster Recovery
description
Transcript of Leveraging the Public Cloud for Disaster Recovery
![Page 1: Leveraging the Public Cloud for Disaster Recovery](https://reader034.fdocuments.us/reader034/viewer/2022051411/5453102aaf79591d308b5453/html5/thumbnails/1.jpg)
Leveraging the Public Cloudfor Disaster Recovery
Lahav Savir, Architect & CEOEmind systems [email protected]
![Page 2: Leveraging the Public Cloud for Disaster Recovery](https://reader034.fdocuments.us/reader034/viewer/2022051411/5453102aaf79591d308b5453/html5/thumbnails/2.jpg)
About
Lahav Savir• 15+ years’ experience in on-line industry• Architect and CEO @ Emind Systems
Emind Systems (est. 2006)• Boutique system integrator• ~100 AWS customers• AWS solution provider
![Page 3: Leveraging the Public Cloud for Disaster Recovery](https://reader034.fdocuments.us/reader034/viewer/2022051411/5453102aaf79591d308b5453/html5/thumbnails/3.jpg)
Amazon (AWS) Certification
Amazon Solution Provider& Consulting Partner
https://aws.amazon.com/solution-providers/si/emind-systems-ltd
![Page 4: Leveraging the Public Cloud for Disaster Recovery](https://reader034.fdocuments.us/reader034/viewer/2022051411/5453102aaf79591d308b5453/html5/thumbnails/4.jpg)
Disaster Recovery in a Nutshell
• Business continuity• Minimize downtime and data loss• Recovery Time Objective (RPO)• Recovery Point Objective (RTO)• Price
![Page 5: Leveraging the Public Cloud for Disaster Recovery](https://reader034.fdocuments.us/reader034/viewer/2022051411/5453102aaf79591d308b5453/html5/thumbnails/5.jpg)
DR ApproachesComplete server mirroring
Data mirroring / replication
Configuration replication
![Page 6: Leveraging the Public Cloud for Disaster Recovery](https://reader034.fdocuments.us/reader034/viewer/2022051411/5453102aaf79591d308b5453/html5/thumbnails/6.jpg)
Emind’s Best Practice
Server MirrorConfiguration
Mirror
Data Mirror
Data Mirror
![Page 7: Leveraging the Public Cloud for Disaster Recovery](https://reader034.fdocuments.us/reader034/viewer/2022051411/5453102aaf79591d308b5453/html5/thumbnails/7.jpg)
Why Amazon ?
Flexible, Global Infrastructure• N. Virginia• Oregon• N. California• Ireland• Singapore• Tokyo• Sydney• São Paulo• GovCloud
![Page 8: Leveraging the Public Cloud for Disaster Recovery](https://reader034.fdocuments.us/reader034/viewer/2022051411/5453102aaf79591d308b5453/html5/thumbnails/8.jpg)
Secure
• VPC - Virtual Private Cloud on AWS's infrastructure
• Specify private IP address range
• Bridge your onsite IT infrastructure and the VPC with a VPN connection or Direct Connect
• Extending your existing security and management policies to the cloud
![Page 9: Leveraging the Public Cloud for Disaster Recovery](https://reader034.fdocuments.us/reader034/viewer/2022051411/5453102aaf79591d308b5453/html5/thumbnails/9.jpg)
A different cost model
2nd Site Cost
AWS Cost
Demand
Cost savings w/ AWS
Ability to scale – no arbitrary time limit to failback
Time
Infr
astr
uctu
re C
ost
Test Test Failover Failback
![Page 10: Leveraging the Public Cloud for Disaster Recovery](https://reader034.fdocuments.us/reader034/viewer/2022051411/5453102aaf79591d308b5453/html5/thumbnails/10.jpg)
Zoom into the technics
![Page 11: Leveraging the Public Cloud for Disaster Recovery](https://reader034.fdocuments.us/reader034/viewer/2022051411/5453102aaf79591d308b5453/html5/thumbnails/11.jpg)
Disaster Recovery Terms• RTO: Recovery Time Objective
– Acceptable time period within which normal operation (or degraded operation) needs to be restored after event
• RPO: Recovery Point Objective– Acceptable data loss measured in time
![Page 12: Leveraging the Public Cloud for Disaster Recovery](https://reader034.fdocuments.us/reader034/viewer/2022051411/5453102aaf79591d308b5453/html5/thumbnails/12.jpg)
Backup and Restore
On-premises Infrastructure
Traditional server
Amazon Route 53
AWS Import/Export
S3 Bucket with Objects
Data copied to S3
![Page 13: Leveraging the Public Cloud for Disaster Recovery](https://reader034.fdocuments.us/reader034/viewer/2022051411/5453102aaf79591d308b5453/html5/thumbnails/13.jpg)
Backup and Restore
Availability Zone
AWS Region
Data Volume
Amazon EC2Instance
AMI
Amazon S3 Bucket
Data copied from objects in S3
Instance Quickly provisioned from
AMI
Pre-bundled with OS and
applications
![Page 14: Leveraging the Public Cloud for Disaster Recovery](https://reader034.fdocuments.us/reader034/viewer/2022051411/5453102aaf79591d308b5453/html5/thumbnails/14.jpg)
Backup and Restore
• Advantages– Simple to get started– Extremely cost effective (mostly backup storage)
• Preparation Phase– Take backups of current systems– Store backups in S3– Describe procedure to restore from backup on AWS
• Know which AMI to use, build your own as needed• Know how to restore system from backups• Know how to switch to new system
![Page 15: Leveraging the Public Cloud for Disaster Recovery](https://reader034.fdocuments.us/reader034/viewer/2022051411/5453102aaf79591d308b5453/html5/thumbnails/15.jpg)
Backup and Restore
• In Case of Disaster– Retrieve backups from S3– Bring up required infrastructure
• EC2 instances with prepared AMIs, Load Balancing, etc.
– Restore system from backup– Switch over to the new system
• Adjust DNS records to point to AWS
• Objectives– RTO: as long as it takes to bring up infrastructure and restore
system from backups– RPO: time since last backup
![Page 16: Leveraging the Public Cloud for Disaster Recovery](https://reader034.fdocuments.us/reader034/viewer/2022051411/5453102aaf79591d308b5453/html5/thumbnails/16.jpg)
Pilot LightUser or system
WebServer
ApplicationServer
DatabaseServer
Data Volume
Web Server
ApplicationServer
DatabaseServer
Data Volume
Data Mirroring/ Replication
Not Running
Smaller Instance
Amazon Route 53
![Page 17: Leveraging the Public Cloud for Disaster Recovery](https://reader034.fdocuments.us/reader034/viewer/2022051411/5453102aaf79591d308b5453/html5/thumbnails/17.jpg)
Pilot LightUser or system
WebServer
DatabaseServer
Data Volume
Web Server
ApplicationServer
DatabaseServer
Data Volume
Not Running
Smaller Instance
Amazon Route 53
WebServer
ApplicationServer
DatabaseServer Data Mirroring/
Replication
![Page 18: Leveraging the Public Cloud for Disaster Recovery](https://reader034.fdocuments.us/reader034/viewer/2022051411/5453102aaf79591d308b5453/html5/thumbnails/18.jpg)
ApplicationServer
Web Server
Pilot LightUser or system
WebServer
DatabaseServer
Data Volume
DatabaseServer
Data Volume
Start in minutes
Resize as desired
Amazon Route 53
WebServer
ApplicationServer
DatabaseServer Data Mirroring/
Replication
![Page 19: Leveraging the Public Cloud for Disaster Recovery](https://reader034.fdocuments.us/reader034/viewer/2022051411/5453102aaf79591d308b5453/html5/thumbnails/19.jpg)
Pilot Light
• Advantages– Very cost effective (fewer 24/7 resources)
• Preparation Phase– Enable replication of all critical data to AWS– Prepare all required resources for automatic start
• AMIs, Network Settings, Load Balancing, etc.
![Page 20: Leveraging the Public Cloud for Disaster Recovery](https://reader034.fdocuments.us/reader034/viewer/2022051411/5453102aaf79591d308b5453/html5/thumbnails/20.jpg)
Pilot Light
• In Case of Disaster– Automatically bring up resources around the replicated core data set– Scale the system as needed to handle current production traffic– Switch over to the new system
• Adjust DNS records to point to AWS
• Objectives– RTO: as long as it takes to detect need for DR and automatically scale
up replacement system– RPO: depends on replication type
![Page 21: Leveraging the Public Cloud for Disaster Recovery](https://reader034.fdocuments.us/reader034/viewer/2022051411/5453102aaf79591d308b5453/html5/thumbnails/21.jpg)
WebServer
Fully-Working Low Capacity Standby
User or system
Data Volume
Data Volume
Data Mirroring/ Replication
Low CapacityAmazon Route 53
WebServer
AppServer
DBServer
DatabaseServer
ApplicationServer
![Page 22: Leveraging the Public Cloud for Disaster Recovery](https://reader034.fdocuments.us/reader034/viewer/2022051411/5453102aaf79591d308b5453/html5/thumbnails/22.jpg)
Fully-Working Low Capacity Standby
User or system
Data Volume
Data Volume
Low CapacityAmazon Route 53
WebServer
AppServer
DBServerData Mirroring/
Replication
WebServer
DatabaseServer
ApplicationServer
![Page 23: Leveraging the Public Cloud for Disaster Recovery](https://reader034.fdocuments.us/reader034/viewer/2022051411/5453102aaf79591d308b5453/html5/thumbnails/23.jpg)
Fully-Working Low Capacity Standby
User or system
Data Volume
AppServer
DBServer
Data Volume
Grow CapacityAmazon Route 53
WebServer
Web Server
ApplicationServer
DatabaseServer
WebServer
DatabaseServer
ApplicationServer
Data Mirroring/ Replication
![Page 24: Leveraging the Public Cloud for Disaster Recovery](https://reader034.fdocuments.us/reader034/viewer/2022051411/5453102aaf79591d308b5453/html5/thumbnails/24.jpg)
Fully-Working Low-Capacity Standby
User or system
Data Volume
AppServer
DBServer
Data Volume
Grow CapacityAmazon Route 53
WebServer
Web Server
ApplicationServer
DatabaseServer
WebServer
DatabaseServer
ApplicationServer
Data Mirroring/ Replication
![Page 25: Leveraging the Public Cloud for Disaster Recovery](https://reader034.fdocuments.us/reader034/viewer/2022051411/5453102aaf79591d308b5453/html5/thumbnails/25.jpg)
Fully-Working Low-Capacity Standby
• Advantages– Can take some production traffic at any time– Cost savings (IT footprint smaller than full DR)
• Preparation– Similar to Pilot Light– All necessary components running 24/7, but not scaled for production
traffic– Best practice – continuous testing
• “Trickle” a statistical subset of production traffic to DR site
![Page 26: Leveraging the Public Cloud for Disaster Recovery](https://reader034.fdocuments.us/reader034/viewer/2022051411/5453102aaf79591d308b5453/html5/thumbnails/26.jpg)
Fully-Working Low-Capacity Standby
• In Case of Disaster– Immediately fail over most critical production load
• Adjust DNS records to point to AWS– (Auto) Scale the system further to handle all production load
• Objectives– RTO: for critical load: as long as it takes to fail over; for all other load,
as long as it takes to scale further– RPO: depends on replication type
![Page 27: Leveraging the Public Cloud for Disaster Recovery](https://reader034.fdocuments.us/reader034/viewer/2022051411/5453102aaf79591d308b5453/html5/thumbnails/27.jpg)
Multi-Site Hot StandbyUser or system
Data Volume
AppServer
DBServer
Data Volume
Data Mirroring/ Replication
Full CapacityAmazon Route 53
WebServer
ApplicationServer
DatabaseServer
Web Server
ApplicationServer
DatabaseServer
Web Server
ApplicationServer
DatabaseServer
![Page 28: Leveraging the Public Cloud for Disaster Recovery](https://reader034.fdocuments.us/reader034/viewer/2022051411/5453102aaf79591d308b5453/html5/thumbnails/28.jpg)
Multi-Site Hot Standby
• Advantages– At any moment can take all production load
• Preparation– Similar to Low-Capacity Standby– Fully scaling in/out with production load
• In Case of Disaster– Immediately fail over all production load
• Adjust DNS records to point to AWS
• Objectives– RTO: as long as it takes fail over– RPO: depends on replication type
![Page 29: Leveraging the Public Cloud for Disaster Recovery](https://reader034.fdocuments.us/reader034/viewer/2022051411/5453102aaf79591d308b5453/html5/thumbnails/29.jpg)
Summary
• Plan– Analyze your existing applications and services– Find the right approach per case
• Adapt– Match your plan to RTO, RPO and Budget
• POC– Validate your plan
• Test– Periodic testing
• Monitor– Ensure continues operation of all
![Page 30: Leveraging the Public Cloud for Disaster Recovery](https://reader034.fdocuments.us/reader034/viewer/2022051411/5453102aaf79591d308b5453/html5/thumbnails/30.jpg)
• goCloud – Emind’s optimal road to the cloud– Secure cloud architecture– Scalable & high-availability design– Customized system deployment– Orchestrating cloud and software– Cloud operation team– Monitoring and alerting– 24x7 SLA