Journey Through The Cloud - Disaster Recovery
-
Upload
amazon-web-services -
Category
Technology
-
view
1.949 -
download
1
Transcript of Journey Through The Cloud - Disaster Recovery
Journey Through the Cloud
[email protected]@IanMmmm
Ian Massingham — Technical Evangelist
Disaster Recovery
Journey Through the Cloud
Learn from the journeys taken by other AWS customers
Discover best practices that you can use to bootstrap your projects
Common use cases and adoption models for the AWS Cloud123
Disaster Recovery
Explore and learn about AWS with a ‘non-production’ use case Phase systems into ‘live’ DR use with reduced risk
Benefit from lower costs & only pay for what you useGain the ability to test DR procedures more frequently
Invoke DR whilst testing DR procedures if necessary
Agenda
Why AWS for disaster recovery?AWS services that are relevant for DR use-cases
Common DR architecturesCustomer case studies and examples
Resources to learn more
Using AWS for DR Provision
https://aws.amazon.com/solutions/case-studies/sunpower/
Business & Technical Drivers for DR in the Cloud
▶︎ Minimise costs
▶︎ Reduce on-premises infrastructure
▶︎ Consolidate sites
▶︎ Remove ageing technonologies
DR & Business ContinuityDR forms part of a wider set of policies & controls
High availability Backup Disaster recovery
Keep your applications running 24x7
Make sure your data is safe Get your applications and data back after a major
disaster
I T ’ S N O T B I N A R Y
DR & Business Continuity
Recovery Time Objective (RTO)
Recovery Point Objective (RPO)
How quickly I need this service to be recovered 1 minute? 15 minutes? 1 hour? 4 hours? 1 day?
How much data loss can be tolerated? Zero data loss? 15 minutes out of date?
Each application or service will have specific requirements
DR & Business Continuity
Custo
mer fac
ing
trans
actio
nal w
eb ap
plica
tion
Intern
al co
llabora
tion s
ystem
Daily
sche
duled
proc
esse
s
& sy
stems
Back
end r
eport
ing sy
stem &
datab
ase
Applications can be placed on a spectrum of complexity…
Rebuild when required from offsite backup
Run hot-hot configuration with auto-failover
The Utility, On-Demand Data Centre
Primary Site Routers Firewalls Network
Application Licenses Operating Systems
Hypervisor Servers
SAN fabric Primary Storage
Backup Archive
Secondary Site Routers Firewalls Network
Application Licenses Operating Systems
Hypervisor Servers
SAN fabric Primary Storage
Backup Archive
The Utility, On-Demand Data Centre
Primary Site Routers Firewalls Network
Application Licenses Operating Systems
Hypervisor Servers
SAN fabric Primary Storage
Backup Archive
AWS Region Routers Firewalls Network
Application Licenses Operating Systems
Hypervisor Servers
SAN fabric Snapshot Storage
Backup Archive
The Utility, On-Demand Data Centre
Primary Site Routers Firewalls Network
Application Licenses Operating Systems
Hypervisor Servers
SAN fabric Primary Storage
Backup Archive
AWS Region Routers Firewalls Network
Application Licenses Operating Systems
Hypervisor Servers
SAN fabric Snapshot Storage
Backup Archive
Secondary site costs
11 regions 28 availability zones 51 edge locations
AWS Global Footprint
https://aws.amazon.com/about-aws/global-infrastructure/
AWS security approach
Size of AWSsecurity team
Visibility intousage & resources
Increasing your Security Posture in the Cloud
https://aws.amazon.com/security
Broad Accreditations & Certifications
https://aws.amazon.com/compliance
Partner ecosystem Customer ecosystem Everyone benefits
Security Benefits from Community Network Effect
RELEVANT AWS SERVICES
Object Storage & Transfer Services
Amazon S3 AWS Import/Export AWS Storage Gateway
AWS Import/Export Disk
AWS Import/Export Disk
https://aws.amazon.com/importexport/disk/details/
Accelerates moving large amounts of data Uses portable storage devices for transport Often faster than internet transfer for large data sets Supported regions: US East (N. Virginia), US West (Oregon), US West
(Northern California), EU (Ireland), and Asia Pacific (Singapore)
AWS Import/Export Snowball
https://aws.amazon.com/importexport/
A single Snowball appliance can transport up to 50 terabytes of data
Using AWS Storage Services for DRAmazon S3 & Amazon Elastic Block Store
Simple Storage Service Highly scalable object storage
1 byte to 5TB in size
99.999999999% durability
Elastic Block Store High performance block storage device
Volumes from 1GB to 16TB in size
Snapshot/cloning functionalities
Networking & Connectivity Services
AWS Direct Connect Amazon Virtual Private Cloud (VPC) Amazon Route 53
Connecting to AWS
VPN Connectio
VPN Connection
Amazon VPCYour premises/network
AWS Resources
VPC VPN Gateway
VPN Connection
Customer Gateway
Your Resources
http://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/vpn-connections.html
Connecting to AWSDirect Connect
Amazon VPCYour premises/network
AWS ResourcesDirect Connect
Your Resources
https://aws.amazon.com/directconnect/
Foundation Services
Amazon EC2 Amazon Relational Database Service (RDS)
Amazon Elastic Block Storage (EBS)
COMMON ARCHITECTURESFOR DISASTER RECOVERY
Common Architectures for Disaster Recovery4 Main Patterns
Backup & Restore Pilot light
Warm standby in AWS Multi-site solution AWS & on-premises
Common Architectures for Disaster RecoveryWe’ll focus on two, starting with Backup & Restore
Backup & Restore Pilot light
Warm standby in AWS Multi-site solution AWS & on-premises
Store backup data in the AWS Cloud
Store AMIs for server operating system images
Recover servers during DR testing or invocation
Backup & Restore PatternAdvantages from starting here…
Simple to get started
Easy starting point for exploring the AWS cloud
Low technical barrier to entry
Focus on incorporating cloud into your DR strategy, not on complex technical issues
related to hot-hot systems
Cost effective
Very high levels of data durability at low price
Cost of storing snapshots in S3
Archiving possibilities beyond tape using Glacier
Backup & Restore PatternGetting started…
Take backups of configuration state & data
Store Backups in Amazon S3
Move to long term archive in Glacier
Backup & Restore PatternOptions…
Gateway Backup Appliance Direct Access to Amazon S3
AWS Storage Gateway
Amazon S3Standard Standard - Infrequent Access Glacier
https://aws.amazon.com/s3/
Amazon S3Standard - Infrequent Access
https://aws.amazon.com/s3/storage-classes/
Amazon Glacier
https://aws.amazon.com/glacier/
Amazon Glacier
Durable Designed for 99.999999999%
durability of archives
Cost Effective Write-once, read-never. Cost effective for long
term storage. Pay for accessing data
https://aws.amazon.com/glacier/
Logsaccessible from S3
logs
Expi
ry
time
logs✗Objects expire and are deleted
Logsaccessible from S3Ex
piry
time
Txns
Object transition to Glacier invoked
Logs logs✗Objects expire and are deleted
accessible from S3
accessible from S3
Expi
ryTr
ansit
ion
time
Restoration of object requested for x hrs
Logs logs✗Objects expire and are deleted
accessible from S3
accessible from S3Txns
Expi
ryTr
ansit
ion
Object transition to Glacier invoked
time
time
3-5hrs
Object held in S3 RRS for x hrs
Expi
ryTr
ansit
ion
Logs logs✗Objects expire and are deleted
accessible from S3
accessible from S3Txns
Object transition to Glacier invoked
Restoration of object requested for x hrs
Storage Gateway
Corporate Data Center Elastic Data Center
AWS Storage Gateway
AWS Storage Gateway installed
on-premise to synchronize local
volumes
https://aws.amazon.com/storagegateway/
Storage Gateway
Corporate Data Center Elastic Data Center
AWS Storage Gateway
Local volumes created under
Storage Gateway
Storage Gateway
Corporate Data Center Elastic Data Center
AWS Storage Gateway
Usable with on-premise servers
via iSCSI interface
Storage Gateway
Corporate Data Center Elastic Data Center
AWS Storage Gateway
Primary on-premise volumes
snapshotted, compressed and stored in Amazon
S3
✕Storage Gateway
Corporate Data Center Elastic Data Center
AWS Storage Gateway✕
Storage Gateway
Corporate Data Center Elastic Data Center
AWS Storage Gateway
Snapshot pulled from S3 to restore local
volume
Storage Gateway
Corporate Data Center Elastic Data Center
AWS Storage Gateway
Snapshot pulled from S3 to create cloud
instance backed by
Volume
Gateway stored volumes
Data stored locally Asynchronous backup
EBS snapshots iSCSI local interface Up to 16TB volumes
Up to 12 volumes
Gateway cached volumes
Data stored in S3 Recently read data cached
Low latency iSCSI local interface Up to 32TB volumes
Up to 32 volumes
AWS Storage GatewayGateway-Virtual Tape Library (VTL)
http://docs.aws.amazon.com/storagegateway/latest/userguide/Requirements.html#requirements-backup-sw-for-vtl
Storage appliances & backup management
RDS & Oracle RMAN
https://d0.awsstatic.com/whitepapers/strategies-for-migrating-oracle-database-to-aws.pdf
Common Architectures for Disaster RecoveryNext, let’s take a look at the Pilot Light pattern
Backup & Restore Pilot light
Warm standby in AWS Multi-site solution AWS & on-premises
Pilot light architecture
Build resources around replicated dataset
Keep ‘pilot light’ on by replicating core databases
Build AWS resources around dataset and leave in stopped state
Pilot light architecture
Build resources around replicated dataset
Scale AWS resources in response to a DR event
Keep ‘pilot light’ on by replicating core databases
Build AWS resources around dataset and leave in stopped state
Start up pool of resources in AWS when events dictate
Match required production capacity through auto-scaling policies
Pilot light architecture
Build resources around replicated dataset
Scale AWS resources in response to a DR event
Keep ‘pilot light’ on by replicating core databases
Build AWS resources around dataset and leave in stopped state
Start up pool of resources in AWS when events dictate
Match required production capacity through auto-scaling policies
Cut over to the system in AWS
Stopped instances
Pilot Light
Running instances
Pilot Light
RESOURCES YOU CAN USETO LEARN MORE
aws.amazon.com/disaster-recovery/
AWS Disaster Recovery White Paper
Amazon Web Services – Using AWS for Disaster Recovery October 2014
Page 1 of 22
Using Amazon Web Services for Disaster Recovery October 2014
Glen Robinson, Attila Narin, and Chris Elleman
Amazon Web Services – Using AWS for Disaster Recovery October 2014
Page 2 of 22
Contents Introduction ...............................................................................................................................................................3
Recovery Time Objective and Recovery Point Objective ................................................................................................4
Traditional DR Investment Practices ............................................................................................................................4
AWS Services and Features Essential for Disaster Recovery ...........................................................................................5
Example Disaster Recovery Scenarios with AWS ...........................................................................................................9
Backup and Restore ................................................................................................................................................9
Pilot Light for Quick Recovery into AWS ................................................................................................................. 11
Warm Standby Solution in AWS ............................................................................................................................. 14
Multi-Site Solution Deployed on AWS and On-Site .................................................................................................. 16
AWS Production to an AWS DR Solution Using Multiple AWS Regions ...................................................................... 18
Replication of Data ................................................................................................................................................... 18
Failing Back from a Disaster....................................................................................................................................... 19
Improving Your DR Plan ............................................................................................................................................ 20
Software Licensing and DR ........................................................................................................................................ 21
Conclusion ............................................................................................................................................................... 21
Further Reading........................................................................................................................................................ 22
Document Revisions ................................................................................................................................................. 22
Amazon Web Services – Using AWS for Disaster Recovery October 2014
Page 14 of 22
Warm Standby Solution in AWS The term warm standby is used to describe a DR scenario in which a scaled-down version of a fully functional environment is always running in the cloud. A warm standby solution extends the pilot light elements and preparation. It further decreases the recovery time because some services are always running. By identifying your business-critical systems, you can fully duplicate these systems on AWS and have them always on.
These servers can be running on a minimum-sized fleet of Amazon EC2 instances on the smallest sizes possible. This solution is not scaled to take a full-production load, but it is fully functional. It can be used for non-production work, such as testing, quality assurance, and internal use.
In a disaster, the system is scaled up quickly to handle the production load. In AWS, this can be done by adding more instances to the load balancer and by resizing the small capacity servers to run on larger Amazon EC2 instance typ es. As stated in the preceding section, horizontal scaling is preferred over vertical scaling.
Preparation phase
The following figure shows the preparation phase for a warm standby solution, in which an on-site solution and an AWS solution run side-by-side.
Figure 6: The Preparation Phase of the Warm Standby Scenario.
Amazon Web Services – Using AWS for Disaster Recovery October 2014
Page 16 of 22
Multi-Site Solution Deployed on AWS and On-Site A multi-site solution runs in AWS as well as on your existing on-site infrastructure, in an active-active configuration. The data replication method that you employ will be determined by the recovery point that you choose. For more information about recovery point options, see the Recovery Time Objective and Recovery Point Objective section in this whitepaper.
In addition to recovery point options, there are various replication methods, such as synchronous and asynchronous methods. For more information, see the Replication of Data section in this whitepaper.
You can use a DNS service that supports weighted routing, such as Amazon Route 53, to route production traffic to different sites that deliver the same application or service. A proportion of traffic will go to your infrastructure in AWS, and the remainder will go to your on-site infrastructure.
In an on-site disaster situation, you can adjust the DNS weighting and send all traffic to the AWS servers. The capacity of the AWS service can be rapidly increased to handle the full production load. You can use Amazon EC2 Auto Scaling to automate this process. You might need some application logic to detect the failure of the primary database services and cut over to the parallel database services running in AWS.
The cost of this scenario is determined by how much production traffic is handled by AWS during normal operation. In the recovery phase, you pay only for what you use for the duration that the DR environment is required at full scale. You can further reduce cost by purchasing Amazon EC2 Reserved Instances for your “always on” AWS servers.
Preparation phase
The following figure shows how you can use the weighted routing policy of the Amazon Route 53 DNS to route a portion of your traffic to the AWS site. The application on AWS might access data sources in the on-site production system. Data is replicated or mirrored to the AWS infrastructure.
Figure 8: The Preparation Phase of the Multi-Site Scenario.
http://media.amazonwebservices.com/AWS_Disaster_Recovery.pdf
aws.amazon.com/vpc
aws.amazon.com/directconnect
aws.amazon.com/s3
aws.amazon.com/glacier
aws.amazon.com/storagegateway
AWS re:Invent 2015 | (STG304) Deploying a Disaster Recovery Site on AWS
https://www.youtube.com/watch?v=bXrGUlgbl-s&list=PLhr1KZpdzukdTMmq1gkXs7g6WIIXtL5r9&index=15
aws.amazon.com/architecture/
Certification
aws.amazon.com/certification
Self-Paced Labs
aws.amazon.com/training/self-paced-labs
Try products, gain new skills, and get hands-on practice working
with AWS technologies
aws.amazon.com/training
Training
Validate your proven skills and expertise with the AWS platform
Build technical expertise to design and operate scalable, efficient applications on AWS
AWS Training & Certification
Follow us fo
r more
events
& webina
rs
@AWScloud for Global AWS News & Announcements
@AWS_UKI for local AWS events & news
@IanMmmmIan Massingham — Technical Evangelist