ProfitBricks-white-paper-Disaster-Recovery-US

16
How to select a cloud disaster recovery method that meets your requirements. White Paper VS

Transcript of ProfitBricks-white-paper-Disaster-Recovery-US

Page 1: ProfitBricks-white-paper-Disaster-Recovery-US

How to select a cloud disaster recovery method that meets your requirements.

White Paper

VS

Page 2: ProfitBricks-white-paper-Disaster-Recovery-US

White Paper Disaster Recovery Page 2

Table of contents Page 2

Executive Summary Page 3

Introduction Page 3

Disaster Recovery Methodologies Page 5

Cold Site DR at ProfitBricks Page 7

Warm Site DR at ProfitBricks Page 9

Hot Site DR at ProfitBricks Page 12

Other Considerations Page 14

Table of contents

Page 3: ProfitBricks-white-paper-Disaster-Recovery-US

White Paper Disaster Recovery Page 3

The value of data that a business generates is increasing as more and more companies are required to implement a disaster recovery plan. It’s important to understand your business’s Recovery Time Objective (RTO) and Recovery Point Objective (RPO) and the risk and bottom line costs of these values before selecting one of the three most common technology solutions to meet them. These are Cold site, Warm site and Hot site. Each of them will meet the RTO and RPO objectives differently and each model has a different cost and operational model. In this paper, you’ll learn some of the details and see via architectural drawings how a Hot site combined with public cloud infrastructure. It can help you achieve a RTO and RPO that would be the envy of any business continuity expert, but one that will come at a higher cost than a cold site. ProfitBricks’ unique feature set includes a visual cloud management tool, billing by the minute, free incoming traffic and live vertical scaling, all combined to make one of the most compelling disaster recovery options available.

Executive Summary

Data that a company gains or uses has become one of the single most predictors of a business’s ability to merely survive or to prosper. Data loss that occurs during a disaster can mean an increased likelihood of business failure. Insurance companies, financial institutions and your business partners may all now being asking for a DR plan. We’ve all heard that statistics; a business that loses an important portion of its data is 70% more than likely to fail.

Many organizations are turning to cloud computing and its Infrastructure as a Service (IaaS) Delivery Model to bring Disaster Recovery (DR) capabilities for their critical IT workloads without incurring the investment of a second physical site and/or the capital expenditure of hardware infrastructure at a second physical site. Most Infrastructure as a Service offerings support the various types of disaster recovery methodologies for failover and recovery. Many IaaS providers have multiple state-of-the-art data centers around the world, and can become a strategic partner to ensure the business continuity plan for any organization.

Introduction

Page 4: ProfitBricks-white-paper-Disaster-Recovery-US

White Paper Disaster Recovery Page 4

Firstly, it’s important to define some of the concepts and terminology commonly used when developing a plan to recover against a disaster.

� Recovery Time Objective (RTO) – This is the maximum tolerable length of time that a system can be unavailable during a disaster in order to avoid non-repairable damage to the business operations, customer satisfaction, employee productivity or revenue and profit margins. For example, if a disaster occurs at 1:00 PM and the RTO is 8 hours, the DR process should restore the services no later than 9:00 PM in order to meet this requirement.

� Recovery Point Objective (RPO) – This is the maximum targeted period of time in which data might be lost during a disaster. For example, if a disaster occurs at 1:00 PM and the RPO is 1 hour, the system should recover all data that was lost during the time of the disaster to 12:00 PM (noon)

One of the most important decisions in developing a disaster recovery plan for IT infrastructure is the financial impact analysis has on the business when IT systems are unavailable. This may include, among others, loss of credibility and business losses due to downtime. An organization would typically determine an acceptable RTO and RPO in order to find and plan solutions that are cost effective and meet the necessary levels of availability.

It is also important to make a distinction between Backup and Disaster Recovery. While many interchange the terms, Backup is essential to the Disaster Recovery process but is often a separate initiative. While not recommended, most organizations have a backup plan but not a full disaster recovery plan. See our Backup and Restore documentation for further details.

This white paper outlines some of the most common methodologies used by customers of ProfitBricks to enable the DR processes and minimize the impact of a disaster. It is worth mentioning that a comprehensive Disaster Recovery Plan is a holistic strategy that includes people, processes and technology. This white paper emphasizes the technology aspects of a DR plan.

Page 5: ProfitBricks-white-paper-Disaster-Recovery-US

White Paper Disaster Recovery Page 5

In general, there are three types of scenarios that are commonly used to support a DR strategy:

Disaster Recovery Methodologies

Requirement COLD WARM HOT

Overall Cost Low Medium High

ProfitBricks Cost Savings Features

On-Demand FacilitiesOn-Demand Servers

Free Incoming TrafficNo Tape, No Hardware

Same as Cold plus:Minute-based billingLive Vertical Scalling

Same as Warm

User impact (RTO) High Medium Low

ProfitBricks Features to reduce RTO

Automatic Tools:

Ansible, API, Chef KnifePlugin, CLI, Libcloud,

JClouds, SDKs

Same as Cold Not necessary

Complexity Low Medium High

Test Feasibility Not Easy Not Easy Easy

Partner Solutions Needed Backup & Restore (RPO)DNS Same as Cold

DNS FailoverGlobal

Load BalancingReplication Technologies

� Cold Site – typically refers to having access to data center space, power and network connectivity without having the production server workloads deployed. In the event of a disaster, IT staff will have to spend a significant amount of time deploying the necessary infrastructure to deploy the server workloads from backups or other images. This scenario is typically the least expensive option but requires a good deal of configuration making the RTO window difficult to achieve for certain environments.

� Warm Site – a middle ground for the DR options described in this document. This method offers the same benefits of a cold site (lower costs, fewer complicated connections) with the addition of some pre-installed servers

Page 6: ProfitBricks-white-paper-Disaster-Recovery-US

White Paper Disaster Recovery Page 6

Figure 1 Sample long description here

and core infrastructure that can be quickly configured and/or restored to in order to minimize configuration efforts and speed the recovery process.

� Hot Site – essentially mirrors the production data center. In this type of configuration, the production and DR workloads are running concurrently with the primary site continuously synchronizing changes to the DR site. This allows for the DR site to serve requests almost immediately in the event of a disaster. However; this level of redundancy is typically the most expensive of the options but greatly reduces risk and ensures a shorter RTO can be achieved.

The following figure is a high level graphical view of the cost / benefits of the methods described before:

It is worth mentioning that as with any other methodologies, there is no one size fits all, and there can certainly be a combination of these methods to create a workload-specific DR capability for the various applications being hosted. For this reason, it is recommended to perform an application criticality categorization review to determine the needs of each application in the environment.

Page 7: ProfitBricks-white-paper-Disaster-Recovery-US

White Paper Disaster Recovery Page 7

Cold Site DR at ProfitBricks

One of the benefits of partnering with ProfitBricks for a Disaster Recovery strategy and implementation is that all of the data center space, power, cooling, physical security and an almost limitless supply of servers, storage and networks can be spun up in our state-of-the-art data centers the minute an administrator logs in to the ProfitBricks Data Center Designer.

A typical configuration for this level of recovery would include a basic level of infrastructure to provide basic network connectivity and access to backups that can be used in a similar way to a backup-and-restore scenario. The difference is that this would occur at a different site than the primary data center. In this scenario, the target recovery site can be a ProfitBricks virtual data center, which could mirror your existing data center that has failed.

Some of items that should be considered for preparing the DR Site environment include:

� Understand how clients will connect to the DR site, which could include via a Web Browser, Terminal Services, VPN, among others. This will depend on the application(s) being hosted.

� Reserve the necessary public IP(s) that will be used to access the environment and configure the necessary Firewall/VPN rules to allow sufficient access to the DR Site.

� Maintain a copy of the backup sets at the DR site to speed up the recovery process. It is worth mentioning that ProfitBricks customers have the option to ship a hard drive to “seed” the backup data and only sync up the delta changes.

� Stage CPU, RAM and LAN connections for the servers that would be used for recovery within the ProfitBricks virtual data center. Leave them as “unallocated” by clicking the STOP button to avoid unnecessary costs for computing resources.

Page 8: ProfitBricks-white-paper-Disaster-Recovery-US

White Paper Disaster Recovery Page 8

Figure 2 Pre-staged environment to enable cold failover recovery for a sample 3-tier web application

In the event of a disaster, the backup server at the ProfitBricks virtual data center (VDC) can recover the systems from its local backup replica. Also, the domain name (DNS record) can be updated to point to the ProfitBricks VDC in order to reach the servers that will take over production operations.

Ideally, the Firewall Server can be maintained with the same rules as the source in order to minimize the amount of time needed to configure the appropriate access control lists (ACLs). It is worth mentioning that the IP space for the internal local area network can likely be maintained so it’s important to document the networking details in advance.

Below is a high level representation of the items referenced above which can be pre-staged to enable cold failover recovery for a sample 3-tier web application:

Page 9: ProfitBricks-white-paper-Disaster-Recovery-US

White Paper Disaster Recovery Page 9

The following figure shows the recovery phase for this scenario:

Figure 3 Recover phase of a cold failover recovery for a sample 3-tier web application

Warm Site DR at ProfitBricks

A warm or standby site is often a popular option due to the ability to deploy a scaled-down version of a fully functional environment that is always running. This further decreases the recovery time as servers are already running and may simply need to be updated with the latest data sets (ie. delta vs. full).

ProfitBricks can further support this method by allowing the servers to be configured with any combination of CPU, RAM, Storage (enough to meet the minimum requirements) and can later scale to the appropriate size via ProfitBricks’ Live Vertical Scaling feature which enables the allocation of additional resources on the fly.

Page 10: ProfitBricks-white-paper-Disaster-Recovery-US

White Paper Disaster Recovery Page 10

Figure 4 Pre-DR environment for a warm site at ProfitBricks for a sample 3-tier web application

The initial DR setup at ProfitBricks would look as depicted below for a sample 3-tier web application:

This method minimizes the cost of running an identical amount of resources at the DR site while providing the flexibility to grow as needed. At ProfitBricks, it is also possible to scale up temporarily for testing the disaster recovery process, run user acceptance testing (UAT) and then scale back down to the minimum requirements.

In the case of failure of the primary data center, the warm/standby environment can be quickly scaled up to support the production load, DNS records updated to redirect traffic to ProfitBricks and apply the latest delta changes to the production data.

Page 11: ProfitBricks-white-paper-Disaster-Recovery-US

White Paper Disaster Recovery Page 11

Figure 5 Post recovery for a warm site DR environment at ProfitBricks for a sample 3-tier web application

The diagram below is a graphical representation of what the environment would look like during the recovery process:

Page 12: ProfitBricks-white-paper-Disaster-Recovery-US

White Paper Disaster Recovery Page 12

Hot Site DR at ProfitBricks

A hot site solution runs identical configurations at the primary site as well as in the ProfitBricks cloud using either an active-active configuration or a highly automated active-passive configuration where the DR site is only a few transactions behind the primary source. This is typically achieved by leveraging synchronous (active-active) or asynchronous (active-passive) replication technologies. This scenario is often combined with various technologies that enable a “stretched cluster” to have multiple redundant resources located in different geographic locations appear as a single highly available system. Some of these technologies include DNS weighting, Global Load Balancing, Host-Based Replication, Database mirroring, to name a few.

The following diagram depicts some of the elements that need to be in place to deploy an active-active configuration in our sample 3-tier web application:

Figure 6 Pre-DR environment for a Hot site at ProfitBricks for a sample 3-tier web application

Page 13: ProfitBricks-white-paper-Disaster-Recovery-US

White Paper Disaster Recovery Page 13

If disaster strikes, traffic can quickly be cutover to the infrastructure running at ProfitBricks by the Global Load Balancer once it detects that the primary data center has failed. At this point, the DNS domain service will direct 100% of the traffic to the DR environment.

The following diagram depicts the environment after the disaster:

Figure 7 Post-DR environment for a Hot site at ProfitBricks for a sample 3-tier web application

Page 14: ProfitBricks-white-paper-Disaster-Recovery-US

White Paper Disaster Recovery Page 14

Every IT environment varies, and while the examples used in this white paper were simple 3-tier web applications, your organization probable has a mixture of simpler and far more complex environments.Any of the preceding scenarios in this white paper can be deployed at ProfitBricks using separate regions to deploy both primary and DR sites within the ProfitBricks infrastructure. Some of the benefits of utilizing one IaaS vendor for both environments include:

� One vendor for all Infrastructure-as-a-Service needs. No need to negotiate contracts with third-party vendors.

� Leverage same tools and APIs for both environments.

It is also worth considering the following factors for the replication of data between locations:

� Bandwidth – Network throughout and latency are key to deliver data consistently.

� Distance Between Sites – This not only affects the latency of the network connection. It should also be considered as a factor for allowing sufficient distance between the sites to prevent disasters affecting both physical sites.

� Data Transfer Rate – This directly correlates with the ability to meet RPO requirements and is dependent upon the available bandwidth, distance and capabilities of the replication technology.

Another important aspect of the disaster recovery plan include the people and processes that need to be in place to support the execution of the DR plan. Some of these items include:

� Testing frequency and testing methodologies of the DR plan.

� Communications needed during the disaster to include regular updates to key stakeholders.

Other Considerations

Page 15: ProfitBricks-white-paper-Disaster-Recovery-US

White Paper Disaster Recovery Page 15

� Ownership for the various processes and tasks that need to be executed.

� Process to “failback” to the primary site after the disaster has been contained.

� Understanding of software licensing rights for the various workloads and their deployments in a DR model to ensure compliance.

We hope this white paper provided an overview of how customers of the Profit-Bricks cloud are implementing their disaster recovery programs.

If you’d like to learn how to select a cloud computing IaaS provider, we suggest that you download our expert guide to selecting a cloud computing provider, com-plete with a checklist you could use as an RFP template.

Page 16: ProfitBricks-white-paper-Disaster-Recovery-US

White Paper Disaster Recovery

This report was created by ProfitBricks.

About ProfitBricks

ProfitBricks is the IaaS provider that offers a painless cloud infrastructure for all IT users and managed service provider partners, with no learning curve. ProfitBricks boasts flexible cloud servers and networking, an integrated Data Center Designer tool for visual control over the cloud and a price/performance guarantee. ProfitBricks was awarded a spot on CRNs Emerging Vendors List, won the Best Cloud Award by ASCII and was also the recipient of the Frost & Sullivan North American Cloud Innovation award. ProfitBricks has offices in Berlin, Germany, San Antonio, Texas and Boston, Massachusetts.

Get more information

For further information and resources please visit our � Website: www.profitbricks.com � Blog: blog.profitbricks.com � Twitter: https://twitter.com/ProfitBricksUSA

ProfitBricks Inc.15900 La Cantera Pkwy, Ste. 19210San Antonio, TX 78256

Phone: + 1 866 852-5229Fax: + 1 888 620-3375Email: [email protected]

www.profitbricks.comtwitter.com/profitbricksusablog.profitbricks.com

©2015 ProfitBricks, Inc. All rights reserved. ProfitBricks, the ProfitBricks logo and Data Center Designer are trademarks of ProfitBricks Inc.All other trademarks are the property of their respective owners. ProfitBricks reserves the right to make changes without further notice.

Copyright