Protect your app from Outages

Post on 12-Jun-2015

772 views 0 download

Tags:

description

How to protect your application from outages and failures of cloud infrastructures. Planning disaster recovery architecture and use Cloudify for cloud abstraction and monitoring.

Transcript of Protect your app from Outages

Protect your app from OutagesRon Zavner, Applications Architect at Gigaspaces

February 2013

2

AWS and outages Outage impact Disaster Recovery – it’s all about redundancy! Cloudify as a solution for redundancy Demo with Cloudify on EC2

® Copyright 2013 GigaSpaces Ltd. All Rights Reserved

AGENDA

3

AWS USAGE

Managing Big Data on the Cloud

• AWS – around 0.5M servers• Facebook – less than 0.1M servers• Google – around 1M servers

4

THE OUTAGE PROBLEM

® Copyright 2012 GigaSpaces Ltd. All Rights Reserved5

OUTAGE – APRIL 21, 2011

® Copyright 2012 GigaSpaces Ltd. All Rights Reserved6

OUTAGE - JUNE 29, 2012

® Copyright 2012 GigaSpaces Ltd. All Rights Reserved7

OUTAGE - OCTOBER 22, 2012

® Copyright 2012 GigaSpaces Ltd. All Rights Reserved8

OUTAGE - CHRISTMAS EVE 2012

9

THAT’S WHAT YOU EXPECT?

Managing Big Data on the Cloud

99% - 3.65 days downtime99.9% - 8.76 hours downtime99.99% - 53 minutes downtime99.999% - 5.26 minutes downtime

® Copyright 2012 GigaSpaces Ltd. All Rights Reserved10

OUTAGE IMPACT – DESIGN FOR FAILURES

Outage could cost…$89K per hour for Amadeus$225K per hour for PayPal!

11

DISASTER RECOVERY

12

MULTI CLOUD

Managing Big Data on the Cloud

13

PREPARE FOR DISASTER RECOVERY

Managing Big Data on the Cloud

•Dedicated expert for DR architecture•Define target recovery time & point•Assume every tier can fail•Use monitoring and alerts•Document your operational processes

14

CHAOS MONKEY

Managing Big Data on the Cloud

15

It’s all about REDUNDANCY!

16

CLONE YOUR ENVIORMENT

Managing Big Data on the Cloud

17

CLONE YOUR DATA

•RDS Read Replica•More to come…

18

You must use an AUTOMATION layer

CLOUDIFY POSITIONING IN THE CLOUD STACK

19

PaaS

IaaS

DevOps(Automation)

Productivity

Control

ChefPuppet

CloudFoundryHeroku

GAEOpenShift

Rightscale

Public clouds(AWS, Rackspace,..) Private clouds

(Vmware, OpenStack..)

High productivity with full control

Enstratus

CLONE YOUR ENV - HOW DOES IT WORK?

® Copyright 2012 GigaSpaces. All Rights Reserved21

EXTENSIVE PLATFORM SUPPORT

22

USE ANY CLOUD

Managing Big Data on the Cloud

® Copyright 2012 GigaSpaces Ltd. All Rights Reserved23

GETTING COMPUTE RESOURCES IN A PORTABLE WAY

compute { template "SMALL_LINUX"}

SMALL_LINUX : template imageId "us-east-1/ami-76f0061f“ remoteDirectory "/home/ec2-user/gs-files“ machineMemoryMB 1600 hardwareId "m1.small" locationId "us-east-1" localDirectory "upload" keyFile "myKeyFile.pem"

options ([ "securityGroups" : ["default"]as

String[], "keyPair" : "myKeyFile"])

overrides (["jclouds.ec2.ami-query":"",

"jclouds.ec2.cc-ami-query":""])privileged true

}

SMALL_LINUX : template{ imageId "1234" machineMemoryMB 3200 hardwareId "103" remoteDirectory "/root/gs-files" localDirectory "upload" keyFile "gigaPGHP.pem" options ([ "openstack.securityGroup" : "default", "openstack.keyPair" : "gigaPGHP"

])privileged true

}

® Copyright 2012 GigaSpaces Ltd. All Rights Reserved24

DATA REPLICATION

• Cloudify Replicated MySQL Recipe• Generic replication service using WAN Gateway

GENERIC REPLICATION SERVICE OVER WAN

Hong Kong

London

New York

In-Memory Speed High Availability and Self-HealingScalable and Efficient

26

Real Life Scenario

VERIFI (CURRENT) DEPLOYMENT ARCHITECTURE

27

Availability region (US-West: Oregon)

Data VolumeInternet EC2 Instance

mod_cluster

EC2 Instance

JBoss

Data Volume

EC2 Instance

EC2 Instance

PostgresSQL

Cassandra

4 recipes

TARGET ARCHITECTURE

Availability Region (US-West Oregon)

Data Volume

Internet EC2 Instance

mod_cluster

EC2 Instance

JBoss

Data Volume

Postgres MasterEC2 Instance

EC2 Instance

Cassandra

Availability Region (US-East Virginia)

Data Volume

EC2 Instance

mod_cluster

EC2 Instance

JBoss

Data Volume

Postgres SlaveEC2 Instance

EC2 Instance

Cassandra

replication

Bootstrap two EC2 clouds in different regions, install the “verifi” application on each. The second cloud will have a slightly modified (extended) postgres recipe for acting as a slave + no running app servers. Upon the primary zone failure, the second cloud will spin up instances of the app servers and turn the data instance into master, then bootstrapping another “slave” cloud in another zone.

FAILOVER SCENARIO

29

Region (US-West Oregon)

App ServersPostgresSQL

Region (US-East Virginia)

PostgresSQL

Cloud #1 Cloud #2

Region (US-East Virginia )

PostgresSQL

Cloud #1 Cloud #2

XApp Servers

Region (US-West California)

PostgresSQL

Cloud #3

Region failure occurs

Bootstrap another cloud in a different region using the same application recipe used to bootstrap cloud #2 above*

1 2 3

Liveness poll

Liveness poll

0 Upon initial deployment, the primary deployment of the application will be bootstrapped onto cloud #1, another slightly modified application recipe will be bootstrapped as cloud #2, polling cloud #1 for failure, and acting as a PostgresSQL db slave.

Turn Postgres slave into master, Start app server instances*

® Copyright 2012 GigaSpaces Ltd. All Rights Reserved30

DEMO ON EC2 - 5 MINUTES SETUP

/* Credentials - You must enter your * cloud provider account credentials */

user="ENTER_USER_HERE"apiKey="ENTER_API_KEY_HERE"keyFile="ENTER_KEY_FILE_HERE"keyPair="ENTER_KEY_PAIR_HERE"

// Advanced usage

hardwareId="m1.small"locationId="us-east-1"linuxImageId="us-east-1/ami-1624987f"ubuntuImageId="us-east-1/ami-82fa58eb"

31

AWS and outages Outage impact Disaster Recovery – it’s all about redundancy!

Cloning your environment – app stack Cloning your DB – Replication

Cloudify as a solution for Redundancy Use recipes to work on any cloud Fast and customized data replication

Demo with Cloudify on EC2

® Copyright 2013 GigaSpaces Ltd. All Rights Reserved

SUMMARY

32

Thank You!RonZ@gigaspaces.com

® Copyright 2013 GigaSpaces Ltd. All Rights Reserved

QUESTIONS & ANSWERS