Availability Analysis for Deployment of In-Cloud Applications

Availability Analysis for Deployment of In-Cloud

ApplicationsXiwei Xu, Qinghua Lu, Liming Zhu, Jim (Zhanwen) Li

Sherif Sakr, Hiroshi Wada, Ingo Weber

Software Systems Research Group, NICTA

ISARCS13, Vancouver

Slides at: http://www.slideshare.net/LimingZhu/

Motivation

• Uncertainties in Cloud are challenging for architecting critical applications and understanding availability – Shared resources, weak SLA guarantees and limited visibility– Rare but high consequence events– Sporadic activities: upgrade, backup, recovery… – Subjective uncertainties: impact of configuration choices

• We want to explicitly model the above uncertainties in application availability analysis of cloud deployment.– from a cloud consumer perspective– focusing on mechanisms most relevant to critical

applications: auto-scaling, over-provisioning, backup, recovery and maintenance.

Contributions

• SRN(Stochastic Reward Net)-based availability models • which allow you to specify:

– Deployment architecture (application placements in VM)– Node/Aggregation level SLAs from infrastructure providers– Auto-scaling policies and recovery strategies – Rare events: availability zone or region down

• which give you application availability levels of different options under different scenarios

• Model evaluation by analysing existing industry best practices in cloud application deployment– Quantifying the rule-of-thumb best practices– Comparing different (best) practices

Deployment Architecture Assumption

– Stateless VMs: auto-scaling groups– Stateful VMs: hot standbys – Backup at separate region for recovery

Availability Analysis Overview

• SRN-based Models• Architecture model and recovery model in this paper• One SRN architecture model per availability zone

• Deployment decisions and patterns – stateless/stateful application placement within VMs– auto-scaling policies– multi-zone configurations

• SLA from the cloud providers• Node level (Rackspace) or zone level (Amazon)

• Recovery strategy• Auto-regeneration of stateless VMs and different

recovery mechanisms for stateful VMs• Different Recovery-Time/Point-Objective (RTO/RPO)

• Application-specific data– Stateless VM start-up time… – Stateful VM replication…

Stochastic Reward Net

• Stochastic Reward Net (SRN)– Stochastic Petri Net variant – Firing delays– Reward function

• Constructs• Places: VM states (Full,

Running, Stoped, Failed )• Token: VMs• Transition

• Guard function• Transition rate: 1) frequency of

events, 2) delay before the transition fires

• Reward Function: if((#Running1>0) 1 else 0

SRN-based Availability Models

Availability Models: Auto-scaling

gScaleSelf1: if(#Running1<=#Running2 && #Stopped1>0) 1 else 0

gScaleOther1: if(#Running1>#Running2 && #Stopped2>0) 1 else 0

Availability Models: Stateful VM

Availability Models—Disaster Recovery

• Availability zone life cycle– Interact with the big

architecture model

• Stateless VM recovery– Backup/AMI

• Stateful VM recovery– Backup– Replica– Hot standby

Case 1: Multi-zone Deployment• Parameters

– Amazon EC2 SLA of 99.95% availability – Zone fail rate: 0.00011, MTTR: 4.38 hours per year

– Application specific measurement of transitions

0.01% = 52.56 mins downtime per year

0.4% diff = 35 hours

Case 2: Recovery across Availability Zone

• Industry rule of thumb: “Target auto-scale 30-60% until you have 50% headroom for load spikes. Lose an AZ leads to 90% utilisation.”• Impact on overall availability?• 30-60% vs. traditional 70-90%?• over-provisioning vs. auto-scaling?

Case 3: Disaster Recovery across Regions

• Trade-off between RPO and RTO• RPO: Recovery Point Objective• RTO: Recovery Time Objective

Yuruware — http://www.yuruware.com/

Conclusion and Future Work

• SRN-based availability models – Application-level availability – Highly configurable for different deployment architectures– Model different uncertainties and scenarios for critical systems– Quantify and compare choices and enable what-if analysis – Evaluated using industry best practices

• Future work – Better evaluation!– Integrated models on impact of upgrade, live migration, backup and

subjective uncertainties (in IEEE Cloud 13)Q. Lu, X. Xu, L. Zhu, L. Bass, et al., "Incorporating Uncertainty into in-Cloud Application Deployment Decisions for Availability," in IEEE Cloud 2013

Liming.Zhu@nicta.com.auSlides available at http://www.slideshare.net/LimingZhu/

Availability Analysis for Deployment of In-Cloud Applications

Technology

Transcript of Availability Analysis for Deployment of In-Cloud Applications

Government Cloud Deployment: Lessons Learned

SRX High Availability Deployment Guide

Oracle Maximum Availability Architecture with … | DEPLOYING ORACLE MAXIMUM AVAILABILITY ARCHITECTURE WITH EXADATA CLOUD MACHINE & EXADATA CLOUD SERVICE FIGURE 1: MAXIMUM AVAILABILITY

Node.js Cloud deployment

Practical Guide to Cloud Deployment Technologies · Cloud Deployment Technologies . Version 1.0 . ... Development of the Practical Guide to Cloud Deployment Technologies is a collaborative

Enterprise Cloud Adoption- Deployment Models, Workloads ... · PDF fileEnterprise Cloud Adoption-Deployment Models, Workloads and Industry Perspective ... deployment options such as

Fortinet Fortigate Deployment Guide for High Availability ... · PDF fileFORTINET FORTIGATE DEPLOYMENT GUIDE FOR HIGH AVAILABILITY IN AZURE . 2 ... but asymmetric routing will reduce

Oracle Exalytics Deployment for High Availability Maximum Availability Architecture Exalytics Deployment for High Availability Introduction 1 Deploying Exalytics for High Availability

VMware vSphere High Availability 5.0 Deployment Best … · TECHNICAL WHITE PAPER / 3 VMware vSphere High Availability 5.0 Deployment Best Practices Introduction Downtime, whether

DEPLOYMENT GUIDE Infoblox Cloud Platform and Cloud … · Infoblox Cloud Platform and Cloud Network Automation ... DEPLOYMENT GUIDE Infoblox Cloud Platform and Cloud ... Infoblox

Exchange Deployment Planning Services Exchange 2010 High Availability.

O futuro do cloud deployment

Cloud hub deployment

Demystifying cloud deployment options

Cisco Cmr Cloud Deployment Guide

PHP Cloud Deployment Toolkits

Securing your Cloud Deployment

Overview - Cisco · •nova-cloud-controller •quantum-gateway •openstack-dashboard •nova-compute •vsm ... Overview OpenStack High Availability Deployment. Cisco Nexus 1000V

App Orchestration 2.6 Deployment for High Availability and ... · PDF fileDeploy App Orchestration 2.6 for High Availability and Disaster Recovery Qiang Xu, Cloud Services Nanjing

Show 101: IBM Traveler Highly Availability - Deployment ... · PDF fileHighly Availability - Deployment and Best Practices ... IBM® Domino® Infrastructure, ... – Run as administrator.