This presentation is available for download from:...

35
Mission-Critical Enterprise Cloud Applications Eugene Ciurana Open-Source Evangelist CIME Software Labs http://ciurana.eu/contact This presentation is available for download from: http://ciurana.eu/TSSJSE2009

Transcript of This presentation is available for download from:...

Page 1: This presentation is available for download from: …0921ccnz33vh1pvcmg02.images.s3.amazonaws.com/cdn/... · 2009-10-05 · Case Study • Large consumer and services company •.Net

Mission-CriticalEnterprise Cloud

Applications

Eugene CiuranaOpen-Source Evangelist

CIME Software Labs

http://ciurana.eu/contact

This presentation is available for download from:http://ciurana.eu/TSSJSE2009

Page 2: This presentation is available for download from: …0921ccnz33vh1pvcmg02.images.s3.amazonaws.com/cdn/... · 2009-10-05 · Case Study • Large consumer and services company •.Net

About Eugene...

• 15+ years building mission-critical, high-availability systems

• 13+ years Java work

• Open source evangelist

• Official adoption of open source/Linux at Walmart worldwide

• State of the art tech for main line of business roll-outs

• Largest companies in the world• Retail• Finance• Oil• Background: robotics to on-line retail

Page 3: This presentation is available for download from: …0921ccnz33vh1pvcmg02.images.s3.amazonaws.com/cdn/... · 2009-10-05 · Case Study • Large consumer and services company •.Net

This presentation is about...

• How to design, implement, and roll-out cloud/enterprise hybrid applications

• Different cloud architectures

• PaaS• SaaS• Infrastructure as a service (IaaS)

• Technologies: ESB, clouds, mini-clouds, Java, App Engine, Chef!/Puppet, etc.

• How cloud technologies lower costs

• Understanding the advantages of:

• Cloud deployments• Hybrid deployments distributed in the cloud and enterprise

Page 4: This presentation is available for download from: …0921ccnz33vh1pvcmg02.images.s3.amazonaws.com/cdn/... · 2009-10-05 · Case Study • Large consumer and services company •.Net

What You’ll Learn

• Identify the applications best suited for cloud deployments

• Identify the advantages of Platform-as-a-Service or Software-as-a-Service resources

• Learn the caveats of cross-platform and cross-language integration

• Learn about high-performance alternatives to XML serialization for data exchange

• Learn how to structure event-based, stateless apps for maximum scalability

Page 5: This presentation is available for download from: …0921ccnz33vh1pvcmg02.images.s3.amazonaws.com/cdn/... · 2009-10-05 · Case Study • Large consumer and services company •.Net

What is the Cloud Anyway?

• Ask 10 different people, get 10 different answers

• In general, you may use 4 types of cloud offerings

• Platform as a Service• Software as a Service• Infrastructure as a Service• Pure infrastructure

• Some times you integrate pre-fabricated apps, some times platform, some times both

Page 6: This presentation is available for download from: …0921ccnz33vh1pvcmg02.images.s3.amazonaws.com/cdn/... · 2009-10-05 · Case Study • Large consumer and services company •.Net

Cloud Services Features

• Quick deployment of prepackaged components

• Uses commodity, virtualized hardware and network resources

• Amazon Elastic Cloud 2 (EC2) and Simple Storage Service (S3)

• Google App Engine (Python, Java)• Rackspace Cloud Services

• The overall model is “pay as you consume”

• Horizontal scalability is achieved by adding or removing resources as needed

• May host full applications or only services

Page 7: This presentation is available for download from: …0921ccnz33vh1pvcmg02.images.s3.amazonaws.com/cdn/... · 2009-10-05 · Case Study • Large consumer and services company •.Net

Cloud Services Features

• They could replace the data centre

• Basic administration moves to the application owner

• It may move away from the IT team - political fallout

• For the bean counters... it’s an operational expense!

• Tax advantages• Turn on or off as needed• In a tight economy, IT infrastructure ends up under the CFO -

give the guy options

• Assuming sensible SLAs, the ROI is better than for co-located or company-owned data centres

Page 8: This presentation is available for download from: …0921ccnz33vh1pvcmg02.images.s3.amazonaws.com/cdn/... · 2009-10-05 · Case Study • Large consumer and services company •.Net

• What do scalability and high-availability mean?

• Scalability: the property of a system to handle bigger amounts of work or to be easily expanded in response to increased demand

• Vertical• Horizontal

Dual Core

Single Processor

16 GB RAM

Virtual Node 0

Virtual Node 2

Virtual Node 1

Dual Core

Dual Processor

32 GB RAM

Virtual Node 0

Virtual Node 2

Virtual Node 1

Scalability and High Availability

Scales up

Virtual Node 3 Node Node Node

Load Balancer

Scales

out

Node Node Node

Load Balancer

Node

Page 9: This presentation is available for download from: …0921ccnz33vh1pvcmg02.images.s3.amazonaws.com/cdn/... · 2009-10-05 · Case Study • Large consumer and services company •.Net

Scalability and High Availability

• Availability: how the system provides useful resources over a set period of time

• Uptime != availability• Many factors affect it: network, storage, processor, etc.

Availability Downtime90% 36.5 days99% 4 days

99.9% 9 hours99.99% 53 minutes

99.999% 5.3 minutes99.9999% 32 seconds

•Amazon S3 Outage (summer 2008)•Major companies affected (LeapFrog, eBay, Apple, many others)•Amazon’s SLA? “Sorry - it won’t happen again.”

Can you affordthis?

Page 10: This presentation is available for download from: …0921ccnz33vh1pvcmg02.images.s3.amazonaws.com/cdn/... · 2009-10-05 · Case Study • Large consumer and services company •.Net

SLAs and Availability

• Service Level Agreements drive the architectural and technological decisions

• Cloud systems’ scalability characteristics are often pitched

• What is the availability?

• What is the impact on business?• How do you perform disaster recovery?

• What service level are the vendors offering?

• The higher the availability (3-nines, 4-nines, 5-nines), the higher the cost

• Co-lo, data centre deployments still make more sense for 4-nines availability and above

Page 11: This presentation is available for download from: …0921ccnz33vh1pvcmg02.images.s3.amazonaws.com/cdn/... · 2009-10-05 · Case Study • Large consumer and services company •.Net

Platform and Software as a Service

• Identify the type of implementation after establishing the availability requirements

• From a vendor?• In-house development?

• Software as a Service (SaaS)

• The vendor provides the full application and is responsible for scaling it and meeting SLAs

• Integration with the vendor via web services or EDI• Examples: GSI Commerce, Salesforce.com

• Platform as a Service (PaaS)

• The vendor provides the operating system and/or application stack

• The vendor provides a well-defined API for interacting with the system

• Examples: Amazon Web Services, Google App Engine, Nirvanix

Page 12: This presentation is available for download from: …0921ccnz33vh1pvcmg02.images.s3.amazonaws.com/cdn/... · 2009-10-05 · Case Study • Large consumer and services company •.Net

PaaS Systems

• Amazon EC2 and S3

• Virtual images with various Intel-based operating systems, app servers, databases, etc.

• Billed hourly + bandwidth• Java, PHP, Ruby, other technologies supported• “Data centre in the sky”• Lower cost than Rackspace or Nirvanix

• Weaker SLAs?

• Google App Engine

• The apps are written as callbacks from a virtualized service• Datastore ~= database; Memcache ~= caching; most

interaction via HTTP• Billed on usage• “You run on the same infrastructure as Google”• Low cost, still evolving, minimal SLAs

Page 13: This presentation is available for download from: …0921ccnz33vh1pvcmg02.images.s3.amazonaws.com/cdn/... · 2009-10-05 · Case Study • Large consumer and services company •.Net

A Hybrid Cloud Architecture

• Many mission-critical systems will live behind the corporate firewall

• The cloud is used for high-load applications and services

• The clouds applications work independently of the data center applications, and vice versa

• Loose coupling over web services• Assume that the data in the cloud is “throwaway” - it’s either

consolidated or sourced in the data centre• The cloud exists mostly for distribution and scalability

Page 14: This presentation is available for download from: …0921ccnz33vh1pvcmg02.images.s3.amazonaws.com/cdn/... · 2009-10-05 · Case Study • Large consumer and services company •.Net

A Hybrid Cloud Architecture

services.company.com

Node Node Node Node

Legacy Back-end

Enterprise

Database

Enterprise Service Bus

User

services.company.com

Firewall

Amazon S3http://media.company.com

PNG PNG PNG PNG

PNG PNG PNG PNG

Google App Enginehttp://www.company.com

Node Node Node

Node Node Node

Datastore Datastore

Page 15: This presentation is available for download from: …0921ccnz33vh1pvcmg02.images.s3.amazonaws.com/cdn/... · 2009-10-05 · Case Study • Large consumer and services company •.Net

A Hybrid Cloud Architecture

End-User System (Mac, Windows)

Connected

App

Web

Browser

USB

Internet

S3

Content

RepositoryThird-party

Partner Site

www.leapfrog.comconnected

productsLearningPath

Firewall

Mule ESB backboneHTTP, SOAP (CXF), REST, etc. routing, filtering, and dispatching; ActiveMQ JMS broker; dedicated services

Mule ESBConnected products SOAP, REST

web services

Mule ESBDevice log upload, processing,

servlet container

Content

Management

System

REST, JCR

Device

Logs

Crowd SSO

Customer

Data

Game

play

Data

Content

Authoring

User

Credentials

Servlets

App Logic

Page 16: This presentation is available for download from: …0921ccnz33vh1pvcmg02.images.s3.amazonaws.com/cdn/... · 2009-10-05 · Case Study • Large consumer and services company •.Net

Which Parts Go To The Cloud?

� � � � � � � � � � � � $ $ � � � � * � % � # � � � � � # & % � � � � � � � $ ! � % � � � � � � � " & � & � � � � � � ' � � % $

� % � # � � %

� � � � � � � � � � � #

� ! ! � � � � % � � �� � # ' � #

� � � � % � �� � # ' � � � $ � � # ( )

� ! ! � � � � % � � �� � # ' � #

� � � � % � �

� � � � � � � � � � � # � � � � � � � � � �

� & � � � � � � � & � � � � � � � & � � � � � � � & � � � � � �

� � � � � � � � � � � # � � � � � � � � � �

� & � � � � � �� � � � � � � � � �

� & � � � � � �� � � � � � � � � �

� � % � � � $ �

� � � � � � � � � � � # � � � & � � ) � � �

� & � � � � � �$ � # ' � � % � � � � � �

� & � � � � � �$ � # ' � � % � � � � � �

�$ � � # �

� � � � � � � � � � � # � � � � � � $ $ � � � � � # � � #

� � % � ' � � � � � % � ' � � �

�$ � � # �

Page 17: This presentation is available for download from: …0921ccnz33vh1pvcmg02.images.s3.amazonaws.com/cdn/... · 2009-10-05 · Case Study • Large consumer and services company •.Net

Which Parts Go To The Cloud?

• Depending on the cloud configuration, load balancing may not be necessary

• Amazon provides Elastic Load Balancers and Elastic IP• Google App Engine provides stateless calls only

• The client itself or Memcache keep state instead• Rackspace provides either “elastic” addresses or explicit load

balancers

• Enterprise Service Bus and other integration may exist in both the cloud and the data centre

• Message routing is OK is if latency is below the SLA requirements

• Strategic partner, application, and services integration may occur 100% on the cloud, with data consolidation behind the firewall

Page 18: This presentation is available for download from: …0921ccnz33vh1pvcmg02.images.s3.amazonaws.com/cdn/... · 2009-10-05 · Case Study • Large consumer and services company •.Net

Case Study

• Large consumer and services company

• .Net / Windows servers infrastructure

• ASP.Net• SQL Server

• Two-tier architecture

• The system “just grew” and became a business

• Server pages, services, media

• 6-month scalability project began in April

• Implementation complete 25.Sep of the same year

Page 19: This presentation is available for download from: …0921ccnz33vh1pvcmg02.images.s3.amazonaws.com/cdn/... · 2009-10-05 · Case Study • Large consumer and services company •.Net

Case Study - Objectives

• Stable architecture defined by July

• Low cost

• Build scalability wherever possible

• Optimal data transfer rate for all properties

• Web systems• Media properties

• Meet business requirements and SLAs

• Hard dates set by the business, no input from the technology team

• Milestones: 1.July, 10.Oct

Page 20: This presentation is available for download from: …0921ccnz33vh1pvcmg02.images.s3.amazonaws.com/cdn/... · 2009-10-05 · Case Study • Large consumer and services company •.Net

Case Study - Initial Configuration

• System grew “as needed” - no planning

• Combines end-user and internal applications in the same server

• It only scales vertically -- and not very well

• Applications and database are tightly coupled

• Application servers and services tightly coupled

• Single environment: from development to production without passing Go nor collecting $200

• Disaster recovery?

• 4 hours minimum• From (stale?) backups

Page 21: This presentation is available for download from: …0921ccnz33vh1pvcmg02.images.s3.amazonaws.com/cdn/... · 2009-10-05 · Case Study • Large consumer and services company •.Net

rsync

Linux

UDP

Services

Windows

Case Study - Initial Configuration

Zone Server

ClientClientClient

Client

Internet

Applications

.Net

Services

.Net

SQL

Server

UDP

Services

Windows

UDP

Services

Windows

UDP

Services

Windows

Master, IRC

Linux

� � � �� � � � �

Page 22: This presentation is available for download from: …0921ccnz33vh1pvcmg02.images.s3.amazonaws.com/cdn/... · 2009-10-05 · Case Study • Large consumer and services company •.Net

Case Study - Phase I Scalability

Page 23: This presentation is available for download from: …0921ccnz33vh1pvcmg02.images.s3.amazonaws.com/cdn/... · 2009-10-05 · Case Study • Large consumer and services company •.Net

Case Study - Phase I Scalability

• Introduces a CDN for asset delivery

• Amazon S3 (optional Cloud Front) for asset delivery• Reduces load on company servers and bandwidth costs

• Introduces database replication for production environments

• Establishes a continuous integration environment

• Set tools for hybrid UNIX/Windows environment• UNIX tools take precedence; Cygwin everywhere

• Improved build/release process

Page 24: This presentation is available for download from: …0921ccnz33vh1pvcmg02.images.s3.amazonaws.com/cdn/... · 2009-10-05 · Case Study • Large consumer and services company •.Net

Case Study - Phase I Scalability

Co

nti

nu

ou

s I

nte

gra

tio

n S

erv

er

Ve

rsio

n C

on

tro

l S

ys

tem

, A

uto

ma

tic

Bu

ild

s,

Un

it T

es

tsServices

services HTTP other

proxy

.Net Admin App Server

.Net App Server

.Net Server

Services

Load balancer

Databases DatabasesIRC

Production

Services

services HTTP other

proxy

.Net Admin App Server

.Net App Server

.Net Server

Services

Load balancer

DatabasesIRC

Staging

Services

services HTTP other

proxy

.Net Admin Server

.Net App Server

.Net Server

Databases

IRC

QA

Services

services HTTP other

proxy

.Net Admin Server

.Net App Server

.Net Server

Databases

IRC

Development

Developer 0 Developer 1 Developer 2 Developer 3

Deployment

DeploymentDeployment

CDNCDN

CDN

Deployment

CDNCDN

CDN

CDNCDN

CDN

CDN

Page 25: This presentation is available for download from: …0921ccnz33vh1pvcmg02.images.s3.amazonaws.com/cdn/... · 2009-10-05 · Case Study • Large consumer and services company •.Net

Case Study - Phase I Scalability

• Fail-over with traditional database replication techniques

Partition 0

Primary Cluster

Node 0 Node 1

DB 0

DB 0b

Partition 1

DB 1

DB 1b

ESB as app services provider

Page 26: This presentation is available for download from: …0921ccnz33vh1pvcmg02.images.s3.amazonaws.com/cdn/... · 2009-10-05 · Case Study • Large consumer and services company •.Net

Case Study - Phase II Cloud Deployment

Page 27: This presentation is available for download from: …0921ccnz33vh1pvcmg02.images.s3.amazonaws.com/cdn/... · 2009-10-05 · Case Study • Large consumer and services company •.Net

Case Study - Phase II Cloud Deployment

• Web applications move to a uniform technology (.Net)

• The database and stored procedures are normalized and optimized

• Applications use common resources via Mule ESB and services

• No more direct calls from apps to databases• Business logic is implemented as stateless POJOs

• Software stack is “best of breed”

• Windows, other servers driven by business requirements• Open source software wherever possible

Page 28: This presentation is available for download from: …0921ccnz33vh1pvcmg02.images.s3.amazonaws.com/cdn/... · 2009-10-05 · Case Study • Large consumer and services company •.Net

Case Study - Phase II Cloud Deployment

• Web and other RPC services must coexist

• Different partners use different protocols• Mule transformers take care of all the interfaces so that

development may scale / continue independently of what happens in other layers

• Bandwidth can be expen$ive!

• Data exchange protocols

• Clients: custom, XML, JSON• Images: HTTP, S3• Cloud-to-HQ: custom, XML/SOAP, protocol buffers• HQ data centre: XML/SOAP, protocol buffers

• Replication strategy: data centre

• The cloud isn’t trustworthy yet

Page 29: This presentation is available for download from: …0921ccnz33vh1pvcmg02.images.s3.amazonaws.com/cdn/... · 2009-10-05 · Case Study • Large consumer and services company •.Net

Case Study - Phase II Cloud Deployment

• Deployment involves using an Amazon Machine Image (AMI)

• Use existing ones from Amazon or third parties• Configure your own to meet your software requirements

• AMIs need a post-configuration step in a load-balanced environment

• Unique file processing (/etc/hosts, /etc/hostname, app configuration, etc.)

• Necessary intermediate step between launching the AMI and having a useful machine

• Elastic Load Balancer and Elastic IP limitations

• ELB only works for dynamic IP addresses• ELIPs are limited to 5 per account

Page 30: This presentation is available for download from: …0921ccnz33vh1pvcmg02.images.s3.amazonaws.com/cdn/... · 2009-10-05 · Case Study • Large consumer and services company •.Net

Case Study - Phase II Cloud Deployment

legacyservices01

Web services

EvolverSP

Dimenxian

EvolverMP

DimensionMP

legacyservices02

Web services

EvolverSP

Dimenxian

EvolverMP

DimensionMP

balance

services01

Web services

Proxy

Routing

Queuing

services02

Web services

Proxy

Routing

Queuing

master01

IRC

game server list

174.129.238.159

master02

IRC

game server list

services

syncserver01

rsync

repository

balancesyncserver

database

master

database

slave

balance

web01

balance

web02

los01

balance

los02

losgamesvr01

balance

losgamesvr02

ep01

FRONT END

balance

ep02

FRONT END

syncserver02

rsync

repository

174.129.238.110

174.129.238.126

174.129.238.129

174.129.238.145

174.129.238.184

174.129.238.189

replicate

174.129.239.18174.129.239.11

game server

174.129.239.42

game server

174.129.239.45

game server

174.129.239.57

game server

174.129.239.68

balance

World

ep01

SERVICES

balance

ep02

SERVICES

174.129.239.41

downloads

S3174.129.238.188

app-x server

174.129.239.81

QA

DNS

174.129.15.199

legacyservicesdatabase

web

IMUService

IMUService

IMUService

Page 31: This presentation is available for download from: …0921ccnz33vh1pvcmg02.images.s3.amazonaws.com/cdn/... · 2009-10-05 · Case Study • Large consumer and services company •.Net

Case Study - Phase II Cloud Deployment

Setup Monthly Bandwidth Total / year

Production $1,300 $5,5580 $2,000 $70,260

Staging $1,300 $5,5580 $ 0 $68,260

QA $ 600 $1,450 $ 0 $18,000

Dev $ 600 $1,450 $ 0 $18,000

Total $174,520

Traditional Hosted Environments

Page 32: This presentation is available for download from: …0921ccnz33vh1pvcmg02.images.s3.amazonaws.com/cdn/... · 2009-10-05 · Case Study • Large consumer and services company •.Net

Case Study - Phase II Cloud Deployment

Setup Monthly Bandwidth Total / year

Production $ 0 $4,320 $1,200 $66,240

Staging $ 0 $4,320 $ 0 $51,840

QA $ 0 $1,080 $ 0 $12,960

Dev $ 0 $ 0 $ 0 $ 0

Total $131,040

Amazon EC2 Configuration

$43,000 cheaper

Page 33: This presentation is available for download from: …0921ccnz33vh1pvcmg02.images.s3.amazonaws.com/cdn/... · 2009-10-05 · Case Study • Large consumer and services company •.Net

Case Study - Phase III Final Deployment

Page 34: This presentation is available for download from: …0921ccnz33vh1pvcmg02.images.s3.amazonaws.com/cdn/... · 2009-10-05 · Case Study • Large consumer and services company •.Net

Case Study - Phase III Final Deployment

services01

Web services

Proxy

Routing

Queuing

master01

IRC

game server list

174.129.238.159

master02

IRC

game server listservices

syncserver01

rsync

repository

balancesyncserver

database

master

database

slave

balance

web01

balance

web02

los01

balance

los02

losgamesvr01

balance

losgamesvr02

ep01

FRONT END

balance

ep02

FRONT END

syncserver02

rsync

repository

174.129.238.110

174.129.238.126

174.129.238.129

174.129.238.145

174.129.238.184

174.129.238.189

replicate

174.129.239.18

game server

174.129.239.42

game server

174.129.239.45

game server

174.129.239.57

game server

174.129.239.68

balance

World

downloads

S3174.129.238.188

app-x server

174.129.239.81

QA

DNS

174.129.15.199

database

web

services01

Web services

Proxy

Routing

Queuing

services01

Web services

Proxy

Routing

Queuing

services01

Web services

Proxy

Routing

Queuing

Page 35: This presentation is available for download from: …0921ccnz33vh1pvcmg02.images.s3.amazonaws.com/cdn/... · 2009-10-05 · Case Study • Large consumer and services company •.Net

Thanks for Coming!

Questions?

Wanna know more about real life cloud, scalable systems?

http://ciurana.eu/scalablesystemsSubscribe to the newsletter!

Eugene CiuranaOpen source [email protected]+41 44 586 8462