Azug - successfully breeding rabits

Post on 31-Oct-2014

816 views 0 download

Tags:

description

 

Transcript of Azug - successfully breeding rabits

Yves GoelevenCapgemini@YvesGoelevenhttp://cloudshaper.wordpress.com

Azug

Successfully breeding rabits

Agenda

• Introduction• Rabits?• Cloud Power• The Weakest Link• Understanding Capacity• Self Everything

Rabits ?

• Rapid bits, – small public apps like websites & phone apps– They want to live outside, in the wild– They need to get there fast – Once they are there, they’ll need some space

to multiply, scale– And they move on quickly

Examples

• Apps– Personal & work related

• Branding websites– Product launches, Special campaigns

• Predictable big events – Olympics, Elections

• Unpredictable events– Disasters, Terrorist attacks– Celebrity Death

Business context

• The world has changed over the past decade– Consumerization of IT – Technology in every day life– Globalization

• New and large scale business opportunities appear– 2.1 Billion internet users– 4.6 Billion phones– 1.4 Billion Households with TV’s

• Elasticity needed as demand varies wildly

Global,

Short time to market,

Performant,

Highly scalable,

Highly available,

Relatively cheap,

Easy

Key Success Factors

How to prevent road kill?

Good thing rabits have cloud power

Agenda

• Introduction• Rabits?• Cloud Power• The Weakest Link• Understanding Capacity• Self Everything

Fast Time To Market - Services

Scalable compute and storageVirtual NetworkMarket place

Relational storage in the cloudReporting and synchronizationAutomated database management

Connect through network boundariesEasy authorization to applicationsCaching & Workflow & …

Global - Spread your rabits all over the world

North America Region Europe Region

Asia Pacific Region

6 datacenters across 3 continentsSimply select your data center of choice and deploy your application in minutes

S. Central – U.S.

W. Europe N. Central – U.S.

N. Europe

S.E. Asia

E. Asia

Performance - Impact of global

Performance - Impact of global

50ms

Performance - Impact of global

50ms

100ms

Performance - Impact of global

50ms

100ms

Traffic manager to the rescue

Global,

Short time to market,

Performant,

Highly scalable,

Highly available,

Relatively cheap,

Easy

Key Success Factors

Agenda

• Introduction• Rabits?• Cloud Power• The Weakest Link• Understanding Capacity• Self Everything

The weakest link

• Overall scalability and availability – Limited by the weakest

component

• If the backend can only handle 30 users– It doesn’t matter that the

front-end could handle 1.000.000

The weakest link

• Typically the weakest link is one of the following:– Integration points– Data stores– Long processes

Remember

• Everything has limits!• Including azure resources– Storage account: 5000 requests/sec– Storage container: 500 requests/sec– Bandwith depending on instance size– Etc...

• Luckily you can get multiple of these

But what if you can’t?

• Keep them out of the critical path– Cache view model data or output– Queue commands

Cache

• Windows Azure Appfabric Cache– Distributed cache

• Reduces queries– To less scalable components

• Store data close to the app– Otherwise the whole point is moot

Queued command processing• Avoid being swarmed by incoming

commands– Use a queue to throttle

• Handle commands at a controlled speed – that of the least scalable component

Recommended Architecture: CQRS

Queries

Commands

Storage

HandlerInput

Validation Validation

Rules

QueriesCache

View ModelUpdater

Publish

WebWorker

Persistence

Recommended Architecture: EDA

Event generators

Event Stream

Event consumers

Time

Side effects of these architectures

• Caches need to be updated regularly– Time based– Event based

• User interface must be adapted– Task orientation required – ISO 9241-151 requires this anyway

What if things break?

• Make sure you have a backup instance!• Fabric controller – At least 2 instances in seperate fault domains

• Traffic manager– Spread over multiple datacenters

• Azure storage– Automatically replicated across datacenters

• SQL Azure– Replicate using data sync

Multiple instances

• Don’t rely on machine dependencies– Avoid reliance on memory (except as

cache)– Session state is evil–WCF default wsdl addressing behavior– Ensure encryption algorithms use

service certificates– ...

Technology can help

• Windows Azure Tech– Queue storage– Appfabric Service Bus

• Framework Tech– NServiceBus– SignalR

Global,

Short time to market,

Performant,

Highly scalable,

Highly available,

Relatively cheap,

Easy

Key Success Factors

Agenda

• Introduction• Rabits?• Cloud Power• The Weakest Link• Understanding Capacity• Self Everything

Keeping it cheap

• Understanding capacity– Pay for what you can ‘potentially’ use, aka the capacity– Instances are baskets of capacity : CPU, Memory, …– Ensure everything is efficiently used before

scaling outCompute Instance Size CPU Memory Instance

Storage I/O Performance Cost per hour

Extra Small 1.0 GHz 768 MB 20 GB Low (5 Mbps) $0.05

Small 1.6 GHz 1.75 GB 225 GB Moderate (100 Mbps) $0.12

Medium 2 x 1.6 GHz 3.5 GB 490 GB High (200 Mbps) $0.24

Large 4 x 1.6 GHz 7 GB 1,000 GB High (400 Mbps) $0.48

Extra Large 8 x 1.6 GHz 14 GB 2,040 GB High (800 Mbps) $0.96

Example

• 1 XS webrole instance (1 Ghz, 768 Mb, 5Mbps)– Dynamic home page but with relatively static content

• Limited to 50 concurrent users, yet only– 10% CPU used – 80% Memory used (by OS)– Plenty of free disk space– Limited by bandwidth IO

• Scaling out to 2 instances– Moves the tipping point– But wastes 90% cpu, 20% Memory– Twice

Yves GoelevenCapgemini@YvesGoelevenhttp://cloudshaper.wordpress.com

Demo: Hammering the rabit

Offload static content

• Better is to remove the bottle neck– In this case IO

• Offload static content to – Blob storage, CDN

• Leaves more power to handle dynamic workload– Increases number of users served– Better utilization of CPU & Memory– Relative to bandwidth

CDN = Content Delivery network

• Content cache near internet edges (24 datacenters), static content close to user

• Great response times, > 200% performance improvement in my test

Cache, cache, cache

• The internet has multiple levels of cache– Browser & proxy cache– Kernel & output cache– Memory– Windows Azure Appfabric Cache

• Ensures low latency– Memory is faster than IO– Less time waiting for IO– Means more resources to handle

requests

Yves GoelevenCapgemini@YvesGoelevenhttp://cloudshaper.wordpress.com

Demo: Hammering the rabit again

• Visual studio projects force you in a 1 logical role = 1 physical role instance mindset– Website = web role, Background process =

Worker role– Becomes expensive and wastes a lot of

capacity

• Combine different types of workload in same webrole instance– Website (Bandwith heavy)– Background process (Cpu heavy)

• Immediate 50% cost reduction!

Web Role Instance

Website

Background Process

Balance your workloads

• Ideally all roles operate at 80% overall capacity utilisation– Leaves room for sudden peaks– Still efficient use of the capacity you rented

• Monitoring your roles is key– Add performance counters for CPU, Memory, …– Store measurements in Windows Azure Storage

• On premises monitoring software– Polls storage for metrics– F.e Cerabrata Diagnostics Manager

Monitor your roles

• Smart auto scaling & dynamic workload allocation

The holy grail

Role

CP

U

Me

mo

ry

Dis

k

Ba

nd

wid

th

RoleC

PU

Me

mo

ry

Dis

k

Ba

nd

wid

th

Role

CP

U

Me

mo

ry

Dis

k

Ba

nd

wid

th

Scale out at 80%

Global,

Short time to market,

Performant,

Highly scalable,

Highly available,

Relatively cheap,

Easy

Key Success Factors

Agenda

• Introduction• Rabits?• Cloud Power• The Weakest Link• Understanding Capacity• Self Everything

• Rabits join millions of people all over the world

• Some traditional tasks suddenly become very hard

• How to do?– End user training– Helpdesk & support– User acceptance tests– …

Issues of scale

• Self service– Signup, pay, use, maintain…

• Self marketing– Use the power of social networks

• Self supporting– Easy to use, inductive, UI– Build a community for support

• Self educating, testing– Offer early beta’s to the public– Provide means for feedback

Self everything

Global,

Short time to market,

Performant,

Highly scalable,

Highly available,

Relatively cheap,

Easy

Key Success Factors

Yves GoelevenCapgemini@YvesGoelevenhttp://cloudshaper.wordpress.com

Questions?