Handling Flash Crowds from your Garage

16
Handling Flash Crowds from your Garage Peter Ward May 1, 2009

description

Handling Flash Crowds from your Garage. Peter Ward May 1, 2009. Motivation. Web Development Cloud Computing. Purpose. To inform and explain ways to deploy robust web services; the differences between these ways; how they work in real life. Key Information. The Problem The Solution - PowerPoint PPT Presentation

Transcript of Handling Flash Crowds from your Garage

Page 1: Handling Flash Crowds from your Garage

Handling Flash Crowds from your Garage

Peter WardMay 1, 2009

Page 2: Handling Flash Crowds from your Garage

Motivation• Web Development• Cloud Computing

Page 3: Handling Flash Crowds from your Garage

Purpose• To inform and explain

– ways to deploy robust web services;– the differences between these ways;– how they work in real life.

Page 4: Handling Flash Crowds from your Garage

Key Information• The Problem• The Solution• Scaling Technologies• Flash Crowd Experiences

Page 5: Handling Flash Crowds from your Garage

The Problem• Innovation by small companies and individuals.• Specific need + small website= Web Service• Gain popularity through Slashdot, Digg and Reddit.• But large crowds crash servers, losing your service

users.

• More powerful servers can make handle more users, but they cost more.

• The problem is often not processing speed, but disk retrieval speed and bandwidth.

Page 6: Handling Flash Crowds from your Garage

A Distributed Solution• “Utility Computing”: Resources are utilities –

you only need to pay for what you use.• (e.g.: processing time, bandwidth, memory

and disk space)

• No powerful servers – just “virtual computers”.

• Low cost for low demand, high cost for high demand – scalable.

Page 7: Handling Flash Crowds from your Garage

Storage Delivery Networks• Use Case: Sites storing or providing large

amounts of static• content such as photos or videos.• Examples: Amazon’s S3, the Nirvanix

platform.• Scope: Static HTTP• Potential for Failure: Low - the SDN is

responsible for handling the load.

Page 8: Handling Flash Crowds from your Garage

Distributed Computing• Use Case: Dynamic content.

• For dynamic content, we need to run server-side applications.

• Virtual machines– owned by the company (large companies)– “Compute Clouds” (small companies)

• Developed on a single machine• Deployed to any number of virtual machines.

• Examples: Amazon’s EC2, FlexiScale.

Page 9: Handling Flash Crowds from your Garage

HTTP Redirection• Single front-end machine redirects using HTTP to one of many back-end

machines.• Clients only need to contact this machine once.

• If a large number of users are accessing the front-end server, it could become overloaded, and prevent new users.

• This can be prevented using DNS Load Balancing.

• Use Case: Large number of internal servers running web• servers.• Examples: Hotmail.• Scope: HTTP• Potential for Failure: High - front-end failure, but can be reduced by

combining approaches.

Page 10: Handling Flash Crowds from your Garage

IP Load Balancing• Large number of back-end servers appear as a single network address.

• L4 load balancing runs at IP level, choosing a new server for each request.• L7 load balancing runs at protocol level (i.e.: HTTP), so it inspects the

headers of the request, and distributes the request to a server based on the request.

• Use Case: Provides balancing for any number of back-end• servers.• Examples: Linux Virtual Server, Microsoft Internet Security• and Acceleration Server, FlexiScale• Scope: Any network server.• Potential for Failure: Medium – the front-end balancer is most

susceptible to failure. Could be combined with other approaches.

Page 11: Handling Flash Crowds from your Garage

DNS Load Balancing• DNS Servers provide clients with a list of IP addresses to choose

from.• To add/remove servers, the company just has to update the list of IP

addresses.• Unfortunately, not all clients choose a random server from the list of

addresses – leading to an uneven balancing.

• Use Case: Provides balancing for any number of back-end• servers.• Examples: Google, Hotmail, Yahoo and many more.• Scope: Any network server.• Potential for Failure: Medium - DNS servers are robust, but

updates are typically slow.

Page 12: Handling Flash Crowds from your Garage

MapCruncher• A new web authoring tool that

makes it easy for non-experts to convert their own maps into AJAX-style interactive maps.

• No server-side behaviour - all processing done with JavaScript.

• 25GB data on relatively powerful server.

• After release, the service could only handle 100 requests per second - slow disk retrieval speed.

• Data moved to Amazon’s S3 service.

Page 13: Handling Flash Crowds from your Garage

Asirra• Asirra is a CAPTCHA system

which asks users to identify photos as either cats or dogs.

• Written in Python.• Optimised for speed.

• Virtual Machines - Amazon’s EC2.• DNS load balancing.

• In the first 24 hours after release...– 75,000 real requests– 30,000 from a denial-of-service

attack

Page 14: Handling Flash Crowds from your Garage

InkblotPassword.com• A website that helps users

generate and remember high-entropy passwords, using Rorschach-like images as a memory cue.

• Written in Python• No optimisation – servers are

cheap.• Nothing stored on the local disk.• DNS Load Balancing (but not

automated)• Handling a Slashdot crowd cost

less than $150.

Page 15: Handling Flash Crowds from your Garage

Summary• Small companies have a range of options for deploying• robust services with small budgets.• For applications focussed mainly on static content, a Storage• Delivery Network (SND) such as Amazon’s S3 is a good• choice as it requires virtually no setup or understanding of• the load distribution used internally in the SND.• Compute Clouds are useful for deploying low-cost servers• onto virtual machines, and being able to add or remove• servers as needed.• Technologies such as HTTP Redirection, L4/L7 Load• Balancing and DNS Load Balancing are useful when• combined with virtual machines, as they allow the load of a• single domain to be spread across multiple servers, for a• relatively small cost.

Page 16: Handling Flash Crowds from your Garage

Evaluation• The article is a well-presented analysis of the low-cost• options for providing scalable web services. Whilst the• writing style is very accurate and informational, the structure• of the article is somewhat disjointed - there are too many• irrelevant sections, and technologies have been put into• sections for purely cosmetic reasons.• The case studies reported by the article are good examples• of how this technology can be employed, and are useful in• keeping the reader’s interest in the article.• The sources and data of the article are of the expected high• quality for a professional paper.