Scalability rules for web sites

25
Scalability Rules for Web Sites Minh Tran – 12/2011

description

This presentation describes some interesting rules in Scaling Web Sites

Transcript of Scalability rules for web sites

Page 1: Scalability rules for web sites

Scalability Rules forWeb Sites

Minh Tran – 12/2011

Page 2: Scalability rules for web sites

Distribute Your Work

Page 3: Scalability rules for web sites

Distribute Your Work

• Design to Clone Things (X axis)• Design to Split Different Things (Y Axis)• Design to Split Similar Things (Z Axis)

Page 4: Scalability rules for web sites

Design to Clone Things (X axis)

• When:– Databases with a VERY HIGH READ to write ratio

(5:1 or greater—the higher the better).– Any system where transaction growth exceeds

data growth.• How to use:

– Simply clone services and implement a load balancer.

– For databases, ensure the accessing code understands the difference between a read and a write.

Page 5: Scalability rules for web sites

Example

• Reservation system has– 400 searches (read) for 1 booking (write)

• How to scale??Scaled by creating read-only copies (or replicas)One way is to use a caching tier in front of the

database (high recommended 1st step)Most RDBMS allows replication “out of the box”

Master ~ primary transactional database (write)Slave ~ read-only copies of the master

Page 6: Scalability rules for web sites

Design to Split Different Things(Y Axis)

• When:– Very large data sets where relations between

data are not necessary.– Large, complex systems where scaling

engineering resources requires specialization.• How to use:

– Split up actions by using verbs or resources by using nouns or use a mix.

– Split both the services and the data along the lines defined by the verb/noun approach.

Page 7: Scalability rules for web sites

Example – Split up by verbs

Ecommercesystem

Login

Search

Browse

View

Add-to-cart

Signup

Purchase / Buy

Page 8: Scalability rules for web sites

Example – Split up by nouns

Ecommercesystem

Product

SKU

Catalog

Inventory

User Information

Page 9: Scalability rules for web sites

Design to Split Similar Things(Z Axis)

• When:– Very large, similar data sets such as large and rapidly

growing customer bases.• How to use:

– Identify something about the customer• Customer ID• Last name• Geography• Device

– Split or partition both data and services based on that attribute.

• Often referred to as sharding or horizontal partitioning

Page 10: Scalability rules for web sites

Use the Right Tool

• Use Databases Appropriately• Actively Use Log Files

Page 11: Scalability rules for web sites

Use Databases Appropriately

RDBMS File System

Example • Oracle, MySQL… • GFS, MogileFS, Ceph

Storage Structure

• Transactional integrity (ACID)• Relational structure within

tables

• No transactional• No relationships

Advantages • Minimize data redundancy• Improve transaction

processing

• Handle very large amount of files and data

Limitation • Scalability (ACID)• Sharding or Partitioning

(Relational structure)

• conflicting reads and writes over time

Page 12: Scalability rules for web sites

Use Databases Appropriately (cont)NoSQL

Example • Memcached, Tokyo Tyrant, Voldemort

• Google Big Table, Cassandra

• CouchDB, Amazon ‘s SimpleDB, Yahoo’s PNUTS, MongoDB,…

Storage Structure

• Key-value stores• Single key-value

index for data• Stored in memory

• Extensible record stores• Row and column data

model

• Document stores• Multi-indexed object

model• Documents can be

aggregated into collection of documents

Advantages • Significant scaling and performance

• Rows are sharded on primary keys (automatic)

• Columns are broken into groups (user definitions)

• Can be queried based on many different attributes

Limitation • Kind of data can be stored

• Synchronous replication

• Asynchronous replication • Asynchronous replication

• ACID

Page 13: Scalability rules for web sites

Actively Use Log Files

• When– Put a process in place that monitors log files– Forces people to take action on issues

identified.• How to use

– Use any number of monitoring tools from custom scripts to Splunk to watch your application logs for errors

– Export these and assign resources for identifying and solving the issue.

Page 14: Scalability rules for web sites
Page 15: Scalability rules for web sites

Use Caching Aggressively

• Leverage Content Delivery Networks• Use Expires Headers• Leverage Page Caches• Utilize Application Caches• Make Use of Object Caches• Put Object Caches on Their Own “Tier”

Page 16: Scalability rules for web sites

Leverage Content Delivery Networks

• When– Ensure it is cost justified and then choose

which content is most suitable.• How to use

– Most CDNs leverage DNS (Domain Name Services or Domain Name Servers) to serve content on your site’s behalf.

Page 17: Scalability rules for web sites
Page 18: Scalability rules for web sites

Use Expires Headers

• When– All object types

need to be considered.

• How to use– Headers can be set

on Web servers or through application code.

HTTP Status Code: HTTP/1.1 200 OK

Date: Thu, 21 Oct 2010 20:03:38 GMT

Server: Apache/2.2.9 (Fedora)

X-Powered-By: PHP/5.2.6

Expires: Mon, 26 Jul 2011 05:00:00 GMT

Last-Modified: Thu, 21 Oct 2010 20:03:38 GMT

Cache-Control: no-cache

Vary: Accept-Encoding, User-Agent

Transfer-Encoding: chunked

Content-Type: text/html; charset=UTF-8

Page 19: Scalability rules for web sites

Leverage Page Caches

• When– Always

• How to use– Choose a

caching system and deploy.

Page 20: Scalability rules for web sites

Use Caching Aggressively

• Leverage Content Delivery Networks• Use Expires Headers• Leverage Page Caches• Make Use of Object Caches• Put Object Caches on Their Own “Tier”

Page 21: Scalability rules for web sites

Make Use of Object Caches

• When:– Any time you have repetitive queries or

computations.• How to use:

– Select any one of the many open source or vendor supported solutions

– Implement the calls in your application code.• Some popular caches: Memcached,

Ehcache, Apache OJB, NCache

Page 22: Scalability rules for web sites

Put Object Caches on Their Own “Tier”

Page 23: Scalability rules for web sites

Learn Aggressively

• Take every opportunity to learn.• Be constantly learning from your mistakes

as well as successes.• Watch your customers or use A/B testing

to determine what works.• Use postmortems to learn from incidents

and problems in production.

Page 24: Scalability rules for web sites

Reference

• Scalability Rules: 50 Principles for Scaling Web Sites

• The Art of Scalability: Scalable Web Architecture, Processes, and Organizations for the Modern Enterprise

• http://www.codefutures.com/database-sharding/ • http://

highscalability.com/unorthodox-approach-database-design-coming-shard

Page 25: Scalability rules for web sites