How To Scale v2

26
How to scale (with ruby on rails) George Palmer [email protected] 3dogsbark.com

description

Slightly updated scaling presentation to include information on EC2

Transcript of How To Scale v2

Page 1: How To Scale v2

How to scale(with ruby on rails)

George [email protected]

3dogsbark.com

Page 2: How To Scale v2

George Palmer

26th May 2007

Overview

• Starting out

• Scaling the database

• Scaling the web server

• User clusters

• Caching

• Elastic architectures

• Links and Questions

Page 3: How To Scale v2

George Palmer

26th May 2007

How you start out

• Shared Hosting• One web server and DB on same machine• Application designed for one machine• Volume of traffic will depend on host

DBWeb Server

Shared Hosting

Page 4: How To Scale v2

George Palmer

26th May 2007

Two servers

• Possibly still shared hosting• Web server and DB on different machine• Minimal changes to code• Volume of traffic will depend on whether made it

to dedicated machines

DBWeb Server

Page 5: How To Scale v2

George Palmer

26th May 2007

Scaling the database (1)

• DB setup more suited to read intensive applications (MySQL replication)

• Should be on dedicated hosts• Minimal changes to code

MasterDB

Web Server

Slave

Slave

Slave

Page 6: How To Scale v2

George Palmer

26th May 2007

Scaling the database (2)

• DB setup more suited to equal read/write applications (MySQL cluster)

• Should be on dedicated hosts• Minimal changes to code

MasterDB

Web Server

MasterDB

MySQL Cluster

Page 7: How To Scale v2

George Palmer

26th May 2007

Scaling the web server

• Web Server comprises of “Worker threads” that process work as it comes in

DBFarm

Worker thread

Worker thread

Worker thread

Worker thread

Web Server

Page 8: How To Scale v2

George Palmer

26th May 2007

Load balancing

• App Server depends:– Rails (Mongrel, FastCGI)– PHP– J2EE

• Some changes to code will be required

DBFarm

App Server

App Server

App Server

Load balancer

Page 9: How To Scale v2

George Palmer

26th May 2007

The story so far…

App Server

App Server

App Server

Load balancer MasterDB

Slave

Slave

Slave

• App servers continue to scale but the database side is somewhat limited…

Page 10: How To Scale v2

George Palmer

26th May 2007

User Clusters

• For each user registered on the service add a entry to a master database detailing where their user data is stored– UserID– DB Cluster– Basic authorisation details such as username,

password, any NLS settings

Page 11: How To Scale v2

George Palmer

26th May 2007

User Clusters (2)

App Server

MasterDB

User Cluster 1

UserCluster 2

User clusters are themselves one of the two database setups outlined earlier

SELECT * FROMusers WHERE username=‘Bob’AND …

user_id=91732db_cluster=2

Page 12: How To Scale v2

George Palmer

26th May 2007

User Clusters (3)

• ID management becomes an issue– Best to use master DB id as user_id in user cluster or

uuid’s– If let cluster allocate then make sure use offset and

increment (not auto_increment)

• Other DBs such as session must reference a user by id and DB cluster

• Serious code changes may be required• Will want to have ability to move use users

between clusters

Page 13: How To Scale v2

George Palmer

26th May 2007

Architecture so far• As number of app servers grow it’s a good idea

to add a database connection manager (eg SQLRelay)

• Extract out session, search, translation databases onto own machines

• Add background processor for long running tasks (so don’t block app servers)

• Use MySQL cluster (or equivalent) for any critical database– In replication setup can make a slave a backup

master

Page 14: How To Scale v2

George Palmer

26th May 2007

Non-cached architecture

Load balancer

MasterDB

App Server 1

App Server 2

App Server 50

DB ConnectionManager

MasterDB

SessionDB

SearchDB

NLSDB

Master

SlaveSlaveSlave

Master

SlaveSlaveSlave

User Cluster

2

User Cluster

1

Static Files

BackgroundRB

Page 15: How To Scale v2

George Palmer

26th May 2007

Issues

• Load balancer and database connection manager are single point of failure– Easy solved

• 2PC needed for some operations. For example a user wants to be removed from search database– 2PC not supported in rails

• Rails doesn’t support database switching for a given model– Can do explicitly on each request but expensive due to

connection establishment overhead– Can get round if using connection manager but a proper solution

is required (a few gems starting to emerge on this)

Page 16: How To Scale v2

George Palmer

26th May 2007

Making the most of your assets

• In a lot of web applications a huge % of the hits are read only. Hence the need for caching:– Squid

• A reverse-proxy (or webserver accelerator)

– Memcached• Distributed memory caching solution

– Language specific caching• Eg rails fragment caching

Page 17: How To Scale v2

George Palmer

26th May 2007

Squid

Squid

• Lookup of pages is in memory, storing of files is on disk• Can act also act as a load balancer• Pages can be expired by sending DELETE request to

proxy• Can program any load balancer to pick up pages cached

by your app servers (if you know the rules under which it operates)

App Server 1

App Server 2

Storage

In cache Not in cache

Page 18: How To Scale v2

George Palmer

26th May 2007

Memcached

App ServerDB Farm

Memcached

Physical Machine

• Location of data is irrespective of physical machine• A really nice simple API

– SET– GET– DELETE

• In rails only a fews LOC will make a model cached• Also useful for tracking cross machine information – eg dodge user

behaviour

App Server

Memcached

Physical Machine

(Not in memcached)

Page 19: How To Scale v2

George Palmer

26th May 2007

Cached architecture

• Introduce squid or nginx

• Introduce memcached– Can go on every machine that has spare

memory• Best suited to application servers which have high

CPU usage but low memory requirements

• Introduce language specific caching

Page 20: How To Scale v2

George Palmer

26th May 2007

Cached architecture

Load balancer

MasterDB

App Server 1

App Server 2

App Server 50

DB ConnectionManager

MasterDB

SessionDB

SearchDB

NLSDB

Master

SlaveSlaveSlave

Master

SlaveSlaveSlave

User Cluster

2

User Cluster

1

MC

MC

MC

MC=memcached

BackgroundRB

Storage

Page 21: How To Scale v2

George Palmer

26th May 2007

Cached architecture

• Wikipedia quote a cache hit rate of 78% for squid and 7% for memcached– So only 15% of hits actually get to the DB!!

• Performance is a whole new ball game but we recently gained 15-20% by optimising our rails configuration– But don’t get carried away - at some point the time

you spend exceeds the money saved

• Its very easy to scale this architecture down to one machine

Page 22: How To Scale v2

George Palmer

26th May 2007

Elastic architectures

• Based upon Amazon EC2– Allow you to create server images and launch

instances on demand– Very cheap as you only pay for what you use

• Currently no way to mount Amazon S3– Strictly speaking there are a few projects ongoing…

• Still in Beta– We’ve had network performance issues

• An American VC was quoted as saying “Are you using EC2 for scaling? If not, you better have a good reason”

Page 23: How To Scale v2

George Palmer

26th May 2007

Elastic architectures

Load balancer

App Server 1

App Server 2

App Server 3

MC

MC

MC

Monitor

EC2 CloudEC2 Cloud

App ServerImage

App Server 4MC produces

• WeoCeo now offer a similar service

High load

Page 24: How To Scale v2

George Palmer

26th May 2007

How far can it go?

• For a truly global application, with millions of users - In order of ease:– Have a cache on each continent– Make user clusters based on user location

• Distribute the clusters physically around the world

– Introduce app servers on each continent– If you must replicate your site globally then

use transaction replication software, eg GoldenGate

Page 25: How To Scale v2

George Palmer

26th May 2007

Useful Links

• http://www.squid-cache.org/

• http://nginx.net/

• http://www.danga.com/memcached/

• http://sqlrelay.sourceforge.net/

• http://railsexpress.de/blog/

Page 26: How To Scale v2

George Palmer

26th May 2007

Questions?