How to scale your web app

25
How to scale (with ruby on rails) George Palmer [email protected] 3dogsbark.com

description

Scaling web applications, as present at Barcamp London 2 by George Palmer

Transcript of How to scale your web app

Page 1: How to scale your web app

How to scale(with ruby on rails)

George [email protected]

3dogsbark.com

Page 2: How to scale your web app

George Palmer

17th February 2007

Overview

• One server• Two servers• Scaling the database• Scaling the web server• User clusters• Final architecture• Caching• Cached architecture• Links• Questions

Page 3: How to scale your web app

George Palmer

17th February 2007

How you start out

• Shared Hosting• One web server and DB on same machine• Application designed for one machine• Volume of traffic will depend on host

DBWeb Server

Shared Hosting

Page 4: How to scale your web app

George Palmer

17th February 2007

Two servers

• Possibly still shared hosting• Web server and DB on different machine• Minimal changes to code• Volume of traffic will depend on whether made it

to dedicated machines

DBWeb Server

Page 5: How to scale your web app

George Palmer

17th February 2007

Scaling the database (1)

• DB setup more suited to read intensive applications (MySQL replication)

• Should be on dedicated hosts• Minimal changes to code

MasterDB

Web Server

Slave

Slave

Slave

Page 6: How to scale your web app

George Palmer

17th February 2007

Scaling the database (2)

• DB setup more suited to equal read/write applications (MySQL cluster)

• Should be on dedicated hosts• Minimal changes to code

MasterDB

Web Server

MasterDB

MySQL Cluster

Page 7: How to scale your web app

George Palmer

17th February 2007

Scaling the web server

• Web Server comprises of “Worker threads” that process work as it comes in

DBFarm

Worker thread

Worker thread

Worker thread

Worker thread

Web Server

Page 8: How to scale your web app

George Palmer

17th February 2007

Load balancing

• App Server depends:– Rails (Mongrel, FastCGI)– PHP– J2EE

• Some changes to code will be required

DBFarm

App Server

App Server

App Server

Load balancer

Page 9: How to scale your web app

George Palmer

17th February 2007

The story so far…

App Server

App Server

App Server

Load balancer MasterDB

Slave

Slave

Slave

• App servers continue to scale but the database side is somewhat limited…

Page 10: How to scale your web app

George Palmer

17th February 2007

User Clusters

• For each user registered on the service add a entry to a master database detailing where their user data is stored– UserID– DB Cluster– Basic authorisation details such as username,

password, any NLS settings

Page 11: How to scale your web app

George Palmer

17th February 2007

User Clusters (2)

App Server

MasterDB

User Cluster 1

UserCluster 2

User clusters are themselves one of the two database setups outlined earlier

SELECT * FROMusers WHERE username=‘Bob’AND …

user_id=91732db_cluster=2

Page 12: How to scale your web app

George Palmer

17th February 2007

User Clusters (3)

• ID management becomes an issue– Best to use master DB id as user_id in user cluster– If let cluster allocate then make sure use offset and

increment (not auto_increment)

• Other DBs such as session must reference a user by id and DB cluster

• Serious code changes may be required• Will want to have ability to move use users

between clusters

Page 13: How to scale your web app

George Palmer

17th February 2007

The final architecture• As number of app servers grow it’s a good idea

to add a database connection manager (eg SQLRelay)

• Extract out session, search, translation databases onto own machines

• Use MySQL cluster (or equivalent) for any critical database– In replication setup can make a slave a backup

master

• Add a NFS/SAN for static files

Page 14: How to scale your web app

George Palmer

17th February 2007

The final architecture (2)

Load balancer

MasterDB

App Server 1

App Server 2

App Server 50

DB ConnectionManager

MasterDB

SessionDB

SearchDB

NLSDB

Master

SlaveSlaveSlave

Master

SlaveSlaveSlave

User Cluster

2

User Cluster

1

NFS/SAN

Page 15: How to scale your web app

George Palmer

17th February 2007

Issues

• Load balancer and database connection manager are single point of failure– Easy solved

• 2PC needed for some operations. For example a user wants to be removed from search database– 2PC not supported in rails

• Rails doesn’t support database switching for a given model– Can do explicitly on each request but expensive due to

connection establishment overhead– Can get round if using connection manager but a proper solution

is required (I may write a gem to do this)

Page 16: How to scale your web app

George Palmer

17th February 2007

Making the most of your assets

• In a lot of web applications a huge % of the hits are read only. Hence the need for caching:– Squid

• A reverse-proxy (or webserver accelerator)

– Memcached• Distributed memory caching solution

Page 17: How to scale your web app

George Palmer

17th February 2007

Squid

Squid

• Lookup of pages is in memory, storing of files is on disk

• Can act also act as a load balancer• Pages can be expired by sending DELETE

request to proxy

App Server 1

App Server 2

NFS/SAN

In cache Not in cache

Page 18: How to scale your web app

George Palmer

17th February 2007

Memcached

App ServerDB Farm

Memcached

Physical Machine

• Location of data is irrespective of physical machine• A really nice simple API

– SET– GET– DELETE

• In rails only a fews LOC will make a model cached• Also useful for tracking cross machine information – eg dodge user

behaviour

App Server

Memcached

Physical Machine

(Not in memcached)

Page 19: How to scale your web app

George Palmer

17th February 2007

Cached Architecture

• Introduce Squid– Acts as load balancer (note there are higher

performing load balancers)

• Introduce memcached– Can go on every machine that has spare

memory• Best suited to application servers which have high

CPU usage but low memory requirements

Page 20: How to scale your web app

George Palmer

17th February 2007

Cached architecture

Squid

MasterDB

App Server 1

App Server 2

App Server 50

DB ConnectionManager

MasterDB

SessionDB

SearchDB

NLSDB

Master

SlaveSlaveSlave

Master

SlaveSlaveSlave

User Cluster

2

User Cluster

1

NFS/SANMC

MC

MC

MC=memcached

Page 21: How to scale your web app

George Palmer

17th February 2007

Cached architecture

• Wikipedia quote a cache hit rate of 78% for squid and 7% for memcached– So only 15% of hits actually get to the DB!!

• Performance is a whole new ball game but we recently gained 15-20% by optimising our rails configuration– But don’t get carried away - at some point the

time you spend exceeds the money saved

Page 22: How to scale your web app

George Palmer

17th February 2007

Cached architecture – 1 machine

Squid

MasterDB

App Server 1

App Server 2

App Server 5

DB ConnectionManager

MasterDB

SessionDB

SearchDB

NLSDB

Master

SlaveSlaveSlave

User Cluster

1

NFS/SAN

Memcached

Physical Machine

Page 23: How to scale your web app

George Palmer

17th February 2007

How far can it go?

• For a truly global application, with millions of users - In order of ease:– Have a cache on each continent– Make user clusters based on user location

• Distribute the clusters physically around the world

– Introduce app servers on each continent– If you must replicate your site globally then

use transaction replication software, eg GoldenGate

Page 24: How to scale your web app

George Palmer

17th February 2007

Useful Links

• http://www.squid-cache.org/

• http://www.danga.com/memcached/

• http://sqlrelay.sourceforge.net/

• http://railsexpress.de/blog/

Page 25: How to scale your web app

George Palmer

17th February 2007

Questions?