How to scale your web app

Post on 08-Sep-2014

69.782 views 3 download

Tags:

description

Scaling web applications, as present at Barcamp London 2 by George Palmer

Transcript of How to scale your web app

How to scale(with ruby on rails)

George Palmergeorge.palmer@gmail.com

3dogsbark.com

George Palmer

17th February 2007

Overview

• One server• Two servers• Scaling the database• Scaling the web server• User clusters• Final architecture• Caching• Cached architecture• Links• Questions

George Palmer

17th February 2007

How you start out

• Shared Hosting• One web server and DB on same machine• Application designed for one machine• Volume of traffic will depend on host

DBWeb Server

Shared Hosting

George Palmer

17th February 2007

Two servers

• Possibly still shared hosting• Web server and DB on different machine• Minimal changes to code• Volume of traffic will depend on whether made it

to dedicated machines

DBWeb Server

George Palmer

17th February 2007

Scaling the database (1)

• DB setup more suited to read intensive applications (MySQL replication)

• Should be on dedicated hosts• Minimal changes to code

MasterDB

Web Server

Slave

Slave

Slave

George Palmer

17th February 2007

Scaling the database (2)

• DB setup more suited to equal read/write applications (MySQL cluster)

• Should be on dedicated hosts• Minimal changes to code

MasterDB

Web Server

MasterDB

MySQL Cluster

George Palmer

17th February 2007

Scaling the web server

• Web Server comprises of “Worker threads” that process work as it comes in

DBFarm

Worker thread

Worker thread

Worker thread

Worker thread

Web Server

George Palmer

17th February 2007

Load balancing

• App Server depends:– Rails (Mongrel, FastCGI)– PHP– J2EE

• Some changes to code will be required

DBFarm

App Server

App Server

App Server

Load balancer

George Palmer

17th February 2007

The story so far…

App Server

App Server

App Server

Load balancer MasterDB

Slave

Slave

Slave

• App servers continue to scale but the database side is somewhat limited…

George Palmer

17th February 2007

User Clusters

• For each user registered on the service add a entry to a master database detailing where their user data is stored– UserID– DB Cluster– Basic authorisation details such as username,

password, any NLS settings

George Palmer

17th February 2007

User Clusters (2)

App Server

MasterDB

User Cluster 1

UserCluster 2

User clusters are themselves one of the two database setups outlined earlier

SELECT * FROMusers WHERE username=‘Bob’AND …

user_id=91732db_cluster=2

George Palmer

17th February 2007

User Clusters (3)

• ID management becomes an issue– Best to use master DB id as user_id in user cluster– If let cluster allocate then make sure use offset and

increment (not auto_increment)

• Other DBs such as session must reference a user by id and DB cluster

• Serious code changes may be required• Will want to have ability to move use users

between clusters

George Palmer

17th February 2007

The final architecture• As number of app servers grow it’s a good idea

to add a database connection manager (eg SQLRelay)

• Extract out session, search, translation databases onto own machines

• Use MySQL cluster (or equivalent) for any critical database– In replication setup can make a slave a backup

master

• Add a NFS/SAN for static files

George Palmer

17th February 2007

The final architecture (2)

Load balancer

MasterDB

App Server 1

App Server 2

App Server 50

DB ConnectionManager

MasterDB

SessionDB

SearchDB

NLSDB

Master

SlaveSlaveSlave

Master

SlaveSlaveSlave

User Cluster

2

User Cluster

1

NFS/SAN

George Palmer

17th February 2007

Issues

• Load balancer and database connection manager are single point of failure– Easy solved

• 2PC needed for some operations. For example a user wants to be removed from search database– 2PC not supported in rails

• Rails doesn’t support database switching for a given model– Can do explicitly on each request but expensive due to

connection establishment overhead– Can get round if using connection manager but a proper solution

is required (I may write a gem to do this)

George Palmer

17th February 2007

Making the most of your assets

• In a lot of web applications a huge % of the hits are read only. Hence the need for caching:– Squid

• A reverse-proxy (or webserver accelerator)

– Memcached• Distributed memory caching solution

George Palmer

17th February 2007

Squid

Squid

• Lookup of pages is in memory, storing of files is on disk

• Can act also act as a load balancer• Pages can be expired by sending DELETE

request to proxy

App Server 1

App Server 2

NFS/SAN

In cache Not in cache

George Palmer

17th February 2007

Memcached

App ServerDB Farm

Memcached

Physical Machine

• Location of data is irrespective of physical machine• A really nice simple API

– SET– GET– DELETE

• In rails only a fews LOC will make a model cached• Also useful for tracking cross machine information – eg dodge user

behaviour

App Server

Memcached

Physical Machine

(Not in memcached)

George Palmer

17th February 2007

Cached Architecture

• Introduce Squid– Acts as load balancer (note there are higher

performing load balancers)

• Introduce memcached– Can go on every machine that has spare

memory• Best suited to application servers which have high

CPU usage but low memory requirements

George Palmer

17th February 2007

Cached architecture

Squid

MasterDB

App Server 1

App Server 2

App Server 50

DB ConnectionManager

MasterDB

SessionDB

SearchDB

NLSDB

Master

SlaveSlaveSlave

Master

SlaveSlaveSlave

User Cluster

2

User Cluster

1

NFS/SANMC

MC

MC

MC=memcached

George Palmer

17th February 2007

Cached architecture

• Wikipedia quote a cache hit rate of 78% for squid and 7% for memcached– So only 15% of hits actually get to the DB!!

• Performance is a whole new ball game but we recently gained 15-20% by optimising our rails configuration– But don’t get carried away - at some point the

time you spend exceeds the money saved

George Palmer

17th February 2007

Cached architecture – 1 machine

Squid

MasterDB

App Server 1

App Server 2

App Server 5

DB ConnectionManager

MasterDB

SessionDB

SearchDB

NLSDB

Master

SlaveSlaveSlave

User Cluster

1

NFS/SAN

Memcached

Physical Machine

George Palmer

17th February 2007

How far can it go?

• For a truly global application, with millions of users - In order of ease:– Have a cache on each continent– Make user clusters based on user location

• Distribute the clusters physically around the world

– Introduce app servers on each continent– If you must replicate your site globally then

use transaction replication software, eg GoldenGate

George Palmer

17th February 2007

Useful Links

• http://www.squid-cache.org/

• http://www.danga.com/memcached/

• http://sqlrelay.sourceforge.net/

• http://railsexpress.de/blog/

George Palmer

17th February 2007

Questions?