Planning LAMP infrastructure

218
Presented 2010-05-19 by David Strauss Designing, Scoping, and Configuring Scalable LAMP Infrastructure Wed 2010-06-09

description

Presented at PHP Tek-X (2010)

Transcript of Planning LAMP infrastructure

Page 1: Planning LAMP infrastructure

Presented 2010-05-19 by David Strauss

Designing, Scoping, and Configuring Scalable LAMP Infrastructure

Wed 2010-06-09

Page 2: Planning LAMP infrastructure

About me

Wed 2010-06-09

Page 3: Planning LAMP infrastructure

About me‣ Founded Four Kitchens in 2006 while at UT Austin

Wed 2010-06-09

Page 4: Planning LAMP infrastructure

About me‣ Founded Four Kitchens in 2006 while at UT Austin‣ In 2008, launched Pressflow,

which now powers the largest Drupal sites

Wed 2010-06-09

Page 5: Planning LAMP infrastructure

About me‣ Founded Four Kitchens in 2006 while at UT Austin‣ In 2008, launched Pressflow,

which now powers the largest Drupal sites‣ Worked with some of the largest sites in the world:

Lifetime Digital, Mansueto Ventures, Wikipedia, The Internet Archive, and The Economist

Wed 2010-06-09

Page 6: Planning LAMP infrastructure

About me‣ Founded Four Kitchens in 2006 while at UT Austin‣ In 2008, launched Pressflow,

which now powers the largest Drupal sites‣ Worked with some of the largest sites in the world:

Lifetime Digital, Mansueto Ventures, Wikipedia, The Internet Archive, and The Economist

‣ Engineered the LAMP stack, deployment tools, and management tools for Yale University, multiple NBC-Universal properties, and Drupal.org

Wed 2010-06-09

Page 7: Planning LAMP infrastructure

About me‣ Founded Four Kitchens in 2006 while at UT Austin‣ In 2008, launched Pressflow,

which now powers the largest Drupal sites‣ Worked with some of the largest sites in the world:

Lifetime Digital, Mansueto Ventures, Wikipedia, The Internet Archive, and The Economist

‣ Engineered the LAMP stack, deployment tools, and management tools for Yale University, multiple NBC-Universal properties, and Drupal.org

‣ Engineered development workflows for Examiner.com

Wed 2010-06-09

Page 8: Planning LAMP infrastructure

About me‣ Founded Four Kitchens in 2006 while at UT Austin‣ In 2008, launched Pressflow,

which now powers the largest Drupal sites‣ Worked with some of the largest sites in the world:

Lifetime Digital, Mansueto Ventures, Wikipedia, The Internet Archive, and The Economist

‣ Engineered the LAMP stack, deployment tools, and management tools for Yale University, multiple NBC-Universal properties, and Drupal.org

‣ Engineered development workflows for Examiner.com‣ Contributor to Drupal, Bazaar, Ubuntu, BCFG2,

Varnish, and other open-source projects

Wed 2010-06-09

Page 9: Planning LAMP infrastructure

David Strauss

Some assumptions

Wed 2010-06-09

Page 10: Planning LAMP infrastructure

David Strauss

Some assumptions‣ You have more than one web server

Wed 2010-06-09

Page 11: Planning LAMP infrastructure

David Strauss

Some assumptions‣ You have more than one web server

‣ You have root access

Wed 2010-06-09

Page 12: Planning LAMP infrastructure

David Strauss

Some assumptions‣ You have more than one web server

‣ You have root access

‣ You deploy to Linux(though PHP on Windows is more sane than ever)

Wed 2010-06-09

Page 13: Planning LAMP infrastructure

David Strauss

Some assumptions‣ You have more than one web server

‣ You have root access

‣ You deploy to Linux(though PHP on Windows is more sane than ever)

‣ Database and web servers occupy separate boxes

Wed 2010-06-09

Page 14: Planning LAMP infrastructure

David Strauss

Some assumptions‣ You have more than one web server

‣ You have root access

‣ You deploy to Linux(though PHP on Windows is more sane than ever)

‣ Database and web servers occupy separate boxes

‣ Your application behaves more or lesslike Drupal, WordPress, or MediaWiki

Wed 2010-06-09

Page 15: Planning LAMP infrastructure

David Strauss

UnderstandingLoad Distribution

Wed 2010-06-09

Page 16: Planning LAMP infrastructure

David Strauss

Predicting peak trafficTraffic over the day can be highly irregular. To plan for peak loads, design as if all traffic were as heavy as the peak hour of load in a typical month — and then plan for some growth.

Wed 2010-06-09

Page 17: Planning LAMP infrastructure

David Strauss

Analyzing hit distribution

Wed 2010-06-09

Page 18: Planning LAMP infrastructure

David Strauss

100%

Analyzing hit distribution

Wed 2010-06-09

Page 19: Planning LAMP infrastructure

David Strauss

100%

Analyzing hit distribution

Static

Content

Wed 2010-06-09

Page 20: Planning LAMP infrastructure

David Strauss

30%

100%

Analyzing hit distribution

Static

Content

Wed 2010-06-09

Page 21: Planning LAMP infrastructure

David Strauss

30%

100%

Analyzing hit distribution

Dynamic

Pages

Static

Content

Wed 2010-06-09

Page 22: Planning LAMP infrastructure

David Strauss

30%

100%

70%

Analyzing hit distribution

Dynamic

Pages

Static

Content

Wed 2010-06-09

Page 23: Planning LAMP infrastructure

David Strauss

30%

100%

70%

Analyzing hit distribution

Authenticated

Dynamic

Pages

Static

Content

Wed 2010-06-09

Page 24: Planning LAMP infrastructure

David Strauss

30%

100%

70%

20%

Analyzing hit distribution

Authenticated

Dynamic

Pages

Static

Content

Wed 2010-06-09

Page 25: Planning LAMP infrastructure

David Strauss

30%

100%

70%

20%

Analyzing hit distribution

Anony

mou

s

Authenticated

Dynamic

Pages

Static

Content

Wed 2010-06-09

Page 26: Planning LAMP infrastructure

David Strauss

50%

30%

100%

70%

20%

Analyzing hit distribution

Anony

mou

s

Authenticated

Dynamic

Pages

Static

Content

Wed 2010-06-09

Page 27: Planning LAMP infrastructure

David Strauss

50%

30%

100%

70%

20%

Analyzing hit distribution

Anony

mou

s

Authenticated

Dynamic

Pages

Static

Content Human

Wed 2010-06-09

Page 28: Planning LAMP infrastructure

David Strauss

40%

50%

30%

100%

70%

20%

Analyzing hit distribution

Anony

mou

s

Authenticated

Dynamic

Pages

Static

Content Human

Wed 2010-06-09

Page 29: Planning LAMP infrastructure

David Strauss

40%

50%

30%

100%

70%

20%

Analyzing hit distribution

Anony

mou

s

Authenticated

Dynamic

Pages

Static

Content Human

Web

Crawler

Wed 2010-06-09

Page 30: Planning LAMP infrastructure

David Strauss

10%

40%

50%

30%

100%

70%

20%

Analyzing hit distribution

Anony

mou

s

Authenticated

Dynamic

Pages

Static

Content Human

Web

Crawler

Wed 2010-06-09

Page 31: Planning LAMP infrastructure

David Strauss

10%

40%

50%

30%

100%

70%

20%

Analyzing hit distribution

Anony

mou

s

Authenticated

Dynamic

Pages

Static

Content Human

Web

CrawlerNo Special Treatm

ent

Wed 2010-06-09

Page 32: Planning LAMP infrastructure

David Strauss

3%

10%

40%

50%

30%

100%

70%

20%

Analyzing hit distribution

Anony

mou

s

Authenticated

Dynamic

Pages

Static

Content Human

Web

CrawlerNo Special Treatm

ent

Wed 2010-06-09

Page 33: Planning LAMP infrastructure

David Strauss

3%

10%

40%

50%

30%

100%

70%

20%

Analyzing hit distribution

Anony

mou

s

Authenticated

Dynamic

Pages

Static

Content Human

Web

CrawlerNo Special Treatm

ent

“Pay Wall” Bypass

Wed 2010-06-09

Page 34: Planning LAMP infrastructure

David Strauss

3%

10%

40%

50%

30%

100%

70%

20%

Analyzing hit distribution

Anony

mou

s

Authenticated

Dynamic

Pages

Static

Content Human

Web

CrawlerNo Special Treatm

ent

7%

“Pay Wall” Bypass

Wed 2010-06-09

Page 35: Planning LAMP infrastructure

David Strauss

Throughput vs. Delivery MethodsGreen

(Static)

Yellow(Dynamic, Cacheable)

Red(Dynamic)

Content Delivery Network

Reverse Proxy Cache

PHP + APC + memcached

PHP + APC

PHP (No APC)

●●●●●●●●●● ✖ ✖

●●●●●●●● ●●●●●●● ✖

●●●● ●●● ●●●

●●●● ●● ●●

●●●● ● ●

1

Delivered by Apache without PHP

1

1

1

More dots = More throughput Some actually can do this.2

2

10 req/s

5000 req/s

Wed 2010-06-09

Page 36: Planning LAMP infrastructure

David Strauss

Objective

Deliver hits using the fastest, most scalable

method available

Wed 2010-06-09

Page 37: Planning LAMP infrastructure

David Strauss

Layering: Less Traffic at Each Step

Wed 2010-06-09

Page 38: Planning LAMP infrastructure

David Strauss

Layering: Less Traffic at Each Step

Traffic

Wed 2010-06-09

Page 39: Planning LAMP infrastructure

David Strauss

Layering: Less Traffic at Each Step

Traffic

Wed 2010-06-09

Page 40: Planning LAMP infrastructure

David Strauss

Layering: Less Traffic at Each Step

CDN

Traffic

Wed 2010-06-09

Page 41: Planning LAMP infrastructure

David Strauss

Layering: Less Traffic at Each Step

CDN

Traffic

Your Datacenter

Wed 2010-06-09

Page 42: Planning LAMP infrastructure

David Strauss

Layering: Less Traffic at Each Step

CDN

Traffic

Your Datacenter

Wed 2010-06-09

Page 43: Planning LAMP infrastructure

David Strauss

Layering: Less Traffic at Each Step

CDN

Traffic

Your Datacenter

DNS Round Robin

Wed 2010-06-09

Page 44: Planning LAMP infrastructure

David Strauss

Layering: Less Traffic at Each Step

CDN

Load BalancerTraffic

Your Datacenter

DNS Round Robin

Wed 2010-06-09

Page 45: Planning LAMP infrastructure

David Strauss

Layering: Less Traffic at Each Step

CDN

Load BalancerTraffic

Your Datacenter

DNS Round Robin

Wed 2010-06-09

Page 46: Planning LAMP infrastructure

David Strauss

Layering: Less Traffic at Each Step

CDN

Load Balancer

Reverse Proxy Cache

Traffic

Your Datacenter

DNS Round Robin

Wed 2010-06-09

Page 47: Planning LAMP infrastructure

David Strauss

Layering: Less Traffic at Each Step

CDN

Load Balancer

Reverse Proxy Cache

Traffic

Your Datacenter

DNS Round Robin

Wed 2010-06-09

Page 48: Planning LAMP infrastructure

David Strauss

Layering: Less Traffic at Each Step

CDN

Load Balancer

Reverse Proxy Cache

Application Server

Traffic

Your Datacenter

DNS Round Robin

Wed 2010-06-09

Page 49: Planning LAMP infrastructure

David Strauss

Layering: Less Traffic at Each Step

CDN

Load Balancer

Reverse Proxy Cache

Application Server

Traffic

Your Datacenter

DNS Round Robin

Wed 2010-06-09

Page 50: Planning LAMP infrastructure

David Strauss

Layering: Less Traffic at Each Step

CDN

Load Balancer

Reverse Proxy Cache

Application Server

Database

Traffic

Your Datacenter

DNS Round Robin

Wed 2010-06-09

Page 51: Planning LAMP infrastructure

David Strauss

Offload from the master database

Your master database is the single greatest limitation on scalability.

Wed 2010-06-09

Page 52: Planning LAMP infrastructure

David Strauss

Offload from the master database

Your master database is the single greatest limitation on scalability.

Application Server

MasterDatabase

Wed 2010-06-09

Page 53: Planning LAMP infrastructure

David Strauss

Offload from the master database

Memory Cache

Your master database is the single greatest limitation on scalability.

Application Server

MasterDatabase

Wed 2010-06-09

Page 54: Planning LAMP infrastructure

David Strauss

Offload from the master database

Memory Cache

SlaveDatabase

Your master database is the single greatest limitation on scalability.

Application Server

MasterDatabase

Wed 2010-06-09

Page 55: Planning LAMP infrastructure

David Strauss

Offload from the master database

Search

Memory Cache

SlaveDatabase

Your master database is the single greatest limitation on scalability.

Application Server

MasterDatabase

Wed 2010-06-09

Page 56: Planning LAMP infrastructure

David Strauss

Tools to use

Wed 2010-06-09

Page 57: Planning LAMP infrastructure

David Strauss

Tools to use

‣ Apache Solr or Sphinx for search

‣ Solr can be fronted with Varnish or another proxy cache if queries are repetitive.

Wed 2010-06-09

Page 58: Planning LAMP infrastructure

David Strauss

Tools to use

‣ Apache Solr or Sphinx for search

‣ Solr can be fronted with Varnish or another proxy cache if queries are repetitive.

‣ Varnish, nginx, Squid, or Traffic Serverfor reverse proxy caching

Wed 2010-06-09

Page 59: Planning LAMP infrastructure

David Strauss

Tools to use

‣ Apache Solr or Sphinx for search

‣ Solr can be fronted with Varnish or another proxy cache if queries are repetitive.

‣ Varnish, nginx, Squid, or Traffic Serverfor reverse proxy caching

‣ Any third-party service for CDN

Wed 2010-06-09

Page 60: Planning LAMP infrastructure

David Strauss

Do the math‣ All non-CDN traffic travels through your load

balancers and reverse proxy caches. Even traffic passed through to application servers must run through the initial layers.

Wed 2010-06-09

Page 61: Planning LAMP infrastructure

David Strauss

Do the math‣ All non-CDN traffic travels through your load

balancers and reverse proxy caches. Even traffic passed through to application servers must run through the initial layers.

InternalTraffic

Wed 2010-06-09

Page 62: Planning LAMP infrastructure

David Strauss

Do the math‣ All non-CDN traffic travels through your load

balancers and reverse proxy caches. Even traffic passed through to application servers must run through the initial layers.

Load Balancer

InternalTraffic

Wed 2010-06-09

Page 63: Planning LAMP infrastructure

David Strauss

Do the math‣ All non-CDN traffic travels through your load

balancers and reverse proxy caches. Even traffic passed through to application servers must run through the initial layers.

Load Balancer

Reverse Proxy Cache

InternalTraffic

Wed 2010-06-09

Page 64: Planning LAMP infrastructure

David Strauss

Do the math‣ All non-CDN traffic travels through your load

balancers and reverse proxy caches. Even traffic passed through to application servers must run through the initial layers.

Load Balancer

Reverse Proxy Cache

InternalTraffic

Wed 2010-06-09

Page 65: Planning LAMP infrastructure

David Strauss

Do the math‣ All non-CDN traffic travels through your load

balancers and reverse proxy caches. Even traffic passed through to application servers must run through the initial layers.

Load Balancer

Reverse Proxy Cache

Application Server

InternalTraffic

Wed 2010-06-09

Page 66: Planning LAMP infrastructure

David Strauss

Do the math‣ All non-CDN traffic travels through your load

balancers and reverse proxy caches. Even traffic passed through to application servers must run through the initial layers.

Load Balancer

Reverse Proxy Cache

Application Server

InternalTraffic

Wed 2010-06-09

Page 67: Planning LAMP infrastructure

David Strauss

Do the math‣ All non-CDN traffic travels through your load

balancers and reverse proxy caches. Even traffic passed through to application servers must run through the initial layers.

Load Balancer

Reverse Proxy Cache

Application Server

InternalTraffic

What hit rate is each layer getting?How many servers share the load?

Wed 2010-06-09

Page 68: Planning LAMP infrastructure

David Strauss

Get a management/monitoring box

Wed 2010-06-09

Page 69: Planning LAMP infrastructure

David Strauss

Get a management/monitoring box

Management

Wed 2010-06-09

Page 70: Planning LAMP infrastructure

David Strauss

Get a management/monitoring box

ManagementApplication

Server

Wed 2010-06-09

Page 71: Planning LAMP infrastructure

David Strauss

Get a management/monitoring box

ManagementApplication

Server

Reverse Proxy Cache

Wed 2010-06-09

Page 72: Planning LAMP infrastructure

David Strauss

Get a management/monitoring box

ManagementApplication

Server

Reverse Proxy Cache

Database

Wed 2010-06-09

Page 73: Planning LAMP infrastructure

David Strauss

Get a management/monitoring box

ManagementApplication

Server

Reverse Proxy Cache

Database

Load Balancer

Wed 2010-06-09

Page 74: Planning LAMP infrastructure

David Strauss

Get a management/monitoring box

ManagementApplication

Server

Reverse Proxy Cache

Database

Load Balancer

(maybe even two and have them

specialize or be redundant)

Wed 2010-06-09

Page 75: Planning LAMP infrastructure

David Strauss

Planning + Scoping

Wed 2010-06-09

Page 76: Planning LAMP infrastructure

David Strauss

Infrastructure goals

Wed 2010-06-09

Page 77: Planning LAMP infrastructure

David Strauss

Infrastructure goals

‣ Redundancy: tolerate failure

Wed 2010-06-09

Page 78: Planning LAMP infrastructure

David Strauss

Infrastructure goals

‣ Redundancy: tolerate failure

‣ Scalability: engage more users

Wed 2010-06-09

Page 79: Planning LAMP infrastructure

David Strauss

Infrastructure goals

‣ Redundancy: tolerate failure

‣ Scalability: engage more users

‣ Performance: ensure each user’s experience is fast

Wed 2010-06-09

Page 80: Planning LAMP infrastructure

David Strauss

Infrastructure goals

‣ Redundancy: tolerate failure

‣ Scalability: engage more users

‣ Performance: ensure each user’s experience is fast

‣ Manageability: stay sane in the process

Wed 2010-06-09

Page 81: Planning LAMP infrastructure

David Strauss

Redundancy

Wed 2010-06-09

Page 82: Planning LAMP infrastructure

David Strauss

Redundancy‣ When one server fails, the website should

be able to recover without taking too long.

Wed 2010-06-09

Page 83: Planning LAMP infrastructure

David Strauss

Redundancy‣ When one server fails, the website should

be able to recover without taking too long.

‣ This requires at least N+1, putting a flooron system requirements even for small sites.

Wed 2010-06-09

Page 84: Planning LAMP infrastructure

David Strauss

Redundancy‣ When one server fails, the website should

be able to recover without taking too long.

‣ This requires at least N+1, putting a flooron system requirements even for small sites.

‣ How long can your site be down?

Wed 2010-06-09

Page 85: Planning LAMP infrastructure

David Strauss

Redundancy‣ When one server fails, the website should

be able to recover without taking too long.

‣ This requires at least N+1, putting a flooron system requirements even for small sites.

‣ How long can your site be down?

‣ Automatic versus manual failover

Wed 2010-06-09

Page 86: Planning LAMP infrastructure

David Strauss

Redundancy‣ When one server fails, the website should

be able to recover without taking too long.

‣ This requires at least N+1, putting a flooron system requirements even for small sites.

‣ How long can your site be down?

‣ Automatic versus manual failover

‣ Warning: over-automation can reduce uptime

Wed 2010-06-09

Page 87: Planning LAMP infrastructure

David Strauss

Performance

Wed 2010-06-09

Page 88: Planning LAMP infrastructure

David Strauss

Performance

‣ Find the “sweet spot” for hardware. This is the best price/performance point.

Wed 2010-06-09

Page 89: Planning LAMP infrastructure

David Strauss

Performance

‣ Find the “sweet spot” for hardware. This is the best price/performance point.

‣ Avoid overspending on any type of component

Wed 2010-06-09

Page 90: Planning LAMP infrastructure

David Strauss

Performance

‣ Find the “sweet spot” for hardware. This is the best price/performance point.

‣ Avoid overspending on any type of component

‣ Yet, avoid creating bottlenecks

Wed 2010-06-09

Page 91: Planning LAMP infrastructure

David Strauss

Performance

‣ Find the “sweet spot” for hardware. This is the best price/performance point.

‣ Avoid overspending on any type of component

‣ Yet, avoid creating bottlenecks

‣ Swapping memory to disk is very dangerous

Wed 2010-06-09

Page 92: Planning LAMP infrastructure

David Strauss

Performance

‣ Find the “sweet spot” for hardware. This is the best price/performance point.

‣ Avoid overspending on any type of component

‣ Yet, avoid creating bottlenecks

‣ Swapping memory to disk is very dangerous

‣ Don’t skimp on RAM

Wed 2010-06-09

Page 93: Planning LAMP infrastructure

David Strauss

Relative importance

Processors/Cores Memory Disk Speed

Reverse Proxy Cache

Web Server

Database Server

Monitoring

●● ●●● ●●

●●●●● ●● ●

●●● ●●●● ●●●●

● ● ●

Wed 2010-06-09

Page 94: Planning LAMP infrastructure

David Strauss

All of your servers

Wed 2010-06-09

Page 95: Planning LAMP infrastructure

David Strauss

All of your servers

‣ 64-bit: no excuse to use anything less in 2010

Wed 2010-06-09

Page 96: Planning LAMP infrastructure

David Strauss

All of your servers

‣ 64-bit: no excuse to use anything less in 2010

‣ RHEL/CentOS and Ubuntu have the broadest adoption for large-scale LAMP

Wed 2010-06-09

Page 97: Planning LAMP infrastructure

David Strauss

All of your servers

‣ 64-bit: no excuse to use anything less in 2010

‣ RHEL/CentOS and Ubuntu have the broadest adoption for large-scale LAMP

‣ But pick one, and stick with it for development, staging, and production

Wed 2010-06-09

Page 98: Planning LAMP infrastructure

David Strauss

All of your servers

‣ 64-bit: no excuse to use anything less in 2010

‣ RHEL/CentOS and Ubuntu have the broadest adoption for large-scale LAMP

‣ But pick one, and stick with it for development, staging, and production

‣ Some disk redundancy: rebuilding a serveris time-consuming unless you’re very automated

Wed 2010-06-09

Page 99: Planning LAMP infrastructure

David Strauss

Reverse proxy caches

Wed 2010-06-09

Page 100: Planning LAMP infrastructure

David Strauss

Reverse proxy caches‣ Varnish and nginx have modern architecture and

broad adoption‣ Sites often front Varnish with nginx

for gzip and/or SSL

Wed 2010-06-09

Page 101: Planning LAMP infrastructure

David Strauss

Reverse proxy caches‣ Varnish and nginx have modern architecture and

broad adoption‣ Sites often front Varnish with nginx

for gzip and/or SSL‣ Squid and Traffic Server are clunky

but reliable alternatives

Wed 2010-06-09

Page 102: Planning LAMP infrastructure

David Strauss

Reverse proxy caches‣ Varnish and nginx have modern architecture and

broad adoption‣ Sites often front Varnish with nginx

for gzip and/or SSL‣ Squid and Traffic Server are clunky

but reliable alternatives

Save Your Money

CPU

Wed 2010-06-09

Page 103: Planning LAMP infrastructure

David Strauss

Reverse proxy caches‣ Varnish and nginx have modern architecture and

broad adoption‣ Sites often front Varnish with nginx

for gzip and/or SSL‣ Squid and Traffic Server are clunky

but reliable alternatives

Save Your Money

CPU

1 GB base system+ 3 GB for caching+

Memory

Wed 2010-06-09

Page 104: Planning LAMP infrastructure

David Strauss

Reverse proxy caches‣ Varnish and nginx have modern architecture and

broad adoption‣ Sites often front Varnish with nginx

for gzip and/or SSL‣ Squid and Traffic Server are clunky

but reliable alternatives

Save Your Money

CPU

1 GB base system+ 3 GB for caching+

MemorySlow

+ Small+ Redundant

Disk

+Wed 2010-06-09

Page 105: Planning LAMP infrastructure

David Strauss

Reverse proxy caches‣ Varnish and nginx have modern architecture and

broad adoption‣ Sites often front Varnish with nginx

for gzip and/or SSL‣ Squid and Traffic Server are clunky

but reliable alternatives

Save Your Money

CPU

1 GB base system+ 3 GB for caching+

MemorySlow

+ Small+ Redundant

Disk

+= 5000 req/s

Wed 2010-06-09

Page 106: Planning LAMP infrastructure

David Strauss

Web servers

Wed 2010-06-09

Page 107: Planning LAMP infrastructure

David Strauss

Web servers‣ Apache 2.2 + mod_php + memcached

Wed 2010-06-09

Page 108: Planning LAMP infrastructure

David Strauss

Web servers‣ Apache 2.2 + mod_php + memcached‣ FastCGI is a bad idea‣ Memory improvements are redundant w/ Varnish‣ Higher latency + less efficient with APC opcode

Wed 2010-06-09

Page 109: Planning LAMP infrastructure

David Strauss

Web servers‣ Apache 2.2 + mod_php + memcached‣ FastCGI is a bad idea‣ Memory improvements are redundant w/ Varnish‣ Higher latency + less efficient with APC opcode

‣ Check the memory your app takes per process

Wed 2010-06-09

Page 110: Planning LAMP infrastructure

David Strauss

Web servers‣ Apache 2.2 + mod_php + memcached‣ FastCGI is a bad idea‣ Memory improvements are redundant w/ Varnish‣ Higher latency + less efficient with APC opcode

‣ Check the memory your app takes per process‣ Tune MaxClients to around 25 × cores

Wed 2010-06-09

Page 111: Planning LAMP infrastructure

David Strauss

Web servers‣ Apache 2.2 + mod_php + memcached‣ FastCGI is a bad idea‣ Memory improvements are redundant w/ Varnish‣ Higher latency + less efficient with APC opcode

‣ Check the memory your app takes per process‣ Tune MaxClients to around 25 × cores

Max out cores

(but prefer fastcores to density)

CPU

Wed 2010-06-09

Page 112: Planning LAMP infrastructure

David Strauss

Web servers‣ Apache 2.2 + mod_php + memcached‣ FastCGI is a bad idea‣ Memory improvements are redundant w/ Varnish‣ Higher latency + less efficient with APC opcode

‣ Check the memory your app takes per process‣ Tune MaxClients to around 25 × cores

Max out cores

(but prefer fastcores to density)

CPU1 GB base system

+ 1 GB memcached+ 25 × cores × per-

process app memory+

Memory

Wed 2010-06-09

Page 113: Planning LAMP infrastructure

David Strauss

Web servers‣ Apache 2.2 + mod_php + memcached‣ FastCGI is a bad idea‣ Memory improvements are redundant w/ Varnish‣ Higher latency + less efficient with APC opcode

‣ Check the memory your app takes per process‣ Tune MaxClients to around 25 × cores

Max out cores

(but prefer fastcores to density)

CPU1 GB base system

+ 1 GB memcached+ 25 × cores × per-

process app memory+

MemorySlow

+ Small+ Redundant

Disk

+Wed 2010-06-09

Page 114: Planning LAMP infrastructure

David Strauss

Web servers‣ Apache 2.2 + mod_php + memcached‣ FastCGI is a bad idea‣ Memory improvements are redundant w/ Varnish‣ Higher latency + less efficient with APC opcode

‣ Check the memory your app takes per process‣ Tune MaxClients to around 25 × cores

Max out cores

(but prefer fastcores to density)

CPU1 GB base system

+ 1 GB memcached+ 25 × cores × per-

process app memory+

MemorySlow

+ Small+ Redundant

Disk

+= 100 req/s

Wed 2010-06-09

Page 115: Planning LAMP infrastructure

David Strauss

Database servers

Wed 2010-06-09

Page 116: Planning LAMP infrastructure

David Strauss

Database servers‣ Insist on MySQL 5.1+ and InnoDB

Wed 2010-06-09

Page 117: Planning LAMP infrastructure

David Strauss

Database servers‣ Insist on MySQL 5.1+ and InnoDB‣ Consider Percona builds and (eventually) MariaDB

Wed 2010-06-09

Page 118: Planning LAMP infrastructure

David Strauss

Database servers‣ Insist on MySQL 5.1+ and InnoDB‣ Consider Percona builds and (eventually) MariaDB‣ Every Apache process generally needs at least one

connection available, and leave some headroom

Wed 2010-06-09

Page 119: Planning LAMP infrastructure

David Strauss

Database servers‣ Insist on MySQL 5.1+ and InnoDB‣ Consider Percona builds and (eventually) MariaDB‣ Every Apache process generally needs at least one

connection available, and leave some headroom‣ Tune the InnoDB buffer pool to at least half of RAM

Wed 2010-06-09

Page 120: Planning LAMP infrastructure

David Strauss

Database servers‣ Insist on MySQL 5.1+ and InnoDB‣ Consider Percona builds and (eventually) MariaDB‣ Every Apache process generally needs at least one

connection available, and leave some headroom‣ Tune the InnoDB buffer pool to at least half of RAM

No more than 8-12

cores

CPU

Wed 2010-06-09

Page 121: Planning LAMP infrastructure

David Strauss

Database servers‣ Insist on MySQL 5.1+ and InnoDB‣ Consider Percona builds and (eventually) MariaDB‣ Every Apache process generally needs at least one

connection available, and leave some headroom‣ Tune the InnoDB buffer pool to at least half of RAM

No more than 8-12

cores

CPUAs much as you can

afford (even RAM not used by MySQL caches

disk content)+

Memory

Wed 2010-06-09

Page 122: Planning LAMP infrastructure

David Strauss

Database servers‣ Insist on MySQL 5.1+ and InnoDB‣ Consider Percona builds and (eventually) MariaDB‣ Every Apache process generally needs at least one

connection available, and leave some headroom‣ Tune the InnoDB buffer pool to at least half of RAM

No more than 8-12

cores

CPUAs much as you can

afford (even RAM not used by MySQL caches

disk content)+

MemoryFast

+ Large+ Redundant

Disk

+Wed 2010-06-09

Page 123: Planning LAMP infrastructure

David Strauss

Database servers‣ Insist on MySQL 5.1+ and InnoDB‣ Consider Percona builds and (eventually) MariaDB‣ Every Apache process generally needs at least one

connection available, and leave some headroom‣ Tune the InnoDB buffer pool to at least half of RAM

No more than 8-12

cores

CPUAs much as you can

afford (even RAM not used by MySQL caches

disk content)+

MemoryFast

+ Large+ Redundant

Disk

+= 3000 queries/s

Wed 2010-06-09

Page 124: Planning LAMP infrastructure

David Strauss

Management server

Wed 2010-06-09

Page 125: Planning LAMP infrastructure

David Strauss

Management server‣ Nagios: service outage monitoring

Wed 2010-06-09

Page 126: Planning LAMP infrastructure

David Strauss

Management server‣ Nagios: service outage monitoring‣ Cacti: trend monitoring

Wed 2010-06-09

Page 127: Planning LAMP infrastructure

David Strauss

Management server‣ Nagios: service outage monitoring‣ Cacti: trend monitoring‣ Hudson: builds, deployment, and automation

Wed 2010-06-09

Page 128: Planning LAMP infrastructure

David Strauss

Management server‣ Nagios: service outage monitoring‣ Cacti: trend monitoring‣ Hudson: builds, deployment, and automation‣ Yum/Apt repo: cluster package distribution

Wed 2010-06-09

Page 129: Planning LAMP infrastructure

David Strauss

Management server‣ Nagios: service outage monitoring‣ Cacti: trend monitoring‣ Hudson: builds, deployment, and automation‣ Yum/Apt repo: cluster package distribution‣ Puppet/BCFG2/Chef: configuration management

Wed 2010-06-09

Page 130: Planning LAMP infrastructure

David Strauss

Management server‣ Nagios: service outage monitoring‣ Cacti: trend monitoring‣ Hudson: builds, deployment, and automation‣ Yum/Apt repo: cluster package distribution‣ Puppet/BCFG2/Chef: configuration management

Save Your Money

CPU

Wed 2010-06-09

Page 131: Planning LAMP infrastructure

David Strauss

Management server‣ Nagios: service outage monitoring‣ Cacti: trend monitoring‣ Hudson: builds, deployment, and automation‣ Yum/Apt repo: cluster package distribution‣ Puppet/BCFG2/Chef: configuration management

Save Your Money

CPU

Save Your Money+Memory

Wed 2010-06-09

Page 132: Planning LAMP infrastructure

David Strauss

Management server‣ Nagios: service outage monitoring‣ Cacti: trend monitoring‣ Hudson: builds, deployment, and automation‣ Yum/Apt repo: cluster package distribution‣ Puppet/BCFG2/Chef: configuration management

Save Your Money

CPU

Save Your Money+Memory

Slow+ Large

+ Redundant

Disk

+Wed 2010-06-09

Page 133: Planning LAMP infrastructure

David Strauss

Management server‣ Nagios: service outage monitoring‣ Cacti: trend monitoring‣ Hudson: builds, deployment, and automation‣ Yum/Apt repo: cluster package distribution‣ Puppet/BCFG2/Chef: configuration management

Save Your Money

CPU

Save Your Money+Memory

Slow+ Large

+ Redundant

Disk

+= good enough

Wed 2010-06-09

Page 134: Planning LAMP infrastructure

David Strauss

Assembling the numbers

Wed 2010-06-09

Page 135: Planning LAMP infrastructure

David Strauss

Assembling the numbers‣ Start with an architecture providing redundancy.

‣ Two servers, each running the whole stack

Wed 2010-06-09

Page 136: Planning LAMP infrastructure

David Strauss

Assembling the numbers‣ Start with an architecture providing redundancy.

‣ Two servers, each running the whole stack

‣ Increase the number of proxy caches based on anonymous and search engine traffic.

Wed 2010-06-09

Page 137: Planning LAMP infrastructure

David Strauss

Assembling the numbers‣ Start with an architecture providing redundancy.

‣ Two servers, each running the whole stack

‣ Increase the number of proxy caches based on anonymous and search engine traffic.

‣ Increase the number of web servers based on authenticated traffic.

Wed 2010-06-09

Page 138: Planning LAMP infrastructure

David Strauss

Assembling the numbers‣ Start with an architecture providing redundancy.

‣ Two servers, each running the whole stack

‣ Increase the number of proxy caches based on anonymous and search engine traffic.

‣ Increase the number of web servers based on authenticated traffic.

‣ Databases are harder to predict, but large sites should run them on at least two separate boxes with replication.

Wed 2010-06-09

Page 139: Planning LAMP infrastructure

David Strauss

Extreme measuresfor performanceand scalability

Wed 2010-06-09

Page 140: Planning LAMP infrastructure

David Strauss

When caching and search offloading isn’t enough

Wed 2010-06-09

Page 141: Planning LAMP infrastructure

David Strauss

When caching and search offloading isn’t enough‣ Some sites have intense custom page needs

‣ High proportion of authenticated users

‣ Lots of targeted content for anonymous users

Wed 2010-06-09

Page 142: Planning LAMP infrastructure

David Strauss

When caching and search offloading isn’t enough‣ Some sites have intense custom page needs

‣ High proportion of authenticated users

‣ Lots of targeted content for anonymous users

‣ Too much data to process real-time on an RDBMS

Wed 2010-06-09

Page 143: Planning LAMP infrastructure

David Strauss

When caching and search offloading isn’t enough‣ Some sites have intense custom page needs

‣ High proportion of authenticated users

‣ Lots of targeted content for anonymous users

‣ Too much data to process real-time on an RDBMS

‣ Data is so volatile that maintaing standard caches outweighs the overhead of regeneration

Wed 2010-06-09

Page 144: Planning LAMP infrastructure

David Strauss

Non-relational/NoSQL tools

Wed 2010-06-09

Page 145: Planning LAMP infrastructure

David Strauss

Non-relational/NoSQL tools‣ Most web applications can run well

on less-than-ACID persistence engines

Wed 2010-06-09

Page 146: Planning LAMP infrastructure

David Strauss

Non-relational/NoSQL tools‣ Most web applications can run well

on less-than-ACID persistence engines‣ In some cases, like MongoDB, easier to use than

SQL in addition to being higher performance

Wed 2010-06-09

Page 147: Planning LAMP infrastructure

David Strauss

Non-relational/NoSQL tools‣ Most web applications can run well

on less-than-ACID persistence engines‣ In some cases, like MongoDB, easier to use than

SQL in addition to being higher performance‣ Interested? You’ve already missed the tutorial.

Wed 2010-06-09

Page 148: Planning LAMP infrastructure

David Strauss

Non-relational/NoSQL tools‣ Most web applications can run well

on less-than-ACID persistence engines‣ In some cases, like MongoDB, easier to use than

SQL in addition to being higher performance‣ Interested? You’ve already missed the tutorial.

‣ In other cases, like Cassandra, considerably harder to use than SQL but massively scalable

Wed 2010-06-09

Page 149: Planning LAMP infrastructure

David Strauss

Non-relational/NoSQL tools‣ Most web applications can run well

on less-than-ACID persistence engines‣ In some cases, like MongoDB, easier to use than

SQL in addition to being higher performance‣ Interested? You’ve already missed the tutorial.

‣ In other cases, like Cassandra, considerably harder to use than SQL but massively scalable

‣ Current Erlang-based systems are neat but slow

Wed 2010-06-09

Page 150: Planning LAMP infrastructure

David Strauss

Non-relational/NoSQL tools‣ Most web applications can run well

on less-than-ACID persistence engines‣ In some cases, like MongoDB, easier to use than

SQL in addition to being higher performance‣ Interested? You’ve already missed the tutorial.

‣ In other cases, like Cassandra, considerably harder to use than SQL but massively scalable

‣ Current Erlang-based systems are neat but slow‣ Many require a special PHP extension,

at least for ideal performance

Wed 2010-06-09

Page 151: Planning LAMP infrastructure

David Strauss

Offline processing

Wed 2010-06-09

Page 152: Planning LAMP infrastructure

David Strauss

Offline processing‣ Gearman

‣ Primarily asynchronous job manager

Wed 2010-06-09

Page 153: Planning LAMP infrastructure

David Strauss

Offline processing‣ Gearman

‣ Primarily asynchronous job manager

‣ Hadoop

‣ MapReduce framework

Wed 2010-06-09

Page 154: Planning LAMP infrastructure

David Strauss

Offline processing‣ Gearman

‣ Primarily asynchronous job manager

‣ Hadoop

‣ MapReduce framework

‣ Traditional message queues

‣ ActiveMQ + Stomp is easy from PHP

‣ Allows you to build your own job manager

Wed 2010-06-09

Page 155: Planning LAMP infrastructure

Edge-side includes

Wed 2010-06-09

Page 156: Planning LAMP infrastructure

Edge-side includes

ESI Processor(Varnish, Akamai, other)

Wed 2010-06-09

Page 157: Planning LAMP infrastructure

Edge-side includes<html><body>

<esi:include href=“http://drupal.org/block/views/3” /></body></html>

ESI Processor(Varnish, Akamai, other)

Wed 2010-06-09

Page 158: Planning LAMP infrastructure

Edge-side includes<html><body>

<esi:include href=“http://drupal.org/block/views/3” /></body></html>

<div>My block HTML.

</div>ESI Processor

(Varnish, Akamai, other)

Wed 2010-06-09

Page 159: Planning LAMP infrastructure

Edge-side includes<html><body>

<esi:include href=“http://drupal.org/block/views/3” /></body></html>

<div>My block HTML.

</div>

<html><body>

<div>My block HTML.

</div></body></html>

ESI Processor(Varnish, Akamai, other)

Wed 2010-06-09

Page 160: Planning LAMP infrastructure

Edge-side includes‣ Blocks of HTML are

integrated into the page at the edge layer.

<html><body>

<esi:include href=“http://drupal.org/block/views/3” /></body></html>

<div>My block HTML.

</div>

<html><body>

<div>My block HTML.

</div></body></html>

ESI Processor(Varnish, Akamai, other)

Wed 2010-06-09

Page 161: Planning LAMP infrastructure

Edge-side includes‣ Blocks of HTML are

integrated into the page at the edge layer.

‣ Non-primary page content often occupies >50% of PHP execution time.

<html><body>

<esi:include href=“http://drupal.org/block/views/3” /></body></html>

<div>My block HTML.

</div>

<html><body>

<div>My block HTML.

</div></body></html>

ESI Processor(Varnish, Akamai, other)

Wed 2010-06-09

Page 162: Planning LAMP infrastructure

Edge-side includes‣ Blocks of HTML are

integrated into the page at the edge layer.

‣ Non-primary page content often occupies >50% of PHP execution time.

‣ Decouples block and page cache lifetimes

<html><body>

<esi:include href=“http://drupal.org/block/views/3” /></body></html>

<div>My block HTML.

</div>

<html><body>

<div>My block HTML.

</div></body></html>

ESI Processor(Varnish, Akamai, other)

Wed 2010-06-09

Page 163: Planning LAMP infrastructure

David Strauss

HipHop PHP

Wed 2010-06-09

Page 164: Planning LAMP infrastructure

David Strauss

HipHop PHP‣ Compiles PHP to a C++-based binary

‣ Integrated HTTP server

Wed 2010-06-09

Page 165: Planning LAMP infrastructure

David Strauss

HipHop PHP‣ Compiles PHP to a C++-based binary

‣ Integrated HTTP server

‣ Supports a subset of PHP and extensions

Wed 2010-06-09

Page 166: Planning LAMP infrastructure

David Strauss

HipHop PHP‣ Compiles PHP to a C++-based binary

‣ Integrated HTTP server

‣ Supports a subset of PHP and extensions

‣ Requires an organizational commitment to building, testing, and deploying on HipHop

Wed 2010-06-09

Page 167: Planning LAMP infrastructure

David Strauss

HipHop PHP‣ Compiles PHP to a C++-based binary

‣ Integrated HTTP server

‣ Supports a subset of PHP and extensions

‣ Requires an organizational commitment to building, testing, and deploying on HipHop

‣ Scott MacVicar has a presentation on HipHop later today at 16:00.

Wed 2010-06-09

Page 168: Planning LAMP infrastructure

Credits

Cluster Problems

Wed 2010-06-09

Page 169: Planning LAMP infrastructure

David Strauss

Server failure

Wed 2010-06-09

Page 170: Planning LAMP infrastructure

David Strauss

Server failure‣ Load balancers can remove broken or overloaded

application reverse proxy caches.

Wed 2010-06-09

Page 171: Planning LAMP infrastructure

David Strauss

Server failure‣ Load balancers can remove broken or overloaded

application reverse proxy caches.‣ Reverse proxy caches like Varnish can

automatically use only functional application servers.

Wed 2010-06-09

Page 172: Planning LAMP infrastructure

David Strauss

Server failure‣ Load balancers can remove broken or overloaded

application reverse proxy caches.‣ Reverse proxy caches like Varnish can

automatically use only functional application servers.

‣ Memcached clients automatically handle failure.

Wed 2010-06-09

Page 173: Planning LAMP infrastructure

David Strauss

Server failure‣ Load balancers can remove broken or overloaded

application reverse proxy caches.‣ Reverse proxy caches like Varnish can

automatically use only functional application servers.

‣ Memcached clients automatically handle failure.‣ Virtual service IP management tools like

heartbeat2 can manage which MySQL servers receive connections to automate failover.

Wed 2010-06-09

Page 174: Planning LAMP infrastructure

David Strauss

Server failure‣ Load balancers can remove broken or overloaded

application reverse proxy caches.‣ Reverse proxy caches like Varnish can

automatically use only functional application servers.

‣ Memcached clients automatically handle failure.‣ Virtual service IP management tools like

heartbeat2 can manage which MySQL servers receive connections to automate failover.

‣ Conclusion: Each layer intelligently monitors and uses the servers beneath it.

Wed 2010-06-09

Page 175: Planning LAMP infrastructure

David Strauss

Cluster coherency

Wed 2010-06-09

Page 176: Planning LAMP infrastructure

David Strauss

Cluster coherency‣ Systems that run properly on single boxes may

lose coherency when run on a networked cluster.

Wed 2010-06-09

Page 177: Planning LAMP infrastructure

David Strauss

Cluster coherency‣ Systems that run properly on single boxes may

lose coherency when run on a networked cluster.

‣ Some caches, like APC’s object cache, have no ability to handle network-level coherency. (APC’s opcode cache is safe to use on clusters, though.)

Wed 2010-06-09

Page 178: Planning LAMP infrastructure

David Strauss

Cluster coherency‣ Systems that run properly on single boxes may

lose coherency when run on a networked cluster.

‣ Some caches, like APC’s object cache, have no ability to handle network-level coherency. (APC’s opcode cache is safe to use on clusters, though.)

‣ memcached, if misconfigured, can hash values inconsistently across the cluster, resulting in different servers using different memcached instances for the same keys.

Wed 2010-06-09

Page 179: Planning LAMP infrastructure

David Strauss

Cluster coherency‣ Systems that run properly on single boxes may

lose coherency when run on a networked cluster.

‣ Some caches, like APC’s object cache, have no ability to handle network-level coherency. (APC’s opcode cache is safe to use on clusters, though.)

‣ memcached, if misconfigured, can hash values inconsistently across the cluster, resulting in different servers using different memcached instances for the same keys.

‣ Session coherency issues can be helped with load balancer affinity or storage in memcached

Wed 2010-06-09

Page 180: Planning LAMP infrastructure

David Strauss

Cache regeneration races

Wed 2010-06-09

Page 181: Planning LAMP infrastructure

David Strauss

Cache regeneration races‣ Downside to network cache coherency:

synched expiration

Wed 2010-06-09

Page 182: Planning LAMP infrastructure

David Strauss

Cache regeneration races‣ Downside to network cache coherency:

synched expiration

‣ Requires a locking framework (like ZooKeeper)

Wed 2010-06-09

Page 183: Planning LAMP infrastructure

David Strauss

Cache regeneration races‣ Downside to network cache coherency:

synched expiration

‣ Requires a locking framework (like ZooKeeper)

Old Cached Item

Wed 2010-06-09

Page 184: Planning LAMP infrastructure

David Strauss

Cache regeneration races‣ Downside to network cache coherency:

synched expiration

‣ Requires a locking framework (like ZooKeeper)

Old Cached Item

Time

Wed 2010-06-09

Page 185: Planning LAMP infrastructure

David Strauss

Cache regeneration races‣ Downside to network cache coherency:

synched expiration

‣ Requires a locking framework (like ZooKeeper)

Old Cached Item

TimeExpiration

Wed 2010-06-09

Page 186: Planning LAMP infrastructure

David Strauss

Cache regeneration races‣ Downside to network cache coherency:

synched expiration

‣ Requires a locking framework (like ZooKeeper)

Old Cached Item

TimeExpiration

{All servers regenerating the item.

Wed 2010-06-09

Page 187: Planning LAMP infrastructure

David Strauss

Cache regeneration races‣ Downside to network cache coherency:

synched expiration

‣ Requires a locking framework (like ZooKeeper)

Old Cached Item

TimeExpiration

New Cached Item{All servers regenerating the item.

Wed 2010-06-09

Page 188: Planning LAMP infrastructure

David Strauss

Broken replication

Wed 2010-06-09

Page 189: Planning LAMP infrastructure

David Strauss

Broken replication

‣ MySQL slave servers get out of synch, fall further behind

Wed 2010-06-09

Page 190: Planning LAMP infrastructure

David Strauss

Broken replication

‣ MySQL slave servers get out of synch, fall further behind

‣ No (sane) method of automated recovery

Wed 2010-06-09

Page 191: Planning LAMP infrastructure

David Strauss

Broken replication

‣ MySQL slave servers get out of synch, fall further behind

‣ No (sane) method of automated recovery

‣ Only solvable with good monitoring and recovery procedures

Wed 2010-06-09

Page 192: Planning LAMP infrastructure

David Strauss

Broken replication

‣ MySQL slave servers get out of synch, fall further behind

‣ No (sane) method of automated recovery

‣ Only solvable with good monitoring and recovery procedures

‣ Can automate DB slave blacklisting from use,but requires cluster management tools

Wed 2010-06-09

Page 193: Planning LAMP infrastructure

All content in this presentation, except where noted otherwise, is Creative Commons Attribution-ShareAlike 3.0 licensed and copyright 2009 Four Kitchen Studios, LLC.

Wed 2010-06-09

Page 194: Planning LAMP infrastructure

David Strauss

DrupalCamp StockholmPresentation Ended Here

Wed 2010-06-09

Page 195: Planning LAMP infrastructure

Credits

Managing the Cluster

Wed 2010-06-09

Page 196: Planning LAMP infrastructure

Credits

The problem

Application Server

Application Server

Application Server

Application Server

Application Server

Software and Configuration

Objectives:Fast, atomic deployment and rollbackMinimize single points of failure and contentionRestart servicesIntegrate with version control systems

Wed 2010-06-09

Page 197: Planning LAMP infrastructure

Credits

Manual updates and deployment

Application Server

Application Server

Application Server

Application Server

Application Server

Human Human Human Human Human

Why not: slow deployment,non-atomic/difficult rollbacks

Wed 2010-06-09

Page 198: Planning LAMP infrastructure

Credits

Shared storageApplication Server

Application Server

Application Server

Application Server

Application Server

NFS

Why not: single point of contention and failure

Wed 2010-06-09

Page 199: Planning LAMP infrastructure

Credits

rsync

Application Server

Application Server

Application Server

Application Server

Application Server

Synchronizedwith rsync

Why not: non-atomic, does not manage services

Wed 2010-06-09

Page 200: Planning LAMP infrastructure

Credits

Capistrano

Application Server

Application Server

Application Server

Application Server

Application Server

Deployed withCapistrano

Capistrano provides near-atomic deployment,service restarts, automated rollback, test automation, and version control integration (tagged releases).

Wed 2010-06-09

Page 201: Planning LAMP infrastructure

Credits

Multistage deployment

Application Server

Application Server

Application Server

Application Server

Application Server

Deployed withCapistrano

Development Integration

Deployed withCapistrano

Staging

Deployed withCapistrano

Deploymentscan be staged.

cap staging deploycap production deploy

Wed 2010-06-09

Page 202: Planning LAMP infrastructure

Credits

But your application isn’t the only thing to manage.

Wed 2010-06-09

Page 203: Planning LAMP infrastructure

Credits

Beneath the application

Application Server

Application Server

Application Server

Application Server

Application Server

cfengine and bcfg2 are popularcluster-level system configuration tools.

Reverse Proxy Cache

DatabaseCluster-level configuration

Cluster management applies to package management, updates, and software configuration.

Wed 2010-06-09

Page 204: Planning LAMP infrastructure

Credits

System configuration management‣ Deploys and updates packages, cluster-wide or

selectively.

‣ Manages arbitrary text configuration files

‣ Analyzes inconsistent configurations (and converges them)

‣ Manages device classes (app. servers, database servers, etc.)

‣ Allows confident configuration testing on a staging server.

Wed 2010-06-09

Page 205: Planning LAMP infrastructure

Credits

All on the management box

Management {

Development

Integration

Staging

Deployment Tools

Monitoring

Wed 2010-06-09

Page 206: Planning LAMP infrastructure

Credits

Monitoring

Wed 2010-06-09

Page 207: Planning LAMP infrastructure

David Strauss

Types of monitoringFailure Capacity/Load

Analyzing Downtime

Viewing Failover

Troubleshooting

Notification

Analyzing Trends

Predicting Load

Checking Results of Configuration

and Software Changes

Wed 2010-06-09

Page 208: Planning LAMP infrastructure

Credits

Everyone needs both.

Wed 2010-06-09

Page 209: Planning LAMP infrastructure

David Strauss

What to use

Failure/Uptime Capacity/Load

Nagios

Hyperic

Cacti

Munin

Wed 2010-06-09

Page 210: Planning LAMP infrastructure

David Strauss

Nagios‣ Highly recommended.

‣ Used by Four Kitchens and Tag1 Consulting for client work, Drupal.org, Wikipedia, etc.

‣ Easy to install on CentOS 5 using EPEL packages.

‣ Easy to install nrpe agents to monitor diverse services.

‣ Can notify administrators on failure.

‣ We use this on Drupal.org

Wed 2010-06-09

Page 211: Planning LAMP infrastructure

Credits

Cacti

‣ Highly annoying to set up.

‣ One instance generally collects all statistics.(No “agents” on the systems being monitored.)

‣ Provides flexible graphs that can be customized on demand.

Wed 2010-06-09

Page 212: Planning LAMP infrastructure

Credits

Munin‣ Fairly easy to set up.

‣ One instance generally collects all statistics.(No “agents” on the systems being monitored.)

‣ Provides static graphs that cannot be customized.

Wed 2010-06-09

Page 213: Planning LAMP infrastructure

David Strauss

PressflowMake Drupal sites scale by upgrading corewith a compatible, powerful replacement.

Wed 2010-06-09

Page 214: Planning LAMP infrastructure

David Strauss

Common large-site issues‣ Drupal core requires patching to effectively

support the advanced scalability techniques discussed here.

‣ Patches often conflict and have to be reapplied with each Drupal upgrade.

‣ The original patches are often unmaintained.

‣ Sites stagnate, running old, insecure versions of Drupal core because updating is too difficult.

Wed 2010-06-09

Page 215: Planning LAMP infrastructure

David Strauss

What is Pressflow?‣ Pressflow is a derivative of Drupal core that

integrates the most popular performance and scalability enhancements.

‣ Pressflow is completely compatible with existing Drupal 5 and 6 modules, both standard and custom.

‣ Pressflow installs as a drop-in replacement for standard Drupal.

‣ Pressflow is free as long as the matching version of Drupal is also supported by the community.

Wed 2010-06-09

Page 216: Planning LAMP infrastructure

David Strauss

What are the enhancements?‣ Reverse proxy support

‣ Database replication support

‣ Lower database and session management load

‣ More efficient queries

‣ Testing and optimization by Four Kitchenswith standard high-performance softwareand hardware configuration

‣ Industry-leading scalability supportby Four Kitchens and Tag1 Consulting

Wed 2010-06-09

Page 217: Planning LAMP infrastructure

David Strauss

Four Kitchens + Tag1

‣ Provide the development, support, scalability, and performance services behind Pressflow

‣ Comprise most members of the Drupal.org infrastructure team

‣ Have the most experience scaling Drupal sitesof all sizes and all types

Wed 2010-06-09

Page 218: Planning LAMP infrastructure

David Strauss

Ready to scale?‣ Learn more about Pressflow:

‣ Pick up pamphlets in the lobby

‣ Request Pressflow releases at fourkitchens.com

‣ Get the help you need to make it happen:

‣ Talk to me (David) or Todd here at DrupalCamp

‣ Email [email protected]

Wed 2010-06-09