Usenix LISA 2012 - Choosing a Proxy

Post on 08-May-2015

7.617 views 0 download

description

My talk on choosing an HTTP proxy cache at Usenix LISA 2012.

Transcript of Usenix LISA 2012 - Choosing a Proxy

Choosing a Proxy-

Don’t roll the D20!

Leif HedstromCisco WebEx

Who am I?

• Unix developer since 1985• Yeah, I’m really that old, I learned Unix on BSD 2.9• Long time SunOS/Solaris/Linux user

• Mozilla committer (but not active now)• VP of Apache Traffic Server PMC• ASF member• Overall hacker, geek and technology addict

zwoop@apache.org@zwoop

+lhedstrom

So which proxy cache should you choose?

Plenty of Proxy Servers

PerlBal

And plenty of “reliable” sources…

Answer: the one that solves your problem!

http://mihaelasharkova.files.wordpress.com/2011/05/5steploop2.jpg

But first…

• While you are still awake, and the coffee is fresh:

My crash course in HTTP proxy and caching!

Forward Proxy

Reverse Proxy

Intercepting Proxy

Why Cache is King

• The content fastest served is the data the user already has locally on his computer/browser– This is near zero cost and zero latency!

• The speed of light is still a limiting factor– Reduce the latency -> faster page loads

• Serving out of cache is computationally cheap– At least compared to e.g. PHP or any other higher

level page generation system– It’s easy to scale caches horizontally

Choosing an intermediary

Plenty of Proxy Servers

PerlBal

Plenty of Free Proxy Servers

PerlBal

Plenty of Free Proxy Servers

PerlBal

Plenty of Free Caching Proxy Servers

Choosing an intermediary

The problem

• You can basically not buy a computer today with less than 2 CPUs or cores

• Things will only get “worse”!– Well, really, it’s getting better

• Typical server deployments today have at least 8 – 16 cores– How many of those can you actually use??– And are you using them efficiently??

• NUMA turns out to be kind of a bitch…

Solution 1: Multi-threading

Problems with multi-threading

• It’s a wee bit difficult to get it right!

http://www.flickr.com/photos/stuartpilbrow/3345896050/

Problems with multi-threading

Solution 2: Event Processing

Problems with Event Processing

• It hates blocking APIs and calls!– Hating it back doesn’t help :/

• Still somewhat complicated• It doesn’t scale on SMP by

itself

Where are we at ?

Apache TS Nginx Squid VarnishProcesses 1 1 - <n> 1 - <n> 1

Threads Based on cores 1 1 Lots

Evented Yes Yes Yes Yes *)

*) Can use blocking calls, with (large) thread pool

Proxy Cache test setup• AWS Large instances, 2 CPUs• All on RCF 1918 network (“internal” net)• 8GB RAM• Access logging enabled to disk (except on Varnish)• Software versions

– Linux v3.2.0– Traffic Server v3.3.1– Nginx v1.3.9– Squid v3.2.5– Varnish v3.0.3

• Minimal configuration changes• Cache a real (Drupal) site

ATS configuration

• etc/traffficserver/remap.config:

map / http://10.118.154.58• etc/trafficserver/records.config:

CONFIG proxy.config.http.server_ports STRING 80

Nginx configuration try 1, basically defaults (broken, don’t use)

worker_processes 2;access_log logs/access.log main;

proxy_cache_path /mnt/nginx_cache levels=1:2 keys_zone=my-cache:8m \ max_size=16384m inactive=600m;proxy_temp_path /mnt/nginx_temp;

server { listen 80;

location / { proxy_pass http://10.83.145.47/; proxy_cache my-cache;}

Nginx configuration try 2 (works but really slow, 10x slower)

worker_processes 2;access_log logs/access.log main;

proxy_cache_path /mnt/nginx_cache levels=1:2 keys_zone=my-cache:8m \ max_size=16384m inactive=600m;proxy_temp_path /mnt/nginx_temp;

gzip on;server { listen 80;

location / { proxy_pass http://10.83.145.47/; proxy_cache my-cache; proxy_set_header Accept-Encoding "";}

Nginx configuration try 3 (works and reasonably fast, but WTF!)

worker_processes 2;access_log logs/access.log main;

proxy_cache_path /mnt/nginx_cache levels=1:2 keys_zone=my-cache:8m \ max_size=16384m inactive=600m;proxy_temp_path /mnt/nginx_temp;

server { listen 80; set $ae ""; if ($http_accept_encoding ~* gzip) { set $ae "gzip"; }

location / { proxy_pass http://10.83.145.47/; proxy_cache my-cache; proxy_set_header If-None-Match ""; proxy_set_header If-Modified-Since ""; proxy_set_header Accept-Encoding $ae; proxy_cache_key $uri$is_args$args$ae; }

location ~ /purge_it(/.*) { proxy_cache_purge example.com $1$is_args$args$myae }

Thanks to Chris Ueland at NetDNA for the snippet

Squid configuration

http_port 80 accelhttp_access allow allcache_mem 4096 MBworkers 2memory_cache_shared oncache_dir ufs /mnt/squid 100 16 256cache_peer 10.83.145.47 parent 80 0 no-query originserver

Varnish configuration

backend default { .host = "10.83.145.47”; .port = "80";}

Performance AWS 8KB HTML (gzip)

ATS 3.3.1 Nginx 1.3.9 hack Squid 3.2.5 Varnish 3.0.3 Varnish 3.0.3 varnishlog -w

0

1,000

2,000

3,000

4,000

5,000

6,000

7,000

8,000

9,000

10,000

0.0

5.0

10.0

15.0

20.0

25.0

7.40 7.92

12.16

9.20

22.81

QPS Latency

Thro

ughp

ut

Tim

e to

firs

t re

spon

se (m

s)

Performance AWS 8KB HTML (gzip)

ATS 3.3.1 Nginx 1.3.9 hack Squid 3.2.5 Varnish 3.0.3 Varnish 3.0.3 varnishlog -w

0

1,000

2,000

3,000

4,000

5,000

6,000

7,000

8,000

9,000

10,000

0.00%

10.00%

20.00%

30.00%

40.00%

50.00%

60.00%

70.00%

80.00%

90.00%

100.00%

63%60%

82%

73%

83%

QPS CPU usage

Thro

ughp

ut

CPU

use

d (d

ual c

ore)

Performance AWS 500 bytes JPG

ATS 3.3.1 Nginx 1.3.9 hack Squid 3.2.5 Varnish 3.0.3 Varnish 3.0.3 varnishlog -w

0

2,000

4,000

6,000

8,000

10,000

12,000

14,000

16,000

0.0

2.0

4.0

6.0

8.0

10.0

12.0

14.0

16.0

18.0

4.955.93

9.10

7.27

16.41

QPS Latency

Thro

ughp

ut

Tim

e to

firs

t re

spon

se (m

s)

Performance AWS 500 bytes JPG

ATS 3.3.1 Nginx 1.3.9 hack Squid 3.2.5 Varnish 3.0.3 Varnish 3.0.3 varnishlog -w

0

2,000

4,000

6,000

8,000

10,000

12,000

14,000

16,000

0.00%

10.00%

20.00%

30.00%

40.00%

50.00%

60.00%

70.00%

80.00%

90.00%

100.00%

78% 77%

84%

77% 76%

QPS CPU usage

Thro

ughp

ut

CPU

use

d (d

ual c

ore)

Choosing an intermediary

RFC 2616 is not optional!

• Neither is the new BIS revision!• Understanding HTTP and how it relates to

Proxy and Caching is important– Or you will get it wrong! I promise.

How things can go wrong: Vary!$ curl -D - -o /dev/null -s --compress http://10.118.73.168/HTTP/1.1 200 OKServer: nginx/1.3.9Date: Wed, 12 Dec 2012 18:00:48 GMTContent-Type: text/html; charset=utf-8Content-Length: 8051Connection: keep-aliveX-Powered-By: PHP/5.4.9X-Drupal-Cache: HITEtag: "1355334762-0-gzip"Content-Language: enX-Generator: Drupal 7 (http://drupal.org)Cache-Control: public, max-age=900Last-Modified: Wed, 12 Dec 2012 17:52:42 +0000Expires: Sun, 19 Nov 1978 05:00:00 GMTVary: Cookie,Accept-EncodingContent-Encoding: gzip

How things can go wrong: Vary!$ curl -D - -o /dev/null -s http://10.118.73.168/HTTP/1.1 200 OKServer: nginx/1.3.9Date: Wed, 12 Dec 2012 18:00:57 GMTContent-Type: text/html; charset=utf-8Content-Length: 8051Connection: keep-aliveX-Powered-By: PHP/5.4.9X-Drupal-Cache: HITEtag: "1355334762-0-gzip"Content-Language: enX-Generator: Drupal 7 (http://drupal.org)Cache-Control: public, max-age=900Last-Modified: Wed, 12 Dec 2012 17:52:42 +0000Expires: Sun, 19 Nov 1978 05:00:00 GMTVary: Cookie,Accept-EncodingContent-Encoding: gzip

EPIC FAIL!

Note: no gzip support

What type of proxy do you need?

• Of our candidates, only two fully supports all proxy modes!

CoAdvisor HTTP protocol quality tests for reverse proxies

ATS 3.3.1

Nginx 1.3.9

Squid 3.2.5

Varnish 3.0.3

0 100 200 300 400 500 600

Failures Violations Success

49%

81%

51%

68%

CoAdvisor HTTP protocol quality tests for reverse proxies

ATS 3.3.1

Nginx 1.3.9

Squid 3.2.5

Varnish 3.0.3

0 100 200 300 400 500 600

Failures Violations Success

25%

6%

27%

15%

Choosing an intermediary

My subjective opinions

ATS – The good

• Good HTTP/1.1 support, including SSL• Tunes itself very well to the system / hardware

at hand• Excellent cache features and performance

– Raw disk cache is fast and resilient• Extensible plugin APIs, quite a few plugins• Used and developed by some of the largest

Web companies in the world

ATS – The bad

• Load balancing is incredibly lame• Seen as difficult to setup (I obviously disagree)• Developer community is still too small• Code is complicated

– By necessity? Maybe …

ATS – The ugly

• Too many configuration files!• There’s still legacy code that has to be

replaced or removed• Not a whole lot of commercial support

– But there’s hope (e.g. OmniTI recently announced packaged support)

Nginx – The good

• Easy to understand the code base, and software architecture– Lots of plugins available, including SPDY

• Excellent Web and Application server– E.g. Nginx + fpm (fcgi) + PHP is the awesome,

according to a very reputable source• Commercial support available from the people

who wrote and know it best. Huge!

Nginx – The bad

• Adding extensions implies rebuilding the binary

• By far the most configurations required “out of the box” to even do anything remotely useful

• It does not make good attempts to tune itself to the system

• No good support for conditional requests

Nginx – The ugly

• The cache is a joke! Really• The protocol support as an HTTP proxy is

rather poor. It fares the worst in the tests, and can be outright wrong if you are not very careful

• From docs: “nginx does not handle "Vary" headers when caching.” Seriously?

Squid – The Good

• Has by far the most HTTP features of the bunch. I mean, by far, nothing comes even close

• It also is the best HTTP conformant proxy today. It has the best scores in the CoAdvisor tests, by a wide margin

• The features are mature, and used pretty much everywhere

• Works pretty well out of the box

Squid – The Bad

• Old code base• Cache is not particularly efficient• Has traditionally been prone to instability• Complex configurations

– At least IMO, I hate it

Squid – The Ugly

• SMP is quite an afterthought– Duct tape

• Why spend so many years rewriting from v2.x to v3.x without actually addressing some of the real problems? Feels like a boat has been missed…

• Not very extensible– Typically you write external “helper” processes, similar

to fcgi. This is not particularly flexible, nor powerful (can not do everything you’d want as a helper, so might have to rewrite the Squid core)

Varnish – The Good

• VCL• And did I mention VCL? Pure genius!• Very clever logging mechanism• ESI is cool, even with its limited subset

– Not unique to Varnish though• Support from several good commercial

entities

Varnish – The Bad

• Letting the kernel do the hard work might seem like a good idea on paper, but perhaps not so great in the real world. But lets not go into a BSD vs Linux kernel war …

• Persistent caching seems like an after thought at best

• No good support for conditional requests• What impact does “real” logging have on

performance?

Varnish – The Ugly

• There are a lot of threads in this puppy!• No SSL. And presumably, there never will be?

– So what happens with SPDY / HTTP2 ?• Protocol support is weak, without a massive

amount of VCL.• And, you probably will need a PhD in VCL!

– There’s a lot of VCL hacking to do to get it to behave well

Summary

• Please understand your problem`– Don’t listen to @zwoop on twitter…

• Performance in itself is rarely a key differentiator; latency, features and correctness are

• But most important, use a proxy, preferably a good one, if you run a serious web server

Performance AWS 8KB HTML (gzip)

If it ain’t broken, don’t fix itBut by all means, make it less sucky!

However, when all you have is a hammer…

http://www.flickr.com/photos/aai/6936657289/