Link-State 2012 - Fred Hatfull - The Dirty Work: Scaling Out Websites With Your Own Two Hands
Transcript of Link-State 2012 - Fred Hatfull - The Dirty Work: Scaling Out Websites With Your Own Two Hands
-
7/31/2019 Link-State 2012 - Fred Hatfull - The Dirty Work: Scaling Out Websites With Your Own Two Hands
1/87
Author:Fred Hatfull
Scaling Out Websites With Your Own Two HandsThe Dirty Work
Date:CWRU, October 27 2012
-
7/31/2019 Link-State 2012 - Fred Hatfull - The Dirty Work: Scaling Out Websites With Your Own Two Hands
2/87
PAGE:
The Dirty Work
CWRU Alumnus
Software Developer at Yelp
Infrastructure Engineer
Availability
Performance
Productivity
CWRU, 27 October 2012 - Fred Hatfull
Scaling Out Websites With Your Own Two Hands
Who Am I?
2
-
7/31/2019 Link-State 2012 - Fred Hatfull - The Dirty Work: Scaling Out Websites With Your Own Two Hands
3/87
PAGE:
The Dirty Work
Whats Yelp?
Scaling the Backend
Accelerating Content Delivery
Monitoring Performance
CWRU, 27 October 2012 - Fred Hatfull
Scaling Out Websites With Your Own Two Hands
Overview
3
-
7/31/2019 Link-State 2012 - Fred Hatfull - The Dirty Work: Scaling Out Websites With Your Own Two Hands
4/87
PAGE:
The Dirty Work
Help consumers find great local businesses
Help businesses owners find more customers
CWRU, 27 October 2012 - Fred Hatfull
Scaling Out Websites With Your Own Two Hands
Whats Yelp?
4
Numbers current as of 2012Q2.
-
7/31/2019 Link-State 2012 - Fred Hatfull - The Dirty Work: Scaling Out Websites With Your Own Two Hands
5/87
PAGE:
The Dirty Work
CWRU, 27 October 2012 - Fred Hatfull
Scaling Out Websites With Your Own Two Hands
Whats Yelp?
5
Were hiring! yelp.com/careers
-
7/31/2019 Link-State 2012 - Fred Hatfull - The Dirty Work: Scaling Out Websites With Your Own Two Hands
6/87
PAGE:
The Dirty Work
Five Sites
www.yelp.com biz.yelp.com
api.yelp.com
m.yelp.com
admin.yelp.com
CWRU, 27 October 2012 - Fred Hatfull
Scaling Out Websites With Your Own Two Hands
Whats Yelp?
6
www - consumer facing websitebiz - business owners website for managing ads, biz page, etcapi - public and private APIs (mobile apps, too)m - mobile site (web browsers on mobile devices)admin - administrative tools
http://www.yelp.com/http://www.yelp.com/http://www.yelp.com/http://www.yelp.com/http://www.yelp.com/http://www.yelp.com/http://www.yelp.com/http://www.yelp.com/http://www.yelp.com/http://www.yelp.com/http://www.yelp.com/ -
7/31/2019 Link-State 2012 - Fred Hatfull - The Dirty Work: Scaling Out Websites With Your Own Two Hands
7/87
PAGE:
The Dirty Work
Numerous Open-Source Projects:
mrjob firefly
testify
tron
many more: github.com/Yelp
CWRU, 27 October 2012 - Fred Hatfull
Scaling Out Websites With Your Own Two Hands
Whats Yelp?
7
mrjob - python Map/Reduce frameworkfirefly - time-series statistics graphingtestify - more python test frameworktron - distributed cron
-
7/31/2019 Link-State 2012 - Fred Hatfull - The Dirty Work: Scaling Out Websites With Your Own Two Hands
8/87
PAGE:
The Dirty Work
CWRU, 27 October 2012 - Fred Hatfull
Scaling Out Websites With Your Own Two Hands
8
-
7/31/2019 Link-State 2012 - Fred Hatfull - The Dirty Work: Scaling Out Websites With Your Own Two Hands
9/87
PAGE:
The Dirty Work
Like most websites, it all started with a single server...
CWRU, 27 October 2012 - Fred Hatfull
Scaling Out Websites With Your Own Two Hands
In the Beginning...
9
You have probably set up a website like this....
-
7/31/2019 Link-State 2012 - Fred Hatfull - The Dirty Work: Scaling Out Websites With Your Own Two Hands
10/87
PAGE:
The Dirty Work
CWRU, 27 October 2012 - Fred Hatfull
Scaling Out Websites With Your Own Two Hands
In the Beginning
10
Apache, Python, mySQL, Linux.
-
7/31/2019 Link-State 2012 - Fred Hatfull - The Dirty Work: Scaling Out Websites With Your Own Two Hands
11/87
PAGE:
The Dirty Work
CWRU, 27 October 2012 - Fred Hatfull
Scaling Out Websites With Your Own Two Hands
In the Beginning
11
16.32.64.128
No load balancer, no internal DNS, no web framework. Just us, mod_python, and mySQL.
-
7/31/2019 Link-State 2012 - Fred Hatfull - The Dirty Work: Scaling Out Websites With Your Own Two Hands
12/87
PAGE:
The Dirty Work
CWRU, 27 October 2012 - Fred Hatfull
Scaling Out Websites With Your Own Two Hands
Up and Running
12
16.32.64.128
web1
web2
db1
* Names changed to protect the innocent
Trafc starts picking up. One box doesnt cut it any more... time to scale out horizontally.Adding webs is the low hanging fruit.
-
7/31/2019 Link-State 2012 - Fred Hatfull - The Dirty Work: Scaling Out Websites With Your Own Two Hands
13/87
PAGE:
The Dirty Work
CWRU, 27 October 2012 - Fred Hatfull
Scaling Out Websites With Your Own Two Hands
Up and Running
13
16.32.64.128
web1
db1
* Names changed to protect the innocent
web2
web3
web4
web5
web6
Ok, we begin to hit the limits of horizontal scaling. The webapp can always benefit fromhaving more machines (+HAproxy)
Th Di t W k
-
7/31/2019 Link-State 2012 - Fred Hatfull - The Dirty Work: Scaling Out Websites With Your Own Two Hands
14/87
PAGE:
The Dirty Work
CWRU, 27 October 2012 - Fred Hatfull
Scaling Out Websites With Your Own Two Hands
Up and Running
14
16.32.64.128
web1
db1
* Names changed to protect the innocent
web2
web3
web4
web5
web6
!
But the database is now under very heavy load!
Th Di t W k
-
7/31/2019 Link-State 2012 - Fred Hatfull - The Dirty Work: Scaling Out Websites With Your Own Two Hands
15/87
PAGE:
The Dirty Work
Options
Find a faster data store?
Use separate databases?
Sharding?
Replication?
CWRU, 27 October 2012 - Fred Hatfull
Scaling Out Websites With Your Own Two Hands
Scaling the Database
15
There are a few classic options for scaling up mySQL. We could switch datastores... maybemySQL is just slow? How about Oracle? MSSQL? etc. We could introduce an entirely newdatabase machine with its own mySQL instance... basically just a clone of the current one.Both DBs dont know anything about each other. We could shard the database by havingmultiple machines where each is responsible for a certain set of keys. Or we could just
replicate the current database to accomodate more trafc, and hope writes dont getoverwhelmin .
TheDirtyWork
-
7/31/2019 Link-State 2012 - Fred Hatfull - The Dirty Work: Scaling Out Websites With Your Own Two Hands
16/87
PAGE:
The Dirty Work
Options
Find a faster data store?
Use separate databases?
Sharding?
Replication?
CWRU, 27 October 2012 - Fred Hatfull
Scaling Out Websites With Your Own Two Hands
Scaling the Database
16
Ok. Its 2004... noSQL isnt around, really, and our data is pretty relational anyway. mySQL islooking like the fastest store that meets our requirements.
TheDirtyWork
-
7/31/2019 Link-State 2012 - Fred Hatfull - The Dirty Work: Scaling Out Websites With Your Own Two Hands
17/87
PAGE:
The Dirty Work
Options
Find a faster data store?
Use separate databases?
Sharding?
Replication?
CWRU, 27 October 2012 - Fred Hatfull
Scaling Out Websites With Your Own Two Hands
Scaling the Database
17
We could just set up an entirely new database machine and run the new database in parallel.Have to make two read queries instead of one now, figure out where to send writes, andkeeping schemas in sync is a nightmare. Clearly not scalable.
TheDirtyWork
-
7/31/2019 Link-State 2012 - Fred Hatfull - The Dirty Work: Scaling Out Websites With Your Own Two Hands
18/87
PAGE:
The Dirty Work
Options
Find a faster data store?
Use separate databases?
Sharding?
Replication
CWRU, 27 October 2012 - Fred Hatfull
Scaling Out Websites With Your Own Two Hands
Scaling the Database
18
We are actually doing a form of sharding here, but its not quite the conventional master-master sharding that usually comes to mind. master-master would work, but its a huge painto get right and requires a lot of eort to make sure keys go to and are retrieved from theshards where they belong. Our read-heavy trafc patterns make us an ideal candidate forreplication to massively increase read capacity while reducing engineering overhead.
TheDirtyWork
-
7/31/2019 Link-State 2012 - Fred Hatfull - The Dirty Work: Scaling Out Websites With Your Own Two Hands
19/87
PAGE:
The Dirty Work
CWRU, 27 October 2012 - Fred Hatfull
Scaling Out Websites With Your Own Two Hands
19
-
7/31/2019 Link-State 2012 - Fred Hatfull - The Dirty Work: Scaling Out Websites With Your Own Two Hands
20/87
The Dirty Work
-
7/31/2019 Link-State 2012 - Fred Hatfull - The Dirty Work: Scaling Out Websites With Your Own Two Hands
21/87
PAGE:
y
CWRU, 27 October 2012 - Fred Hatfull
Scaling Out Websites With Your Own Two Hands
Database Replication
21
db1
webs
Reads/Writes
readdb1
Reads
ReplicatedWrites
Simple two-database replication scheme. All write trafc hits the master, `db1`. Read trafcis split between the master and a read-only database. When writes come through the masterdatabase, the master database informs the slave database that the write happens so that theslave replicates the action taken on the master.
The Dirty Work
-
7/31/2019 Link-State 2012 - Fred Hatfull - The Dirty Work: Scaling Out Websites With Your Own Two Hands
22/87
PAGE:
y
CWRU, 27 October 2012 - Fred Hatfull
Scaling Out Websites With Your Own Two Hands
Database Replication
22
db1
webs
Reads/Writes
readdb1
Reads
ReplicatedWrites
db2
Its good practice to keep another master early in the replication stream that can bepromoted to write master if the write master fails.
The Dirty Work
-
7/31/2019 Link-State 2012 - Fred Hatfull - The Dirty Work: Scaling Out Websites With Your Own Two Hands
23/87
PAGE: CWRU, 27 October 2012 - Fred Hatfull
Scaling Out Websites With Your Own Two Hands
Database Replication
23
db1
webs
Reads/Writes
readdb1
Reads
ReplicatedWrites
db2
Replication cant always happen immediately, depending on the load on the slave db and themaster and how far apart they are/network congestion.
The Dirty Work
-
7/31/2019 Link-State 2012 - Fred Hatfull - The Dirty Work: Scaling Out Websites With Your Own Two Hands
24/87
PAGE:
Title Page General Bullet Points
Graph Page
Image Page Closing Page
CWRU, 27 October 2012 - Fred Hatfull
Scaling Out Websites With Your Own Two Hands
Slide or page title goes here
24
Life used to be so easy... :(
The Dirty Work
-
7/31/2019 Link-State 2012 - Fred Hatfull - The Dirty Work: Scaling Out Websites With Your Own Two Hands
25/87
PAGE:
Strong Consistency
vs.
Eventual Consistency
CWRU, 27 October 2012 - Fred Hatfull
Scaling Out Websites With Your Own Two Hands
Database Replication
25
A shift in thinking. Life is easy in strongly-consistent systems. Although scaling can bechallenging, you get a guarantee that data is always up-to-date. Eventual consistency is a bigchange and has lots of nasty corner cases ready to bite.
The Dirty Work
-
7/31/2019 Link-State 2012 - Fred Hatfull - The Dirty Work: Scaling Out Websites With Your Own Two Hands
26/87
PAGE:
Replication has lots of cons:
Expensive/poorly formed queries have multiplicative effect
Replication delay can lead to inconsistent views
Figuring out when to hit master vs. slave
etc...
CWRU, 27 October 2012 - Fred Hatfull
Scaling Out Websites With Your Own Two Hands
Database Replication
26
While horizontally-scalable read capacity is a big win, there are lots of new things to thinkabout.
The Dirty WorkS li O W b i Wi h Y O T H d
-
7/31/2019 Link-State 2012 - Fred Hatfull - The Dirty Work: Scaling Out Websites With Your Own Two Hands
27/87
PAGE: CWRU, 27 October 2012 - Fred Hatfull
Scaling Out Websites With Your Own Two Hands
Replication: Gross Queries
27
websmaster
slaves
2ms1ms4500ms
A normal replication stream. A lots of nice, small queries floating by. Master has no problem.
The Dirty WorkS li O t W b it With Y O T H d
-
7/31/2019 Link-State 2012 - Fred Hatfull - The Dirty Work: Scaling Out Websites With Your Own Two Hands
28/87
PAGE: CWRU, 27 October 2012 - Fred Hatfull
Scaling Out Websites With Your Own Two Hands
Replication: Gross Queries
28
websmaster
slaves
2ms
1ms4500ms
2ms
2ms
Small writes get replicated fine, but the big, nasty table-scan suddenly locks a whole bunchof rows and waits seconds for mySQL to figure out which rows should come back. Thisdelays all the writes in the replication stream and causes an increase in replication delay,exacerbating the inconsistency
The Dirty WorkScaling Out Websites With Your Own Two Hands
-
7/31/2019 Link-State 2012 - Fred Hatfull - The Dirty Work: Scaling Out Websites With Your Own Two Hands
29/87
PAGE: CWRU, 27 October 2012 - Fred Hatfull
Scaling Out Websites With Your Own Two Hands
Replication: Gross Queries
29
webs
master
slaves
1ms
4500ms
4500ms
4500ms
In the case of a big INSERT/UPDATE/DELETE, that query also needs to replicate to the slaves,causing big delays on all of the slaves
The Dirty WorkScaling Out Websites With Your Own Two Hands
-
7/31/2019 Link-State 2012 - Fred Hatfull - The Dirty Work: Scaling Out Websites With Your Own Two Hands
30/87
PAGE: CWRU, 27 October 2012 - Fred Hatfull
Scaling Out Websites With Your Own Two Hands
Replication: Consistency
30
web
db master
db replica
Heres an example of where replication can introduce inconsistency. Our user wants to knowabout the restaurant Happy Dog
The Dirty WorkScaling Out Websites With Your Own Two Hands
-
7/31/2019 Link-State 2012 - Fred Hatfull - The Dirty Work: Scaling Out Websites With Your Own Two Hands
31/87
PAGE: CWRU, 27 October 2012 - Fred Hatfull
Scaling Out Websites With Your Own Two Hands
Replication: Consistency
31
web
db master
db replica
As part of the request, the web server handling the request asks a DB replica for informationabout Happy Dog
The Dirty WorkScaling Out Websites With Your Own Two Hands
-
7/31/2019 Link-State 2012 - Fred Hatfull - The Dirty Work: Scaling Out Websites With Your Own Two Hands
32/87
PAGE: CWRU, 27 October 2012 - Fred Hatfull
Scaling Out Websites With Your Own Two Hands
Replication: Consistency
32
web
db master
db replica
The replica comes back with the requested information...
The Dirty WorkScaling Out Websites With Your Own Two Hands
-
7/31/2019 Link-State 2012 - Fred Hatfull - The Dirty Work: Scaling Out Websites With Your Own Two Hands
33/87
PAGE: CWRU, 27 October 2012 - Fred Hatfull
g
Replication: Consistency
33
web
db master
db replica
and its returned to the user. Fine. Everything here is as it used to be.
The Dirty WorkScaling Out Websites With Your Own Two Hands
-
7/31/2019 Link-State 2012 - Fred Hatfull - The Dirty Work: Scaling Out Websites With Your Own Two Hands
34/87
PAGE: CWRU, 27 October 2012 - Fred Hatfull
g
Replication: Consistency
34
web
db master
db replica
Now our user wants to write a review about Happy Dog
The Dirty WorkScaling Out Websites With Your Own Two Hands
-
7/31/2019 Link-State 2012 - Fred Hatfull - The Dirty Work: Scaling Out Websites With Your Own Two Hands
35/87
PAGE: CWRU, 27 October 2012 - Fred Hatfull
Replication: Consistency
35
web
db master
db replica
Since we have to write data, our web connects to the master database and issues the writesfor the new review
The Dirty WorkScaling Out Websites With Your Own Two Hands
-
7/31/2019 Link-State 2012 - Fred Hatfull - The Dirty Work: Scaling Out Websites With Your Own Two Hands
36/87
PAGE: CWRU, 27 October 2012 - Fred Hatfull
Replication: Consistency
36
web
db master
db replica
Thats the end of our users web request. After she POSTs the review she gets redirected backto Happy Dogs page. Like before, her web hits a replica instead of the master because it onlyneeds to do reads. However, her request gets through the web and to the replica before herreview makes it to the replica in the replication stream...
The Dirty WorkScaling Out Websites With Your Own Two Hands
-
7/31/2019 Link-State 2012 - Fred Hatfull - The Dirty Work: Scaling Out Websites With Your Own Two Hands
37/87
PAGE: CWRU, 27 October 2012 - Fred Hatfull
Replication: Consistency
37
web
db master
db replica
wheres my review??
As a result, our user sees the stale information, even though she just contributed content!Now the user thinks her content has disappeared.
The Dirty WorkScaling Out Websites With Your Own Two Hands
-
7/31/2019 Link-State 2012 - Fred Hatfull - The Dirty Work: Scaling Out Websites With Your Own Two Hands
38/87
PAGE:
Writes:
alwaysmasterReads:
can useeithermasterorslave
majority can useslave sometimes you want to hit themasterfor consistency
CWRU, 27 October 2012 - Fred Hatfull
Replication: Consistency
38
Heres how DB access is split up based on what kind of activity you are doing. Writes(INSERTs/UPDATEs/DELETEs) always hit the master, since nothing else will accept writes.Reads (SELECTs) can use either the master or a slave, and usually only need a slave. Its up tothe application to figure out if it needs to hit the master, and that can be tricky
The Dirty WorkScaling Out Websites With Your Own Two Hands
-
7/31/2019 Link-State 2012 - Fred Hatfull - The Dirty Work: Scaling Out Websites With Your Own Two Hands
39/87
PAGE:
When Does Consistency Matter?
Consistency only matters when itsexpected
example: users writing reviews
If the user doesnt know information is out of date... is it
really out of date?
CWRU, 27 October 2012 - Fred Hatfull
Replication: Consistency
39
The dirty secret is: consistency doesnt really always matter.
The Dirty WorkScaling Out Websites With Your Own Two Hands
-
7/31/2019 Link-State 2012 - Fred Hatfull - The Dirty Work: Scaling Out Websites With Your Own Two Hands
40/87
PAGE:
Asking for the master:
after writes, hit the master until replication catches up webs/load-balancers can remember user state
but expensive, brittle
instead, teach clients to ask for the master dirty session cookie
CWRU, 27 October 2012 - Fred Hatfull
Replication: Consistency
40
Always hit the master for writes. After writes, hang on to a cookie for X seconds. While theuser has the cookie, hit the master.
The Dirty WorkScaling Out Websites With Your Own Two Hands
-
7/31/2019 Link-State 2012 - Fred Hatfull - The Dirty Work: Scaling Out Websites With Your Own Two Hands
41/87
PAGE:
After writes, issue dirty session cookie
cookie contains a timestamp in the future
webs check for cookie
if time is before timestamp in cookie:
redirect to master db
else:
remove cookie, continue via replica
CWRU, 27 October 2012 - Fred Hatfull
Replication: Dirty Session
41
The Dirty WorkScaling Out Websites With Your Own Two Hands
-
7/31/2019 Link-State 2012 - Fred Hatfull - The Dirty Work: Scaling Out Websites With Your Own Two Hands
42/87
PAGE:
Title Page General Bullet Points
Graph Page
Image Page Closing Page
CWRU, 27 October 2012 - Fred Hatfull
Slide or page title goes here
42
Caches
The Dirty WorkScaling Out Websites With Your Own Two Hands
-
7/31/2019 Link-State 2012 - Fred Hatfull - The Dirty Work: Scaling Out Websites With Your Own Two Hands
43/87
PAGE:
Fastest thing since sliced bread Often seen as a drop-in performance enhancement
Can be hard to get right
Present hidden availability implications
CWRU, 27 October 2012 - Fred Hatfull
Caches
43
Take advantage of precomputed/pre-retrieved results in-memory. While it seems like adrop-in speed upgrade, they can be surprisingly hard to get right.
The Dirty WorkScaling Out Websites With Your Own Two Hands
-
7/31/2019 Link-State 2012 - Fred Hatfull - The Dirty Work: Scaling Out Websites With Your Own Two Hands
44/87
PAGE: CWRU, 27 October 2012 - Fred Hatfull
Caches: Types
44
webs dbs
load balancer
Several dierent types of caches in several dierent places...
The Dirty WorkScaling Out Websites With Your Own Two Hands
-
7/31/2019 Link-State 2012 - Fred Hatfull - The Dirty Work: Scaling Out Websites With Your Own Two Hands
45/87
PAGE: CWRU, 27 October 2012 - Fred Hatfull
Caches: Types
45
webs dbs
load balancer
HTTP caches(varnish etc)
HTTP Caches - cache full HTTP responses. Great for static sites or dynamic sites with contentthat changes infrequently. Ex: varnish, squid
The Dirty WorkScaling Out Websites With Your Own Two Hands
-
7/31/2019 Link-State 2012 - Fred Hatfull - The Dirty Work: Scaling Out Websites With Your Own Two Hands
46/87
PAGE: CWRU, 27 October 2012 - Fred Hatfull
Caches: Types
46
webs dbs
load balancer
HTTP caches(varnish etc)
in-memory caches
Memoization of results from ... things. Functions, db queries, etc. Typically per-node (notshared between webs, for example)
The Dirty WorkScaling Out Websites With Your Own Two Hands
-
7/31/2019 Link-State 2012 - Fred Hatfull - The Dirty Work: Scaling Out Websites With Your Own Two Hands
47/87
PAGE: CWRU, 27 October 2012 - Fred Hatfull
Caches: Types
47
webs dbs
load balancer
HTTP caches(varnish etc)
in-memory caches
memcache
Memcache! Frequently used to store computed results for faster lookup and load reduction.Used to cache anything from raw DB rows to larger queries (joins) to gzipd blobs toserialized data structures
The Dirty WorkScaling Out Websites With Your Own Two Hands
-
7/31/2019 Link-State 2012 - Fred Hatfull - The Dirty Work: Scaling Out Websites With Your Own Two Hands
48/87
PAGE:
Primary cache in most places: memcache
Takes advantage of fast in-memory key/value lookups Good for expensive operations
Complex DB queries - 100s of ms
Network roundtrip to memcache - 2-3ms Misses are cheap - only network roundtrip cost
CWRU, 27 October 2012 - Fred Hatfull
Caches: Advantages
48
The Dirty WorkScaling Out Websites With Your Own Two Hands
-
7/31/2019 Link-State 2012 - Fred Hatfull - The Dirty Work: Scaling Out Websites With Your Own Two Hands
49/87
PAGE:
Cache libraries can make it easy to cache weird things:
Database models memcache connections (!)
??? - anything else that can be serialized (via pickle etc)
Causes problems when object definitions change Cannot enumerate cache contents easily to fix polluted caches
CWRU, 27 October 2012 - Fred Hatfull
Caches: Pitfalls
49
Especially in dynamic languages, it can be easy to say oh do some serialization or whateverand then cache it! However, many times youll have things like SQLAlchemy models (whichhave connections to your database!), the connection you are using to access memcache, andmore. If any of those object definitions change (or you change/remove code that pickle/jsonexpects to have to deserialize), you may end up with a polluted cache which contains entries
that you cant decode. Memcache also doesnt allow you to enumerate cache entries, soro rammaticall invalidatin certain subsets of ke s is hard if not im ossible.
The Dirty WorkScaling Out Websites With Your Own Two Hands
-
7/31/2019 Link-State 2012 - Fred Hatfull - The Dirty Work: Scaling Out Websites With Your Own Two Hands
50/87
PAGE:
Makes exceeding failover capacityreallyeasy
If memcache cluster goes down what happens?
How do you handle additional web and DB load?
Solution:
Build in additional capacity
Be able to isolate and turn off expensive features
Have an emergency maintenance mode
CWRU, 27 October 2012 - Fred Hatfull
Caches: Pitfalls
50
Memcache helps to reduce load, but its another point of failure. Memcache outages cancause increased load proportional to what it ooaded for you, which can easily causecascading failures if not handled correctly.
The Dirty WorkScaling Out Websites With Your Own Two Hands
-
7/31/2019 Link-State 2012 - Fred Hatfull - The Dirty Work: Scaling Out Websites With Your Own Two Hands
51/87
PAGE: CWRU, 27 October 2012 - Fred Hatfull
Datacenters
51
The Dirty WorkScaling Out Websites With Your Own Two Hands
-
7/31/2019 Link-State 2012 - Fred Hatfull - The Dirty Work: Scaling Out Websites With Your Own Two Hands
52/87
PAGE:
Geographic distribution helps you mitigate the speed of light
Replication problems expand to more systems: memcache
code deployments
offline batch processing database slaves see non-trivial replication delay
CWRU, 27 October 2012 - Fred Hatfull
Datacenters
52
Its like database replication for your whole system. Out-of-sync caches can be problematic,and database replication becomes a super-non-trivial problem.
The Dirty WorkScaling Out Websites With Your Own Two Hands
-
7/31/2019 Link-State 2012 - Fred Hatfull - The Dirty Work: Scaling Out Websites With Your Own Two Hands
53/87
PAGE:
Solutions:
Each datacenter gets write master Still only One True Master, other write masters replicate
Read/reporting slaves replicate from local write master
Replicate cache inserts, invalidations Take advantage of existing mySQL replication stream
CWRU, 27 October 2012 - Fred Hatfull
Datacenters
53
Provide utilities for monitoring replication delay for services using the replication stream
The Dirty WorkScaling Out Websites With Your Own Two Hands
-
7/31/2019 Link-State 2012 - Fred Hatfull - The Dirty Work: Scaling Out Websites With Your Own Two Hands
54/87
PAGE:
Title Page General Bullet Points
Graph Page
Image Page Closing Page
CWRU, 27 October 2012 - Fred Hatfull
Slide or page title goes here
54
The Dirty WorkScaling Out Websites With Your Own Two Hands
-
7/31/2019 Link-State 2012 - Fred Hatfull - The Dirty Work: Scaling Out Websites With Your Own Two Hands
55/87
PAGE:
Reduce HTTP Round-Trips
Reduce download sizes
Dont do things browsers dont like
CWRU, 27 October 2012 - Fred Hatfull
Front-End: Principles
55
Lots of front-end performance tips/tricks/hacks. Most of them are based on theseguidelines.
The Dirty WorkScaling Out Websites With Your Own Two Hands
-
7/31/2019 Link-State 2012 - Fred Hatfull - The Dirty Work: Scaling Out Websites With Your Own Two Hands
56/87
PAGE:
ContentDeliveryNetworks
Maintains copies of your assets
Probably serves your assets faster than you do
Examples:
Akamai
Cloudfront (Amazon Web Services)
Cotendo
CWRU, 27 October 2012 - Fred Hatfull
CDNs
56
Like a big, giant, mega-cache.
The Dirty WorkScaling Out Websites With Your Own Two Hands
-
7/31/2019 Link-State 2012 - Fred Hatfull - The Dirty Work: Scaling Out Websites With Your Own Two Hands
57/87
PAGE:
Hugenetworks of globally distributed edge nodes
e.g. Akamai at > 100,000
Easy to setup and drop in
Transparent layer, just change hostnames to CDN
Much lower bandwidth and equipment costs Asset gets uploaded to CDN once (ish)
CWRU, 27 October 2012 - Fred Hatfull
CDNs: Why?
57
[1] http://www.datacenterknowledge.com/archives/2009/05/14/whos-got-the-most-web-servers/
The Dirty WorkScaling Out Websites With Your Own Two Hands
-
7/31/2019 Link-State 2012 - Fred Hatfull - The Dirty Work: Scaling Out Websites With Your Own Two Hands
58/87
PAGE:
RFC 2616 (HTTP 1.1):
A single-user client SHOULD NOT maintain more than2
connectionswith any server or proxy.
CWRU, 27 October 2012 - Fred Hatfull
Subdomain Sharding
58
http://www.ietf.org/rfc/rfc2616.txt
-
7/31/2019 Link-State 2012 - Fred Hatfull - The Dirty Work: Scaling Out Websites With Your Own Two Hands
59/87
The Dirty WorkScaling Out Websites With Your Own Two Hands
-
7/31/2019 Link-State 2012 - Fred Hatfull - The Dirty Work: Scaling Out Websites With Your Own Two Hands
60/87
PAGE:
Distribute assets traffic across sharded subdomains
Before: media.yelp.com -> 16.32.64.128
After: media[1-4].yelp.com -> media.yelp.com -> 16.32.64.128
CWRU, 27 October 2012 - Fred Hatfull
Subdomain Sharding
60
The Dirty WorkScaling Out Websites With Your Own Two Hands
-
7/31/2019 Link-State 2012 - Fred Hatfull - The Dirty Work: Scaling Out Websites With Your Own Two Hands
61/87
PAGE:
Reduce round trips by caching assets
Use HTTP Cache-Control headers
Use very long times e.g. 10 years
Version assets via URL path or query params
CWRU, 27 October 2012 - Fred Hatfull
Cache Me If You Can
61
There are alternatives... e.g. ETag and If-Modified-Since. These require HTTP round-trips tocompute, though, so even though you dont end up needing to re-download the asset youstill end up with more TCP connections.
The Dirty WorkScaling Out Websites With Your Own Two Hands
-
7/31/2019 Link-State 2012 - Fred Hatfull - The Dirty Work: Scaling Out Websites With Your Own Two Hands
62/87
PAGE:
GET /assets/js/1/32dce72546/main.js HTTP/1.1
200 OK
Cache-Control: max-age=315360000
Content-Encoding: gzip, deflate
Content-Length: 8437...
CWRU, 27 October 2012 - Fred Hatfull
Cache Me If You Can
62
The Dirty WorkScaling Out Websites With Your Own Two Hands
-
7/31/2019 Link-State 2012 - Fred Hatfull - The Dirty Work: Scaling Out Websites With Your Own Two Hands
63/87
PAGE:
GET /assets/js/1/32dce72546/main.js HTTP/1.1
200 OK
Cache-Control: max-age=315360000
Content-Encoding: gzip, deflate
Content-Length: 8437...
CWRU, 27 October 2012 - Fred Hatfull
Cache Me If You Can
63
Global version + hash of asset
10 years
The Dirty WorkScaling Out Websites With Your Own Two Hands
-
7/31/2019 Link-State 2012 - Fred Hatfull - The Dirty Work: Scaling Out Websites With Your Own Two Hands
64/87
PAGE:
GET / HTTP/1.1Host: www.yelp.comConnection: keep-aliveCache-Control: max-age=0User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_5) AppleWebKit/537.11 (KHTML, like Gecko)Chrome/23.0.1271.52 Safari/537.11Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8Accept-Encoding: gzip,deflate,sdchAccept-Language: en-US,en;q=0.8
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.3Cookie:yuv=moFlBQmJAAfti607vjGInzN7FFk8_DSjAfHQ_lW4YaGwZEJqQqZmcEyNyeNTQam7Rqm2q6EOieOhxmRTZiNuMNmm_G7pet1m;__qca=P0-2101128917-1317837280654;__gads=ID=91ec4bf2b3418833:T=1321911452:S=ALNI_MbtB7iIIKDvemgnlX95Ywi4BsWEPg;bse=63c34cdcaef7b20a01cc89cc34ccfff5; fd=0; searchPrefs=%7B%22seen_pop%22%3Atrue%2C%22seen_crop_pop%22%3Atrue%2C%22prevent_scroll%22%3Afalse%2C%22maptastic_mode%22%3Atrue%2C%22mapsize%22%3A%22large%22%2C%22rpp%22%3A40%7D; fbm_97534753161=base_domain=.yelp.com; s=YGc7FduEf1Wv2m5hE1sMWU5pMolrEG8x;hl=en_US; recentlocations=New+York%2C+NY%2C+USA%3B%3B706+Mission+St%2C+San+Francisco%2C+CA%2C+USA%3B%3BLower+Pac+Heights%2C+San+Francisco%2C+CA%2C+USA%3B%3BFillmore%2C+MO%2C+USA%3B%3B1251+Waller+St%2C+San+Francisco%2C+CA%2C+USA%3B%3BAnn+Arbor%2C+MI%2C+USA%3B%3BHaight-Ashbury%2C+San+Francisco%2C+CA%2C
+USA%3B%3BSOMA%2C+San+Francisco%2C+CA%2C+USA%3B%3BPittsburgh%2C+PA%2C+USA%3B%3BUnion+Square%2C+San+Francisco%2C+CA%2C+USA%3B%3BLondon%2C+UK%3B%3B706+Mission%2C+Kingsburg%2C+CA%2C+USA%3B%3BMiami%2C+FL%2C+USA%3B%3B510+Central+Ave%2C+Hot+Springs%2C+AR%2C+USA; location=%7B%22unformatted%22%3A+%22San+Francisco%2C+CA%22%2C+%22city%22%3A+%22San+Francisco%22%2C+%22state%22%3A+%22CA%22%2C+%22country%22%3A+%22US%22%7D; __utma=165223479.655521012.1316285892.1350862721.1351317047.69;__utmb=165223479.3.10.1351317047; __utmc=165223479; __utmz=165223479.1338263014.44.13.utmcsr=google|utmccn=(organic)|utmcmd=organic|utmctr=(not%20provided);fbsr_97534753161=Le92MKZfPPnUrIyfRghoVKhIdhfzAt4wW1jpJHo3fIk.eyJhbGdvcml0aG0iOiJITUFDLVNIQTI1NiIsImNvZGUiOiIzNWM5OTViNzIxYTgyYjQzMzVmZjNlNDAuMS0xNDczNjMwMTIyfDEzNTEzMTczNDl8Vm1mLU5DLXh1RkswSHRmdF9wQWRDR0NRTGNZIiwiaXNzdWVkX2F0IjoxMzUxMzE3MDQ5LCJ1c2VyX2lkIjoiMTQ3MzYzMDEyMiJ9
CWRU, 27 October 2012 - Fred Hatfull
Cookieless Domains
64
This is big request...
The Dirty WorkScaling Out Websites With Your Own Two Hands
http://www.yelp.com/http://www.yelp.com/ -
7/31/2019 Link-State 2012 - Fred Hatfull - The Dirty Work: Scaling Out Websites With Your Own Two Hands
65/87
PAGE:
2128 bytes = 2k!
CWRU, 27 October 2012 - Fred Hatfull
Cookieless Domains
65
Could be as big as the asset itself!
The Dirty WorkScaling Out Websites With Your Own Two Hands
-
7/31/2019 Link-State 2012 - Fred Hatfull - The Dirty Work: Scaling Out Websites With Your Own Two Hands
66/87
PAGE:
Only assign cookies to domains which need them
Put static assets/other cookie-less content elsewhere
*.yelp.com vs. *.yelp-cdn.com
CWRU, 27 October 2012 - Fred Hatfull
Cookieless Domains
66
Previous request becomes 399 bytes of overhead.
The Dirty WorkScaling Out Websites With Your Own Two Hands
-
7/31/2019 Link-State 2012 - Fred Hatfull - The Dirty Work: Scaling Out Websites With Your Own Two Hands
67/87
PAGE:
Use gzip
Always include styles in the
Always put scripts at the end of
Avoid inline styles
Avoid manipulating the DOM
Try not to trigger repaints
Load non-critical content via AJAX (if applicable)
CWRU, 27 October 2012 - Fred Hatfull
Assorted Front-End Tips
67
Lots more... web is abundant with tips. These are some of the ones we use.
The Dirty WorkScaling Out Websites With Your Own Two Hands
-
7/31/2019 Link-State 2012 - Fred Hatfull - The Dirty Work: Scaling Out Websites With Your Own Two Hands
68/87
PAGE:
[Cat]
CWRU, 27 October 2012 - Fred Hatfull
Slide or page title goes here
68
The Dirty WorkScaling Out Websites With Your Own Two Hands
-
7/31/2019 Link-State 2012 - Fred Hatfull - The Dirty Work: Scaling Out Websites With Your Own Two Hands
69/87
PAGE:
Mission-critical
Needs to be simple, easy-to-understand, durable
Strategies vary widely based on application requirements
Drop-in products only get you so far
Exposes:
what is broken when
what works and how well it works
CWRU, 27 October 2012 - Fred Hatfull
Monitoring
69
The Dirty WorkScaling Out Websites With Your Own Two Hands
-
7/31/2019 Link-State 2012 - Fred Hatfull - The Dirty Work: Scaling Out Websites With Your Own Two Hands
70/87
PAGE:
Nagios - Alerts
Firefly - Performance and availability analysis
Tertiary tools
Smokeping
Pharos
Ganglia
etc...
CWRU, 27 October 2012 - Fred Hatfull
Monitoring: Our Strategy
70
The Dirty WorkScaling Out Websites With Your Own Two Hands
-
7/31/2019 Link-State 2012 - Fred Hatfull - The Dirty Work: Scaling Out Websites With Your Own Two Hands
71/87
PAGE:
Nagios
Off-the-shelf solution
Flexible custom reporting
Well-understood for monitoring systems
e.g. load, memory/disk usage, hardware failures, etc
Needs well-known states (CRITICAL/OK)
CWRU, 27 October 2012 - Fred Hatfull
Monitoring: Alerts
71
The Dirty WorkScaling Out Websites With Your Own Two Hands
-
7/31/2019 Link-State 2012 - Fred Hatfull - The Dirty Work: Scaling Out Websites With Your Own Two Hands
72/87
PAGE:
Firefly
Graphical front-end to time-series data
Extensible data ingestion API
Open-source: github.com/Yelp/firefly
Statmonster
Code-name for data collection and preprocessing
Upstream of Firefly
Turns log lines into stats, analyzes, and stores
CWRU, 27 October 2012 - Fred Hatfull
Monitoring: Performance
72
The Dirty WorkScaling Out Websites With Your Own Two Hands
-
7/31/2019 Link-State 2012 - Fred Hatfull - The Dirty Work: Scaling Out Websites With Your Own Two Hands
73/87
PAGE:
Brief Aside: Logging
Incredibly important for application development Often only source of information when everything blows up
Useful for pulling data out of the webapp on the fly
Huge volumes of log data require special infrastructure
CWRU, 27 October 2012 - Fred Hatfull
Monitoring: Logging
73
The Dirty WorkScaling Out Websites With Your Own Two Hands
-
7/31/2019 Link-State 2012 - Fred Hatfull - The Dirty Work: Scaling Out Websites With Your Own Two Hands
74/87
PAGE:
Distributed log aggregation system
Composed ofleafandaggregator nodes
Leavescollect log lines
Aggregatorsaggregate incoming lines based onchannel
Each line associated with a channel
e.g. (yelp_timings, Homepage render took 32.9ms
Eventually Consistent
CWRU, 27 October 2012 - Fred Hatfull
Monitoring: Scribe
74
Note: eventual consistency != strong consistency. Lines may be delayed/out-of-order
The Dirty WorkScaling Out Websites With Your Own Two Hands
-
7/31/2019 Link-State 2012 - Fred Hatfull - The Dirty Work: Scaling Out Websites With Your Own Two Hands
75/87
PAGE: CWRU, 27 October 2012 - Fred Hatfull
Monitoring: Scribe
75
webs
(channel3, message)
(channel1, message)
(channel2, message)
(channel2, message)
(channel1, message)
leaves
aggregators
[channel1]
[channel2, channel3]
localconsumers
The Dirty WorkScaling Out Websites With Your Own Two Hands
-
7/31/2019 Link-State 2012 - Fred Hatfull - The Dirty Work: Scaling Out Websites With Your Own Two Hands
76/87
PAGE: CWRU, 27 October 2012 - Fred Hatfull
Monitoring: Performance
76
scribe
log digestionstat generation
windowingadditional statistics
log lines (json etc)
stats
(performance.home, 1.2)
RRD Data Chunks
The Dirty WorkScaling Out Websites With Your Own Two Hands
-
7/31/2019 Link-State 2012 - Fred Hatfull - The Dirty Work: Scaling Out Websites With Your Own Two Hands
77/87
PAGE: CWRU, 27 October 2012 - Fred Hatfull
Monitoring: Performance
77
scribe
log digestionstat generation
windowingadditional statistics
log lines (json etc)
stats
(performance.home, 1.2)
RRD Data Chunks
The Dirty WorkScaling Out Websites With Your Own Two Hands
-
7/31/2019 Link-State 2012 - Fred Hatfull - The Dirty Work: Scaling Out Websites With Your Own Two Hands
78/87
PAGE: CWRU, 27 October 2012 - Fred Hatfull
Monitoring: Performance
78
{time_start: 10,time_dispatch: 12,time_end: 44,checkpoints: {
user_details: 22,review_collection: 37,template_render: 43
}}
def digest(e):time_start = e[time_start]checkpoints = e[checkpoints]total_time = e[time_end] - time_startcompute_time = e[template_render] - time_start
reviews_time = checkpoints[review_collection] - time_start
emit([performance, total_time], total_time)emit([performance, compute_time], compute_time)emit([checkpoint_times, reviews], reviews_time)
([performance, total_time], 34, 10)([performance, compute_time], 32, 10)([checkpoint_times, reviews], 27, 10)
Trailing 10 in the emitted stats is the timestamp
-
7/31/2019 Link-State 2012 - Fred Hatfull - The Dirty Work: Scaling Out Websites With Your Own Two Hands
79/87
The Dirty WorkScaling Out Websites With Your Own Two Hands
-
7/31/2019 Link-State 2012 - Fred Hatfull - The Dirty Work: Scaling Out Websites With Your Own Two Hands
80/87
PAGE:CWRU, 27 October 2012 - Fred Hatfull
Monitoring: Performance
80
([performance, total_time], 34, 10)([performance, compute_time], 32, 10)([checkpoint_times, reviews], 27, 10)
[performance, total_time]
(8,3
2)
(8,4
0)
(10,2
3)
(9,3
6)
(11,2
9)
(9,3
3)
(10,3
5)
(11,2
8)
(12,3
9)
(10s buffer)
-
7/31/2019 Link-State 2012 - Fred Hatfull - The Dirty Work: Scaling Out Websites With Your Own Two Hands
81/87
The Dirty WorkScaling Out Websites With Your Own Two Hands
-
7/31/2019 Link-State 2012 - Fred Hatfull - The Dirty Work: Scaling Out Websites With Your Own Two Hands
82/87
PAGE:CWRU, 27 October 2012 - Fred Hatfull
Monitoring: Performance
82
([performance, total_time], 34, 10)([performance, compute_time], 32, 10)([checkpoint_times, reviews], 27, 10)
[performance, total_time]
(8,3
2)
(8,4
0)
(10,2
3)
(9,3
6)
(11,2
9)
(9,3
3)
(10,3
5)
(11,2
8)
(12,3
9)
(10,3
4)
(10s buffer)
The Dirty WorkScaling Out Websites With Your Own Two Hands
-
7/31/2019 Link-State 2012 - Fred Hatfull - The Dirty Work: Scaling Out Websites With Your Own Two Hands
83/87
PAGE:CWRU, 27 October 2012 - Fred Hatfull
Monitoring: Performance
83
(8,32)
(8,40)
(10,23)
(9,36)
(11,29)
(9,33)
(10,35)
(11,28)
(12,39)
(10,34)
stats50th, 75th, 95th, 99th,
mean, count, etc...
The Dirty WorkScaling Out Websites With Your Own Two Hands
-
7/31/2019 Link-State 2012 - Fred Hatfull - The Dirty Work: Scaling Out Websites With Your Own Two Hands
84/87
PAGE:
Title Page
General Bullet Points
Graph Page
Image Page
Closing Page
CWRU, 27 October 2012 - Fred Hatfull
Slide or page title goes here
84
The Dirty WorkScaling Out Websites With Your Own Two Hands
-
7/31/2019 Link-State 2012 - Fred Hatfull - The Dirty Work: Scaling Out Websites With Your Own Two Hands
85/87
PAGE: CWRU, 27 October 2012 - Fred Hatfull
Questions?
85
The Dirty WorkScaling Out Websites With Your Own Two Hands
-
7/31/2019 Link-State 2012 - Fred Hatfull - The Dirty Work: Scaling Out Websites With Your Own Two Hands
86/87
PAGE:
Full-Time
Interns
Front-End and Back-End Engineering
http://www.yelp.com/careers
CWRU, 27 October 2012 - Fred Hatfull
Were Hiring!
86
The Dirty WorkScaling Out Websites With Your Own Two Hands
http://www.yelp.com/careershttp://www.yelp.com/careers -
7/31/2019 Link-State 2012 - Fred Hatfull - The Dirty Work: Scaling Out Websites With Your Own Two Hands
87/87
PAGE:
Title Page
General Bullet Points
Graph Page
Image Page
Closing Page
CWRU, 27 October 2012 - Fred Hatfull
Slide or page title goes here
87
Thanks!