Fast Data at Massive Scale Lessons Learned at Facebook Bobby Johnson.

20

Transcript of Fast Data at Massive Scale Lessons Learned at Facebook Bobby Johnson.

Page 1: Fast Data at Massive Scale Lessons Learned at Facebook Bobby Johnson.
Page 2: Fast Data at Massive Scale Lessons Learned at Facebook Bobby Johnson.

Fast Data at Massive Scale

Lessons Learned at Facebook

Bobby Johnson

Page 3: Fast Data at Massive Scale Lessons Learned at Facebook Bobby Johnson.

Me

• Director of Engineering– Scaling and Performance– Site Security– Site Reliability– Distributed Systems– Development tools– Customer Service Tools

• Took Facebook from 7M users to 120M.

Page 4: Fast Data at Massive Scale Lessons Learned at Facebook Bobby Johnson.
Page 5: Fast Data at Massive Scale Lessons Learned at Facebook Bobby Johnson.

ArchitectureLoad Balancer

(assigns a web server)

Web Server (PHP assembles data)

Memcache(fast)

Database(slow, persistent)

Other servicesSearch, Feed, etc(ignore for now)

Page 6: Fast Data at Massive Scale Lessons Learned at Facebook Bobby Johnson.

- 1/2 the time is in PHP

- 1/4 is in memcache

- 1/8 is in database

Page 7: Fast Data at Massive Scale Lessons Learned at Facebook Bobby Johnson.

One year ago, almost half the time was memcache

Page 8: Fast Data at Massive Scale Lessons Learned at Facebook Bobby Johnson.

Network Incast

PHP Client

Switch

memcache memcache memcache memcache

Many Small Get Requests

Page 9: Fast Data at Massive Scale Lessons Learned at Facebook Bobby Johnson.

Network Incast

PHP Client

Switch

memcache memcache memcache memcache

Many big data packets

Page 10: Fast Data at Massive Scale Lessons Learned at Facebook Bobby Johnson.

Clustering

PHP Client

memcache

10 objects

1 round trip for 10 objects

Page 11: Fast Data at Massive Scale Lessons Learned at Facebook Bobby Johnson.

Clustering

PHP Client

memcache memcache

5 objects5 objects

- 2 round trips total

- 1 round trip per server

- longest request is 5

Page 12: Fast Data at Massive Scale Lessons Learned at Facebook Bobby Johnson.

Clustering

PHP Client

memcache memcache

3 objects4 objects

- 3 round trips total

- 1 round trip per server

- longest request is 4

memcache

3 objects

Page 13: Fast Data at Massive Scale Lessons Learned at Facebook Bobby Johnson.

Clustering

• If objects are small, round trips dominate so you want objects clustered

• If objects are large, transfer time dominates so you want objects distributed

• In a web application you will almost always be dealing with small objects

Page 14: Fast Data at Massive Scale Lessons Learned at Facebook Bobby Johnson.

Caching

- Basic tools are parallelism and clustering

- Clustering is a latency/throughput tradeoff

- Application code must be aware

- Networking is a burst problem

- Dropped packets kill you

- TCP quick ack

Page 15: Fast Data at Massive Scale Lessons Learned at Facebook Bobby Johnson.

PHP CPU

Page 16: Fast Data at Massive Scale Lessons Learned at Facebook Bobby Johnson.

Application Improvements

Page 17: Fast Data at Massive Scale Lessons Learned at Facebook Bobby Johnson.

know what your libraries do

$results = get_search_results( $needle );

foreach ( $results as $result ) {

if ( is_pending_friend( $result[‘id’] ) ) {

// we’ll change the links based on this

$result[‘pending’] = true;

}

}

Page 18: Fast Data at Massive Scale Lessons Learned at Facebook Bobby Johnson.

know what your libraries do

function is_pending_friend( $id ) {

// this is short-lived, so don’t cache

expensive_db_query( $id …)

Page 19: Fast Data at Massive Scale Lessons Learned at Facebook Bobby Johnson.

Databases

- Tend to be slower than lighter weight alternatives, so avoid using them

- If you do use them partition them right from the start

- If a query is _really_ slow, like a few seconds or a few minutes, you probably have a bug where you’re scanning a table

- The db should have a command to tell you what index it’s using for a query, and how many rows it’s examining

Page 20: Fast Data at Massive Scale Lessons Learned at Facebook Bobby Johnson.

General Lessons

• Your best tool is parallelism

• Look at your data

• Build tools to look at your data

• Don’t make assumptions about what components are doing

• Algorithmic and system improvements are almost always better than micro-optimization