Download - Scaling the Netflix API - OSCON

Transcript
Page 1: Scaling the Netflix API - OSCON

Scaling the Netflix API

Daniel Jacobson@daniel_jacobson

http://www.linkedin.com/in/danieljacobsonhttp://www.slideshare.net/danieljacobson

Page 2: Scaling the Netflix API - OSCON

Please read the notes associated with each slide for

the full context of the presentation

Page 3: Scaling the Netflix API - OSCON

What do I mean by “scale”?

Page 4: Scaling the Netflix API - OSCON

But There Are Many Ways to Scale!

OrganizationSystems

Devices

Development

Testing

Page 5: Scaling the Netflix API - OSCON

But first, some background…

Page 6: Scaling the Netflix API - OSCON

Global Streaming Videofor TV Shows and Movies

Page 7: Scaling the Netflix API - OSCON

More than 36 Million Subscribers

More than 40 Countries

Page 8: Scaling the Netflix API - OSCON

Netflix Accounts for 33% of Peak Internet Traffic in North America

Netflix subscribers are watching more than 1 billion hours a month

Page 9: Scaling the Netflix API - OSCON
Page 10: Scaling the Netflix API - OSCON

Netflix REST API:One-Size-Fits-All (OSFA)

Solution

Page 11: Scaling the Netflix API - OSCON

Image courtesy of Jay Mac 3 on Flickr

Page 12: Scaling the Netflix API - OSCON

Netflix API Requests by AudienceAt Launch In 2008

External Developers

Page 13: Scaling the Netflix API - OSCON
Page 14: Scaling the Netflix API - OSCON
Page 15: Scaling the Netflix API - OSCON

Image courtesy of Jay Mac 3 on Flickr

Page 16: Scaling the Netflix API - OSCON

Netflix API Requests by AudienceFrom 2011

External Developers

Page 17: Scaling the Netflix API - OSCON

Scaling…

OrganizationSystems

Devices

Development

Testing

Page 18: Scaling the Netflix API - OSCON

Distributed Architecture

Page 19: Scaling the Netflix API - OSCON
Page 20: Scaling the Netflix API - OSCON

1000+ Device Types

Page 21: Scaling the Netflix API - OSCON

Personalization

EngineUser Info Movie

MetadataMovie Ratings

Similar Movies Reviews A/B Test

Engine

Dozens of Dependencies

Page 22: Scaling the Netflix API - OSCON

Personalization

EngineUser Info Movie

MetadataMovie Ratings

Similar Movies

API

Reviews A/B Test Engine

Page 23: Scaling the Netflix API - OSCON

Personalization

EngineUser Info Movie

MetadataMovie Ratings

Similar Movies

API

Reviews A/B Test Engine

Page 24: Scaling the Netflix API - OSCON

http://www.slideshare.net/reed2001/culture-1798664

Page 25: Scaling the Netflix API - OSCON

Scaling…

OrganizationSystems

Devices

Development

Testing

Page 26: Scaling the Netflix API - OSCON

System Resiliency

Page 27: Scaling the Netflix API - OSCON

Distributed Architecture

Page 28: Scaling the Netflix API - OSCON

Dependency Relationships

Page 29: Scaling the Netflix API - OSCON

2,000,000,000Requests Per Day to the

Netflix API

Page 30: Scaling the Netflix API - OSCON

30Distinct, Direct Dependent Services for the Netflix API

Page 31: Scaling the Netflix API - OSCON

14,000,000,000Netflix API Calls Per Day to those Dependent Services

Page 32: Scaling the Netflix API - OSCON

0Dependent Services with

100% SLA

Page 33: Scaling the Netflix API - OSCON

99.99% = 99.7%30

0.3% of 2B = 6M failures per day

2+ Hours of Downtime Per Month

Page 34: Scaling the Netflix API - OSCON

99.99% = 99.7%30

0.3% of 2B = 6M failures per day

2+ Hours of Downtime Per Month

Page 35: Scaling the Netflix API - OSCON

99.9% = 97%30

3% of 2B = 60M failures per day

20+ Hours of Downtime Per Month

Page 36: Scaling the Netflix API - OSCON

Personalization

EngineUser Info Movie

MetadataMovie Ratings

Similar Movies

API

Reviews A/B Test Engine

Page 37: Scaling the Netflix API - OSCON

Personalization

EngineUser Info Movie

MetadataMovie Ratings

Similar Movies

API

Reviews A/B Test Engine

Page 38: Scaling the Netflix API - OSCON

Personalization

EngineUser Info Movie

MetadataMovie Ratings

Similar Movies

API

Reviews A/B Test Engine

Page 39: Scaling the Netflix API - OSCON

Personalization

EngineUser Info Movie

MetadataMovie Ratings

Similar Movies

API

Reviews A/B Test Engine

Page 40: Scaling the Netflix API - OSCON

Personalization

EngineUser Info Movie

MetadataMovie Ratings

Similar Movies

API

Reviews A/B Test Engine

Page 41: Scaling the Netflix API - OSCON
Page 42: Scaling the Netflix API - OSCON

Circuit Breaker Dashboard

Page 43: Scaling the Netflix API - OSCON
Page 44: Scaling the Netflix API - OSCON

Call Volume and Health / Last 10 Seconds

Page 45: Scaling the Netflix API - OSCON

Call Volume / Last 2 Minutes

Page 46: Scaling the Netflix API - OSCON

Successful Requests

Page 47: Scaling the Netflix API - OSCON

Successful, But Slower Than Expected

Page 48: Scaling the Netflix API - OSCON

Short-Circuited Requests, Delivering Fallbacks

Page 49: Scaling the Netflix API - OSCON

Timeouts, Delivering Fallbacks

Page 50: Scaling the Netflix API - OSCON

Thread Pool & Task Queue Full, Delivering Fallbacks

Page 51: Scaling the Netflix API - OSCON

Exceptions, Delivering Fallbacks

Page 52: Scaling the Netflix API - OSCON

Error Rate# + # + # + # / (# + # + # + # + #) = Error Rate

Page 53: Scaling the Netflix API - OSCON

Status of Fallback Circuit

Page 54: Scaling the Netflix API - OSCON

Requests per Second, Over Last 10 Seconds

Page 55: Scaling the Netflix API - OSCON

SLA Information

Page 56: Scaling the Netflix API - OSCON

Personalization

EngineUser Info Movie

MetadataMovie Ratings

Similar Movies

API

Reviews A/B Test Engine

Page 57: Scaling the Netflix API - OSCON

Personalization

EngineUser Info Movie

MetadataMovie Ratings

Similar Movies

API

Reviews A/B Test Engine

Page 58: Scaling the Netflix API - OSCON

Personalization

EngineUser Info Movie

MetadataMovie Ratings

Similar Movies

API

Reviews A/B Test Engine

Page 59: Scaling the Netflix API - OSCON

Personalization

EngineUser Info Movie

MetadataMovie Ratings

Similar Movies

API

Reviews A/B Test Engine

Fallback

Page 60: Scaling the Netflix API - OSCON

Personalization

EngineUser Info Movie

MetadataMovie Ratings

Similar Movies

API

Reviews A/B Test Engine

Fallback

Page 61: Scaling the Netflix API - OSCON

System Infrastructure

Page 62: Scaling the Netflix API - OSCON

AWS Cloud

Page 63: Scaling the Netflix API - OSCON
Page 64: Scaling the Netflix API - OSCON
Page 65: Scaling the Netflix API - OSCON

Autoscaling

Page 66: Scaling the Netflix API - OSCON

Autoscaling

Page 67: Scaling the Netflix API - OSCON

More than 36 Million Subscribers

More than 40 Countries

Page 68: Scaling the Netflix API - OSCON

ZuulGatekeeper for the Netflix Streaming Application

Page 69: Scaling the Netflix API - OSCON

Zuul

• Multi-Region Resiliency

• Insights• Stress Testing• Canary Testing• Dynamic Routing

• Load Shedding• Security• Static Response

Handling• Authentication

Page 70: Scaling the Netflix API - OSCON

Isthmus

Page 71: Scaling the Netflix API - OSCON

Forced Failure

Page 72: Scaling the Netflix API - OSCON
Page 73: Scaling the Netflix API - OSCON

Scaling…

OrganizationSystems

Devices

Development

Testing

Page 74: Scaling the Netflix API - OSCON
Page 75: Scaling the Netflix API - OSCON
Page 76: Scaling the Netflix API - OSCON

Screen Real Estate

Page 77: Scaling the Netflix API - OSCON

Controller

Page 78: Scaling the Netflix API - OSCON

Technical Capabilities

Page 79: Scaling the Netflix API - OSCON

One-Size-Fits-AllAPI

Request

RequestRequest

Request

Request

Request

RequestRequest

Request

Request

RequestRequest

Request

Request

Request

Request

Page 80: Scaling the Netflix API - OSCON

Scaling…

OrganizationSystems

Devices

Development

Testing

Page 81: Scaling the Netflix API - OSCON

Courtesy of South Florida Classical Review

Page 82: Scaling the Netflix API - OSCON
Page 83: Scaling the Netflix API - OSCON

Resource-Based API

vs.

Experience-Based API

Page 84: Scaling the Netflix API - OSCON

Resource-Based Requests

• /users/<id>/ratings/title• /users/<id>/queues• /users/<id>/queues/instant• /users/<id>/recommendations• /catalog/titles/movie• /catalog/titles/series• /catalog/people

Page 85: Scaling the Netflix API - OSCON

REST API

RECOMMENDATIONS

MOVIE DATA

SIMILAR MOVIES

AUTH MEMBERDATA

A/B TESTS

START-UP

RATINGS

Network Border Network Border

Page 86: Scaling the Netflix API - OSCON

RECOMMENDATIONS

MOVIE DATA

SIMILAR MOVIES

AUTH MEMBERDATA

A/B TESTS

START-UP

RATINGS

OSFA API

Network Border Network Border

SERVER CODE

CLIENT CODE

Page 87: Scaling the Netflix API - OSCON

RECOMMENDATIONS

MOVIE DATA

SIMILAR MOVIES

AUTH MEMBERDATA

A/B TESTS

START-UP

RATINGS

OSFA API

Network Border Network Border

DATA GATHERING,FORMATTING,AND DELIVERY

USER INTERFACERENDERING

Page 88: Scaling the Netflix API - OSCON
Page 89: Scaling the Netflix API - OSCON
Page 90: Scaling the Netflix API - OSCON

Experience-Based Requests

• /ps3/homescreen

Page 91: Scaling the Netflix API - OSCON

JAVA API

Network Border Network Border

RECOMMENDATIONS

MOVIE DATA

SIMILAR MOVIES

AUTH MEMBERDATA

A/B TESTS

START-UP

RATINGS

Groovy Layer

Page 92: Scaling the Netflix API - OSCON
Page 93: Scaling the Netflix API - OSCON

RECOMMENDATIONSA

ZXSXX C CCC

MOVIE DATA

SIMILAR MOVIES

AUTH MEMBERDATA

A/B TESTS

START-UP

RATINGS

JAVA API

SERVER CODE

CLIENT CODE

CLIENT ADAPTER CODE(WRITTEN BY CLIENT TEAMS, DYNAMICALLY UPLOADED TO SERVER)

Network Border Network Border

Page 94: Scaling the Netflix API - OSCON

RECOMMENDATIONSA

ZXSXX C CCC

MOVIE DATA

SIMILAR MOVIES

AUTH MEMBERDATA

A/B TESTS

START-UP

RATINGS

JAVA API

DATA GATHERING

DATA FORMATTINGAND DELIVERY

USER INTERFACERENDERING

Network Border Network Border

Page 95: Scaling the Netflix API - OSCON
Page 96: Scaling the Netflix API - OSCON

Scaling…

OrganizationSystems

Devices

Development

Testing

Page 97: Scaling the Netflix API - OSCON

Dependency Relationships

Page 98: Scaling the Netflix API - OSCON
Page 99: Scaling the Netflix API - OSCON

Testing Philosophy:

Act Fast, React Fast

Page 100: Scaling the Netflix API - OSCON

That Doesn’t Mean We Don’t Test

• Unit tests

• Functional tests

• Regression scripts

• Continuous integration

• Capacity planning

• Load / Performance tests

Page 101: Scaling the Netflix API - OSCON

Cloud-Based Deployment Techniques

Page 102: Scaling the Netflix API - OSCON

Current Code

In Production

API Requests from the Internet

Page 103: Scaling the Netflix API - OSCON

Canary Analysis Automation

Page 104: Scaling the Netflix API - OSCON

Single Canary InstanceTo Test New Code with Production Traffic

(around 1% or less of traffic)

Current Code

In Production

API Requests from the Internet

Error!

Page 105: Scaling the Netflix API - OSCON

Current Code

In Production

API Requests from the Internet

Page 106: Scaling the Netflix API - OSCON

Current Code

In Production

API Requests from the Internet

Perfect!

Page 107: Scaling the Netflix API - OSCON

Current Code

In Production

API Requests from the Internet

Perfect!

Page 108: Scaling the Netflix API - OSCON

Stress Test with Zuul

Page 109: Scaling the Netflix API - OSCON

Current Code

In Production

API Requests from the Internet

New Code

Getting Prepared for Production

Page 110: Scaling the Netflix API - OSCON

Current Code

In Production

API Requests from the Internet

New Code

Getting Prepared for Production

Page 111: Scaling the Netflix API - OSCON

Error!

Current Code

In Production

API Requests from the Internet

New Code

Getting Prepared for Production

Page 112: Scaling the Netflix API - OSCON

Current Code

In Production

API Requests from the Internet

New Code

Getting Prepared for Production

Page 113: Scaling the Netflix API - OSCON

Current Code

In Production

API Requests from the Internet

Perfect!

Page 114: Scaling the Netflix API - OSCON

Stress Test with Zuul

Page 115: Scaling the Netflix API - OSCON

Current Code

In Production

API Requests from the Internet

New Code

Getting Prepared for Production

Page 116: Scaling the Netflix API - OSCON

Current Code

In Production

API Requests from the Internet

New Code

Getting Prepared for Production

Page 117: Scaling the Netflix API - OSCON

API Requests from the Internet

New Code

Getting Prepared for Production

Page 118: Scaling the Netflix API - OSCON

https://www.github.com/Netflix

Page 119: Scaling the Netflix API - OSCON

Scaling the Netflix API

Daniel Jacobson@daniel_jacobson

http://www.linkedin.com/in/danieljacobsonhttp://www.slideshare.net/danieljacobson

HelpWanted!