Download - Asynchronous Architectures for Implementing Scalable Cloud Services - Evan Cooke - Gluecon 2012

Transcript
Page 1: Asynchronous Architectures for Implementing Scalable Cloud Services - Evan Cooke - Gluecon 2012

Asynchronous Architectures for Implementing Scalable Cloud ServicesDesigning for Graceful Degradation

EVAN COOKE

CO-FOUNDER & CTO twilioCLOUD COMMUNICATIONS

Page 2: Asynchronous Architectures for Implementing Scalable Cloud Services - Evan Cooke - Gluecon 2012
Page 3: Asynchronous Architectures for Implementing Scalable Cloud Services - Evan Cooke - Gluecon 2012

Cloud services power the apps that are the backbone of modern society. How

we work, play, and communicate.

Page 4: Asynchronous Architectures for Implementing Scalable Cloud Services - Evan Cooke - Gluecon 2012

Cloud WorkloadsCan Be

Unpredictable

Page 5: Asynchronous Architectures for Implementing Scalable Cloud Services - Evan Cooke - Gluecon 2012

6x spike in 5 mins

SMS API Usage

Page 6: Asynchronous Architectures for Implementing Scalable Cloud Services - Evan Cooke - Gluecon 2012

RequestLatency

Load

Time

FAIL

Danger!Load higher than instantaneous throughput

Page 7: Asynchronous Architectures for Implementing Scalable Cloud Services - Evan Cooke - Gluecon 2012

Don’t Fail Requests

Page 8: Asynchronous Architectures for Implementing Scalable Cloud Services - Evan Cooke - Gluecon 2012

LoadBalancer

Incoming Requests

AAA AAA AAA

...Throttling Throttling Throttling

Throttling Throttling Throttling

App Server

App Server

App Server

App Server

W

WW

W

WWW

W

WorkerPool

Page 9: Asynchronous Architectures for Implementing Scalable Cloud Services - Evan Cooke - Gluecon 2012

10%

70%

100%+

FailedRequests

Time

Worker Poolse.g., Apache/Nginx

Page 10: Asynchronous Architectures for Implementing Scalable Cloud Services - Evan Cooke - Gluecon 2012

Problem Summary

•Cloud services often use worker pools to handle incoming requests

•When load goes beyond size of the worker pool, requests fail

Page 11: Asynchronous Architectures for Implementing Scalable Cloud Services - Evan Cooke - Gluecon 2012

What next?

A few observations based on work implementing and scaling the Twilio API over the past 4 years...

• Twilio Voice/SMS Cloud APIs

• 100,000 Twilio Developers

• 100+ employees

Page 12: Asynchronous Architectures for Implementing Scalable Cloud Services - Evan Cooke - Gluecon 2012

Observation 1

For many APIs, taking more time to service a request is better than failing that request

Implication: in many cases, it is better to service a request with some delay rather than failing it

Page 13: Asynchronous Architectures for Implementing Scalable Cloud Services - Evan Cooke - Gluecon 2012

Observation 2

Matching the amount of available resources precisely to the size of incoming request worker pools is challenging

Implication: under load, it may be possible delay or drop only those requests that truly impact resources

Page 14: Asynchronous Architectures for Implementing Scalable Cloud Services - Evan Cooke - Gluecon 2012

What are we going to do?

Suggestion: if request concurrency was very cheap, we could implement delay and finer-grained resource controls much more easily...

Page 15: Asynchronous Architectures for Implementing Scalable Cloud Services - Evan Cooke - Gluecon 2012

Event-driven programming and the Reactor Pattern

Page 16: Asynchronous Architectures for Implementing Scalable Cloud Services - Evan Cooke - Gluecon 2012

Event-driven programming and the Reactor Pattern

req = ‘GET /’;req.append(‘/r/n/r/n’);socket.write(req);resp = socket.read();print(resp);

1110000x10000000x10

TimeWorker

Page 17: Asynchronous Architectures for Implementing Scalable Cloud Services - Evan Cooke - Gluecon 2012

Event-driven programming and the Reactor Pattern

req = ‘GET /’;req.append(‘/r/n/r/n’);socket.write(req);resp = socket.read();print(resp);

1110000x10000000x10

Time

Huge IO latency blocks worker

Page 18: Asynchronous Architectures for Implementing Scalable Cloud Services - Evan Cooke - Gluecon 2012

Event-driven programming and the Reactor Pattern

req = ‘GET /’;req.append(‘/r/n/r/n’);socket.write(req, fn() {

socket.read(fn(resp) {print(resp);});

});

Make IO operations async and “callback” when done

Page 19: Asynchronous Architectures for Implementing Scalable Cloud Services - Evan Cooke - Gluecon 2012

Event-driven programming and the Reactor Pattern

req = ‘GET /’;req.append(‘/r/n/r/n’);socket.write(req, fn() {

socket.read(fn(resp) {print(resp);});

});Central dispatch to coordinate event callbacksreactor.run_forever();

Page 20: Asynchronous Architectures for Implementing Scalable Cloud Services - Evan Cooke - Gluecon 2012

Event-driven programming and the Reactor Pattern

req = ‘GET /’;req.append(‘/r/n/r/n’);socket.write(req, fn() {

socket.read(fn(resp) {print(resp);});

});reactor.run_forever();

11

10

Time

1010

Result: we don’t block the worker

Page 21: Asynchronous Architectures for Implementing Scalable Cloud Services - Evan Cooke - Gluecon 2012

(Some)Reactor Pattern Frameworks

js/node.js

python/twistedpython/gevent

c/libeventc/libev

ruby/eventmachine

java/nio/netty

Page 22: Asynchronous Architectures for Implementing Scalable Cloud Services - Evan Cooke - Gluecon 2012

The Callback Mess

Python Twistedreq = ‘GET /’req += ‘/r/n/r/n’

def r(resp): print resp

def w(): socket.read().addCallback(r)

socket.write().addCallback(w)

Page 23: Asynchronous Architectures for Implementing Scalable Cloud Services - Evan Cooke - Gluecon 2012

The Callback Mess

Python Twistedreq = ‘GET /’req += ‘/r/n/r/n’

yield socket.write()resp = yield socket.read()print resp

Use deferred generators and inline callbacks

Page 24: Asynchronous Architectures for Implementing Scalable Cloud Services - Evan Cooke - Gluecon 2012

The Callback Mess

Python Twistedreq = ‘GET /’req += ‘/r/n/r/n’

yield socket.write()resp = yield socket.read()print resp

Easy sequential programming with

mostly implicit async IO

Page 25: Asynchronous Architectures for Implementing Scalable Cloud Services - Evan Cooke - Gluecon 2012

Enter gevent“gevent is a coroutine-based Python networking library that uses greenlet

to provide a high-level synchronous API on top of the libevent event loop.”

socket.write()resp = socket.read()print resp

Natively Async

Page 26: Asynchronous Architectures for Implementing Scalable Cloud Services - Evan Cooke - Gluecon 2012

Enter gevent

from gevent.server import StreamServer

def echo(socket, address): print ('New connection from %s:%s' % address) socket.sendall('Welcome to the echo server!\r\n') line = fileobj.readline() fileobj.write(line) fileobj.flush() print ("echoed %r" % line)

if __name__ == '__main__': server = StreamServer(('0.0.0.0', 6000), echo) server.serve_forever()

Simple Echo Server

Easy sequential modelFully async

Page 27: Asynchronous Architectures for Implementing Scalable Cloud Services - Evan Cooke - Gluecon 2012

Async Services with Ginkgo

Ginkgo is a simple framework for composing async gevent services with common

configuration, logging, demonizing etc.

https://github.com/progrium/ginkgo

Let’s look a simple example that implements a TCP and

HTTP server...

Page 28: Asynchronous Architectures for Implementing Scalable Cloud Services - Evan Cooke - Gluecon 2012

Async Services with Ginkgoimport geventfrom gevent.pywsgi import WSGIServerfrom gevent.server import StreamServer

from ginkgo.core import Service

def handle_http(env, start_response): start_response('200 OK', [('Content-Type', 'text/html')]) print 'new http request!' return ["hello world"]

def handle_tcp(socket, address): print 'new tcp connection!' while True: socket.send('hello\n') gevent.sleep(1)

app = Service()app.add_service(StreamServer(('127.0.0.1', 1234), handle_tcp))app.add_service(WSGIServer(('127.0.0.1', 8080), handle_http))app.serve_forever()

Page 29: Asynchronous Architectures for Implementing Scalable Cloud Services - Evan Cooke - Gluecon 2012

Async Services with Ginkgoimport geventfrom gevent.pywsgi import WSGIServerfrom gevent.server import StreamServer

from ginkgo.core import Service

def handle_http(env, start_response): start_response('200 OK', [('Content-Type', 'text/html')]) print 'new http request!' return ["hello world"]

def handle_tcp(socket, address): print 'new tcp connection!' while True: socket.send('hello\n') gevent.sleep(1)

app = Service()app.add_service(StreamServer(('127.0.0.1', 1234), handle_tcp))app.add_service(WSGIServer(('127.0.0.1', 8080), handle_http))app.serve_forever()

Import WSGI/TCPServers

Page 30: Asynchronous Architectures for Implementing Scalable Cloud Services - Evan Cooke - Gluecon 2012

Async Services with Ginkgoimport geventfrom gevent.pywsgi import WSGIServerfrom gevent.server import StreamServer

from ginkgo.core import Service

def handle_http(env, start_response): start_response('200 OK', [('Content-Type', 'text/html')]) print 'new http request!' return ["hello world"]

def handle_tcp(socket, address): print 'new tcp connection!' while True: socket.send('hello\n') gevent.sleep(1)

app = Service()app.add_service(StreamServer(('127.0.0.1', 1234), handle_tcp))app.add_service(WSGIServer(('127.0.0.1', 8080), handle_http))app.serve_forever()

HTTP Handler

Page 31: Asynchronous Architectures for Implementing Scalable Cloud Services - Evan Cooke - Gluecon 2012

Async Services with Ginkgoimport geventfrom gevent.pywsgi import WSGIServerfrom gevent.server import StreamServer

from ginkgo.core import Service

def handle_http(env, start_response): start_response('200 OK', [('Content-Type', 'text/html')]) print 'new http request!' return ["hello world"]

def handle_tcp(socket, address): print 'new tcp connection!' while True: socket.send('hello\n') gevent.sleep(1)

app = Service()app.add_service(StreamServer(('127.0.0.1', 1234), handle_tcp))app.add_service(WSGIServer(('127.0.0.1', 8080), handle_http))app.serve_forever()

TCP Handler

Page 32: Asynchronous Architectures for Implementing Scalable Cloud Services - Evan Cooke - Gluecon 2012

Async Services with Ginkgoimport geventfrom gevent.pywsgi import WSGIServerfrom gevent.server import StreamServer

from ginkgo.core import Service

def handle_http(env, start_response): start_response('200 OK', [('Content-Type', 'text/html')]) print 'new http request!' return ["hello world"]

def handle_tcp(socket, address): print 'new tcp connection!' while True: socket.send('hello\n') gevent.sleep(1)

app = Service()app.add_service(StreamServer(('127.0.0.1', 1234), handle_tcp))app.add_service(WSGIServer(('127.0.0.1', 8080), handle_http))app.serve_forever()

Service Composition

Page 33: Asynchronous Architectures for Implementing Scalable Cloud Services - Evan Cooke - Gluecon 2012

LoadBalancer

...

Incoming Requests

Async Server

Async Server

Async Server

Using our async reactor-based approach let’s redesign our serving infrastructure

Page 34: Asynchronous Architectures for Implementing Scalable Cloud Services - Evan Cooke - Gluecon 2012

LoadBalancer

...

Incoming Requests

Async Server

AAA

Async Server

AAA

Async Server

AAA

Step 1: define an authentication and authorization layer that will identify the user and the resource being requested

Page 35: Asynchronous Architectures for Implementing Scalable Cloud Services - Evan Cooke - Gluecon 2012

LoadBalancer

...

Incoming Requests

Throttling

Async Server

AAA

Throttling

Async Server

AAA

Throttling

Async Server

AAA

ConcurrencyManager

Step 2: add a throttling layer and concurrency manager

Page 36: Asynchronous Architectures for Implementing Scalable Cloud Services - Evan Cooke - Gluecon 2012

Concurrency Admission Control

•Goal: limit concurrency by delaying or selectively failing requests

•Common metrics- By Account

- By Resource Type

- By Availability of Dependent Resources

•What we’ve found useful- By (Account, Resource Type)

Page 37: Asynchronous Architectures for Implementing Scalable Cloud Services - Evan Cooke - Gluecon 2012

Delay - delay responses without failing requests

Latency

Load

Page 38: Asynchronous Architectures for Implementing Scalable Cloud Services - Evan Cooke - Gluecon 2012

Load

Latency /x Fail

Latency /*

Deny - deny requests based on resource usage

Page 39: Asynchronous Architectures for Implementing Scalable Cloud Services - Evan Cooke - Gluecon 2012

LoadBalancer

...

Incoming Requests

Throttling

App Server

AAA

Throttling

App Server

AAA

Throttling

App Server

AAA

DependentServices

ConcurrencyManager

Throttling Throttling Throttling

Step 3: allow backend resources to throttle requests

Page 40: Asynchronous Architectures for Implementing Scalable Cloud Services - Evan Cooke - Gluecon 2012

SummaryAsync frameworks like gevent allow you to easily decouple a request from access to constrained resources

RequestLatency

Time

Service-wideFailure

Page 41: Asynchronous Architectures for Implementing Scalable Cloud Services - Evan Cooke - Gluecon 2012

Don’t Fail RequestsDecrease

Performance

Page 42: Asynchronous Architectures for Implementing Scalable Cloud Services - Evan Cooke - Gluecon 2012
Page 43: Asynchronous Architectures for Implementing Scalable Cloud Services - Evan Cooke - Gluecon 2012

CONTENTS CONFIDENTIAL & COPYRIGHT © TWILIO INC. 2012

Evan Cooke@emcooke

twilio