Asynchronous Architectures for Implementing Scalable Cloud Services - Evan Cooke - Gluecon 2012
-
Upload
twilio -
Category
Technology
-
view
6.016 -
download
1
description
Transcript of Asynchronous Architectures for Implementing Scalable Cloud Services - Evan Cooke - Gluecon 2012
Asynchronous Architectures for Implementing Scalable Cloud ServicesDesigning for Graceful Degradation
EVAN COOKE
CO-FOUNDER & CTO twilioCLOUD COMMUNICATIONS
Cloud services power the apps that are the backbone of modern society. How
we work, play, and communicate.
Cloud WorkloadsCan Be
Unpredictable
6x spike in 5 mins
SMS API Usage
RequestLatency
Load
Time
FAIL
Danger!Load higher than instantaneous throughput
Don’t Fail Requests
LoadBalancer
Incoming Requests
AAA AAA AAA
...Throttling Throttling Throttling
Throttling Throttling Throttling
App Server
App Server
App Server
App Server
W
WW
W
WWW
W
WorkerPool
10%
70%
100%+
FailedRequests
Time
Worker Poolse.g., Apache/Nginx
Problem Summary
•Cloud services often use worker pools to handle incoming requests
•When load goes beyond size of the worker pool, requests fail
What next?
A few observations based on work implementing and scaling the Twilio API over the past 4 years...
• Twilio Voice/SMS Cloud APIs
• 100,000 Twilio Developers
• 100+ employees
Observation 1
For many APIs, taking more time to service a request is better than failing that request
Implication: in many cases, it is better to service a request with some delay rather than failing it
Observation 2
Matching the amount of available resources precisely to the size of incoming request worker pools is challenging
Implication: under load, it may be possible delay or drop only those requests that truly impact resources
What are we going to do?
Suggestion: if request concurrency was very cheap, we could implement delay and finer-grained resource controls much more easily...
Event-driven programming and the Reactor Pattern
Event-driven programming and the Reactor Pattern
req = ‘GET /’;req.append(‘/r/n/r/n’);socket.write(req);resp = socket.read();print(resp);
1110000x10000000x10
TimeWorker
Event-driven programming and the Reactor Pattern
req = ‘GET /’;req.append(‘/r/n/r/n’);socket.write(req);resp = socket.read();print(resp);
1110000x10000000x10
Time
Huge IO latency blocks worker
Event-driven programming and the Reactor Pattern
req = ‘GET /’;req.append(‘/r/n/r/n’);socket.write(req, fn() {
socket.read(fn(resp) {print(resp);});
});
Make IO operations async and “callback” when done
Event-driven programming and the Reactor Pattern
req = ‘GET /’;req.append(‘/r/n/r/n’);socket.write(req, fn() {
socket.read(fn(resp) {print(resp);});
});Central dispatch to coordinate event callbacksreactor.run_forever();
Event-driven programming and the Reactor Pattern
req = ‘GET /’;req.append(‘/r/n/r/n’);socket.write(req, fn() {
socket.read(fn(resp) {print(resp);});
});reactor.run_forever();
11
10
Time
1010
Result: we don’t block the worker
(Some)Reactor Pattern Frameworks
js/node.js
python/twistedpython/gevent
c/libeventc/libev
ruby/eventmachine
java/nio/netty
The Callback Mess
Python Twistedreq = ‘GET /’req += ‘/r/n/r/n’
def r(resp): print resp
def w(): socket.read().addCallback(r)
socket.write().addCallback(w)
The Callback Mess
Python Twistedreq = ‘GET /’req += ‘/r/n/r/n’
yield socket.write()resp = yield socket.read()print resp
Use deferred generators and inline callbacks
The Callback Mess
Python Twistedreq = ‘GET /’req += ‘/r/n/r/n’
yield socket.write()resp = yield socket.read()print resp
Easy sequential programming with
mostly implicit async IO
Enter gevent“gevent is a coroutine-based Python networking library that uses greenlet
to provide a high-level synchronous API on top of the libevent event loop.”
socket.write()resp = socket.read()print resp
Natively Async
Enter gevent
from gevent.server import StreamServer
def echo(socket, address): print ('New connection from %s:%s' % address) socket.sendall('Welcome to the echo server!\r\n') line = fileobj.readline() fileobj.write(line) fileobj.flush() print ("echoed %r" % line)
if __name__ == '__main__': server = StreamServer(('0.0.0.0', 6000), echo) server.serve_forever()
Simple Echo Server
Easy sequential modelFully async
Async Services with Ginkgo
Ginkgo is a simple framework for composing async gevent services with common
configuration, logging, demonizing etc.
https://github.com/progrium/ginkgo
Let’s look a simple example that implements a TCP and
HTTP server...
Async Services with Ginkgoimport geventfrom gevent.pywsgi import WSGIServerfrom gevent.server import StreamServer
from ginkgo.core import Service
def handle_http(env, start_response): start_response('200 OK', [('Content-Type', 'text/html')]) print 'new http request!' return ["hello world"]
def handle_tcp(socket, address): print 'new tcp connection!' while True: socket.send('hello\n') gevent.sleep(1)
app = Service()app.add_service(StreamServer(('127.0.0.1', 1234), handle_tcp))app.add_service(WSGIServer(('127.0.0.1', 8080), handle_http))app.serve_forever()
Async Services with Ginkgoimport geventfrom gevent.pywsgi import WSGIServerfrom gevent.server import StreamServer
from ginkgo.core import Service
def handle_http(env, start_response): start_response('200 OK', [('Content-Type', 'text/html')]) print 'new http request!' return ["hello world"]
def handle_tcp(socket, address): print 'new tcp connection!' while True: socket.send('hello\n') gevent.sleep(1)
app = Service()app.add_service(StreamServer(('127.0.0.1', 1234), handle_tcp))app.add_service(WSGIServer(('127.0.0.1', 8080), handle_http))app.serve_forever()
Import WSGI/TCPServers
Async Services with Ginkgoimport geventfrom gevent.pywsgi import WSGIServerfrom gevent.server import StreamServer
from ginkgo.core import Service
def handle_http(env, start_response): start_response('200 OK', [('Content-Type', 'text/html')]) print 'new http request!' return ["hello world"]
def handle_tcp(socket, address): print 'new tcp connection!' while True: socket.send('hello\n') gevent.sleep(1)
app = Service()app.add_service(StreamServer(('127.0.0.1', 1234), handle_tcp))app.add_service(WSGIServer(('127.0.0.1', 8080), handle_http))app.serve_forever()
HTTP Handler
Async Services with Ginkgoimport geventfrom gevent.pywsgi import WSGIServerfrom gevent.server import StreamServer
from ginkgo.core import Service
def handle_http(env, start_response): start_response('200 OK', [('Content-Type', 'text/html')]) print 'new http request!' return ["hello world"]
def handle_tcp(socket, address): print 'new tcp connection!' while True: socket.send('hello\n') gevent.sleep(1)
app = Service()app.add_service(StreamServer(('127.0.0.1', 1234), handle_tcp))app.add_service(WSGIServer(('127.0.0.1', 8080), handle_http))app.serve_forever()
TCP Handler
Async Services with Ginkgoimport geventfrom gevent.pywsgi import WSGIServerfrom gevent.server import StreamServer
from ginkgo.core import Service
def handle_http(env, start_response): start_response('200 OK', [('Content-Type', 'text/html')]) print 'new http request!' return ["hello world"]
def handle_tcp(socket, address): print 'new tcp connection!' while True: socket.send('hello\n') gevent.sleep(1)
app = Service()app.add_service(StreamServer(('127.0.0.1', 1234), handle_tcp))app.add_service(WSGIServer(('127.0.0.1', 8080), handle_http))app.serve_forever()
Service Composition
LoadBalancer
...
Incoming Requests
Async Server
Async Server
Async Server
Using our async reactor-based approach let’s redesign our serving infrastructure
LoadBalancer
...
Incoming Requests
Async Server
AAA
Async Server
AAA
Async Server
AAA
Step 1: define an authentication and authorization layer that will identify the user and the resource being requested
LoadBalancer
...
Incoming Requests
Throttling
Async Server
AAA
Throttling
Async Server
AAA
Throttling
Async Server
AAA
ConcurrencyManager
Step 2: add a throttling layer and concurrency manager
Concurrency Admission Control
•Goal: limit concurrency by delaying or selectively failing requests
•Common metrics- By Account
- By Resource Type
- By Availability of Dependent Resources
•What we’ve found useful- By (Account, Resource Type)
Delay - delay responses without failing requests
Latency
Load
Load
Latency /x Fail
Latency /*
Deny - deny requests based on resource usage
LoadBalancer
...
Incoming Requests
Throttling
App Server
AAA
Throttling
App Server
AAA
Throttling
App Server
AAA
DependentServices
ConcurrencyManager
Throttling Throttling Throttling
Step 3: allow backend resources to throttle requests
SummaryAsync frameworks like gevent allow you to easily decouple a request from access to constrained resources
RequestLatency
Time
Service-wideFailure
Don’t Fail RequestsDecrease
Performance
CONTENTS CONFIDENTIAL & COPYRIGHT © TWILIO INC. 2012
Evan Cooke@emcooke
twilio