PyCon 2011 Scaling Disqus

Post on 08-Sep-2014

21.602 views 0 download

Tags:

description

Disqus talks about how they scale their Python web application to over 500 million visitors a month. Video is available here: http://pycon.blip.tv/file/4880330/

Transcript of PyCon 2011 Scaling Disqus

DISQUSJason Yan@jasonyan

David Cramer@zeeg

Python at 400 500 million visitors

Got feedback? Use hashtag #sckrw

Sunday, March 13, 2011

Agenda

• What is DISQUS?

• An Overview of the Infrastructure• Iterative Development and Deployment• Why We Love Python

Sunday, March 13, 2011

We are a comment system with an emphasis on connecting communities

http://disqus.com/about/

dis·cuss • dĭ-skŭs'

What is DISQUS?

Sunday, March 13, 2011

Embeddable Comments

Sunday, March 13, 2011

A Brief History

Sunday, March 13, 2011

Startup-ish

• Founded just about 4 years ago• 16 employees, 8 engineers• Tra!c increasing 15-20% a month• Flat organizational structure, every

engineer is a product manager• Fast turnaround, new feature launches

every week (sometimes daily)

Sunday, March 13, 2011

Tra!c

0M

125M

250M

375M

500M

Number of Visitors

March 2008 through March 2011

Sunday, March 13, 2011

DjangoCon 2010

• 17,000 requests/second peak

• 450,000 websites

• 15 million profiles

• 75 million comments

• 250 million visitors

Sunday, March 13, 2011

Six Months Later

• 25,000 requests/second peak

• 700,000 websites

• 30 million profiles

• 170 million comments

• 500 million visitors

• 17,000 requests/second peak

• 450,000 websites

• 15 million profiles

• 75 million comments

• 250 million visitors

Sunday, March 13, 2011

Six Months Later

• September 2010: 250 million uniques

• March 2011: 500 million uniques

• Handling over 2x the tra!c

Sunday, March 13, 2011

Six Months Later

• September 2010: ~100 servers• March 2011: ~100 servers

• Scale diagonally

Sunday, March 13, 2011

Scaling Diagonally

• We still rent hardware, so there is no “commodity hardware”

• Cheaper to upgrade

• Everything is redundant• Partition data where you need to, scale

partitions vertically

• Upgrade hardware (more RAM, more drives, more cores)

• Python apps tend to be CPU bound

Sunday, March 13, 2011

Infrastructure

• 35% Web Servers (Apache + mod_wsgi)

• 15% Utility Servers (Python scripts, background workers)

• 20% Databases (PostgreSQL, Redis, Membase)

• 20% Load Balancing / High Availability (HAProxy + Heartbeat)

• 10% Caching servers (Memcached, Varnish)

• Half of our servers run Python

Sunday, March 13, 2011

• Use what you’re comfortable with• Apache + mod_wsgi vs nginx + uWSGI

• Bottleneck is in the application

Python Web Servers

mod_wsgi

uWSGI

0 200 400 600

req/sec

Min Avg Max

015.030.045.060.0

mod_wsgi uWSGI

Memory

Sunday, March 13, 2011

Background Workers

• Lots of tasks that don’t need to be done in web application process:

• Crawling URLs

• Updating avatars

• Email notifications

• Analytics

• Counters

Sunday, March 13, 2011

Background Workers (cont’d)

• Most jobs are I/O bound• Slow external calls

• Twitter is slow

• Facebook is slow

• Could parallelize with multiple processes, but...

Sunday, March 13, 2011

Background Workers (cont’d)

• Waste of memory

• Use non-blocking I/O• Celery 2.2 adds support for gevent/

eventlet

Sunday, March 13, 2011

Monitoring

• Application side: Graphite• Real-time(ish) graphing

• Django front-end, Python backend

• Etsy’s StatsD proxy to Graphite

• UDP (fire and forget)

• Batches updates

Sunday, March 13, 2011

Monitoring

• Track application metrics

• Errors, exceptions

• New comments, users, sites, etc.

• Anything

Sunday, March 13, 2011

Monitoring

• Check out Etsy’s posts:

• Measure Anything, Measure Everything http://codeascraft.etsy.com/2011/02/15/measure-anything-measure-everything/

• Tracking Every Release http://codeascraft.etsy.com/2010/12/08/track-every-release/

Sunday, March 13, 2011

What about the code?

Sunday, March 13, 2011

Powered By Django

Sunday, March 13, 2011

Which means...

• Largest Django-powered web application

• We fork, and even sometimes monkey patch to make it scale to our needs

• Fortunately, we don’t have to do too much (Yay, Django!)

• Unfortunately, we can’t use the whole of the Django internal components (and if we do, we do it in atypical ways)

Sunday, March 13, 2011

Iterative DevelopmentRelease Early Release Often

Sunday, March 13, 2011

Iterating Quickly

• Abstracting our application environment

• Less dependancies locally• Rely on CI for dependency coverage

• Heavy use of open source packages• No NIH syndrome

• Deploy frequently, 3-7 times a day

• Lots of branches, but master is “stable”• Realtime reporting on exceptions, metrics

• Our test suite is the main blocker (slow)

Sunday, March 13, 2011

Dealing with Deploys

Sunday, March 13, 2011

Gargoyle

Being users of our product, we actively use early versions of features before public release

Deploy features to portions of a user base at a time to ensure smooth, measurable releases

Sunday, March 13, 2011

The Deployment Problem

• Make some changes locally

• Run a subset of the test suite• Push your commits• CI server begins running tests

• ....

Sunday, March 13, 2011

Waiting on the test suite...

Sunday, March 13, 2011

Rinse and Repeat

• 30 minutes later tests fail, start over• Finally, deploy to a subset of servers

• Open Sentry (our exception logger)

• Monitor Graphite• Deploy to 35 servers (~8 minutes)

• Full rollback in < 30 seconds

Sunday, March 13, 2011

Wait, Sentry?

Sunday, March 13, 2011

Testing

Sunday, March 13, 2011

Testing Code

• Test suite takes around 25 minutes usually• “Stuck” with Hudson (or Jenkins)

• Most tightly integrated plugins are geared towards Java developers

• Which framework do we use?

• unittest(2), nose, doctests, LETTUCE?

• We use unittest and nose• Need to report code coverage, speed of

tests, pylint (or pyflakes)

Sunday, March 13, 2011

We Love Python

Sunday, March 13, 2011

Love-ish

• Many of us started with PHP or Rails• Clean syntax, clear standards

• All languages need PEP8.py and PyFlakes

• Interpreted, fast... enough

• Very easy to learn• We all started by learning Django first,

then Python

Sunday, March 13, 2011

Haters Gonna HateIf you could choose one thing in

Python to hate on...

Sunday, March 13, 2011

Better package management

Sunday, March 13, 2011

What can we do?

• Too many forks, too many frameworks• We need less clones, and more combined

e"ort

• Improving existing Python solutions

• More Python solutions for existing products

Sunday, March 13, 2011

Python Rocks!

Sunday, March 13, 2011

DISQUSQuestions?

psst, we’re hiringjobs@disqus.com

Sunday, March 13, 2011

References

• Sentry (our exception tracking tool)http://github.com/dcramer/django-sentry

• Gargoyle (feature switches)https://github.com/disqus/gargoyle

• Django DB Utils (collection of db helpers for Django)https://github.com/disqus/django-db-utils

• Jenkins CIhttp://jenkins-ci.org/

code.disqus.com

Sunday, March 13, 2011