Joyent circa 2006 (Scale with Rails)

75
Scale with Rails Real-life capacity and deployment planning for small to large scale Rails applications

description

This is a talk given by Jason Hoffman at a workshop given by Joyent called "Scale With Rails" in 2006. There's quite a bit of prescience in this presentation, including the first documented use of ZFS in production ("Fsck you if you think ZFS isn't production") and of OS-based virtualization (zones) in the cloud (which, it must be said, was not called "cloud" in 2006).

Transcript of Joyent circa 2006 (Scale with Rails)

Page 1: Joyent circa 2006 (Scale with Rails)

Scale with RailsReal-life capacity and deployment planning for

small to large scale Rails applications

Page 2: Joyent circa 2006 (Scale with Rails)

The Guys {Thanks}

David Young, Dean Allen, Jason Ho!man, Matt Imbach, Michael Koziarski, Scott Barron, Chris Morris, Luke Crawford, Justin French, Johan Sörensen, Ben Myles, Marten Veldthuis, Bryan Bell, Ryan Schwartz, Jan Isley, Florian Munz, Filip Hajný, Daniel Crowell, Christopher Horrell, Terrell Russell, Peter Watridge, Josh Roebuck

Page 3: Joyent circa 2006 (Scale with Rails)

Today• Please humor me => a 5 minute story with too

many slides about how I came to stand here in front of you

• Somethings about the Ruby on Rails site

• An accounting what’s buzzing in my ears all the time

• Some examples with numbers

• How to think about “scaling”

• To talk about where we’ve taken our infrastructure

Page 4: Joyent circa 2006 (Scale with Rails)

Who?

Page 5: Joyent circa 2006 (Scale with Rails)

Internet beginnings for me

Page 6: Joyent circa 2006 (Scale with Rails)

I liked collaboration

Page 7: Joyent circa 2006 (Scale with Rails)

It started with Textpattern

The best CMS/blog thing written in PHP ever

Page 8: Joyent circa 2006 (Scale with Rails)

We used basecamp

Ooooooh, it’s in Ruby? Reeeeeeeeeally.

Page 9: Joyent circa 2006 (Scale with Rails)

So we thought let’s involve some friends

• Dean Allen (Textpattern)

• Matt Mullenweg (Wordpress)

• Brad Choate (doing MT hosting + plugins)

• David Heinemeier Hansson (Instiki and “Rails”)

• Rickard of PunBB

• Allan => Textmate

Page 10: Joyent circa 2006 (Scale with Rails)

I was a committer once

Page 11: Joyent circa 2006 (Scale with Rails)

What did they get me into?

5/2003 9/20055/2004

“it” started

Page 12: Joyent circa 2006 (Scale with Rails)
Page 13: Joyent circa 2006 (Scale with Rails)
Page 14: Joyent circa 2006 (Scale with Rails)
Page 15: Joyent circa 2006 (Scale with Rails)
Page 16: Joyent circa 2006 (Scale with Rails)

Books are important

Not going to pay in my house. Send those books out.

Page 17: Joyent circa 2006 (Scale with Rails)

Rails

Page 18: Joyent circa 2006 (Scale with Rails)

rubyonrails.com

~3500-4500 TBs/month (~10-15Mbps)

Page 19: Joyent circa 2006 (Scale with Rails)

But sometimes it’s gotten “bad”

• Ruby on Rails

• Turbogears

• Textmate

• All have videos (”screencasts”)

• They’ve been in the same paragraph on Slashdot

• Then we’ve pushed 150-200 Mbps

• Open Source is free?

Page 20: Joyent circa 2006 (Scale with Rails)

Some of Joyent’s

We combined last year to be the “perfect” company.

Page 21: Joyent circa 2006 (Scale with Rails)

OK, So what?

• We found ourselves needing an infrastructure for “diverse” things

• But fundamentally we’re talking about web stu!, mail stu!, database stu!, storing and moving around files stu!

Page 22: Joyent circa 2006 (Scale with Rails)

And why are we here?

• One indication: because I get, on average, 2 emails/day asking the same questions

• That’s just “me”

• Support system and “sales” gets them as well

Page 23: Joyent circa 2006 (Scale with Rails)

Examples from last week

• I have yet to find any examples of websites that have heavy tra#c and stream media that run from a Ruby on Rails platform, can you suggest any sites that will demonstrate that the ruby platform is stable and reliable enough to use on a commercial level?

• We are concerned about the long-term viability of Ruby on Rails as a development language/environment.

• How easily can a ruby site be converted to another language? (If for any reason we were forced to abandon ruby at some point in the future or I can’t find someone to work with our code?).

• My company has some concerns on whether or not Ruby on Rails is the right platform to deploy on if we have a very large scale app.

I could go on for a while.

Page 24: Joyent circa 2006 (Scale with Rails)

So what is?

• A “large scale” application?

• Do many of us really have one? I mean a BIG ONE?

• What is an “intensive” application?

• What is an “enterprise” application?

Page 25: Joyent circa 2006 (Scale with Rails)

But we’ve been doing “it” with

• Perl

• PHP

• Python

• Java

• Does the language really really matter to the system’s folks?

Page 26: Joyent circa 2006 (Scale with Rails)

Ah-ha!

• System’s Folks you say?

• What are these “system administrators” that you speak of?

• Does the network really matter?

• You mean there’s “networking people”?

• You mean it takes more than just having my designer(s) learns The Rails?

Page 27: Joyent circa 2006 (Scale with Rails)

If I had a nickel for every time ...

• “We’re going to need to scale this up to Flickr-sized proportions!”

• “This could go very very fast!”

• “The market is HUGE”

Page 28: Joyent circa 2006 (Scale with Rails)

Reality Check• “OK”, I say

• “Not a problem”, I say

• “I can do that”, I say

• “For what you’re asking, I’ll need $325,000 tomorrow to start, it’ll take $18,500 a month to run and you can expect that to go up along with your growth.”

• “It’s a good rule of thumb to try and keep it about 10% of your revenue once you’re going because people ended being the most expensive”

Page 29: Joyent circa 2006 (Scale with Rails)

Oh. But we don’t have that kind of money. And the app is free to

start.

Page 30: Joyent circa 2006 (Scale with Rails)

OK.

Page 31: Joyent circa 2006 (Scale with Rails)

What a minute!

• Why are you going on and on about this not worry about scaling til you need to scale stu!?

• I can read this same thing on what’s their face’s weblog.

• It’s not what I’m asking about!

• I need to know if I can “scale” with ruby!

• We’re not a start-up, why are you talking about this from a “start-up’s” perspective?!

• I need a ...

Page 32: Joyent circa 2006 (Scale with Rails)

A What?

• A framework for thinking about the entire system?

• You mean it’s more than just the language?

• It’s more than about a readily available IDE?

• You mean one’s “development” framework is just one small part?

• My system’s guys can scale anything.

Page 33: Joyent circa 2006 (Scale with Rails)

Yes It Is More

Page 34: Joyent circa 2006 (Scale with Rails)

And oh yes ... you should worry still worry about scaling

Page 35: Joyent circa 2006 (Scale with Rails)

Because what does “scaling” actually mean?

• For a little software application company?

• Can you do the Start-up => Mid-sized, “small caps” company transition without going out of business?

• In a big company?

• Can you do what you gotta do without having your program cut?

• Will your app run on a phone? Scale down.

Page 36: Joyent circa 2006 (Scale with Rails)

Let’s talk some details

Page 37: Joyent circa 2006 (Scale with Rails)

First, let’s talk about limiting factors

Page 38: Joyent circa 2006 (Scale with Rails)

The slowest part ?

• The fast ethernet or gigabit network port (assuming there’s more than one drive)

• Transactions of something/second

Page 39: Joyent circa 2006 (Scale with Rails)

• 1 Gbps = 125 MB/sec (100 Mbps = 12.5 MB/sec)

• The question is can your OS and our CPUs push it?

• And let’s say you can, just how much is a Gbps in some kind of other thing?

Page 40: Joyent circa 2006 (Scale with Rails)

We used to use FreeBSD

Moral of the story? Even restricted to a single processor with a single core, Solaris Nevada Build 31 can push 60% of a gigabit connection. FreeBSD can’t.

Page 41: Joyent circa 2006 (Scale with Rails)

OK, Solaris Good. What’s a 100 Mbps in normal web traffic?

• Textdrive.com => 122 KB and has 20 objects (~125KB for an uncached page view).

• 125KB page => 100-1000 unique visitors per second

• 20 objects per page, that is 2000 requests per second that could pump out of that system.

• Maximum.

Page 42: Joyent circa 2006 (Scale with Rails)

• What is the ability to do 2000 requests/second then?

• ((2000 requests/sec)*(20 requests per page)) * 0.125MB per page = 12.5 MB/sec (100Mbps constant).

• 86400 seconds/day on 100Mbps => 8,640,000 uniques in a day with 172,800,00 hits.

Page 43: Joyent circa 2006 (Scale with Rails)

• $8000/month

• Give or take a couple of grand depending

How much does just a 100 Mbps commit cost?

Page 44: Joyent circa 2006 (Scale with Rails)

Can you do 2000 requests/second?

• Sure an Apache, Lighttpd or Litespeed can do 1000-15,000 static or proxy requests/second

Page 45: Joyent circa 2006 (Scale with Rails)

How do you do 2000 requests/second?

• Alistapart.com bursts to that with Zeldman’s “Web 3.0” for over an hour

• 10 Lighttpds => 200 proxy requests/second each

• 1 Lighttpd => 40 request/sec x 5 Rails-FCGI each

• We cached everything but writing to MySQL for comments and that kept it on a 3.2 Ghz P4 with 4GB of RAM

Page 46: Joyent circa 2006 (Scale with Rails)

The Shared Hosting

• Is this odd heterogenous largely ruby “application”

• Users received 8,641,866 emails, they sent 5,860,769 emails and had 14,984,680 pieces of spam blocked. Mail through the system averages 300 emails/minute with bursts up to 60,000 emails/minute.

• Websites cumulatively did ~400,000,000 page views.

Page 47: Joyent circa 2006 (Scale with Rails)
Page 48: Joyent circa 2006 (Scale with Rails)

What is it running now?

• There’s 22 TBs of strongspace

• Fiber-attached EMC storage

• Migrated to Solaris “11”

• One big ZFS pool with LUNs

• Migrated down to one “hot” server: Dual 3.0 Ghz Xeon with 8GB of RAM

• Apache, Lighttpd, static FCGI

Page 49: Joyent circa 2006 (Scale with Rails)

Fsck You{if you think ZFS isn’t production}

Page 50: Joyent circa 2006 (Scale with Rails)

Is it a web app or what?

~20 Mbps constant

Page 51: Joyent circa 2006 (Scale with Rails)

“Logical” servers for the hosted connector

1) Jumpstart/Boot & Administrative servers (x2 per setup)

2) DHCP/LDAP for server identification/authentication and control (x2)

3) DNS: DNSCache for outside resolution (DNSCache as resolver) and a private DNS system

4) DNS MySQL (x2, master/slave, innodb tables)

5) SPAM filtering servers (running DSPAM -> files to NFS store and tracking to postgresql) (x4)

6) SPAM database setup (running postgresql) (x2)

7) SPAM NFS store (x2 heads clustered)

8) SMTP gateway out (x2)

9) SMTP gateway in (x4 -> delivery to Maildir over NFS)

10) Mail NFS store (x2 heads cluster) *main user file store

11) IMAP proxy servers (x2)

12) IMAP servers (x6)

13) User LDAP servers (x2 with hitting postgresql DB backend)

14) User LDAP-postgresql db (x2)

15) User postgresql DB servers (x4)

16) User Web/Application servers (running with kernel SSL accelerators) (x6)

17) User File Storage (NFS dual cluster heads) (x2)

18) Joyent Organization Provisioning server

Page 52: Joyent circa 2006 (Scale with Rails)

So how do you scale that “web” application?

Page 53: Joyent circa 2006 (Scale with Rails)

You know

Page 54: Joyent circa 2006 (Scale with Rails)

That beautiful Connector application where only 2 of the

18 “things” is “rails”.

Page 55: Joyent circa 2006 (Scale with Rails)

So most scale is how you scale those “common” services

Page 56: Joyent circa 2006 (Scale with Rails)

And why just 3.0 Ghz?

Page 57: Joyent circa 2006 (Scale with Rails)

CPUs are largely idle & RAM RAM RAM

08:23:01 PM CPU %user %nice %sys %iowait %irq %soft %idle intr/s08:23:01 PM all 2.64 0.00 0.36 2.81 0.02 0.14 94.02 1239.4908:23:01 PM 0 6.86 0.00 1.18 4.72 0.10 0.69 86.45 68.9408:23:01 PM 1 2.63 0.00 0.34 1.95 0.01 0.11 95.01 192.9708:23:01 PM 2 2.78 0.00 0.43 7.41 0.02 0.11 89.29 197.2108:23:01 PM 3 2.16 0.00 0.29 4.12 0.01 0.07 93.39 178.3508:23:01 PM 4 1.67 0.00 0.17 1.09 0.01 0.04 97.04 117.2908:23:01 PM 5 1.67 0.00 0.17 1.11 0.01 0.04 97.05 161.0008:23:01 PM 6 1.67 0.00 0.16 1.04 0.01 0.04 97.13 161.8608:23:01 PM 7 1.68 0.00 0.16 1.08 0.01 0.04 97.07 161.86

• “Busy” mysql database server

• ~240GB of data with ~150,000 users

• Swapping just a little little bit, so ...

Page 58: Joyent circa 2006 (Scale with Rails)

So what kind of CPUs have we moved into?

Page 59: Joyent circa 2006 (Scale with Rails)

MB/CMP0/P0 0 1000 MHz SUNW,UltraSPARC-T1 MB/CMP0/P1 1 1000 MHz SUNW,UltraSPARC-T1 MB/CMP0/P2 2 1000 MHz SUNW,UltraSPARC-T1 MB/CMP0/P3 3 1000 MHz SUNW,UltraSPARC-T1 MB/CMP0/P4 4 1000 MHz SUNW,UltraSPARC-T1 MB/CMP0/P5 5 1000 MHz SUNW,UltraSPARC-T1 MB/CMP0/P6 6 1000 MHz SUNW,UltraSPARC-T1 MB/CMP0/P7 7 1000 MHz SUNW,UltraSPARC-T1 MB/CMP0/P8 8 1000 MHz SUNW,UltraSPARC-T1 MB/CMP0/P9 9 1000 MHz SUNW,UltraSPARC-T1 MB/CMP0/P10 10 1000 MHz SUNW,UltraSPARC-T1 MB/CMP0/P11 11 1000 MHz SUNW,UltraSPARC-T1 MB/CMP0/P12 12 1000 MHz SUNW,UltraSPARC-T1 MB/CMP0/P13 13 1000 MHz SUNW,UltraSPARC-T1 MB/CMP0/P14 14 1000 MHz SUNW,UltraSPARC-T1 MB/CMP0/P15 15 1000 MHz SUNW,UltraSPARC-T1MB/CMP0/P16 16 1000 MHz SUNW,UltraSPARC-T1 MB/CMP0/P17 17 1000 MHz SUNW,UltraSPARC-T1 MB/CMP0/P18 18 1000 MHz SUNW,UltraSPARC-T1 MB/CMP0/P19 19 1000 MHz SUNW,UltraSPARC-T1 MB/CMP0/P20 20 1000 MHz SUNW,UltraSPARC-T1 MB/CMP0/P21 21 1000 MHz SUNW,UltraSPARC-T1 MB/CMP0/P22 22 1000 MHz SUNW,UltraSPARC-T1 MB/CMP0/P23 23 1000 MHz SUNW,UltraSPARC-T1 MB/CMP0/P24 24 1000 MHz SUNW,UltraSPARC-T1 MB/CMP0/P25 25 1000 MHz SUNW,UltraSPARC-T1 MB/CMP0/P26 26 1000 MHz SUNW,UltraSPARC-T1 MB/CMP0/P27 27 1000 MHz SUNW,UltraSPARC-T1 MB/CMP0/P28 28 1000 MHz SUNW,UltraSPARC-T1 MB/CMP0/P29 29 1000 MHz SUNW,UltraSPARC-T1 MB/CMP0/P30 30 1000 MHz SUNW,UltraSPARC-T1 MB/CMP0/P31 31 1000 MHz SUNW,UltraSPARC-T1

Sun T1000’s Ultrasparc T1

32 logical 1.0 GhzCombined with many software instances ZonesBest match between actual CPU and RAM usage

Page 60: Joyent circa 2006 (Scale with Rails)

It’s not really about high frequencies

If you buy intel, buy weak ones, spend the money on RAM

Throughput!

Page 61: Joyent circa 2006 (Scale with Rails)

Not simple

Page 62: Joyent circa 2006 (Scale with Rails)

Back to simple

Console serverOne kind of switchOne kind of serverOne kind of storageOne kind of interconnectOne kind of power plugOne kind of power strip

Page 63: Joyent circa 2006 (Scale with Rails)

Well. We do have some load balancers still.

Page 64: Joyent circa 2006 (Scale with Rails)

There’s actually lots of different kinds of plugs

Page 65: Joyent circa 2006 (Scale with Rails)

What the “default” 5-15R gets you

Umm ... ok so there’s a real disconnect between server power and datacenter “design”

Page 66: Joyent circa 2006 (Scale with Rails)

I/OStorage

PublicRemote Interconnects

Jumbo FramesOwn Switches

Page 67: Joyent circa 2006 (Scale with Rails)

4 “old” San Diego Racks

Per Rack:

39 processors 39 logical CPUs52 GB RAM5280 watts

Total:156 logical CPUs208 GB RAM21,120 watts

Page 68: Joyent circa 2006 (Scale with Rails)

Yes that is what they look like

Page 69: Joyent circa 2006 (Scale with Rails)

Become a new node

1152 logical 1Ghz CPUs576 GB RAM6,480 watts

31 amps @ 208V

2 x 20amp @ 208V L6-20R2 x 24 plug 20amp/208V

----------------------------------

12 Enclosures36 controllers180 drives90 TBs raw storage 77 TBs clustered storage

29.4 amps @ 208V

2 x 20amp @ 208V L6-20R2 x 24 plug 20amp/208V

Per Rack:

39 processors 39 logical CPUs52 GB RAM5280 watts

Total:156 logical CPUs208 GB RAM21,120 watts

Page 70: Joyent circa 2006 (Scale with Rails)

Server + Storage

RAID5

7 TBs

RAID5

7 TBs

RAID5

7 TBs

RAIDZ ZFS Pool

iSCSI

Page 71: Joyent circa 2006 (Scale with Rails)

This is going to be open for you to use

Page 72: Joyent circa 2006 (Scale with Rails)

DIY: How do you start?

20amp 208V AC powerL6-20R

Street price: ~$50,000 co-lo’ed

Page 73: Joyent circa 2006 (Scale with Rails)

Why DIY? Or at least a visit?

Outsourcing! You can whois yourself?

Page 74: Joyent circa 2006 (Scale with Rails)

Some final things• Keep it simple even if you’re starting out big

• Keep it simple even if you’re starting out small

• Our “stacks” are currently:

• Solaris, Mongrel, Apache 2.2 event, kssl

• Solaris, FCGI, Apache 2.2 event, kssl

• Solaris, FCGI, Lighttpd, kssl

• Solaris, FCGI, Lighttpd, Big-IP

• Zones, zones, zones

Page 75: Joyent circa 2006 (Scale with Rails)