Memory Issues in Ruby on Rails Applications

Post on 31-Oct-2014

596 views 3 download

Tags:

description

Boston Ruby Meetup presentation by Joe Ferris, CTO of thoughtbot, and Simeon Simeonov, CTO of Swoop, on ways to optimize the memory footprint of data intensive Ruby on Rails applications.

Transcript of Memory Issues in Ruby on Rails Applications

Memory Issuesin Rails Applications

I am @simeons

recruit amazing people

solve hard problems !

ship !

make users happy !

repeat

Problems of Success (good problems)

Too many users Too much traffic Too much data

Memory Issuesin Rails Applications

Common Problem of Success

Display AdvertisingMakes the Web Suck

User-focused optimization Tens of millions of users

1000+% better than average 200+% better than Google

Swoop Fixes That

Mobile  SDKs  iOS  &  Android

Web  SDK  RequireJS  &  jQuery

Components  AngularJS

NLP,  etc.  Python

Targe<ng  High-­‐Perf  Java

Analy<cs  Ruby  2.0

Internal  Apps  Ruby  2.0  /  Rails  3

Pub  Portal  Ruby  2.0  /  Rails  3

Ad  Portal  Ruby  2.0  /  Rails  4

Before 1hr @ 4Gb

Before 1hr @ 4Gb

When problems grow faster than the rate at which you can throw HW at them, you actually have to solve them

Before 1hr @ 4Gb

After 5min @ 230Mb

Resolving Memory Issuesin Rails ApplicationsUsing Streams

CSV

0

125

250

375

500

0 25,000 50,000 75,000 100,000

Rows

Mem

ory

(Mb)

0

125

250

375

500

0 25,000 50,000 75,000 100,000

Rows

Mem

ory

(Mb)

You are here

0

125

250

375

500

0 25,000 50,000 75,000 100,000

Rows

Mem

ory

(Mb)

You are here

This sucks

0

125

250

375

500

0 25,000 50,000 75,000 100,000

Rows

Mem

ory

(Mb)

You are here

This sucks

Start thinking here

Memory Leaks

class AddDomainsStep def call(hashes) hashes.map do |hash| transform_and_return(hash) end end end

1 class AddDomainsStep 2 def initialize 3 @domain_config = DomainConfig.instance 4 end 5 6 def call(hashes) 7 hashes.each do |hash| 8 hash['domain'] = 9 @domain_config. 10 domain_for(hash['domain_id']) 11 end 12 end 13 end

1 class DomainConfig 2 include Singleton 3 4 def initialize 5 @domains = {} 6 end 7 8 def domain_for(id) 9 @domains[id] ||= Domain.name_for(id) || '' 10 end 11 end

@domains[id] ||= Domain.name_for(id) || ''

Memory Leak

•Memory that will never be released by the garbage collector.

•Memory usage grows the longer the process runs.

Avoid Global State

•Global variables

•Class variables

•Singletons

•Per-process instance state

Memory Churn

hashes.map do |hash| hash['domain'].downcase.strip end

hashes.each do |hash| hash['domain'].downcase! hash['domain'].strip! end

vs

Memory Churn

•Allocating and deallocating tons of objects slows down processing

•Mutation limits allocations, but makes it easier to introduce bugs

1 hashes.each do |hash| 2 hash['domain'].downcase! 3 hash['domain'].strip! 4 end

Spot the Bug!

# In shared state: @domains[id] ||= Domain.name_for(id) || '' !

# Much later: hash['domain'].downcase! hash['domain'].strip!

Good News!•Allocating and freeing objects is

fairly fast in Ruby •Keeping your stack frame light

will limit the effects of memory churn

Memory Bloat

def to_csv csv = [CSV.generate_line(headers)] !

rows.each do |row| values = headers.map do |header| row[header] || defaults[header] end !

csv << CSV.generate_line(values) end !

csv.join('') end

def to_csv csv = [CSV.generate_line(headers)] !

rows.each do |row| values = headers.map do |header| row[header] || defaults[header] end !

csv << CSV.generate_line(values) end !

csv.join('') end

def to_csv csv = [CSV.generate_line(headers)] !

rows.each do |row| values = headers.map do |header| row[header] || defaults[header] end !

csv << CSV.generate_line(values) end !

csv.join('') end

Memory Bloat

•Memory usage grows with data set

•Loading too much data at once

Laziness

rename_report_fields( squash( add_domains( add_properties( unwind_variations( rows ) ) ) ) )

def duplicate(number, count) if count > 0 [number] + repeat(number, count - 1) else [] end end !

def sum(list) list.inject(0) do |result, number| result + number end end

sum(repeat(5,10)) # => 50

duplicate :: Int -> Int -> [Int] duplicate number count | count <= 0 = [] | otherwise = number:duplicate number (count - 1) !sum :: [Int] -> Int sum [x] = x sum (x:remaining) = x + sum remaining

> sum $ duplicate 5 10 50

Be ProactiveAbout Being Lazy

Enumerable

class AddDomainsStep def initialize(source) @source = source end !

def each @source.each do |hash| hash['domain'] = DomainConfig. instance. domain_for(hash['domain_id']) yield hash end end end

RenameReportFieldsStep.new( SquashStep.new( AddDomainsStep.new( AddPropertiesStep.new( UnwindVariationsStep.new( rows ) ) ) ) )

Buffering