Migrating from Backgroundrb to Resque

30
2010-10-12 to: Resque from: Backgroundrb @kbrock

description

We upgraded from Backgroundrb to Resque. The pagers have stopped buzzing, and we are very pleased with the migration. Resque was a little tricky to get the last 5% complete. This presentation shares some of the implementation details (code and config files) to help others make their Resque setup rock solid.

Transcript of Migrating from Backgroundrb to Resque

Page 2: Migrating from Backgroundrb to Resque

Summary

! Background queues let us defer logic outside the browser request and response.

! Background.rb was crashing for us often. Moved to resque and it hasn't crashed since.

! Background.rb is easier to run out of the box.

! Adding just a little code makes Resque just as easy without sacrificing all the added flexibility.

Page 3: Migrating from Backgroundrb to Resque

Why we upgraded?

! bdrb pages Boss 4 times my first weekend

! memory leaks caused crashes

! monit can't restart workers in backgroundrb

! move to active project (ala heroku, github, redis)

Page 4: Migrating from Backgroundrb to Resque

What do each bring to the table

bdrb resque

adhoc (out of request) ! !

delay (run/remind) ! resque-schedule

schedule (cron) ! resque-schedule

mail (invisible/out of req) code resque_mailer

status reporting code resque-meta, web

backgroundrb does most of what we need out of the boxresque has plugins to make up the difference

Page 5: Migrating from Backgroundrb to Resque

Bdrb Components

railsenqueue

mainqueue

queuemanager

scheduler

workers

mailer

work

bdrb ymlMonitoredwe started data

simple w/ 1 queue (add started_at for delayed jobs)scheduler is a special worker - managed by 1 process (is a runner/worker)

Page 6: Migrating from Backgroundrb to Resque

Resque Components

mainqueuemain

queue

delayedqueue

railsenqueue

mainqueue

rake

scheduler

workers

mailer

work

workers

schedule

Monitoredwe started data

resqueweb

2

1

5

3

4

6

many moving partssimplified in all workers are the samescheduler simply adds entries in the queue (instead of MetaWorker/running jobs)web ui is a nice touch

Page 7: Migrating from Backgroundrb to Resque

1. Ad-hoc Enqueuing

bdrb resque

args hash ruby, checked

enqueue AR objects !

mail(invisible) ! !

AR objects - creeped up in the action_mailer deliver callsLooks like bdrb wins here, but not enqueuing AR objects is best practice

Page 8: Migrating from Backgroundrb to Resque

Ad-hoc/Delayed (bdrb)

class JobWorker < BackgrounDRb::MetaWorker set_worker_name :job_worker def purge_job_logs() JobLog.purge_expired! persistent_job.finish! end def self.perform_later(*args) MiddleMan.worker(:job_worker).enq_purge_job_logs( :job_key => new_job_key, :arg => args) end def self.perform_at(*args) time=args.shift MiddleMan.worker(:job_worker).enq_purge_job_logs( :job_key => new_job_key, :arg => *args,:scheduled_at => time) end def self.new_job_key() "purge_job_logs_#{ActiveSupport::SecureRandom.hex(8)}" endend

don't need to do a command pattern (our code didn't)scheduled_at = beauty of SQLparent classenqueue knows queue name (code not loaded)

Page 9: Migrating from Backgroundrb to Resque

Ad-hoc/Delayed (resque)

class PurgeJobLogs @queue = :job_worker def self.process() JobLog.purge_expired! end

def self.perform_later(*args) Resque.enqueue(self, *args) end def self.perform_at(*args) time=args.shift Resque.enqueue_at(time, self, *args) endend

Enqueue needs worker class to know the name of the queue(even if called directly into Resque)interface only (perform_{at,later}) -> abstracted out to parent?

Page 10: Migrating from Backgroundrb to Resque

2. Scheduled Enqueuing

bdrb resque

sched any method !x2 command

scheduler ! !+

adhoc jobs !

Need to define schedule in 2 places. yml and ruby.We ran into case where this caused a problemweb ui for easy adhoc kicking off of resque commands. (very useful in staging)

Page 11: Migrating from Backgroundrb to Resque

Scheduled (bdrb)

:backgroundrb: :ip: 127.0.0.1 :port: 11006 :environment: development

:schedules: :scheduled_worker: :purge_job_logs: :trigger_args: 0 */5 * * * *

Evidence of framework - scheduled_worker defined here, need meta worker (so it can be run)

Page 12: Migrating from Backgroundrb to Resque

Scheduled (bdrb)

class ScheduledWorker < BackgrounDRb::MetaWorker extend BdrbUtils::CronExtensions set_worker_name :scheduled_worker

threaded_cron_job(:purge_job_logs) { JobLog.purge_expired! }end

scheduler = MetaWorker. Defined 2 times - so it calls your code, so can call "any static method"

Page 13: Migrating from Backgroundrb to Resque

Scheduled (resque)

---clear_logs: cron: "*/10 * * * *" class: PurgeJobLogs queue: job_worker description: Remove old logs

queue_name (so scheduler does not need to load worker into memory to enqueue)cron is standard format (remove 'seconds') - commandsscheduler in separate process. (can run when workers are stopped / changed) - minimal envscheduler injects into queue (vs runs jobs) - so can adhoc inject via webno ruby code for this

Page 14: Migrating from Backgroundrb to Resque

3. Processes/Worker management

bdrb resque

knows queues ! us, command, web

pids ! us+

mem leak resistant !

workers/queue 1 <1 - ∞

pause workers !

Discover previous queues (not all) via 'resque list' / webbdrb: creates 1 worker/queue (creates pid file + 1 pid for backgroundrb) - monit can't restartwe manage processes: 1+ workers/queue - 1+ queues / workerpause/restart workers

Page 15: Migrating from Backgroundrb to Resque

worker list (resque)

primary: queues: background,mailsecondary: queues: mail,background

can have multiple workers running the same queuescan have multiple queues in 1 workerworker pool can be * generalized, * response focused, * schedule focused, *changed at runtimeinverted priority list - prevents starvation

Page 16: Migrating from Backgroundrb to Resque

4. Running Workers

namespace :resque do desc 'start all background resque daemons' task :start_daemons do mrake_start "resque_scheduler resque:scheduler" workers_config.each do |worker, config| mrake_start "resque_#{worker} resque:work QUEUE=#{config['queues']}" end end desc 'stop all background resque daemons' task :stop_daemons do sh "./script/monit_rake stop resque_scheduler" workers_config.each do |worker, config| sh "./script/monit_rake stop resque_#{worker} -s QUIT" end end def self.workers_config YAML.load(File.open(ENV['WORKER_YML'] || 'config/resque_workers.yml')) end def self.mrake_start(task) sh "nohup ./script/monit_rake start #{task} RAILS_ENV=#{ENV['RAILS_ENV']} >> log/daemons.log &" endend

Page 17: Migrating from Backgroundrb to Resque

Deploying (cap)

namespace :resque do desc "Stop the resque daemon" task :stop, :roles => :resque do run "cd #{current_path} && RAILS_ENV=#{rails_env} WORKER_YML=#{resque_workers_yml} rake resque:stop_daemons; true" end

desc "Start the resque daemon" task :start, :roles => :resque do run "cd #{current_path} && RAILS_ENV=#{rails_env} WORKER_YML=#{resque_workers_yml} rake resque:start_daemons" endend

Page 18: Migrating from Backgroundrb to Resque

5. Monitoring Workers (monit.erb)

check process resque_scheduler with pidfile <%= @rails_root %>/tmp/pids/resque_scheduler.pid group resque alert [email protected] start program = "/bin/sh -c 'cd <%= @rails_root %>; RAILS_ENV=production ./script/monit_rake start resque_scheduler resque:scheduler'" stop program = "/bin/sh -c 'cd <%= @rails_root %>; RAILS_ENV=production ./script/monit_rake stop resque_scheduler'"

<% YAML.load(File.open(Rails.root+'/config/production/resque/resque_workers.yml')).each_pair do |worker, config| %>check process resque_<%=worker%> with pidfile <%= @rails_root %>/tmp/pids/resque_<%=worker%>.pid group resque alert [email protected] start program = "/bin/sh -c 'cd <%= @rails_root %>; RAILS_ENV=production ./script/monit_rake start resque_<%=worker%> resque:work QUEUE=<%=config['queues']%>'" stop program = "/bin/sh -c 'cd <%= @rails_root %>; RAILS_ENV=production ./script/monit_rake stop resque_<%=worker%>'"

<% end %>

use template to generate monit file

Page 19: Migrating from Backgroundrb to Resque

Monitoring Rake Processes

#!/bin/sh# wrapper to daemonize rake tasks: see also http://mmonit.com/wiki/Monit/FAQ#pidfile

usage() { echo "usage: ${0} [start|stop] name target [arguments]" echo "\tname is used to create or read the log and pid file names" echo "\tfor start: target and arguments are passed to rake" echo "\tfor stop: target and arguments are passed to kill (e.g.: -n 3)" exit 1}[ $# -lt 2 ] && usage

cmd=$1name=$2shift ; shift

pid_file=./tmp/pids/${name}.pidlog_file=./log/${name}.log

# ...

Page 20: Migrating from Backgroundrb to Resque

Monitoring Processes

case $cmd in start) if [ ${#} -eq 0 ] ; then echo -e "\nERROR: missing target\n" usage fi pid=`cat ${pid_file} 2> /dev/null` if [ -n "${pid}" ] ; then ps ${pid} if [ $? -eq 0 ] ; then echo "ensure process ${name} (pid: ${pid_file}) is not running" exit 1 fi fi echo $$ > ${pid_file} exec 2>&1 rake $* 1>> ${log_file} ;; stop) pid=`cat ${pid_file} 2> /dev/null` [ -n "${pid}" ] && kill $* ${pid} rm -f ${pid_file} ;; *) usage ;;esac

Page 21: Migrating from Backgroundrb to Resque

Monitoring Web

Page 22: Migrating from Backgroundrb to Resque

6. Running Web

namespace :resque do task :setup => :environment

desc 'kick off resque-web' task :web => :environment do $stdout.sync=true $stderr.sync=true puts `env RAILS_ENV=#{RAILS_ENV} resque-web #{RAILS_ROOT}/config/initializers/resque.rb` endend

Page 23: Migrating from Backgroundrb to Resque

initializer

#this runs in sinatra and rails - so don't use Rails.envrails_env = ENV['RAILS_ENV'] || 'development'rails_root=ENV['RAILS_ROOT'] || File.join(File.dirname(__FILE__),'../..')

redis_config = YAML.load_file(rails_root + '/config/redis.yml')Resque.redis = redis_config[rails_env]

require 'resque_scheduler'require 'resque/plugins/meta'require 'resque_mailer'

Resque.schedule = YAML.load_file(rails_root+'/config/resque_schedule.yml')Resque::Mailer.excluded_environments = [:test, :cucumber]

Page 24: Migrating from Backgroundrb to Resque

5. Monitoring Work

bdrb resque

ad-hoc queries SQL redis query

did it run? custom resque-meta

did it fail? hoptoad !

rerun !

have id ! resque-meta

que health sample controller !

Did the job run?resque assumes all worked - only tells you failures. not good enough for us

Page 25: Migrating from Backgroundrb to Resque

Pausing Workers

signal what happens when to use

quit wait for child & exit gracefully shutdown

term / int immediately kill child & exit shutdown now

usr1 immediately kill child stale child

usr2 don't start any new jobs

cont start to process new jobs

Page 26: Migrating from Backgroundrb to Resque

Testing Worker

bdrb resque

testing queue mid-easy resque_unit

testing command !

all workers same !

interface only !

Page 27: Migrating from Backgroundrb to Resque

Mail

Resque::Mailer.excluded_environments = [:test, :cucumber]

Page 28: Migrating from Backgroundrb to Resque

Extending with Hooks

resque hooks

around_enqueue "

after_enqueue !

before_perform !

around_perform ! / "

after_perform !

all plugins want to extend enqueue - not compatibleneed to be able to alter arguments (e.g.: add id for meta plugins)

Page 29: Migrating from Backgroundrb to Resque

Conclusion

! Boss got no pages in first month of implementation

! no memory leaks, great uptime (don't need monit...)

! Fast

! generalized workers increases throughput (nightly vs 1 hour)

! minimal custom code

! still some intimidation

! Eating flavor of the month