High gear PHP with Gearman (phpDay 2010)
-
Upload
felix-de-vliegher -
Category
Technology
-
view
12.793 -
download
0
Transcript of High gear PHP with Gearman (phpDay 2010)
14/05/2010 - phpDay Italia 2010Felix De Vliegher
High gear PHP with Gearman
• Consultant and software engineer at Ibuildings NL
• Long time PHP developer
• PHPBenelux
• Interest in PHP QA, high performance & scalability
• Belgium
Whoami?
2
• Consultant and software engineer at Ibuildings NL
• Long time PHP developer
• PHPBenelux
• Interest in PHP QA, high performance & scalability
• Belgium (beer)!
Whoami?
3
• Consultant and software engineer at Ibuildings NL
• Long time PHP developer
• PHPBenelux
• Interest in PHP QA, high performance & scalability
• Belgium
Whoami?
4
What is Gearman?And why should I care about it?
5
6
GEARMAN!
A grain of truth
The name is an anagram for “Manager,” since it dispatches jobs to be done, but does not do anything useful itself.
- From Gearman website
7
What is Gearman?
It’s an
Application frameworkto
distribute work
8
Distribute work?
• Actual work is farmed out
• Handled by a number of nodes
• RPC
• Distributed parallel processing
• Shared nothing
9
f
Traditional architecture
10
Model
Controller
View
Application
Model
New architecture
11
Client application
Gearman Job server(gearmand)
Gearman Client API
Gearman Worker API
Worker application
Gearman Worker API
Worker application
Gearman Worker API
Worker application
Terminology
12
Terminology
12
Client Create jobs to be run and send them to a job server
Worker Register with a job server and grab jobs to run
Job Server Coordinates assignment from clients to workers, handles restarts
Terminology
12
Client Create jobs to be run and send them to a job server
Worker Register with a job server and grab jobs to run
Job Server Coordinates assignment from clients to workers, handles restarts
Terminology
12
Client Create jobs to be run and send them to a job server
Worker Register with a job server and grab jobs to run
Job Server Coordinates assignment from clients to workers, handles restarts
Application areas
• Image resizing / generating
• Log analysis and aggregation
• Asynchronous queues
• Map/Reduce
• URL processing
• Cache warm-up
13
Multiple client & worker API’s
14
Multiple client & worker API’s
14
Advantages
• Speed up work
• Parallel and asynchronous work
• Doesn’t block your apache processes
• Scales well
• Architecture-based workload distributing
• Legacy code
15
Once upon a timeGearman history
16
Once upon a time
• 2005: http://brad.livejournal.com/2106943.html
• Originally a Perl implementation
• Created by Danga Interactive
• Guys behind Memcache and MogileFS
17
Once upon a time
• 2008: Rewrite in C by Brian Aker
• PHP Extension by James Luedke
• Gearman powers some of the largest sites around:
• Digg: 45+ servers, 400K jobs/day
• Yahoo: 60+ servers, 6M jobs/day
• Netlog.com
• Xing.com
18
Installing Gearman
• Job Server: gearmand
• Get it from https://launchpad.net/gearmand/• extract, configure, make, make install:
19
dev:/usr/local/src# wget http://launchpad.net/gearmand/trunk/0.10/+download/gearmand-0.10.tar.gz
dev:/usr/local/src# tar -xzvf gearmand-0.10.tar.gz
dev:/usr/local/src# cd gearmand-0.10/
dev:/usr/local/src/gearmand-0.10# ./configure --prefix=/usr/local/
dev:/usr/local/src/gearmand-0.10# make
dev:/usr/local/src/gearmand-0.10# make install
dev:/usr/local/src/gearmand-0.10# /usr/local/sbin/gearmand --help
Gearmand usage
/usr/local/sbin/gearmand -d -u <user> -L 127.0.0.1 -p 7003
20
-d Start as daemon in background
-u <user> Run as the specified user
-L <host> Only listen on the specified host or IP
-p <port> Listen on the specified port
-t <threads> Number of threads to use
-v(vv) Verbose (useful for debugging)
Commandline gearman
• Client mode• ls | gearman -f processFiles
• gearman -f processFiles < file
• gearman -f processFiles “foo data”
• Worker mode• gearman -w -f lineCount -- wc -l
• gearman -w -c 100 -f doStuff ./script.sh
• Example:
21
dev:~/gearman# gearman -w -f foo -- grep GearmanClient &dev:~/gearman# cat demo.php | gearman -f foo$client= new GearmanClient();
PHP Interface
22
PHP: 2 options
• Pecl extension:
$ pecl install channel://pecl.php.net/gearman-0.6.0
$ php -i | grep "gearman support"
gearman support => enabled
• Net_Gearman PEAR Library:
$ pear install Net_Gearman
• Net_Gearman_Job
• Net_Gearman_Worker
• Net_Gearman_Task
• Net_Gearman_Set
• Net_Gearman_Client
23
Worker:
Client:
Simplest example
24
Client API
Setting up the client:
25
Client API
Job priorities and synchronous vs asynchronous:
26
GearmanClient::jobStatus()
27
array(4) { [0]=> bool(true) [1]=> bool(true) [2]=> int(2) [3]=> int(5)}
Job is known?
Job still running?
Numerator
Denominator
Notifying the client
Client receiving the status notifications:
28
dev:~/gearman# php -q client.php Running: true, numerator: 0, denomintor: 2Running: true, numerator: 1, denomintor: 2Running: false, numerator: 0, denomintor: 0
Worker API
Possible to add multiple servers:
29
Worker API
Registering functions using a callback:
Pass application data to the functions:
30
# php -q client.php Sending jobCount: 1: HELLO!Count: 2: WORLD!
Worker API
31
Put it all together:
Notifying the client
Status notifications:
GearmanJob::sendStatus(int $numerator, int $denominator)
GearmanJob::sendWarning(string $warning)
GearmanJob::sendComplete(string $result)
GearmanJob::sendFail(void)
GearmanJob::sendException(string $exception)
32
Jobs vs Tasks
33
Callbacks
Provide feedback on different moments in the process:
34
GearmanClient::setDataCallback
GearmanClient::setCompleteCallback
GearmanClient::setCreatedCallback
GearmanClient::setExceptionCallback
GearmanClient::setFailCallback
GearmanClient::setStatusCallback
GearmanClient::setWorkloadCallback
What about persistence?
By default, jobs are stored in memory (fast)
gearmand --queue-type (-q)
• libdrizzle
• postgresql
• libmemcached
• libsqlite3
Only for background jobs
35
Sqlite3 example
gearmand --queue-type=libsqlite3 --libsqlite3-db=/tmp/jobs.sqlite --libsqlite3-table=gearman_jobs
Table structure:
36
sqlite> CREATE TABLE gearman_jobs ( ...> unique_key TEXT PRIMARY_KEY, ...> function_name TEXT, ...> priority INTEGER, ...> data BLOB ...> );
How to do storage?
37
How to do storage?
Distributed
37
How to do storage?
Distributed
Need to share storage (most of the time)
37
How to do storage?
Distributed
Need to share storage (most of the time)
Some options:
37
How to do storage?
Distributed
Need to share storage (most of the time)
Some options:
• NFS
37
How to do storage?
Distributed
Need to share storage (most of the time)
Some options:
• NFS
• MogileFS
37
How to do storage?
Distributed
Need to share storage (most of the time)
Some options:
• NFS
• MogileFS
• DR:BD
37
So, can I kick out my crons?
38
So, can I kick out my crons?
Not quite :)
38
So, can I kick out my crons?
Not quite :)
Scheduled execution (cron)
38
So, can I kick out my crons?
Not quite :)
Scheduled execution (cron)
delayed execution (at)
38
So, can I kick out my crons?
Not quite :)
Scheduled execution (cron)
delayed execution (at)
38
*/15 * * * * /usr/local/bin/gearman -f cronTask < /tmp/input.txt
So, can I kick out my crons?
Not quite :)
Scheduled execution (cron)
delayed execution (at)
at functionality is considered for inclusion
38
*/15 * * * * /usr/local/bin/gearman -f cronTask < /tmp/input.txt
So, can I kick out my crons?
Not quite :)
Scheduled execution (cron)
delayed execution (at)
at functionality is considered for inclusion
See http://groups.google.com/group/gearman/browse_thread/thread/b9891649fb08d16b#
38
*/15 * * * * /usr/local/bin/gearman -f cronTask < /tmp/input.txt
HTTP protocol
Start gearmand with -r http
Use X-Gearman-* headers to modify job
39
POST /reverse HTTP/1.1Content-Length: 13
Hello phpDay!
HTTP/1.0 200 OKX-Gearman-Job-Handle: H:gman:4Content-Length: 13Server: Gearman/0.9
!yaDphp olleH
Application areas
40
Image resizing: Client
41
Image resizing: worker
42
Map / Reduce
43
Client
Gearman Job server
Map/Reduce Worker
Client Client Client
Gearman Job server
Worker Worker Worker Worker Worker
Apache logging
44
Apache logging
44
Monitoring
45
Monitoring
GearUp: Monitoring service for gearman‣ No code yet, but looks promising
‣ http://launchpad.net/gearup
46
Monitoring
GearUp: Monitoring service for gearman‣ No code yet, but looks promising
‣ http://launchpad.net/gearup
46
Monitoring
GearUp: Monitoring service for gearman‣ No code yet, but looks promising
‣ http://launchpad.net/gearup
Telnet monitoring:
46
Monitoring
Supervisord: can manage workers
Combine with get_memory_usage() to restart workers
http://supervisord.org/
Alternative: http://github.com/brianlmoon/GearmanManager
47
[program:gearman-foobar-worker]command=/usr/local/bin/php -q /home/foo/workers/foobar.phpprocess_name=%(program_name)s_%(process_num)02dautostart=trueautorestart=truenumprocs=100redirect_stderr=truestdout_logfile=/var/log/gearman-foobar-worker.logstdout_logfile_maxbytes=5MBstdout_logfile_backups=10
Alternatives
Most similar: Beanstalkd
• PHP Client: pheanstalk
• Web interface in Django
• http://kr.github.com/beanstalkd/
Zend Server job queue
• Has job scheduling and dependencies built in
• Not free
• http://www.zend.com/en/products/server/zend-server-job-queue
48
Questions ?
49
Feedback: http://joind.in/1470
50
Ibuildings challenge!
50
Ibuildings challenge!
http://www.ibuildings.com/challenge
50
Ibuildings challenge!
http://www.ibuildings.com/challenge
“The Test Driven Challenge”
50
Ibuildings challenge!
http://www.ibuildings.com/challenge
“The Test Driven Challenge”
Win an iPad! (when they’re available)
50
Ibuildings challenge!
http://www.ibuildings.com/challenge
“The Test Driven Challenge”
Win an iPad! (when they’re available)
We’re also hiring!
Links & sourcesCredits:- Gear man: http://agearman.com/- Distributed computing: http://www.theleadblog.com/2009/06/lead-distribution-creating-right-sales.html- Old gears: http://decorate.pebblez.com- Army of elephpants: http://www.flickr.com/people/dragonbe/
Gearman online:- http://www.danga.com/gearman/- http://gearman.org/- http://pecl.php.net/package/gearman
- IRC: #gearman on irc.freenode.net- ML: http://groups.google.com/group/gearman
51