Using Monitoring & Configuration Management to restart services.
20
Greg Retkowski Operations Engineer, OnLive Combining Monitoring and CM to restart services.
-
Upload
gregretkowski -
Category
Technology
-
view
1.724 -
download
3
description
These are the slides from a talk I gave to the Large Scale Production Engineering group in November 2011. The talk is on how to tie Nagios and Puppet together so that puppet can rectify problems detected by Nagios.
Transcript of Using Monitoring & Configuration Management to restart services.
Greg RetkowskiOperations Engineer, OnLive
Combining Monitoring and CM to restart services.
What's in it for me?
● It'll free you up from firefighting
● It'll react faster than a human can
● It's a hedge against technical debt
Required Tools
● NAGIOS
● Puppet
High Level Diagram
Our Puppet Apache Class
Nagios service.cfg
Nagios commands.cfg
Puppet auth.conf● Create an empty namespaceauth.conf● Add this to your auth.conf:
Puppet puppet.conf
Testing the puppet agent● puppetd –listen –verbose –no-daemonize –no-
client –fqdn `hostname`
Invoking puppetrun● puppetrun -a –host FQDN
The handle_puppetrun.sh script
Bringing it all together
Resources
http://www.rage.net/lspe