Using Monitoring & Configuration Management to restart services.

Post on 16-May-2015

1.724 views 3 download

Tags:

description

These are the slides from a talk I gave to the Large Scale Production Engineering group in November 2011. The talk is on how to tie Nagios and Puppet together so that puppet can rectify problems detected by Nagios.

Transcript of Using Monitoring & Configuration Management to restart services.

Greg RetkowskiOperations Engineer, OnLive

Combining Monitoring and CM to restart services.

What's in it for me?

● It'll free you up from firefighting

● It'll react faster than a human can

● It's a hedge against technical debt

Required Tools

● NAGIOS

● Puppet

High Level Diagram

Our Puppet Apache Class

Nagios service.cfg

Nagios commands.cfg

Puppet auth.conf● Create an empty namespaceauth.conf● Add this to your auth.conf:

Puppet puppet.conf

Testing the puppet agent● puppetd –listen –verbose –no-daemonize –no-

client –fqdn `hostname`

Invoking puppetrun● puppetrun -a –host FQDN

The handle_puppetrun.sh script

Bringing it all together

Resources

http://www.rage.net/lspe