Using Monitoring & Configuration Management to restart services.

20
Greg Retkowski Operations Engineer, OnLive Combining Monitoring and CM to restart services.

description

These are the slides from a talk I gave to the Large Scale Production Engineering group in November 2011. The talk is on how to tie Nagios and Puppet together so that puppet can rectify problems detected by Nagios.

Transcript of Using Monitoring & Configuration Management to restart services.

Page 1: Using Monitoring & Configuration Management to restart services.

Greg RetkowskiOperations Engineer, OnLive

Combining Monitoring and CM to restart services.

Page 2: Using Monitoring & Configuration Management to restart services.

What's in it for me?

● It'll free you up from firefighting

● It'll react faster than a human can

● It's a hedge against technical debt

Page 3: Using Monitoring & Configuration Management to restart services.

Required Tools

● NAGIOS

● Puppet

Page 4: Using Monitoring & Configuration Management to restart services.

High Level Diagram

Page 5: Using Monitoring & Configuration Management to restart services.

Our Puppet Apache Class

Page 6: Using Monitoring & Configuration Management to restart services.

Nagios service.cfg

Page 7: Using Monitoring & Configuration Management to restart services.

Nagios commands.cfg

Page 8: Using Monitoring & Configuration Management to restart services.

Puppet auth.conf● Create an empty namespaceauth.conf● Add this to your auth.conf:

Page 9: Using Monitoring & Configuration Management to restart services.

Puppet puppet.conf

Page 10: Using Monitoring & Configuration Management to restart services.

Testing the puppet agent● puppetd –listen –verbose –no-daemonize –no-

client –fqdn `hostname`

Page 11: Using Monitoring & Configuration Management to restart services.

Invoking puppetrun● puppetrun -a –host FQDN

Page 12: Using Monitoring & Configuration Management to restart services.

The handle_puppetrun.sh script

Page 13: Using Monitoring & Configuration Management to restart services.

Bringing it all together

Page 14: Using Monitoring & Configuration Management to restart services.
Page 15: Using Monitoring & Configuration Management to restart services.
Page 16: Using Monitoring & Configuration Management to restart services.
Page 17: Using Monitoring & Configuration Management to restart services.
Page 18: Using Monitoring & Configuration Management to restart services.
Page 19: Using Monitoring & Configuration Management to restart services.
Page 20: Using Monitoring & Configuration Management to restart services.

Resources

http://www.rage.net/lspe