Post on 13-Sep-2018
WMLUG July 2015
Nagios, PNP4Nagios, and NConf
byPatrick TenHoopen
What is Nagios?
Nagios is an IT infrastructure monitoring and alerting tool. The free Nagios DIY Core provides the central monitoring engine and the basic web interface.
Current Version: 4.08 (2014-08-12)
Download:https://assets.nagios.com/downloads/nagioscore/releases/nagios-4.0.8.tar.gz
Nagios Demo
Demo
Installation Prerequisites
● gcc● apache2● perl● php● rrdtool● php5-gd● php5-zlib● php5-socket
Installation
Follow Quick-Start Guides
https://assets.nagios.com/downloads/nagioscore/docs/nagioscore/4/en/quickstart.html
After install, don't forget to configure the firewall on the Nagios server to allow http access if one is running.
Installation, cont.
tar xf nagios-4.0.8.tar.gz
cd nagios-4.0.8
./configure --with-command-group=nagcmd
make all
make install
make install-init
make install-config
make install-commandmode
make install-webconf
htpasswd2 -c /usr/local/nagios/etc/htpasswd.users nagiosadmin
Nagios Plugins
Download:http://nagios-plugins.org/download/nagios-plugins-2.0.3.tar.gz
tar xf nagios-plugins-2.0.3.tar.gz
cd nagios-plugins-2.0.3
./configure --with-nagios-user=nagios --with-nagios-group=nagios
make
make install
Configuration
Nagios comes with a default configuration for monitoring the localhost that Nagios is installed on (localhost.cfg) plus some other examples.
The configuration files are stored at /usr/local/nagios/etc/objects/ and are plain text files formatted in a proprietary format.
Detailed description of configuration files and options:
https://assets.nagios.com/downloads/nagioscore/docs/nagioscore/4/en/objectdefinitions.html
Default Configuration Files
● commands.cfg – Check commands that are used in service definitions
● contacts.cfg - Who to contact if an alert is generated● hosts.cfg - Hosts to monitor● localhost.cfg - Basic config for Nagios host● printer.cfg – Sample config for printers● services.cfg - Things on hosts to monitor● switch.cfg - Sample config for switches● templates.cfg - Definition templates used by hosts, services, etc.● timeperiods.cfg – Notification times/hours of alerting● windows.cfg - Sample config for a Windows machine
Configuration File Organization
You don't need to separate the definitions into separate files, and you can have just one large configuration file.
The cfg_file line(s) in the /usr/local/nagios/etc/nagios.cfg file controls what files are used.
Note: If you want to import existing Nagios conf files into NConf (discussed later), it will work better if they are separated out by function/type.
Templates
Templates are used by configuration definitions to provide default values for settings. It keeps the actual definition smaller and easy to update. If you modify a template, all definitions that use it get updated.
Generic Linux Host Template
# Linux host definition template - This is NOT a real host, just a template!
define host{
name linux-server ; The name of this host template
use generic-host ; Inherits other values from generic-host template
check_period 24x7 ; By default, Linux hosts are checked round the clock
check_interval 5 ; Actively check the host every 5 minutes
retry_interval 1 ; Schedule host check retries at 1 minute intervals
max_check_attempts 10 ; Check each Linux host 10 times (max)
check_command check-host-alive ; Default command to check Linux hosts
notification_period workhours ; Only notify during the day
; Note that the notification_period variable is being
; overridden from the value that is inherited from the
; generic-host template!
notification_interval 120 ; Resend notifications every 2 hours
notification_options d,u,r ; Only send notifications for specific host states
contact_groups admins ; Notifications get sent to the admins by default
register 0 ; DONT REGISTER THIS DEFINITION
}
Generic Service Template
# Generic service definition template - This is NOT a real service, just a template!
define service{
name generic-service ; The 'name' of this service template
active_checks_enabled 1 ; Active service checks are enabled
passive_checks_enabled 1 ; Passive service checks are enabled/accepted
parallelize_check 1 ; Active service checks should be parallelized
; (disabling this can lead to major performance problems)
obsess_over_service 1 ; We should obsess over this service (if necessary)
check_freshness 0 ; Default is to NOT check service 'freshness'
notifications_enabled 1 ; Service notifications are enabled
event_handler_enabled 1 ; Service event handler is enabled
flap_detection_enabled 1 ; Flap detection is enabled
process_perf_data 1 ; Process performance data
retain_status_information 1 ; Retain status information across program restarts
retain_nonstatus_information 1 ; Retain non-status information across program restarts
is_volatile 0 ; The service is not volatile
check_period 24x7 ; The service can be checked at any time of the day
max_check_attempts 3 ; Re-check the service up to 3 times in order to determine its final (hard) state
normal_check_interval 10 ; Check the service every 10 minutes under normal conditions
retry_check_interval 2 ; Re-check the service every two minutes until a hard state can be determined
contact_groups admins ; Notifications get sent out to everyone in the 'admins' group
notification_options w,u,c,r ; Send notifications about warning, unknown, critical, and recovery events
notification_interval 60 ; Re-notify about service problems every hour
notification_period 24x7 ; Notifications can be sent out at any time
register 0 ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL SERVICE, JUST A TEMPLATE!
}
Commands
Nagios comes with several commands for checking services and more are installed with the Nagios plugins. They are located in the /usr/local/nagios/lib/ directory.
Some examples:
check_disk, check_http, check_log, check_nt, check_ping
Community Check Commands
You can download command definitions created by the Nagios community by perusing the plugin exchange at:
http://exchange.nagios.org/directory/Plugins
Custom Commands
You can also create custom check commands using scripts or custom programs.
The script/program just needs to return one of the exit statuses that Nagios expects:
UNKNOWN = 3, CRITICAL = 2, WARNING = 1, OK = 0
Example Check Definition
The $USER1$, $HOSTADDRESS$, $ARG1$, and $ARG2$ are Nagios macros. They are substituted for the values passed into the check when it is called from the service definition. -W is warning threshold. -C is critical threshold.
# 'check_ping' command definition
define command{
command_name check_ping
command_line $USER1$/check_ping -H $HOSTADDRESS$ -w $ARG1$ -c $ARG2$ -p 5
}
Example Host Definition
define host{
use linux-server ; Name of host templates to use
; This host definition will
; inherit all variables that are
; defined in (or inherited by)
; the linux-server host template
; definition.
host_name localhost
alias localhost
address 127.0.0.1
}
Example Service
Note that the command parameters are delimited by an "!". The parameters are used in the check definition ($ARG1$, $ARG2$, etc).
# Define a service to "ping" the local machine
define service{
use local-service ; Name of service
; template to use
host_name localhost
service_description PING
check_command check_ping!100.0,20%!500.0,60%
}
Host Groups
By using host groups, you can easily set up checks for a set of hosts with one service definition. You can create a new config file named hostgroups.cfg.
define hostgroup{
hostgroup_name linux-servers ; Name of the hostgroup
alias Linux Servers ; Long name of the group
members localhost,linuxbox1,linuxbox2 ; Comma separated list of
; hosts that belong to this group
}
define service{
use local-service ; Name of service template to use
hostgroup_name linux-servers
service_description PING-LINUX-HOSTS
check_command check_ping!100.0,20%!500.0,60%
}
Parent/Child Relationships
By defining what other hosts a host depends on, Nagios can distinguish between down and unreachable states for the host.
For example if Nagios is monitoring a host connected to another switch and the switch is down, preventing Nagios from pinging it, Nagios only alerts that the switch is offline and doesn't alert that the other host is down too.
Parents Setting
When defining a host, use the "parents" setting to establish the parent/child relationship.
define host{
host_name Nagios ; Nagios host has no parent
}
define host{
host_name Switch1
parents Nagios
}
define host{
host_name OtherHost
parents Switch1
}
Parent/Child Relationship Picture
Pictorial representation:
https://assets.nagios.com/downloads/nagioscore/docs/nagioscore/4/en/networkreachability.html
NSClient++
With the NSClient++ add-on, you can easily set up checks on Windows servers.
http://exchange.nagios.org/directory/Addons/Monitoring-Agents/NSClient%2B%2B/details
NRPE
The NRPE (Nagios Remote Plugin Executor) add-on runs checks on a remote Linux host. It also acts as an NRPE listener on the Windows server.
http://exchange.nagios.org/directory/Addons/Monitoring-Agents/NRPE--2D-Nagios-Remote-Plugin-Executor/details
File Count Example
Using NSClient++ and a community-created command (Check Filecount), you can monitor the number of files in a directory on a Windows computer.
File Count Example – Service Definition
From services.cfg:
# Service definition
define service {
use generic-service
host_name WINSERVER
service_description Temp File Count
check_command check_temp_files
}
File Count Example – Command Definition
From commands.cfg:
# 'check_temp_files' command definition
define command{
command_name check_temp_files
command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c check_temp_files
}
File Count Example – NSClient++ Definition
From the NSClient++ nsclient.ini file on Windows server being monitored:
[/settings/external scripts/scripts]
check_temp_files=c:\windows\system32\cscript.exe //NoLogo //T:30 C:\nrpe\directory_file_count\directory_file_count.wsf c: \\windows\\temp 50 100
Pre-Flight Check
Configuration changes don't go into affect until Nagios is restarted.
You should run a Nagios pre-flight check after making configuration changes and before restarting Nagios to make sure it doesn't find anything wrong with the configuration.
/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
Starting Nagios
You'll need to restart webserver and start Nagios:
systemctl restart apache2
systemctl start nagios
Logging Into Nagios
Goto Nagios web page and log in:
http://nagioshost/nagios/
PNP4Nagios
PNP4Nagios is an add-on to Nagios which analyzes and graphs performance data provided by Nagios plugins and stores them in Round Robin Database (RRD) files.
https://docs.pnp4nagios.org/
Prerequisites
● Perl 5.x or higher, without additional modules● RRDtool 1.x or higher, better with 1.2● Nagios 2.x or higher
Installation
https://docs.pnp4nagios.org/pnp-0.6/install
tar xf pnp4nagios-0.6.25.tar.gz
cd pnp4nagios-0.6.25
./configure
make all
make fullinstall
Configuration – Choose Mode
The Bulk-Mode + NCPD mode seems to be the only mode that works with Nagios core 4.
https://docs.pnp4nagios.org/pnp-0.6/modes#bulk_mode_with_npcd
Also, this is the best way of processing because Nagios will not be blocked. The NPCD daemon (Nagios Performance C Daemon) will monitor the directory for new files and process them.
Configuration - Enable Processing
Enable processing of performance data in /usr/local/nagios/etc/nagios.cfg
process_performance_data=1
# service performance data
service_perfdata_file=/usr/local/pnp4nagios/var/service-perfdata
service_perfdata_file_template=DATATYPE::SERVICEPERFDATA\tTIMET::$TIMET$\tHOSTNAME::$HOSTNAME$\tSERVICEDESC::$SERVICEDESC$\tSERVICEPERFDATA::$SERVICEPERFDATA$\tSERVICECHECKCOMMAND::$SERVICECHECKCOMMAND$\tHOSTSTATE::$HOSTSTATE$\tHOSTSTATETYPE::$HOSTSTATETYPE$\tSERVICESTATE::$SERVICESTATE$\tSERVICESTATETYPE::$SERVICESTATETYPE$
service_perfdata_file_mode=a
service_perfdata_file_processing_interval=15
service_perfdata_file_processing_command=process-service-perfdata-file
# host performance data
host_perfdata_file=/usr/local/pnp4nagios/var/host-perfdata
host_perfdata_file_template=DATATYPE::HOSTPERFDATA\tTIMET::$TIMET$\tHOSTNAME::$HOSTNAME$\tHOSTPERFDATA::$HOSTPERFDATA$\tHOSTCHECKCOMMAND::$HOSTCHECKCOMMAND$\tHOSTSTATE::$HOSTSTATE$\tHOSTSTATETYPE::$HOSTSTATETYPE$
host_perfdata_file_mode=a
host_perfdata_file_processing_interval=15
host_perfdata_file_processing_command=process-host-perfdata-file
Configuration – Add Commands
Add new commands to /usr/local/nagios/etc/objects/commands.cfg
# 'process-host-perfdata' command definition
define command{
command_name process-host-perfdata-file
command_line /bin/mv /usr/local/pnp4nagios/var/host-perfdata /usr/local/pnp4nagios/var/spool/host-perfdata.$TIMET$
}
# 'process-service-perfdata' command definition
define command{
command_name process-service-perfdata-file
command_line /bin/mv /usr/local/pnp4nagios/var/service-perfdata /usr/local/pnp4nagios/var/spool/service-perfdata.$TIMET$
}
Configuration - Templates
Add new templates to /usr/local/nagios/etc/objects/templates.cfg
define host {
name host-pnp
action_url /pnp4nagios/index.php/graph?host=$HOSTNAME$&srv=_HOST_
register 0
}
define service {
name srv-pnp
action_url /pnp4nagios/index.php/graph?host=$HOSTNAME$&srv=$SERVICEDESC$
register 0
}
Verify Configuration
Download the verify_pnp_config Perl script from:
https://docs.pnp4nagios.org/pnp-0.6/verify_pnp_config
Run it, specifying the mode, location of Nagios config file, and the pnp config file:
perl verify_pnp_config --mode=bulk+npcd --config=/usr/local/nagios/etc/nagios.cfg --pnpcfg=/usr/local/pnp4nagios/etc
Starting PNP4Nagios
Start PNP4Nagios as a daemon:
/usr/local/pnp4nagios/bin/npcd -d -f /usr/local/pnp4nagios/etc/npcd.cfg
You'll also need to restart webserver and Nagios:
systemctl restart apache2
systemctl restart nagios
Using PNP4Nagios
Click the graph icon next to hosts and services in the Nagios page to see the graphs of performance data.
PNP4Nagios Demo
Demo
NConf
NConf is a PHP-based web-tool for configuring Nagios. It has features like templates, service to hostgroup assignment, and dependencies.
http://www.nconf.org
Prerequisites
● PHP 5.x or higher● php-mysql● php-ldap (only if using LDAP auth)● MySQL 5.0.2 or higher (with InnoDB!)● Perl 5.6 or higher● perl-DBI● perl-DBD-MySQL
Installation - Prep
Note your webserver document root, user and group:
Apache document root: /srv/www/htdocs/
User: wwwrun
Group: www
Installation Guide
http://www.nconf.org/dokuwiki/doku.php?id=nconf:help:documentation:start:installation
Installation - Extract
Copy the tar file to the document root and expand it.
tar xf nconf-1.3.0-0.tgz
Grant permissions to webserver user:
chown wwwrun:www ./config
chown wwwrun:www ./output
chown wwwrun:www ./static_cfg
chown wwwrun:www ./temp
Installation – Create Database
Start the MySQL prompt then create the database:
mysql -u root -p
If the MySQL install is new, follow this to set password:
https://dev.mysql.com/doc/refman/5.0/en/default-privileges.html
mysql> CREATE DATABASE NConf;
mysql> GRANT SELECT, INSERT, UPDATE, DELETE, CREATE, ALTER, DROP ON `NConf`.* TO 'nconfuser'@'localhost' IDENTIFIED BY 'nconfp';
mysql> quit
GUI Installation Method
GUI (easy): http://localhost/nconf/INSTALL.php
1. Enter MySQL information.
2. Enter NConf and Nagios paths.
3. Set up authentication (use defaults).
4. Remove INSTALL, INSTALL.php, UPDATE, and UPDATE.php from NConf directory.
Manual Installation Method
1. CD into the extracted NConf folder.
2. Create database structure: mysql -u nconf -D NConf -p < INSTALL/create_database.sql
3. Configure NConf.
1. Copy the contents of ./config.orig to ./config.
2. Edit ./config/mysql.php, and set values for DBHOST, DBNAME, DBUSER, and DBPASS.
3. Edit ./config/nconf.php, and set values for NCONFDIR and NAGIOS_BIN.
4. Remove INSTALL, INSTALL.php, UPDATE, and UPDATE.php from NConf directory.
NConf Nagios Configuration Check
In order to enable NConf to check the Nagios configuration files, make sure your webserver user has access to your Nagios binary, or copy the binary to the '/srv/www/htdocs/nconf/bin/' folder and make the webserver user the owner.
NConf Demo
Demo