In some cases the "Normal" way of running checks does not work, passive checks might do the job. I have a script the runs every night that backup my MySQL database. If this script fails I would like op5 Monitor or Nagios to send a notification. An active check will not work in this case or is very cumbersome to get it to work. A more elegant solution is to let the backup script send in the result to op5 Monitor or Nagios. This is where passive checks is handy. A passive check trust that some external program will send in the result. It is possible to set check_freshness so if nothing has been sent in to op5 Monitor or Nagios it will react, typically set the status to UNKOWN or CRITICAL.
In my case the backup script is started on another host then op5 Monitor or Nagios server, so I also will need a way of sending the data from the passive check over the network, the recommended way is to use nsca. Read the theory at http://nagios.sourceforge.net/docs/3_0/addons.html#nsca
In my op5 Monitor system the nsca daemon to recieve nsca information was installed so I only had to start it:
/etc/init.d/nsca start
This is the steps I did to install it on the client:
1. Download nsca from here.
2. Untar and compile nsca
3. Create a ncsa config file i.e. send_nsca.cfg
encryption_method=0
Now the data will be transmitted unencrypted over the network, this might not be what you want. Make sure that the corresponding nsca config file on the Nagios or op5 Monitor host has the same encryption method.
4. Create a passive check for testing.
# service 'Passive check test'
define service{
use default-service
host_name dull
service_description Passive check test
check_command check_dummy!3 "No Data from passive check"
max_check_attempts 1
active_checks_enabled 0
check_freshness 1
freshness_threshold 300
flap_detection_options n
contact_groups it-slav_mail,call_it-slav,it-slav_msn
stalking_options n
}
Explanation:
The check_dummy command will be run if no passive check has been recieved within 5 minutes (300 seconds).
4. test
-First test, wait 5 minutes and your service "Passive check test" should be in status UNKNOWN
-Second test, create a file passive_file_test_critical (the separator is TAB):
dull Passive check test 2 CRITICAL:test critical
run command:
send_nsca -H nagios_host -c send_nsca.cfg < passive_check_data_critical
and the status should change to CRITICAL
-Third test, create a file passive_check_data_ok (the separator is TAB):
dull Passive check test 0 OK: test ok
Run the command
send_nsca -H nagios_host -c send_nsca.cfg < passive_check_data_ok
And the status should change to OK
Now you can set the status of a Nagios or op5 Monitor service by using commands that can be used in scripts. I will in a later article describe how I use it in my MySQL backup script.
Links:
- NSCA
- Nagios passive check theory
- op5 Monitor
- An article about monitor automysqlbackup with passive checks
Troubleshooting hint:
If it does not work, a good hint is to take a look into nagios.log