Custom Checks with check_mk_agent

1. Install xinetd and check_mk_agent

Install xinetd first. The check_mk_agent package drops a configuration snippet into /etc/xinetd.d/, which makes xinetd listen on port 6556 for the agent.

2. Write the custom check (Service to check)

The server-side check can be written in any language, as long as it can print to STDOUT. The output format is quite simple:

<<<check_name>>>
data1 data2 ... dataN
data1 data2 ... dataN
...

check_name is the identifier. It can be anything, but it has to be enclosed in <<<>>>. Data[1-] are the things you want to check, including performance counters, anything you like. Server-side the data will be split at white-spaces.

Simple example: check if bacula-director and bacula-storage-daemon is running (in perl):

#!/usr/bin/perl -w 
 
use strict; 
 
my @instances = `ps o cmd ax`; 
my %count; 
foreach my $i (@instances){ 
        if($i =~ m#bacula-sd#){ 
                $count{"storage_daemon"}++; 
        } 
        if($i =~ m#bacula-dir#){ 
                $count{"director"}++; 
        } 
} 
 
print "<<<bacula_system>>>\n"; 
foreach my $k (keys %count){ 
        print "$k $count{$k}\n"; 
}

Output:

<<<bacula_system>>> 
storage_daemon 1 
director 1

One storage daemon process and one director process running. Perfect! Drop this script in /usr/lib/check_mk/plugins and make it executable.

3. The check_mk-Server-Part

Now we have to create the server-side counterpart. This has to be in python, because check_mk uses introspection extensively. Create a script named exactly as the identifier in local/share/check_mk/checks starting in the home directory of your check_mk-site-user. In the example given: local/share/check_mk/checks/bacula_system

The first thing we need is the dictionary check_info[] with at least these entries:

  • check_function: Name of the function actually performing the check
  • inventory_function: Name of the inventory function (called by WATO, the Web Administration)
  • service_description: Name of the service, can be a format string

Example:

#!/usr/bin/python 
 
def inventory_bacula_instances(info): 
    inventory = [] 
    inventory.append( ("storage_daemon", None) ) 
    inventory.append( ("director", None) ) 
    return inventory 
 
def check_bacula_instances(item, params, info): 
    count = 0 
    for each in info: 
        if each[0] == item: 
            count = int(each[1]) 
 
    retval = 3 
    perfdata = [ (item, count) ] 
    if count == 0: 
       retstring = "No instance found for %s" % item 
       return (2, retstring, perfdata) 
    else: 
        retstring = "%d instances found for %s" % (int(count), item) 
        return (0, retstring, perfdata) 
 
    return (3, "Checking %s failed" % item, perfdata) 
 
check_info["bacula_system"] = { 
    "check_function":       check_bacula_instances, 
    "inventory_function":   inventory_bacula_instances, 
    "service_description":  'Bacula: %s', 
    "has_perfdata":         True, 
}

3. The inventory function

The inventory function returns an array of Tuples with the items to check, in our case the items storage_daemon and directory returned by our perl script. The Tuples are fed to the check_function, in our case check_bacula_instances(item, params, info).

4. The check_function

The check_function does the actual work. The return values are:

  • 0: OK
  • 1: Warning
  • 2: Critical
  • 3: Unknown

The parameters:

  • item: The name of the item to check, i. e. the first item of the Tuple returned by the inventory_function. In our case “storage_daemon” or “director”
  • params: no idea (TODO)
  • info: a dictionary as key and an array with the values returned from the server-side script. In the example given something like this:
{'director': ['director', 1], 'storage_daemon': ['storage_daemon', 1]}

So info[‘director’][1] gives you the count of director processes.

5. Configure the check

  1. Login to WATO, click WATO-Configuration->Hosts->New Host.
  2. Enter hostname
  3. Click Save & Test, check that the agent is available
  4. Click Save & Exit
  5. Click Save & go to Services
  6. Save and activate the new configuration
  7. Have FUN!

6. Debugging

Sprinkle the python script with “print <somethingorother>” and call

$ cmk <hostname>

from the command line.

Ich habe fertig 🙂