Saturday, 4 July 2020

Writing a Bash Check_MK Check to monitor RHEL 6 LVS Hosts

The check below monitors the number of hosts monitored by Linux Virtual Server, on RHEL6. The logic is as follows:
  • Find active server (clustat -l): 
    • Correct number hosts is OK
    • Too many hosts:
    • Lost hosts: Critical
  • Find passive, and indicate server is passive
  • If neither active or passive, then the LVS server has died: Critical

$ cat check_lvs.sh
# check_lvs.sh
# Configure the correct number of hosts in mrpe.cfg with the "-h" flag.

STATUS=$(ipvsadm -l | wc -l)
BALANCEDHOSTS=$(ipvsadm -l | egrep "\->" | wc-1)

while getopts h: option
do
    case    "${option}"
        in
            h) CHECKEDHOSTS=${OPTARG};;
    esac

done

# A passive cluster member: has no hosts balanced by it
if [ $STATUS -eq 3 ]
    then
        echo "OK: This LVS is the passive server in the cluster. "
        exit 0

# OK: An active cluster member with correct number of hosts
elif [ $STATUS -gt 3 ] && [ $BALANCEDHOSTS -eq $CHECKEDHOSTS ]
    then
        echo "OK: This LVS is the active server in the cluster, and has the correct number of hosts."
        exit 0

# CRITICAL: An active cluster member with less than the right number of hosts
elif [ $STATUS -gt 3 ] && [ $BALANCEDHOSTS -lt $CHECKEDHOSTS ]
    then
        echo "CRITICAL: This LVS has lost hosts."
        exit 2

# CRITICAL: LVS has died
elif [ $STATUS -lt 3 ]
    then
        echo "CRITICAL: This LVS is down."
        exit 2

# UNKNOWN: An active cluster member with more than the right number of hosts
else
        echo "UNKNOWN - Nagios expected $CHECKEDHOSTS hosts, but found $BALANCEDHOSTS hosts."
        exit 3
fi


The script uses bash's optional arguments (optargs) to get input. "-h" flag can be used to tell the check how many devices should be monitored.









No comments:

Post a Comment