The FreeBSD Diary

The FreeBSD Diary (TM)

Providing practical examples since 1998

If you buy from Amazon USA, please support us by using this link.
[ HOME | TOPICS | INDEX | WEB RESOURCES | BOOKS | CONTRIBUTE | SEARCH | FEEDBACK | FAQ | FORUMS ]

Things look quiet here. But I've been doing a lot of blogging at dan.langille.org because I prefer WordPress now. Not all my posts there are FreeBSD related. I am in the midst of migrating The FreeBSD Diary over to WordPress (and you can read about that here). Once the migration is completed, I'll move the FreeBSD posts into the new FreeBSD Diary website.

NRPE: Unable to read output - The followup 15 October 2010
Share
Need more help on this topic? Click here
This article has no comments
Show me similar articles

Last week, I wrote about a problem with NRPE reporting NRPE: Unable to read output. Harold Paulson encountered the same issue yesterday. I was able to help him debug it today. We found a solution: a full path to sudo.

We both had the same situation. Checking from the nagios server:

$ /usr/local/libexec/nagios/check_nrpe2 -H kraken -c check_smartmon_ada8
NRPE: Unable to read output

But when running locally on the Nagios client (NOTE: I amended the shell for nagios to /bin/sh for this test):

# su -m nagios -c 'sudo /usr/local/libexec/nagios/check_smartmon -d /dev/ada8'
OK: device is functional and stable (temperature: 35)|TEMP=35;55;60;

If you don't make that temporary shell adjustment, you'll get this instead:

# su -m nagios -c 'sudo /usr/local/libexec/nagios/check_smartmon -d /dev/ada8'
UNKNOWN: no read permission given

We couldn't figure this out. Then, while looking at /usr/local/etc/nrpe.cfg and searching for sudo, I found:

# *** THIS EXAMPLE MAY POSE A POTENTIAL SECURITY RISK, SO USE WITH CAUTION! ***
# Usage scenario:
# Execute restricted commmands using sudo.  For this to work, you need to add
# the nagios user to your /etc/sudoers.  An example entry for alllowing
# execution of the plugins from might be:
#
# nagios          ALL=(ALL) NOPASSWD: /usr/local/libexec/nagios/
#
# This lets the nagios user run all commands in that directory (and only them)
# without asking for a password.  If you do this, make sure you don't give
# random users write access to that directory or its contents!

# command_prefix=/usr/local/bin/sudo

My clue was the full path. I figured we were onto something. In the meantime, Harold had restarted nrpe and discovered it resolved his particular problem. I then recalled that the same restart had fixed something for me too. I decided to restart my system to reproduce the problem Harold had just fixed. After the reboot, Nagios was indeed reporting the same error messages. It was about that time that I recalled this entry in /etc/crontab, which attempted, but failed, to solve the problem:

# for some reason, nrpe2 doesn't start right on boot
@reboot                root    /bin/sleep 600 && /usr/local/etc/rc.d/nrpe2 restart

I recall now that this never need resolve the issue. I always had to restart it by hand. We were about to confirm our suspicions. First, I mounted procfs so I can use the -e option on ps to view the environment variables of the nrpe process:

# mount -t procfs proc /proc

Then I looked at the existing nrpe process which was producing the error:

$ sudo ps -auxwe -p 1104
USER     PID %CPU %MEM   VSZ   RSS  TT  STAT STARTED      TIME COMMAND
nagios  1104  0.0  0.1 11060  3300  ??  Ss    1:16PM   0:00.02 HOME=/ PATH=/sbin:/bin:/usr/sbin:/usr/bin RC_PID=24 PWD=/ /usr/local/sbin/nrpe2 -d -c /usr/local/etc/nrpe.cfg

Then I restarted nrpe and issue the command for the new process:

$ sudo ps -auxwe -p 3619
sudo: cannot get working directory
USER     PID %CPU %MEM   VSZ   RSS  TT  STAT STARTED      TIME COMMAND
nagios  3619  0.0  0.1 11060  3372  ??  Ss    1:25PM   0:00.00 SUDO_GID=1001 USER=root MAIL=/var/mail/root HOME=/root SUDO_UID=1001 LOGNAME=root USERNAME=root TERM=xterm PATH=/sbin:/bin:/usr/sbin:/usr/bin:/usr/games:/usr/local/sbin:/usr/local/bin:/home/dan/bin RC_PID=3600 SUDO_COMMAND=/usr/local/etc/rc.d/nrpe2 restart SHELL=/bin/sh SUDO_USER=dan PWD=/proc/1104 /usr/local/sbin/nrpe2 -d -c /usr/local/etc/nrpe.cfg
As you can see, in the first output, /usr/local/bin is not in the PATH, but it is in the second output. This explains why the errors go away after nrpe is restarted.

I then amended nrpe.cfg to contain a full path to sudo:

command[check_smartmon_ada8]=/usr/local/bin/sudo /usr/local/libexec/nagios/check_smartmon -d /dev/ada8

After a reboot of the system, no more nrpe errors. :)

Why? When processes are run from init and cron, they exist in a very sparse environement. This is by design. Thus, full paths are often needed and are a very good design choice in any case.

Share
Need more help on this topic? Click here
This article has no comments
Show me similar articles