[BlueOnyx:15017] Re: Vacation blocked by cced down

Michael Stauber mstauber at blueonyx.it
Wed Mar 26 20:06:48 -05 2014


Hi Eric,

> Here's a nagios plugin for checking cced for those that run nagios:
> 
> #!/usr/bin/perl -w -I/usr/sausalito/perl
> 
> use strict;
> use lib qw( /usr/sausalito/perl );
> use CCE;
> 
> 
> my $cce = new CCE;
> $cce->connectuds();
> my ($oid) = $cce->find("System");
> $cce->bye('SUCCESS');
> 
> if ($oid == '1') {
>   print "OK";
>   exit(0);
> } else {
>   print "OID == $oid";
>   exit(2);
> }

Thank you. Was thinking about something like that as well.

Yeah, that'll do fine to detect if CCEd is stopped or crashed.

But if CCEd is hanging and this checker is run via a cronjob, it'll just
hang on the conectuds() call as well.

I was thinking about a two staged approach. One that checks for 'cced'
or 'pperld' processes that sit around in Zombie state (D state). If none
are present, it runs a Perl script similar to yours to check if CCEd
actually responds expectedly to queries within a reasonable time.

Another approach I discussed with Greg will be used in the new GUI.
Instead of showing the red text on white background that CCEd is down
(when it is down), we initiate a restart of it instead, refresh the page
and try again. Only if all fails it'll show the usual "Doh!" page.

-- 
With best regards

Michael Stauber



More information about the Blueonyx mailing list