|
|
RE: Setting up service dependencies: msg#00408
network.nagios.user
|
Subject: |
RE: Setting up service dependencies |
Agreed, creating these dependencies is slow and painful (made less
painful with the help of a sample block of code that I add after each service
that I want to set up a dependency for, then run through each host's config file
to replace this and that). I've barely shown you much of a
single config file for a given host. Make no mistake, I'm monitoring
some 1100+ services, and a large chunk of those (ie, the majority) are
NRPE-related. I'm *still* working on building out the dependencies, and
expect to be adding quite a few more NRPE services (say, 5-10 NRPE services per
host for another ~60 hosts) before too long. (I could possibly write a
Perl script to add the missing dependencies, but I like your XML suggestion
for easing the pain.)
The only other suggestion/wish list item I would have, would be to be
able to create a system template which would encompass an integrated
host-and-multiple-services definition, something which would make subsequent
definitions relatively brief. :) Perhaps someday, but I'm keen to
see if there are other areas of Nagios which would reap greater rewards in a
shorter time with less effort, given the time/effort.
Jeff, I'm not sure if this should be moved over to the nagios-devel
mailing list (which I should think about subscribing to), but if you would care
to constructively hash out an approach/design which could be integrated without
too much grief, then check into it. :)
jc
yeah, I'm willing to deal with the fact that the host may not be really
down if I'm checking HTTP.
Your
example is similar to what I tried. But almost all my services use
hostgroup_name rather than host_name to keep things manageable. Otherwise I
would have over 500 service objects to manage. servicedependency doesn't seem
to work right when you substitute hostgroup_name for host_name like it works
for all other object definitions. I would have to create thousands of service
dependecies if I were to do everything on a host basis rather than hostgroup.
Oh well, but here's a vote for XML config files which would make
automatic generation of these files easier.
Jeff
It's not A Bad Thing per se; given the example of a host behind a
firewall with a hole at port 80 poked through it, it would certainly have
merit. But in the case where your webserver fails but, say, sshd is
still up, can the host be said to be down?
Just to say that I gave you a servicedependency example, here it
is:
define service{
host_name
itdmll01
use
icmp
service_description
NRPE check
contact_groups
linux-admins
check_command
check_nrpe!check_nrpe_status
}
define service{
host_name
itdmll01
use
icmp
service_description
Total Users
contact_groups
linux-admins
check_command
check_nrpe!check_users
} define servicedependency{
dependent_host_name
itdmll01
dependent_service_description Total
Users
host_name
itdmll01
service_description
NRPE check
execution_failure_criteria
w,u,c
notification_failure_criteria
w,u,c
}
Explanation: "NRPE check" is the basic 'is NRPE up'
check. "Total Users" actually kicks off a plugin on the remote
host. "Total Users" is dependent on "NRPE check". If "Total
Users" fails, it checks to make sure that "NRPE check" is up/down. If
the latter is down, that's all I get alerted on. If it's still up,
then I get alerted on the former.
Does this help?
jc
I think I found a workaround for this. For some servers, like
webservers most of the services are some form of HTTP request, so I
changed the check_command on the host to do a simple http request. If that
fails, then the host is considered 'down' rather than just using the
standard ping check. If this is 'bad' for some reason, please let me
know.
second question: is it possible to pass arbitrary ssh
parameters to check_ssh, like "-1" for ssh v1? Doesn't appear so, but
maybe there's a trick?
Jeff
I've looked at the examples that come with nagios and they don't
seem to address the problem of using hostgroups versus hosts. I'm not
using NPRE, but if you have examples using hostgroups, I'd love to see
them.
Jeff
That's different. You'll want to set up a
service dependency, of course.
Er... do you absolutely need to check for static
HTML? I mean, if the PHP fails because the webserver is
down.... Oh, unless you just want to be informed that there's a
PHP problem but that the server itself it still up. Okay, I can
see that.
I've set up various NRPE dependencies at our
site, so if you need a trivial example, I can post one to the
list. Let me know.
jc
That's good to know. I still have this issue though because I
have some tests that aren't going work if simpler tests fail. For
example, a simple test to see if a webserver can serve a static HTML
can fail, and if that's the case then checking to see if the
webserver will return a PHP page is obviously not going to work.
Jeff
I'm not sure why you're taking this
approach. Out of the box, Nagios will behave as you wish it
to.
If a service check fails, then a host check
is made. If the host check fails, it's flagged as down and,
depending on your particular configuration, you'll receive the
notification for the host being down, not for the N services
you're monitoring on that host. If the host check passes,
then you'll get an alert on the service.
jc
Hello all,
I'm trying to set up service
dependencies in Nagios. For example, I don't want alerts about
an HTTP service being down on a host if the host isn't
pingable.
So far, the only way I've found to do
this is to set it up for each host individually. That would be
difficult with the hundreds of hosts that my nagios is currently
monitoring.
I tried creating a servicedependency
object with a hostgroup_name and a dependent_hostgroup_name.
This seems to make each host in hostgroup_name dependent on
EVERY host in dependent_hostgroup_name. For example, if my
hostgroup has 7 hosts in it, it seems it's making each of those
7 hosts dependent on each other, making 7x7=49 dependencies.
(Stragely, Nagios reports 98 dependencies. Why 2x?)
Any suggestions on how to make a
service dependencies local to each host and not a group of hosts
without lots and lots of dependency objects?
Jeff
|
|