Dear All,
monit is generally awesome (and documentation is really great btw) - but...
I am having a problem with monit that occurs in both v4.8.1 and v4.9
and the changelog (
http://www.tildeslash.com/monit/CHANGES.txt)
appears to specifically state that the bug was fixed in v4.8:
("Removed a feature introduced in 4.7 which tested that a check-file,
check-directory or check-fifo actually refered to an existing object
of that type. Monit should not require these file objects to exist at
startup.")
Specifically I have a service that is mode manual (and in a separate
group) and hence the pid file for the service does not exist when
monit starts up (the group in question is not (and should not be)
started by default when the daemon starts up). But although my
monitrc file passes the validation check, whenever I try to start
monit using it with the check file statement below included it exits
with the error:
Error: the path '/var/run/bfrt-monit/pid-files/testMat.pid' used in
the TIMESTAMP statement does not exist 'shutdown'
Although it seems pretty obvious what the problem is, to be sure I
tried substituting the pid file in the check file statement for a
random text file that does exist when monit is started and monit then
starts up just fine (just like it does if the check file statement is
commented out).
Given the wording of the changelog for v4.8 - and what would seem to
be logical behaviour (why should monit check for files that are mode
manual until the group has been started?) this would appear to be a
bug.
Am I missing something here or do people agree this is not desired
behaviour (if so what is the mechanism for filing a bug report)?
-Alex
check file statement from monitrc file:
------------------------------------------------------------
check file testMat_pidfile with path /var/run/bfrt-monit/pid-files/testMat.pid
if changed timestamp 3 times within 12 cycles then exec
"/usr/local/bfrt-monit/scripts/nodeControl warn"
mode manual
group testGroup
------------------------------------------------------------
--
To unsubscribe:
http://lists.nongnu.org/mailman/listinfo/monit-general
Thread at a glance:
Previous Message by Date:
click to view message preview
Re: Monit refuse to start/stop a java process
Hi,Sorry for distrubing you. I have find my problem. It was about the configuration of my JAVA program, wich need some parameters from my personnal environment, and not available in root mode.
2007/9/19, pierrick grasland <pierrick.grasland@xxxxxxxxx>:
Hi,I'm testing monit for a project, and i'm trying to monit a java process, under a debian etch (monit 4.8.1)I'm currently using this configuration for my service :
check process easip with pidfile /var/run/easip.pid group easip_gr
start program = "/home/pierrick/easip/easip.sh start" stop program = "/home/pierrick/easip/easip.sh stop"
# if 3 restarts within 5 cycles then timeout # desactivated for test
depends on easip_filecheck file easip_file with path /home/pierrick/easip/easip.sh
group easip_gr if failed permission 755 then unmonitorwith this script, in order to have a pidfile :
#/bin/sheasipStart() {
echo $$ > /var/run/easip.pid exec 2>&1 java -jar /home/pierrick/easip/start.jar
}easipStop() {
kill -15 `cat /var/run/easip.pid` echo "" >/var/run/easip.pid
}case $1 in
start) easipStart
;; stop)
easipStop ;;
restart) easipStop sleep 3
easipStart ;;
*) echo "usage: easip.sh {start|stop|restart}" ;;
esacWith this configuration, I can check the status of my process, ie running or not, with the web interface.
Process 'easip' status running
monitoring status monitored pid 23418
parent pid 19170 uptime 1m
childrens 0 memory kilobytes 76484
memory kilobytes total 76484 memory percent 3.6%
memory percent total 3.6% cpu percent 0.0%
cpu percent total 0.0% data collected Wed Sep 19 15:41:38 2007In order to verify my script (needed for pidfile), i monitor the file
File 'easip_file' status accessible
monitoring status monitored permission 755
uid 0 gid 1000
timestamp Wed Sep 19 15:10:48 2007
size 389 B data collected Wed Sep 19 15:41:38 2007So, as you can see, i have all access to my script and monit can check the pid. But, when i try to stop / start my process with web interface, monit don't pass these operations :
Sep 19 15:47:08 localhost monit[23566]: 'easip' stop: /home/pierrick/easip/easip.sh
Sep 19 15:47:29 localhost monit[23566]: 'easip' failed to stopSep 19 15:50:29 localhost monit[23566]: 'easip' process is not running
Sep 19 15:50:29 localhost monit[23566]: 'easip' trying to restart
Sep 19 15:50:29 localhost monit[23566]: monit: pidfile `/var/run/easip.pid' does not contain a valid pidnumberSep 19 15:50:29 localhost monit[23566]: monit: pidfile `/var/run/easip.pid' does not contain a valid pidnumber
Sep 19 15:50:29 localhost monit[23566]: 'easip' start: /home/pierrick/easip/easip.sh
Sep 19 15:50:29 localhost monit[23566]: monit: pidfile `/var/run/easip.pid' does not contain a valid pidnumberSep 19 15:50:29 localhost last message repeated 2 times
Sep 19 15:50:29 localhost monit[23566]: 'easip' failed to start
Sep 19 15:50:30 localhost monit[23566]: monit: pidfile `/var/run/easip.pid' does not contain a valid pidnumberMy script works fine and I don't have any clue about what to do next in order to make monit do his job.
Thank you,-- Grasland Pierrick
-- Grasland PierrickENSSAT - LSI 3
--
To unsubscribe:
http://lists.nongnu.org/mailman/listinfo/monit-general
Next Message by Date:
click to view message preview
monit+heartbeat and drbd
After read 'MONIT WITH HEARTBEAT' section of Monit documentation i make
some little changes for '/etc/init.d/monit-node1' example script:
#!/bin/bash
#
# script for starting/stopping all services for given clusters node
#
prog="/usr/local/bin/monit -g $1"
start()
{
echo -n "Starting Monit services for group '$1':"
$prog start all
RETVAL=$?
echo
}
stop()
{
echo -n "Stopping Monit services for group '$1':"
$prog stop all
RETVAL=$?
echo
}
case "$#" in
start)
start
;;
stop)
stop
;;
*)
echo "Usage: $0 groupname {start|stop}"
RETVAL=1
esac
exit $RETVAL
Now we can use this script on both nodes without any tunes and i called
it 'monit-hb'.
For use this form of script we must change its call from 'heartbeat:
/etc/ha.d/haresources' as follow:
node1 IPaddr::172.16.100.1 monit-hb::node1
node2 IPaddr::172.16.100.2 monit-hb::node2
----------
I not understand why in section 'INIT SUPPORT' you tell that we must
use '-I' parameters:
mo:2345:respawn:/usr/local/bin/monit -Ic /etc/monitrc
but in section 'MONIT WITH HEARTBEAT' you are start 'monit' in daemon
mode:
mo:2345:respawn:/usr/local/bin/monit -d 10 -c /etc/monitrc -g local
Yet another question. '-d 10' option means that 'monit' daemon restart
after 10 seconds or it is daemon poll interval (as 'set daemon' option)?
----------
I want to use also 'drbd' with 'monit+heartbeat' and found some info
about this configuration in your maillist archive but without content of
'ha-fs' script that mention there. I send to you my variant and think that
not will be bad if you add info about 'monit+heartbeat+drbd' into your
documentation.
My variant 'monitrc':
check device fs with path /dev/drbd0
start program = "/etc/ha.d/resource.d/ha-fs start"
stop program = "/etc/ha.d/resource.d/ha-fs stop"
if failed permission 660 then unmonitor
if failed uid root then unmonitor
if failed gid root then unmonitor
if space usage > 80% then alert
if space usage > 99% then stop
mode manual
group node1
check process postgresql with pidfile /var/run/postgresql.pid
start program = "/etc/init.d/postgresql start"
stop program = "/etc/init.d/postgresql stop"
depends fs
alert foo@bar
mode manual
group node1
check process app-server with pidfile /var/run/app-server.pid
start program = "/etc/init.d/app-server start"
stop program = "/etc/init.d/app-server stop"
depends postgresql
alert foo@bar
mode manual
group node1
#
# node2 services
#
check process asterisk with pidfile /var/apache/logs/httpd.pid
start program = "/etc/init.d/apache start"
stop program = "/etc/init.d/apache stop"
alert foo@bar
mode manual
group node2
and content of '/etc/ha.d/resource.d/ha-fs' with my values:
#!/bin/bash
#
# script for starting/stopping services underlies of 'fs' device
#
scrd="/etc/ha.d/resource.d"
start()
{
echo -n "Starting 'fs' device services:"
$scrd/drbddisk pgdisk start && \
$scrd/Filesystem /dev/drbd0 /mnt/cluster reiserfs start
RETVAL=$?
echo
}
stop()
{
echo -n "Stopping 'fs' device services:"
$scrd/Filesystem /dev/drbd0 /mnt/cluster reiserfs stop
$scrd/drbddisk pgdisk stop
RETVAL=$?
echo
}
case "$#" in
start)
start
;;
stop)
stop
;;
*)
echo "Usage: $0 {start|stop}"
RETVAL=1
esac
exit $RETVAL
I don't test this configuration but will do it at near time.
I will be much grateful to you for your opinion about my proposal and
any fixes.
Thank you very much.
Aleksandr Shubnik
---------
Процесс обучения индивидуален. Курсы иностранных языков 'Streamline'
предлагают своим клиентам комфортное и эффективное обучение
в группах V.I.P. численностью до 8-ми человек. http://www.str.by/
--
To unsubscribe:
http://lists.nongnu.org/mailman/listinfo/monit-general
Previous Message by Thread:
click to view message preview
custom check program
Hello list!
I have one little question about 'monit' using.
As i understand from documentation now i can't use
my own program/script for use in 'monit' check procedure.
Am i right or not?
Thank you very much.
Aleksandr Shubnik
---------
Канцелярские и офисные товары на http://www.ofiston.by
--
To unsubscribe:
http://lists.nongnu.org/mailman/listinfo/monit-general
Next Message by Thread:
click to view message preview
Re: check file timestamp bug?
Hi,
I confirmed this behavior on monit 4.9 on Linux 2.6.x
and monit 4.10.beta1
Using the original block from your email:
# monit -v
monit: Debug: Adding credentials for user 'ogg'.
/etc/monitrc:247: Error: the executable does not exist
'/usr/local/bfrt-monit/scripts/nodeControl'
/etc/monitrc:247: Error: the path '/var/run/bfrt-monit/pid-files/testMat.pid'
used in the TIMESTAMP statement does not exist 'warn'
Using this:
check file testMat_pidfile with path /var/run/monitbug.pid
if changed timestamp 3 times within 12 cycles then exec "/usr/bin/yes"
mode manual
group testGroup
# monit -V
This is monit version 4.10-beta1
Copyright (C) 2000-2007 by the monit project group. All Rights Reserved.
# which yes
/usr/bin/yes
# monit -v
monit: Debug: Adding credentials for user 'ogg'.
/etc/monitrc:247: Error: the path '/var/run/monitbug.pid' used in the
TIMESTAMP statement does not exist '/usr/bin/yes'
So the bug appears to still be there.
One thing that monit -v doesn't seem to print is the version info...
Alex Stewart wrote:
Dear All,
Am I missing something here or do people agree this is not desired
behaviour (if so what is the mechanism for filing a bug report)?
-Alex
check file statement from monitrc file:
------------------------------------------------------------
check file testMat_pidfile with path /var/run/bfrt-monit/pid-files/testMat.pid
if changed timestamp 3 times within 12 cycles then exec
"/usr/local/bfrt-monit/scripts/nodeControl warn"
mode manual
group testGroup
------------------------------------------------------------
--
To unsubscribe:
http://lists.nongnu.org/mailman/listinfo/monit-general