[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bottledaemon stop/start doesn't work if killed elsewhere

On 2018-11-18, Dan Sommers wrote:

> On 11/18/18 1:21 PM, MRAB wrote:> On 2018-11-18 17:50, Adam Funk wrote:
> >> Hi,
> >>
> >> I'm using bottledaemon to run a little REST service on a Pi that takes
> >> input from other machines on the LAN and stores stuff in a database.
> >> I have a cron job to call 'stop' and 'start' on it daily, just in case
> >> of problems.
> >>
> >> Occasionally the oom-killer runs overnight and kills the process using
> >> bottledaemon; when this happens (unlike properly stopping the daemon),
> >> the pidfile and its lockfile are left on the filesystem, so the 'stop'
> >> does nothing and the 'start' gets refusedq because the old pidfile and
> >> lockfile are present.  At the moment, I eventually notice something
> >> wrong with the output data, ssh into the Pi, and rm the two files then
> >> call 'start' on the daemon again.
> >>
> >> Is there a recommended or good way to handle this situation
> >> automatically?
> >>
> > Could you write a watchdog daemon that checks whether bottledaemon is
> > running, and deletes those files if it isn't (or hasn't been for a 
> while)?
> What if the oom-killer kills the watchdog?
> Whatever runs in response to the start command has to be smarter:  if
> the pid and lock files exist, then check whether they refer to a
> currently running bottledaemon.  If so, then all is well, and refuse to
> start a redundant daemon.  If not, then remove the pid and lock files
> and start the daemon.

I've reported this as an issue on github.  It seems to me that the
'stop' subcommand should delete the pidfile and lockfile if the pid is
no longer running.

It's like a pair of eyes. You're looking at the umlaut, and it's
looking at you.                             ---David St. Hubbins