bottledaemon stop/start doesn't work if killed elsewhere
On 11/18/18 1:21 PM, MRAB wrote:> On 2018-11-18 17:50, Adam Funk wrote:
>> I'm using bottledaemon to run a little REST service on a Pi that takes
>> input from other machines on the LAN and stores stuff in a database.
>> I have a cron job to call 'stop' and 'start' on it daily, just in case
>> of problems.
>> Occasionally the oom-killer runs overnight and kills the process using
>> bottledaemon; when this happens (unlike properly stopping the daemon),
>> the pidfile and its lockfile are left on the filesystem, so the 'stop'
>> does nothing and the 'start' gets refusedq because the old pidfile and
>> lockfile are present. At the moment, I eventually notice something
>> wrong with the output data, ssh into the Pi, and rm the two files then
>> call 'start' on the daemon again.
>> Is there a recommended or good way to handle this situation
> Could you write a watchdog daemon that checks whether bottledaemon is
> running, and deletes those files if it isn't (or hasn't been for a
What if the oom-killer kills the watchdog?
Whatever runs in response to the start command has to be smarter: if
the pid and lock files exist, then check whether they refer to a
currently running bottledaemon. If so, then all is well, and refuse to
start a redundant daemon. If not, then remove the pid and lock files
and start the daemon.