Brooklyn highAvailabilityMode: default as AUTO?
I'd like to change the default value of highAvailabilityMode from
DISABLED to AUTO.
Currently, if you start two Brooklyn servers pointing at the same
persisted state (file-system directory or object store's bucket), then
they are independent (because HA is 'disabled' by default). However,
they both write to that same persisted state, which will lead to
surprising behaviour, particularly when a Brooklyn server is next restarted.
Changing to 'AUTO' would (almost entirely) have the same behaviour as we
have currently for a single Brooklyn server. In the case of two servers
pointing at the same persisted state, the second would come up as
'standby', and will be automatically promoted to 'master' if the first
stops or fails.
I say "almost entirely":
1. If you run Brooklyn and then kill it (e.g. `kill -9` or turn off the
VM), when you start Brooklyn again it will wait to confirm the previous
server is really dead. It waits for 30 seconds after the server's last
heartbeat, by default.
2. The HA status shows all previous runs of the Brooklyn server (it gets
a new node-id each time it restarts). This list will get longer and
longer if you keep restarting Brooklyn, pointing at the same persisted
state, until you clear out terminates instances from the list (via the
UI or the REST api).
3. The logging at startup will be quite different (e.g. "Brooklyn
initialisation (part two) complete" now means that the server has
finished becoming the 'standby'. If anyone has tools/scripts that
search/parse these logs, then they may be affected.
Note the current behaviour contradicts the docs , which say:
"Brooklyn will automatically run in HA mode if multiple Brooklyn
instances are started pointing at the same persistence store."
p.s. another option would be to try to fail-fast when
highAvailabilityMode is disabled but there is another Brooklyn using the
same persisted state. However, distinguishing that from (1) above is tricky.