osdir.com


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Brooklyn highAvailabilityMode: default as AUTO?


Thanks all,

Here's the PR to change the default behaviour: https://github.com/apache/brooklyn-server/pull/965

I'll add more info to the docs shortly.

Aled


On 22/05/2018 21:12, Geoff Macartney wrote:
+1 sounds like a change that's needed to match the intended behaviour
anyway, as described in the docs.  We should update the docs as part of
this to include your explanation above, Aled, of the details of the
behaviour.

regards
Geoff

On Tue, 22 May 2018 at 15:27 Thomas Bouron <thomas.bouron@xxxxxxxxxxxxxxxxx>
wrote:

+1, sounds sensible to me

Best.

On Tue, 22 May 2018 at 14:51 Duncan Grant <duncan.grant@xxxxxxxxxxxx>
wrote:

Aled,

+1 sounds like a sensible plan

Duncan

On Tue, 22 May 2018 at 13:59 Aled Sage <aled.sage@xxxxxxxxx> wrote:

Hi all,

I'd like to change the default value of highAvailabilityMode from
DISABLED to AUTO.

Currently, if you start two Brooklyn servers pointing at the same
persisted state (file-system directory or object store's bucket), then
they are independent (because HA is 'disabled' by default). However,
they both write to that same persisted state, which will lead to
surprising behaviour, particularly when a Brooklyn server is next
restarted.

Changing to 'AUTO' would (almost entirely) have the same behaviour as
we
have currently for a single Brooklyn server. In the case of two servers
pointing at the same persisted state, the second would come up as
'standby', and will be automatically promoted to 'master' if the first
stops or fails.

I say "almost entirely":
1. If you run Brooklyn and then kill it (e.g. `kill -9` or turn off the
VM), when you start Brooklyn again it will wait to confirm the previous
server is really dead. It waits for 30 seconds after the server's last
heartbeat, by default.
2. The HA status shows all previous runs of the Brooklyn server (it
gets
a new node-id each time it restarts). This list will get longer and
longer if you keep restarting Brooklyn, pointing at the same persisted
state, until you clear out terminates instances from the list (via the
UI or the REST api).
3. The logging at startup will be quite different (e.g. "Brooklyn
initialisation (part two) complete" now means that the server has
finished becoming the 'standby'. If anyone has tools/scripts that
search/parse these logs, then they may be affected.

---

Note the current behaviour contradicts the docs [1], which say:
"Brooklyn will automatically run in HA mode if multiple Brooklyn
instances are started pointing at the same persistence store."

Thoughts?

Aled

p.s. another option would be to try to fail-fast when
highAvailabilityMode is disabled but there is another Brooklyn using
the
same persisted state. However, distinguishing that from (1) above is
tricky.

[1]


https://github.com/apache/brooklyn-docs/blob/master/guide/ops/high-availability/index.md


--

Thomas Bouron • Senior Software Engineer @ Cloudsoft Corporation •
https://cloudsoft.io/
Github: https://github.com/tbouron
Twitter: https://twitter.com/eltibouron