osdir.com


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Brooklyn highAvailabilityMode: default as AUTO?


+1, sounds sensible to me

Best.

On Tue, 22 May 2018 at 14:51 Duncan Grant <duncan.grant@xxxxxxxxxxxx> wrote:

> Aled,
>
> +1 sounds like a sensible plan
>
> Duncan
>
> On Tue, 22 May 2018 at 13:59 Aled Sage <aled.sage@xxxxxxxxx> wrote:
>
> > Hi all,
> >
> > I'd like to change the default value of highAvailabilityMode from
> > DISABLED to AUTO.
> >
> > Currently, if you start two Brooklyn servers pointing at the same
> > persisted state (file-system directory or object store's bucket), then
> > they are independent (because HA is 'disabled' by default). However,
> > they both write to that same persisted state, which will lead to
> > surprising behaviour, particularly when a Brooklyn server is next
> > restarted.
> >
> > Changing to 'AUTO' would (almost entirely) have the same behaviour as we
> > have currently for a single Brooklyn server. In the case of two servers
> > pointing at the same persisted state, the second would come up as
> > 'standby', and will be automatically promoted to 'master' if the first
> > stops or fails.
> >
> > I say "almost entirely":
> > 1. If you run Brooklyn and then kill it (e.g. `kill -9` or turn off the
> > VM), when you start Brooklyn again it will wait to confirm the previous
> > server is really dead. It waits for 30 seconds after the server's last
> > heartbeat, by default.
> > 2. The HA status shows all previous runs of the Brooklyn server (it gets
> > a new node-id each time it restarts). This list will get longer and
> > longer if you keep restarting Brooklyn, pointing at the same persisted
> > state, until you clear out terminates instances from the list (via the
> > UI or the REST api).
> > 3. The logging at startup will be quite different (e.g. "Brooklyn
> > initialisation (part two) complete" now means that the server has
> > finished becoming the 'standby'. If anyone has tools/scripts that
> > search/parse these logs, then they may be affected.
> >
> > ---
> >
> > Note the current behaviour contradicts the docs [1], which say:
> > "Brooklyn will automatically run in HA mode if multiple Brooklyn
> > instances are started pointing at the same persistence store."
> >
> > Thoughts?
> >
> > Aled
> >
> > p.s. another option would be to try to fail-fast when
> > highAvailabilityMode is disabled but there is another Brooklyn using the
> > same persisted state. However, distinguishing that from (1) above is
> > tricky.
> >
> > [1]
> >
> >
> https://github.com/apache/brooklyn-docs/blob/master/guide/ops/high-availability/index.md
> >
> >
> >
>
-- 

Thomas Bouron • Senior Software Engineer @ Cloudsoft Corporation •
https://cloudsoft.io/
Github: https://github.com/tbouron
Twitter: https://twitter.com/eltibouron