[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Openstack] HA Compute & Instance Evacuation

Take this with a grain of salt because we're using the original version
before the project moved under the Big Tent and I'm not sure how much it's
evolved since then. I assume the basic functions are the same though.

You're correct; Corosync and Pacemaker are used to determine if a compute
node goes down. The masakari-host-monitor process runs on each compute node
and checks the cluster status and sends a notification to
masakari-controller when a node goes down. The controller process keeps a
list of reserved hosts in it's database and calls nova host-evacuate to
move the Instances to one of the reserved hosts.

In our environment I also configured STONITH and I'd highly recommend it.
With STONITH Pacemaker sends a shutdown command to the Out of Band
Management card of the unreachable node to make sure that it can't come
back and cause a conflict.

There are two other components, masakari-process-monitor and
masakari-instance-monitor. These also run on your compute nodes. The former
watches the nova-compute service and the later monitors running instances
and restarts them if necessary.

Looking here it seems they've split Masakari into thee different repos:

masakari - The controller service and API
masakari-monitors - Compute node monitoring services
python-masakari-client - The cli tools
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack/attachments/20180502/30fc5ed5/attachment.html>