I think that this affects all hypervisors as CloudStack's deployment strategies are generally sub-optimal to say the least.
From what our devs have told me, a large part of the problem is that capacity/usage and suitability due to tags is calculated by multiple parts of the code independently, there is no central method, which will give a consistent answer.
In Trillian we take a micro-management approach and have a custom module which will return the least used cluster, the least used host or the least used host in a given cluster. With that info we place VMs on a specific hosts - keeping virtualised hypervisors in the same cluster (least used) so that processor types match, and all other VMs on the least used hosts.
For cross-cluster migrations (VMs and/or storage) I think that most times people want to move from cluster A to the least used (cluster/storage) in cluster B - making them choose which host/pool is actually unhelpful.
#scopecreep - sorry Pierre-Luc
Amadeus House, Floral Street, London WC2E 9DPUK
From: Will Stevens <wstevens@xxxxxxxxxxxx>
Sent: 06 September 2018 19:45
To: dev@xxxxxxxxxxxxxxxxxxxxx; Marc-Andre Jutras <mjutras@xxxxxxxxxxxx>
Subject: Re: [DISCUSS] deployment planner improvement
If I remember correctly, we see similar issues on VMware. Marcus, have you seen similar behavior on VMware? I think I remember us having to manually vMotion a lot of VMs very often...
Chief Technology Officer
On Thu, Sep 6, 2018 at 2:34 PM Pierre-Luc Dion <pdion@xxxxxxxxxxxx> wrote:
I'm working with a University in Montreal and we are looking at
working together to improve the deployment planner. Mainly for post
Because what we observed with cloudstack, in our case with XenServer,
overtime, a cluster will become unbalanced in therm of workload, vm HA
will move VMs all over the the cluster which cause hotspot inside a cluster.
Also, when performing maintenance xenmotion of VM spread them in the
cluster but does not consider host usage and at the end of a
maintenance it require manual operation to repopulate VMs on the last
host updated. OS preference not taken into account except for VM.CREATE.
I'd like to work on improving VMs dispersion during and post outage
and maintenances. when a cluster resources are added or removed.
Would you have any more requirement, we will document a feature spec
in the wiki which I believe it's still a requirement ?
Does using KVM have similar issues over time?
I don't think it would make sense to cloudstack to automatically take
decision on moving VMs but for now create report of recommended action
to do and provide steps to do them. tbd.