[nova][scheduler] - Stack VMs based on RAM
On Wed, 2019-04-17 at 15:30 -0700, melanie witt wrote:
> On Thu, 18 Apr 2019 01:13:42 +0300, Georgios Dimitrakakis
> <giorgis at acmac.uoc.gr> wrote:
> > Shouldnâ??t that be the correct behavior and place the new VM on the host with the smaller weight? Isnâ??t that what the
> > negative value for â??ram_weight_multiplierâ?? does ?
> No, it's the opposite, higher weights win. That's why you have to use a
> negative value for ram_weight_multiplier if you want hosts with _less_
> RAM to win over hosts with more RAM (stacking).
> > Please let me know how I can provide to you more debug info....
> One thing I noticed from your log is on the second request, 'cpu1' has
> io_ops: 0 whereas 'cpu2' has io_ops: 1 and the IoOpsWeigher  will
> prefer hosts with fewer io_iops by default. Note that's only one piece
> of the ending weight -- the weighing process will take a sum of all of
> the weights each weigher returns. So the weight returned from RamWeigher
> is added to the weight returned from IoOpsWeigher is added the weight
> returned from CPUWeigher, and so on.
> So, as Matt said, we're a bit in the dark now as far as what each
> weigher is returning and we don't currently have debug logging per
> weigher the way we do for filters. That would be an enhancement we could
> make to aid in debugging issues like this one. You could hack something
> up locally to log/print the returned weight in each weight class under
> the nova/scheduler/weights/ directory, if you want to dig into that.
> Another thing I noticed is that there are probably some new weighers
> available by default that did not exist in the previous version of nova
> that you were using in the past. By default, the config option for weighers:
> [filter_scheduler]weight_classes = ["nova.scheduler.weights.all_weighers"]
> will pick up all weigher classes in the nova/scheduler/weights/ code
> directory. You might take a look at these and see if any are ones you
> should exclude in your environment. For example, the CPUWeigher  (new
> in Rocky) will spread VMs based on available CPU by default.
most of the weighers spread by default so the cpu weigher may be a factor but
the the disk weigher tends to all hevily impact the choice.
we do not normalise any of the values retured by the different weighers
the disk wighter is basically host_state.free_disk_mb * disk_weight_multiplier
althoer host_state.free_disk_mb is actully disk_available_least.
as a result the disk filter will weigh cpu1 19456 height then cpu2
the dela between the cpu1 and cpu2 based on the ram weigher is only 2048
if you want the ram filter to take presedence over teh disk filter you will
need to scale the disk filter down to be in a similar value range
i woudl suggest setting disk_weight_multiplier=0.001
> weigher might be contributing to the VM spreading you're seeing. You
> might try playing with the '[filter_scheduler]weight_classes' config
> option to select only the weighers you want or alternatively you could
> set the weighers multipliers the way you prefer.
>  https://docs.openstack.org/nova/rocky/user/filter-scheduler.html#weights
> > > > On 4/17/2019 3:50 PM, Georgios Dimitrakakis wrote:
> > > > And here is the new log where spawning of 2 VMs can be seen with a few seconds of difference:
> > > > https://pastebin.com/Xy2FL2KL
> > > > Initially both hosts are of weight 1.0 then the one with one VM already running has negative weight but the new
> > > > VM is placed on the other host.
> > > > Really-really strange why this is happening...
> > >
> > > < 2019-04-17 23:26:18.770 157355 DEBUG nova.scheduler.filter_scheduler [req-14c666e4-3ff4-4d88-947e-377b3d37bff9
> > > 6a4c2e32919e4a6fa5c5d956beb68eef 9f22e9bfa7974e14871d58bbb62242b2 - default default] Filtered [(cpu2, cpu2) ram:
> > > 30105MB disk: 1887232MB io_ops: 1 instances: 1, (cpu1, cpu1) ram: 32153MB disk: 1906688MB io_ops: 0 instances: 0]
> > > _get_sorted_hosts /usr/lib/python2.7/site-packages/nova/scheduler/filter_scheduler.py:435
> > >
> > > < 2019-04-17 23:26:18.771 157355 DEBUG nova.scheduler.filter_scheduler [req-14c666e4-3ff4-4d88-947e-377b3d37bff9
> > > 6a4c2e32919e4a6fa5c5d956beb68eef 9f22e9bfa7974e14871d58bbb62242b2 - default default] Weighed [WeighedHost [host:
> > > (cpu1, cpu1) ram: 32153MB disk: 1906688MB io_ops: 0 instances: 0, weight: 1.0], WeighedHost [host: (cpu2, cpu2)
> > > ram: 30105MB disk: 1887232MB io_ops: 1 instances: 1, weight: -0.00900862553213]] _get_sorted_hosts
> > > /usr/lib/python2.7/site-packages/nova/scheduler/filter_scheduler.py:454
> > >
> > > cpu1 is definitely getting weighed higher but I'm not sure why. We likely need some debug logging on the result of
> > > each weigher like we have for each filter to figure out what's going on with the weighers.
> > >
> > > --
> > >
> > > Thanks,
> > >
> > > Matt