osdir.com


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[nova][scheduler] - Stack VMs based on RAM


On 4/17/2019 2:19 PM, melanie witt wrote:
> The relevant scheduler log is this one:
> 
> 2019-04-17 19:53:07.303 98874 DEBUG nova.scheduler.filter_scheduler 
> [req-02fb5504-cbdb-4219-9509-d2be9da7bb0e 
> 6a4c2e32919e4a6fa5c5d956beb68eef 9f22e9bfa7974e14871d58bbb62242b2 - 
> default default] Weighed [(cpu1, cpu1) ram: 32153MB disk: 1906688MB 
> io_ops: 0 instances: 0, (cpu2, cpu2) ram: 30105MB disk: 1886208MB 
> io_ops: 0 instances: 1] _get_sorted_hosts 
> /usr/lib/python2.7/site-packages/nova/scheduler/filter_scheduler.py:455
> 
> and here we see that host 'cpu1' is being weighed ahead of host 'cpu2', 
> which is the problem. I don't understand this considering the docs say 
> that setting the ram_weight_multiplier to a negative value should result 
> in the host with the lesser RAM being weighed higher/first. According to 
> your log, the opposite is happening -- 'cpu1' with 32153MB RAM is being 
> weighed higher than 'cpu2' with 30105MB RAM.
> 
> Either your ram_weight_multiplier setting is not being picked up or 
> there's a bug causing weight to be applied with reverse logic?
> 
> Can you look at the scheduler debug log when the service first started 
> up and verify what value of ram_weight_multiplier the service is using?

I agree with Melanie's assessment. Looking at the RAMWeigher code in 
Rocky we see it's weighing based on the free_ram_mb value in the 
HostState object:

https://github.com/openstack/nova/blob/stable/rocky/nova/scheduler/weights/ram.py#L38

Looking at the filtered hosts that were logged:

Filtered [(cpu2, cpu2) ram: 30105MB disk: 1886208MB io_ops: 0 instances: 
1, (cpu1, cpu1) ram: 32153MB disk: 1906688MB io_ops: 0 instances: 0]

The ram value that is logged is free_ram_mb:

https://github.com/openstack/nova/blob/stable/rocky/nova/scheduler/host_manager.py#L333

It looks like you also don't have this logging regression fix in your 
rocky scheduler code, so you might want to patch this in when getting 
new debug logs:

https://review.openstack.org/#/c/641355/

That could tell us what the resulting weight is.

-- 

Thanks,

Matt