osdir.com


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[nova][scheduler] - Stack VMs based on RAM


 Hello again Menalie!

 Exactly this is what I am thinking...something is not working 
 correctly!

 To answer your questions there is one node acting as controller where 
 the scheduler is running and I have pasted the nova.conf file from 
 there.

 I have also noticed that I have "ram_weight_multiplier" two times (one 
 in [cells] and one in [filter_scheduler]) therefore I have removed the 
 one in [cells] because I though it might give a problem but the results 
 are still the same.

 The log for the scheduler has this entry:

 2019-04-17 22:04:50.045 131723 DEBUG oslo_service.service 
 [req-7e548ecb-f3ed-4a4d-835f-b3a996e32534 - - - - -] 
 filter_scheduler.ram_weight_multiplier = -1.0 log_opt_values 
 /usr/lib/python2.7/site-packages/oslo_config/cfg.py:3032

 so it seems to be picked up correctly but without any influence.

 What also worries me from the scheduler log that I have send to you 
 before is that in there I see an entry like this:

 2019-04-17 19:53:07.298 98874 DEBUG nova.filters 
 [req-02fb5504-cbdb-4219-9509-d2be9da7bb0e 
 6a4c2e32919e4a6fa5c5d956beb68eef 9f22e9bfa7974e14871d58bbb62242b2 - 
 default default] Filter RamFilter returned 2 host(s) 
 get_filtered_objects 
 /usr/lib/python2.7/site-packages/nova/filters.py:104

 Shouldn't the RamFilter return 1host and the one with less RAM? Why 
 does it return 2hosts??

 If you have any other ideas or would like me to do some more checking I 
 am all ears!

 Thank you,

 G.


>> Thank you both Melanie and Matt for trying to assist me.
>> I have double checked the nova.conf at the controller and here is 
>> what
>>   I have (ignored hashed lines and obfuscating sensitive data):
>>   https://pastebin.com/hW1PE4U7
>> As you can see I have everything with default values as discussed
>>   before with Melanie except the filters and the weight that I have
>>   applied that should lead to VM stacking instead of spreading.
>> My case scenario is with two compute hosts (let's call them "cpu1" 
>> and
>>   "cpu2") and when an instance is already placed on "cpu2" I expect 
>> the
>>   next instance to be placed also there. But instead is placed on 
>> "cpu1"
>>   as you can see from the scheduler log that can find here:
>>   https://pastebin.com/sCzB9L2e
>> Do you see something strange that I fail to recognize?
>
> Thanks for providing the helpful data. It appears you have set your
> nova.conf correctly (this is where your scheduler is running, yes?). 
> I
> notice you have duplicated the ram_weight_multiplier setting but that
> shouldn't hurt anything.
>
> The relevant scheduler log is this one:
>
> 2019-04-17 19:53:07.303 98874 DEBUG nova.scheduler.filter_scheduler
> [req-02fb5504-cbdb-4219-9509-d2be9da7bb0e
> 6a4c2e32919e4a6fa5c5d956beb68eef 9f22e9bfa7974e14871d58bbb62242b2 -
> default default] Weighed [(cpu1, cpu1) ram: 32153MB disk: 1906688MB
> io_ops: 0 instances: 0, (cpu2, cpu2) ram: 30105MB disk: 1886208MB
> io_ops: 0 instances: 1] _get_sorted_hosts
> 
> /usr/lib/python2.7/site-packages/nova/scheduler/filter_scheduler.py:455
>
> and here we see that host 'cpu1' is being weighed ahead of host
> 'cpu2', which is the problem. I don't understand this considering the
> docs say that setting the ram_weight_multiplier to a negative value
> should result in the host with the lesser RAM being weighed
> higher/first. According to your log, the opposite is happening --
> 'cpu1' with 32153MB RAM is being weighed higher than 'cpu2' with
> 30105MB RAM.
>
> Either your ram_weight_multiplier setting is not being picked up or
> there's a bug causing weight to be applied with reverse logic?
>
> Can you look at the scheduler debug log when the service first
> started up and verify what value of ram_weight_multiplier the service
> is using?
>
> -melanie
>
>>> On 4/16/2019 7:03 PM, melanie witt wrote:
>>>> To debug further, you should set debug to True in the nova.conf on
>>>> your scheduler host and look for which filter is removing the 
>>>> desired
>>>> host for the second VM. You can find where to start by looking for 
>>>> a
>>>> message like, "Starting with N host(s)". If you have two hosts 
>>>> with
>>>> enough RAM, you should see "Starting with 2 host(s)" and then look 
>>>> for
>>>> the log message where it says "Filter returned 1 host(s)" and that
>>>> will be the filter that is removing the desired host. Once you 
>>>> know
>>>> which filter is removing it, you can debug further.
>>>
>>> If the other host isn't getting filtered out, it could be the
>>> weighers that aren't prioritizing the host you expect, but debug 
>>> logs
>>> should dump the weighed hosts as well which might give a clue.
>>
>>