osdir.com


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[nova][scheduler] - Stack VMs based on RAM


On Fri, 19 Apr 2019 22:47:23 +0300, Georgios Dimitrakakis 
<giorgis at acmac.uoc.gr> wrote:
>   Hello again and apologies for my absence!
> 
>   First of all let me express my gratitude for your valuable feedback.
>   I understand what all of you are saying and I will try what Sean
>   suggested by setting the values for CPUs and Disk to something beyond
>   the default.
> 
>   Meanwhile what I 've though was to change the "weight_classes"
>   parameter to use only the "RAM" weight instead of all, since initially
>   this is what I would like to be based and then move on from there.
>   Unfortunately I haven't found a way to properly set it and no matter
>   what I 've tried I always ended up with an error in nova-scheduler.log
>   saying "ERROR nova ValueError: Empty module name"
> 
>   Any ideas on how to set it based on the available weights? It seems
>   that it needs a list format but how?

Did you try:

weight_classes = ['nova.scheduler.weights.ram.RAMWeigher']

That's what should work, based on the code I see here:

https://github.com/openstack/nova/blob/stable/rocky/nova/loadables.py#L98

-melanie

>> On Wed, 2019-04-17 at 18:24 -0700, melanie witt wrote:
>>> On Thu, 18 Apr 2019 01:38:17 +0100, Sean Mooney <smooney at redhat.com>
>>> wrote:
>>>> On Wed, 2019-04-17 at 15:30 -0700, melanie witt wrote:
>>>>> On Thu, 18 Apr 2019 01:13:42 +0300, Georgios Dimitrakakis
>>>>> <giorgis at acmac.uoc.gr> wrote:
>>>>>> Shouldnâ??t that be the correct behavior and place the new VM on
>>> the host with the smaller weight? Isnâ??t that what
>>>>>> the
>>>>>> negative value for â??ram_weight_multiplierâ?? does ?
>>>>>
>>>>> No, it's the opposite, higher weights win. That's why you have
>>> to use a
>>>>> negative value for ram_weight_multiplier if you want hosts with
>>> _less_
>>>>> RAM to win over hosts with more RAM (stacking).
>>>>>
>>>>>> Please let me know how I can provide to you more debug
>>> info....
>>>>>
>>>>> One thing I noticed from your log is on the second request,
>>> 'cpu1' has
>>>>> io_ops: 0 whereas 'cpu2' has io_ops: 1 and the IoOpsWeigher [1]
>>> will
>>>>> prefer hosts with fewer io_iops by default. Note that's only one
>>> piece
>>>>> of the ending weight -- the weighing process will take a sum of
>>> all of
>>>>> the weights each weigher returns. So the weight returned from
>>> RamWeigher
>>>>> is added to the weight returned from IoOpsWeigher is added the
>>> weight
>>>>> returned from CPUWeigher, and so on.
>>>>>
>>>>> So, as Matt said, we're a bit in the dark now as far as what
>>> each
>>>>> weigher is returning and we don't currently have debug logging
>>> per
>>>>> weigher the way we do for filters. That would be an enhancement
>>> we could
>>>>> make to aid in debugging issues like this one. You could hack
>>> something
>>>>> up locally to log/print the returned weight in each weight class
>>> under
>>>>> the nova/scheduler/weights/ directory, if you want to dig into
>>> that.
>>>>>
>>>>> Another thing I noticed is that there are probably some new
>>> weighers
>>>>> available by default that did not exist in the previous version
>>> of nova
>>>>> that you were using in the past. By default, the config option
>>> for weighers:
>>>>>
>>>>> [filter_scheduler]weight_classes =
>>> ["nova.scheduler.weights.all_weighers"]
>>>>>
>>>>> will pick up all weigher classes in the nova/scheduler/weights/
>>> code
>>>>> directory. You might take a look at these and see if any are
>>> ones you
>>>>> should exclude in your environment. For example, the CPUWeigher
>>> [1] (new
>>>>> in Rocky) will spread VMs based on available CPU by default.
>>>>
>>>> most of the weighers spread by default so the cpu weigher may be a
>>> factor but
>>>> the the disk weigher tends to all hevily impact the choice.
>>>>
>>>> we do not normalise any of the values retured by the different
>>> weighers
>>>> the disk wighter is basically  host_state.free_disk_mb *
>>> disk_weight_multiplier
>>>
>>> Hm, I thought we do based on this code:
>>>
>>>
>>> https://github.com/openstack/nova/blob/stable/rocky/nova/weights.py#L135
>>>
>>> which normalizes the weight to a value between 0.0 and 1.0.
>> yes we do but that is witnin the same resouce type.
>>
>> we do not normalise between resoucser types.
>> on a typical host you will have between 2GB and 8GB of ram per cpu
>> core
>> and you will have between 10G and 100G of local disk typically.
>>
>> so we dont look at the capasity of each resouce and renomalise
>> between them.
>> if you want achive that you have to carefully tweek the weights to do
>> that manually.
>>
>> that said i have not sat down and worked out the math to do that in
>> about 3-4 years
>> but i think i used to run my dev clouster with the
>> disk_weight_multiplier around 0.04
>> and i think i used to set the ram_weight_multiplier to 2.0
>>
>> anyway that is just my personal experience and i was trying to tweek
>> the spreading behavior
>> to spread based on ram then cpus then disk rather then pack based on
>> ram but spread based on the rest.
>> so i dont know if this is relvent or correct for this usecase.
>>
>>>
>>> If the multiplier is large though, that could make the considered
>>> value
>>>   > 1.0 (as is the case with the default
>>> build_failure_weight_multiplier:
>>>
>>>
>>> https://github.com/openstack/nova/blob/stable/rocky/nova/conf/scheduler.py#L501
>>>
>>> -melanie
>>>
>>>> althoer host_state.free_disk_mb is actully disk_available_least.
>>>>
>>>> as a result the disk filter will weigh cpu1 19456 height then cpu2
>>>> the dela between the cpu1 and cpu2 based on the ram weigher is
>>> only 2048
>>>>
>>>> if you want the ram filter to take presedence over teh disk filter
>>> you will
>>>> need to scale the disk filter down to be in a similar value range
>>>>
>>>> i woudl suggest setting disk_weight_multiplier=0.001
>>>>
>>>>> This
>>>>> weigher might be contributing to the VM spreading you're seeing.
>>> You
>>>>> might try playing with the '[filter_scheduler]weight_classes'
>>> config
>>>>> option to select only the weighers you want or alternatively you
>>> could
>>>>> set the weighers multipliers the way you prefer.
>>>>>
>>>>> -melanie
>>>>>
>>>>> [1]
>>> https://docs.openstack.org/nova/rocky/user/filter-scheduler.html#weights
>>>>>
>>>>>>>> On 4/17/2019 3:50 PM, Georgios Dimitrakakis wrote:
>>>>>>>> And here is the new log where spawning of 2 VMs can be
>>> seen with a few seconds of difference:
>>>>>>>> https://pastebin.com/Xy2FL2KL
>>>>>>>> Initially both hosts are of weight 1.0 then the one with
>>> one VM already running has negative weight but the
>>>>>>>> new
>>>>>>>> VM is placed on the other host.
>>>>>>>> Really-really strange why this is happening...
>>>>>>>
>>>>>>> < 2019-04-17 23:26:18.770 157355 DEBUG
>>> nova.scheduler.filter_scheduler [req-14c666e4-3ff4-4d88-947e-
>>>>>>> 377b3d37bff9
>>>>>>> 6a4c2e32919e4a6fa5c5d956beb68eef
>>> 9f22e9bfa7974e14871d58bbb62242b2 - default default] Filtered [(cpu2,
>>> cpu2)
>>>>>>> ram:
>>>>>>> 30105MB disk: 1887232MB io_ops: 1 instances: 1, (cpu1, cpu1)
>>> ram: 32153MB disk: 1906688MB io_ops: 0 instances:
>>>>>>> 0]
>>>>>>> _get_sorted_hosts
>>> /usr/lib/python2.7/site-packages/nova/scheduler/filter_scheduler.py:435
>>>>>>>
>>>>>>> < 2019-04-17 23:26:18.771 157355 DEBUG
>>> nova.scheduler.filter_scheduler [req-14c666e4-3ff4-4d88-947e-
>>>>>>> 377b3d37bff9
>>>>>>> 6a4c2e32919e4a6fa5c5d956beb68eef
>>> 9f22e9bfa7974e14871d58bbb62242b2 - default default] Weighed
>>> [WeighedHost
>>>>>>> [host:
>>>>>>> (cpu1, cpu1) ram: 32153MB disk: 1906688MB io_ops: 0
>>> instances: 0, weight: 1.0], WeighedHost [host: (cpu2,
>>>>>>> cpu2)
>>>>>>> ram: 30105MB disk: 1887232MB io_ops: 1 instances: 1, weight:
>>> -0.00900862553213]] _get_sorted_hosts
>>>>>>>
>>> /usr/lib/python2.7/site-packages/nova/scheduler/filter_scheduler.py:454
>>>>>>>
>>>>>>> cpu1 is definitely getting weighed higher but I'm not sure
>>> why. We likely need some debug logging on the
>>>>>>> result of
>>>>>>> each weigher like we have for each filter to figure out
>>> what's going on with the weighers.
>>>>>>>
>>>>>>> --
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> Matt
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>
>>>
>>>