OSDir


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Openstack] CPU Affinity And Dynamic Load Balancing



Hi,

as part of building a small but completely new OpenStack cloud on top of
Queens, I am investigating the possibilities of providing SLAs for VMs.
Unfortunately, I'm still rather new to OpenStack.

I have a four compute nodes with a few dozen threads each, and a few VMs
on top of them. Some of my VMs occasionally need to be able to nail
entire cores for optimal single-threaded performance, but are usually
nearly idle. For performance and reliability reasons, I would not want
all high-priority VMs be scheduled on one node only, while having the
less performance-critical ones share the other nodes. So the question is
how to (eg) partition nodes to have at least two compartments, and/or
dynamically (re-) allocate VMs to lesser used nodes/compartments.
When the load goes down again, I would like to undo squeezing the other
VMs into the remaining capacity and have them spread out evenly
again. For me, the critical factor is having some cores exclusively
used by certain VMs, if and only as long as they require it (yes, more
hardware might be on the horizon later, but there's no desire to
squander the money).

I've read about the FilterScheduler, which looks almost good, but I
would like to apply this placement mechanism to running VMs. I am also
interested in just pausing VMs for seconds in turn, if I can detect that
a high-priority VM needs more capacity. But I don't see off the top of
my head how to partition compute nodes. Can it be done with the NUMA
configuration, and then schedule based on the individual "pieces", ie.
effectively not scheduling across four nodes, but, say, 16 NUMA slices
in a hierarchical setting (I haven't seen the machines yet, so I don't
really know what they are)?

If I could sort of put a VM through the placement algorithm so that I
get the placement decision, then that's sort of fine: In that case, I'd
put my other VMs through the placement algoritm and "manually"
(scripted) migrate them.

What is the common wisdom on these problems, please?


Thanks,
Toni