osdir.com


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[placement] update 19-32


HTML: https://anticdent.org/placement-update-19-32.html

Here's placement update 19-32. There will be no update 33; I'm going
to take next week off. If there are Placement-related issues that
need immediate attention please speak with any of Eric Fried
(efried), Balazs Gibizer (gibi), or Tetsuro Nakamura (tetsuro).

# Most Important

Same as last week: The main things on the Placement radar are
implementing Consumer Types and cleanups, performance analysis, and
documentation related to nested resource providers.

A thing we should place on the "important" list is bringing the osc
placement plugin up to date. We also need to discuss what would we
would like the plugin to be. Is it required that it have ways to
perform all the functionality of the API, or is it about providing
ways to do what humans need to do with the placement API? Is there a
difference?

We decided that consumer types is medium priority: The nova-side use
of the functionality is not going to happen in Train, but it would
be nice to have the placement-side ready when U opens. The primary
person working on it, tssurya, is spread pretty thin so it might not
happen unless someone else has the cycles to give it some attention.

On the documentation front, we realized during some performance work
[last](https://review.opendev.org/675606)
[week](https://review.opendev.org/#/c/676204/4/placement/tests/functional/gabbits/same-subtree-deep.yaml at 29)
that it easy to have an incorrect grasp of how `same_subtree` works
when there are more than two groups involved. It is critical that we
create good "how to use" documentation for this and other advanced
placement features. Not only can it be easy to get wrong, it can be
challenge to see that you've got it wrong (the failure mode is "more
results, only some of which you actually wanted").

# What's Changed

* Yet more [performance
   fixes](https://review.opendev.org/#/q/topic:optimize-_build_provider_summaries)
   are in the process of merging. Most of these are related to
   getting `_merge_candidates` and `_build_provider_summaries` to
   have less impact. The fixes are generally associated with avoiding
   duplicate work by generating dicts of reusable objects earlier in
   the request. This is possible because of the relatively new
   `RequestWideSearchContext`. In a request that returns many
   provider summaries `_build_provider_summaries` continues to have a
   significant impact because it has to create many objects but
   overall everything is much less heavyweight. More on performance
   in Themes, below.

* The combination of all these performance fixes, and because of
   microversions, makes it reasonable for anyone running placement in
   a resource constrained environment (or simply wanting things to be
   faster) to consider running Train placement with _any_ release of
   OpenStack. Obviously you should test it first, but it is worth
   investigating. More information on how to achieve this can be
   found in the [upgrade to stein
   docs](https://docs.openstack.org/placement/latest/upgrade/to-stein.html)

# Stories/Bugs

(Numbers in () are the change since the last pupdate.)

There are 23 (1) stories in [the placement
group](https://storyboard.openstack.org/#!/project_group/placement).
0 (0) are [untagged](https://storyboard.openstack.org/#!/worklist/580).
4 (1) are [bugs](https://storyboard.openstack.org/#!/worklist/574). 4 (0)
are [cleanups](https://storyboard.openstack.org/#!/worklist/575). 11
(0) are [rfes](https://storyboard.openstack.org/#!/worklist/594).
4 (0) are [docs](https://storyboard.openstack.org/#!/worklist/637).

If you're interested in helping out with placement, those stories
are good places to look.

* Placement related nova [bugs not yet in progress](https://goo.gl/TgiPXb)
   on launchpad: 18 (1).

* Placement related nova [in progress bugs](https://goo.gl/vzGGDQ) on
   launchpad: 4 (-1).

# osc-placement

osc-placement is currently behind by 12 microversions.

* <https://review.opendev.org/666542>
   Add support for multiple member_of. There's been some useful
   discussion about how to achieve this, and a consensus has emerged
   on how to get the best results.

* <https://review.opendev.org/640898>
   Adds a new '--amend' option which can update resource provider
   inventory without requiring the user to pass a full replacement
   for inventory. This has been broken up into three patches to help
   with review.

# Main Themes

## Consumer Types

Adding a type to consumers will allow them to be grouped for various
purposes, including quota accounting.

* <https://review.opendev.org/#/q/topic:bp/support-consumer-types>
   A WIP, as microversion 1.37, has started.

As mentioned above, this is currently paused while other things take
priority. If you have time that you could spend on this please
respond here expressing that interest.

## Cleanup

Cleanup is an overarching theme related to improving documentation,
performance and the maintainability of the code. The changes we are
making this cycle are fairly complex to use and are fairly complex
to write, so it is good that we're going to have plenty of time to
clean and clarify all these things.

As said above, there's lots of performance work in progress. We'll
need to make a similar effort with regard to docs. For example, all
of the coders involved in the creation and review of the
`same_subtree` functionality struggle to explain, clearly and
simply, how it will work in a variety of situations. We need to
enumerate the situations and the outcomes, in documentation.

One outcome of this work will be something like a _Deployment
Considerations_ document to help people choose how to tweak their
placement deployment to match their needs. The simple answer is use
more web servers and more database servers, but that's often very
wasteful.

On the performance front, there is one major area of impact which
has not received much attention yet. When requesting allocation
candidates (or resource providers) that will return many results
the cost of JSON serialization is just under one quarter of the
processing time. This is to be expected when the response body is
`2379k` big, and 154000 lines long (when pretty printed) for 7000
provider summaries and 2000 allocation requests.

But there are ways to fix it. One is to ask more focused questions
(so fewer results are expected). Another is to `limit=N` the results
(but this can lead to issues with migrations).

Another is to [use a different JSON
serializer](https://review.opendev.org/674661). Should we do that?
It make a _big_ difference with large result sets (which will be
common in big and sparse clouds).

# Other Placement

Miscellaneous changes can be found in [the usual
place](https://review.opendev.org/#/q/project:openstack/placement+status:open).

There are two [os-traits
changes](https://review.opendev.org/#/q/project:openstack/os-traits+status:open)
being discussed. And zero [os-resource-classes
changes](https://review.opendev.org/#/q/project:openstack/os-resource-classes+status:open).

# Other Service Users

New discoveries are added to the end. Merged stuff is removed.
Anything that has had no activity in 4 weeks has been removed.

* <https://review.openstack.org/#/q/topic:bug/1819923>
   Nova: nova-manage: heal port allocations

* <https://review.opendev.org/659233>
   Cyborg: Placement report

* <https://review.opendev.org/662229>
   helm: add placement chart

* <https://review.opendev.org/634551>
   libvirt: report pmem namespaces resources by provider tree

* <https://review.opendev.org/660852>
   Nova: Remove PlacementAPIConnectFailure handling from AggregateAPI

* <https://review.opendev.org/670112>
   Nova: WIP: Add a placement audit command

* <https://review.opendev.org/671312>
   blazar: Fix placement operations in multi-region deployments

* <https://review.opendev.org/671793>
   Nova: libvirt: Start reporting PCPU inventory to placement
   A part of <https://review.opendev.org/#/q/topic:bp/cpu-resources

* <https://review.opendev.org/#/q/topic:bp/support-move-ops-with-qos-ports>
   Nova: support move ops with qos ports

* <https://review.opendev.org/666202>
   Blazar: Create placement client for each request

* <https://review.opendev.org/667952>
   nova: Support filtering of hosts by forbidden aggregates

* <https://review.opendev.org/669079>
   blazar: Send global_request_id for tracing calls

* <https://review.opendev.org/670696>
   tempest: Add placement API methods for testing routed provider nets

* <https://review.opendev.org/672678>
   openstack-helm: Build placement in OSH-images

* <https://review.opendev.org/674129>
   Correct global_request_id sent to Placement

* <https://review.opendev.org/#/q/topic:bp/cross-cell-resize>
   Nova: cross cell resize

* <https://review.opendev.org/674524>
   Nova: Scheduler translate properties to traits

* <https://review.opendev.org/623558>
   Nova: single pass instance info fetch in host manager

* <https://review.opendev.org/674708>
   Zun: [WIP] Claim container allocation in placement

# End

Have a good next week.

-- 
Chris Dent                       Ù©â??̯â??Û¶           https://anticdent.org/
freenode: cdent