[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[placement][ptg] code refactoring constraints and goals

On 4/8/19 10:43 AM, Chris Dent wrote:
>     * How (and should) we further split and shrink
> objects/resource_provider.py

To be specific here:

objects/resource_provider.py is currently ~2KLOC. That's not *horrible*
(see below), and if it really didn't lend itself to being further
subdivided, I would say we shouldn't force it just for the sake of
making that number smaller.

However, IMO, this file logically deals with several things:

* Resource provider inventory/capacity/usage
* Resource provider membership in aggregates
* Traits on resource providers
* Creating, updating, and deleting resource providers
* Finding resource providers in various ways

If you look at the structure of the file (in its current order - see
below) these groupings are pretty clear.

Note the presence of "resource provider" in each bullet above. We
already have separate modules for dealing with e.g. CRUD of traits in
their own right. Though the mapping isn't 1:1 with the above list.

So we have the following options, I think:

(1) Do nothing. Leave objects/resource_provider.py as is.
(2) Move X-related stuff into the base X.py module, so like move things
related to "Traits on resource providers" into objects/trait.py. This
level of shuffle would probably entail things like creating a new module
for aggregate.py
(3) Split the above into separate modules while preserving their
resource_provider-ness. From the perspective of code consuming the
modules, this would look like:

from placement.objects import resource_provider as rp_crud
from placement.objects.resource_provider import trait as rp_trait
from placement.objects.resource_provider import search as rp_search
from placement.objects import trait
# Do CRUD-y stuff with resource providers
rp = rp_crud.ResourceProvider(...)
# Do trait stuff
t = trait.Trait(...)
# Do trait-y stuff with resource providers
rp_trait.add_traits_to_provider(rp, [t])
# Do search-y stuff with resource providers

To me, this is nice and intuitive.

The issue is the module hierarchy. As written above, it would entail
putting the CRUD-y stuff into objects/resource_provider/__init__.py,
which some folks don't like.

I suppose we could have objects/resource_provider/crud.py, but that
doesn't match the pattern for other things like objects/trait.py


> * Is it worth reordering methods to some kind of arbitrary ordering?

My preference on this: keep related methods/classes/etc. near each other
in the file.

* Intuitive.
* Increases the probability that, when I'm looking at a method that
calls another method, both methods will fit on my screen at the same
time. (Workaround: multiple editor windows to the same file.)

* Sometimes there's no sensible grouping, which makes this hard to
define and therefore hard to follow (and hard to enforce, if that's a
thing we want to do).
* Ability to know exactly where to look to find something. (Workaround:
Use a real IDE, where you can do a project-wide search for a symbol,
hotkey from a symbol's use to its definition, etc. Also you can choose
to view a module's skeleton in its real order *or* alphabetically.)

> * If a method or module is "long" (by some definition) does that
>   make it bad?

Not to the point where we should do something like implement sonar.
Having had experience with that, it just ain't worth the hassle.
However, there is such a thing as too long. Certain nova libvirt modules
come to mind. For those of us who do choose to use real IDEs, these can
take a long time to process (in-band syntax checking, symbol correlation
highlighting, git blame, etc.). I'm going to go out on a limb and say we
should strive really hard never to have a module in the 5-digit LOCs.
And it would be really nice to stay under 2KLOC, but only if it doesn't
feel forced.

> * Where is the balance between backport hurting churn and keeping
>   the mix fresh?

I'm not sure "fresh" is the heaviest hitter on the plus side. Making the
code organized and crisp and usable (from a developer standpoint) and
maintainable is the main thing to me.

On the minus side, in addition to backport pain, there's the difficulty
tracking down the change that did a thing. Run git blame today and it
looks like Chris wrote virtually the whole thing in a small number of
massive monolithic chunks. You have to use bisect or similar to track
something down. To me, this is the biggest pain point.

So I think we should strive to get to a point where the code structure
is "good enough" and then stop doing huge refactors. That way, as we
move forward, changes will accumulate on that base and things will look
more "normal" history-wise.

> * Should there be a more cleanly defined boundary between the
>   methods that marshal data (the top-level methods in objects/*.py)
>   and those that talk to the persistence layer (also in
>   objects/*.py)?
>     * Do we need to define (better) or document the layers/boundaries
>       in the system?

Sure, if we can find a fairly clean split. I don't have this enough in
my head to volunteer to do it, but I'll review it if someone else does :)

> * How can we encourage a similar refactor/revisit-often attitude to
>   the docs?

Fostering the culture that way, as we're doing. Bringing it up
periodically. Having a "doc sweep day" where members of the/a (sub)team
volunteer to each read a chunk of doc/code thoroughly?