osdir.com


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[goals][upgrade-checkers] Retrospective


On 4/24/2019 8:21 AM, Mark Goddard wrote:
> I put together a patch for kolla-ansible with support for upgrade checks 
> for some projects: https://review.opendev.org/644528. It's on the 
> backburner at the moment but I plan to return to it during the Train 
> cycle. Perhaps you could clarify a few things about expected usage.

Cool. I'd probably try to pick one service (nova?) to start with before 
trying to bite off all of these in a single change (that review is kind 
of daunting).

Also, as part of the community wide goal I wrote up reference docs in 
the nova tree [1] which might answer your questions with links for more 
details.

> 
> 1. Should the tool be run using the new code? I would assume so.

Depends on what you mean by "new code". When nova introduced this in 
Ocata it was meant to be run in a venv or container after upgrading the 
newton schema and data migrations to ocata, but before restarting the 
services with the ocata code and that's how grenade uses it. But the 
checks should also be idempotent and can be run as a 
post-install/upgrade verify step, which is how OSA uses it (and is 
described in the nova install docs [2]).

> 2. How would you expect this to be run with multiple projects? I was 
> thinking of adding a new command that performs upgrade checks for all 
> projects that would be read-only, then also performing the check again 
> as part of the upgrade procedure.

Hmm, good question. This probably depends on each deployment tool and 
how they roll through services to do the upgrade. Obviously you'd want 
to run each project's checks as part of upgrading that service, but I 
guess you're looking for some kind of "should we even start this whole 
damn upgrade if we can detect early that there are going to be issues?". 
If the early run is read-only though - and I'm assuming by read-only you 
mean they won't cause a failure - how are you going to expose that there 
is a problem without failing? Would you make that configurable? 
Otherwise the checks themselves are supposed to be read-only and not 
change your data (they aren't the same thing as an online data migration 
routine for example).

> 3. For the warnings, would you recommend a -Werror style argument that 
> optionally flags up warnings as errors? Reporting non-fatal errors is 
> quite difficult in Ansible.

OSA fails on any return codes that aren't 0 (success) or 1 (warning). 
It's hard to say when warning should be considered an error really. When 
writing these checks I think of warning as a case where you might be OK 
but we don't really know for sure, so it can aid in debugging 
upgrade-related issues after the fact but might not necessarily mean you 
shouldn't upgrade. mnaser has brought up the idea in the past of making 
the output more machine readable so tooling could pick and choose which 
things it considers to be a failure (assuming the return code was 1). 
That's an interesting idea but one I haven't put a lot of thought into. 
It might be as simple as outputting a unique code per check per project, 
sort of like the error code concept in the API guidelines [3] which the 
placement project is using [4].

[1] https://docs.openstack.org/nova/latest/reference/upgrade-checks.html
[2] https://docs.openstack.org/nova/latest/install/verify.html
[3] https://specs.openstack.org/openstack/api-wg/guidelines/errors.html
[4] 
https://opendev.org/openstack/placement/src/branch/master/placement/errors.py

-- 

Thanks,

Matt