[qa][openstackclient] Debugging devstack slowness
---- On Thu, 01 Aug 2019 17:58:18 +0900 Ian Wienand <iwienand at redhat.com> wrote ----
> On Fri, Jul 26, 2019 at 04:53:28PM -0700, Clark Boylan wrote:
> > Given my change shows this can be so much quicker is there any
> > interest in modifying devstack to be faster here? And if so what do
> > we think an appropriate approach would be?
> My first concern was if anyone considered openstack-client setting
> these things up as actually part of the testing. I'd say not,
> comments in  suggest similar views.
> My second concern is that we do keep sufficient track of complexity v
> speed; obviously doing things in a sequential manner via a script is
> pretty simple to follow and as we start putting things into scripts we
> make it harder to debug when a monoscript dies and you have to start
> pulling apart where it was. With just a little json fiddling we can
> currently pull good stats from logstash () so I think as we go it
> would be good to make sure we account for the time using appropriate
> wrappers, etc.
I agree on this concern about maintainability and debugging with scripts.
Now a days, very less people have good knowledge on devstack code and
debugging the failure on job side is much harder for most of the developers.
IMO the maintainability and easy to debug is much needed as first priority.
If we wanted to convert the OSC with something faster, Tempest service client
comes into my mind. They are the very straight call to API directly but the token
is requested for each API call. But that is something need PoC about speed
> Then the third concern is not to break anything for plugins --
> devstack has a very very loose API which basically relies on plugin
> authors using a combination of good taste and copying other code to
> decide what's internal or not.
> Which made me start thinking I wonder if we look at this closely, even
> without replacing things we might make inroads?
> For example ; it seems like SERVICE_DOMAIN_NAME is never not
> default, so the get_or_create_domain call is always just overhead (the
> result is never used).
> Then it seems that in the gate, basically all of the "get_or_create"
> calls will really just be "create" calls? Because we're always
> starting fresh. So we could cut out about half of the calls there
> pre-checking if we know we're under zuul (proof-of-concept ).
> Then we have blocks like:
> get_or_add_user_project_role $member_role $demo_user $demo_project
> get_or_add_user_project_role $admin_role $admin_user $demo_project
> get_or_add_user_project_role $another_role $demo_user $demo_project
> get_or_add_user_project_role $member_role $demo_user $invis_project
> If we wrapped that in something like
> which sets a variable that means instead of calling directly, those
> functions write their arguments to a tmp file. Then at the end call,
> end_osc_session does
> $ osc "$(< tmpfile)"
> and uses the inbuilt batching? If that had half the calls by skipping
> the "get_or" bit, and used common authentication from batching, would
> that help?
> And then I don't know if all the projects and groups are required for
> every devstack run? Maybe someone skilled in the art could do a bit
> of an audit and we could cut more of that out too?
Yeah, improving such usused o not required call with the audit is a good call.
For example, In most place, devstack need just resource id or name or few fields for
created resource so get call which gives complete resource fileds might not be needed
and for async call we can have an exception to get resource('addressess' in server).
> So I guess my point is that maybe we could tweak what we have a bit to
> make some immediate wins, before anyone has to rewrite too much?
>  https://review.opendev.org/673018
>  https://ethercalc.openstack.org/rzuhevxz7793
>  https://review.opendev.org/673941
>  https://review.opendev.org/673936