[qa][openstackclient] Debugging devstack slowness
On Mon, Jul 29, 2019, at 6:41 AM, Alex Schultz wrote:
> On Fri, Jul 26, 2019 at 5:57 PM Clark Boylan <cboylan at sapwetik.org> wrote:
> > Today I have been digging into devstack runtime costs to help Donny Davis understand why tempest jobs sometimes timeout on the FortNebula cloud. One thing I discovered was that the keystone user, group, project, role, and domain setup  can take many minutes  (in the examples here almost 5).
> > I've rewritten create_keystone_accounts to be a python tool  and get the runtime for that subset of setup from ~100s to ~9s . I imagine that if we applied this to the other create_X_accounts functions we would see similar results.
> > I think this is so much faster because we avoid repeated costs in openstack client including: python process startup, pkg_resource disk scanning to find entrypoints, and needing to convert names to IDs via the API every time osc is run. Given my change shows this can be so much quicker is there any interest in modifying devstack to be faster here? And if so what do we think an appropriate approach would be?
> In tripleo, we've also run into the same thing for other actions. While
> you can do bulk openstack client actions, it's not the best thing if
> you need to create a resource and fetch an ID for a subsequent action.
> We ported our post-installation items to python and noticed a
> dramatic improvement as well. It might be beneficial to maybe add some
> caching into openstackclient so that the startup cost isn't so large
> every time?
Reading more of what devstack does I've realized that there is quite a bit of logic tied around devstack's use of OSC. In particular if you select one option you get this endpoint and if you select another option you get that endpoint, or if this service and that service are enabled then they need this common role, etc. I think the best way to tackle this would be to have devstack write a manifest file, then have a tool (maybe in osc or sdk?) that can read a manifest and execute the api updates in order, storing intermediate results so that they can be referred to without doing further API lookups.
Sounds like such a thing would be useful outside of devstack as well. I brought this up briefly with Monty and he said he would explore it a bit on the SDK side of things. Does this seem like a reasonable approach? Anyone else have better ideas?
The big key here seems to be reusing authentication tokens and remembering resource ID data so that we can avoid unnecessary (and costly) lookups every time we want to modify a resource or associate resources.
>  https://review.opendev.org/#/c/521146/
>  https://review.opendev.org/#/c/614540/
> >  https://opendev.org/openstack/devstack/src/commit/6aeaceb0c4ef078d028fb6605cac2a37444097d8/stack.sh#L1146-L1161
> >  http://logs.openstack.org/05/672805/4/check/tempest-full/14f3211/job-output.txt.gz#_2019-07-26_12_31_04_488228
> >  http://logs.openstack.org/05/672805/4/check/tempest-full/14f3211/job-output.txt.gz#_2019-07-26_12_35_53_445059
> >  https://review.opendev.org/#/c/673108/
> >  http://logs.openstack.org/08/673108/6/check/devstack-xenial/a4107d0/job-output.txt.gz#_2019-07-26_23_18_37_211013
> > Note the jobs compared above all ran on rax-dfw.
> > Clark