osdir.com


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[watcher] nova cdm builder performance optimizations - summary


Thanks alot for the summary, this could be very helpful, we will have a
test on these :)

On Wed, Jul 10, 2019 at 2:29 AM Matt Riedemann <mriedemos at gmail.com> wrote:

> I wanted to summarize a series of changes which have improved the
> performance of the NovaClusterDataModel builder for audits across single
> and multiple cells (in the CERN case) by a factor of 20-30%.
>
> There were initially three changes involved (in order):
>
> 1. https://review.opendev.org/#/c/659688/ - Optimize
> NovaClusterDataModelCollector.add_instance_node
>
> Reports on that patch alone said it fixed a regression introduced in
> Stein with scoped audits:
>
> "I checked this patch on the my test environment on the stable/stein
> branch. I have more than 1000 virtual servers (some real, some dummy).
> Previously, in the stable/rocky branch, the time to build a cluster was
> about 15-20 minutes, in the Stein branch there was a regression and the
> time increased to 90 minutes. After this patch, the build time is only 2
> minutes."
>
> That change was backported to stable/stein.
>
> 2. - https://review.opendev.org/#/c/661121/ - Optimize hypervisor API
> calls (which requires https://review.opendev.org/#/c/659886/)
>
> As noted that change requires a patch to python-novaclient if you are
> looking to backport the change. We can't backport that upstream because
> of the python-novaclient dependency since it would require bumping the
> minimum required version of the library on a stable branch which is
> against stable branch policy (minimum version of library dependencies
> are more or less frozen on stable branches).
>
> That change also requires configuring watcher with:
>
> [nova_client]
> api_version = 2.53  # or greater; train now requires at least 2.56
>
> 3. - https://review.opendev.org/#/c/662089/ - Optimize
> NovaHelper.get_compute_node_by_hostname
>
> This optimizes code used to build/update the nova CDM during
> notification processing and also fixes a bug about looking up the
> compute service properly.
>
> After those three changes were merged, Corne Lukken (Dantali0n) started
> doing scale and performance testing with and without the changes in a
> CERN 5-cell test cluster. Corne identified a regression for which Canwei
> Li determined the root cause and chenker fixed:
>
> 4. https://review.opendev.org/#/c/668100/ - Reduce the query time of the
> instances when call get_instance_list()
>
> With that fix applied Corne reported the overall improvement of 20-30%
> when building the nova CDM during an audit in various scenarios. The
> actual performance numbers will be sent later as part of a thesis Corne
> is working on.
>
> I want to thank Dantali0n, licanwei and chenker for all of their help
> with this series of improvements.
>
> --
>
> Thanks,
>
> Matt
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20190710/202c15cd/attachment.html>