osdir.com


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[osprofiler] Distributed tracing in OpenStack



On 4/11/19 9:42 PM, Ilya Shakhat wrote:
> Hi,
> 
> Distributed tracing is one of must-have features when one wants to track 
> the full path of request going through different services and APIs. This 
> makes it similar to shared request-id, but with nice visualization at 
> the end [1]. In OpenStack the tracing can be achieved via osprofiler 
> library. The library was introduced 5 years ago, and back then there was 
> no standard approach on how to do tracing and that's why it stays aside 
> from what has become a mainstream. Yet there is no single standard, but 
> the major players are OpenTracing and OpenCensus communities. 
> OpenTracing is represented by Uber's Jaeger which is the default tracer 
> from k8s world.
> 
> Issues and limitations to be fixed:
> 1. Compatibility. While osprofiler library supports many different 
> storage drivers, it has only one way of transferring trace context over 
> the wire. Ideally the library should be compatible with other 
> third-party tracers and allow traces to start in front of OpenStack APIs 
> (e.g. in user apps) and continue after (e.g. in storage systems, or 
> network management tools). [2]
> 2. Operation mode. With osprofiler tracing is initiated by user request, 
> while in industrial solutions the tracing can be managed centrally via 
> dynamic sampling policies.
> 3. In-process trace propagation. Depending on execution model (threaded, 
> async) the ways of storing current trace context differ. OSProfiler 
> supports thread-local model, which recently got broken with new async 
> implementation in openstacksdk [3].

FWIW - we should have re-fixed that issue in SDK for all instances other 
than parallel uploading of Large Objects segments to swift. The 
parallism support now relies on the calling context's parallism. The 
large-object segment uploader is a thing we should make sure we do 
things with to make sure we're not losing those interactions.

That said - if we move forward with this plan - let's be sure to make 
sure it works in openstacksdk - and that we're testing it so that we 
don't break it.

> With OpenTracing it is possible to 
> select the appropriate model alongside with tracer configuration.
> 
> What's the plan:
> Switching to OpenTracing could be a good option to gain compatibility 
> with 3rd-party solutions. The actual change should go to osprofiler 
> library, but indirectly affects all OpenStack projects (should it be a 
> global team goal then?). I'm going to make a PoC of proposed change, so 
> reviews would be highly appreciated.
> 
> Comments, suggestions?

Generally supportive. I have specific impl feedbacks - but I'll leave 
those on the patches.

> Thanks,
> Ilya
> 
> [1] e.g. 
> http://logs.openstack.org/15/650915/4/check/tempest-smoke-py3-osprofiler-redis/7c6c14e/osprofiler-traces/trace-3e5cc660-8815-4079-86b9-778af8469d79.html.gz
> [2] https://bugs.launchpad.net/osprofiler/+bug/1798565
> [3] https://bugs.launchpad.net/osprofiler/+bug/1818493