osdir.com


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[oslo][oslo-messaging][nova] Stein nova-api AMQP issue running under uWSGI


On Mon, Apr 22, 2019 at 12:25 PM Ben Nemec <openstack at nemebean.com> wrote:
>
>
>
> On 4/22/19 12:53 PM, Alex Schultz wrote:
> > On Mon, Apr 22, 2019 at 11:28 AM Ben Nemec <openstack at nemebean.com> wrote:
> >>
> >>
> >>
> >> On 4/20/19 1:38 AM, Michele Baldessari wrote:
> >>> On Fri, Apr 19, 2019 at 03:20:44PM -0700, iain.macdonnell at oracle.com wrote:
> >>>>
> >>>> Today I discovered that this problem appears to be caused by eventlet
> >>>> monkey-patching. I've created a bug for it:
> >>>>
> >>>> https://bugs.launchpad.net/nova/+bug/1825584
> >>>
> >>> Hi,
> >>>
> >>> just for completeness we see this very same issue also with
> >>> mistral (actually it was the first service where we noticed the missed
> >>> heartbeats). iirc Alex Schultz mentioned seeing it in ironic as well,
> >>> although I have not personally observed it there yet.
> >>
> >> Is Mistral also mixing eventlet monkeypatching and WSGI?
> >>
> >
> > Looks like there is monkey patching, however we noticed it with the
> > engine/executor. So it's likely not just wsgi.  I think I also saw it
> > in the ironic-conductor, though I'd have to try it out again.  I'll
> > spin up an undercloud today and see if I can get a more complete list
> > of affected services. It was pretty easy to reproduce.
>
> Okay, I asked because if there's no WSGI/Eventlet combination then this
> may be different from the Nova issue that prompted this thread. It
> sounds like that was being caused by a bad interaction between WSGI and
> some Eventlet timers. If there's no WSGI involved then I wouldn't expect
> that to happen.
>
> I guess we'll see what further investigation turns up, but based on the
> preliminary information there may be two bugs here.
>

So I wasn't able to reproduce the ironic issues yet. But it's the
mistral executor and nova-api which exhibit the issue on the
undercloud.

mistral/executor.log:2019-04-22 22:40:58.321 7 ERROR
oslo.messaging._drivers.impl_rabbit [-]
[b7b4bc40-767c-4de1-b77b-6a5822f6beed] AMQP server on
undercloud-0.ctlplane.localdomain:5672 is unreachable: [Errno 104]
Connection reset by peer. Trying again in 1 seconds.:
ConnectionResetError: [Errno 104] Connection reset by peer


nova/nova-api.log:2019-04-22 22:38:11.530 19 ERROR
oslo.messaging._drivers.impl_rabbit
[req-d7767aed-e32d-43db-96a8-c0509bfb1cfe
9ac89090d2d24949b9a1e01b1afb14cc 7becac88cbae4b3b962ecccaf536effe -
default default] [c0f3fe7f-db89-42c6-95bd-f367a4fbf680] AMQP server on
undercloud-0.ctlplane.localdomain:5672 is unreachable: Server
unexpectedly closed connection. Trying again in 1 seconds.: OSError:
Server unexpectedly closed connection

The errors being thrown are different perhaps it is two different problems.

Thanks,
-Alex