osdir.com


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[kolla][nova][cinder] Got Gateway-Timeout error on VM evacuation if it has volume attached.


Thanks for the advice.

Actually, I tested evacuation again 2 days ago. And this time the
evacuation is successful. All VMs included volume attached were evacuated
with no error.
The Horizon response still slow when shutdown one node. But it became more
faster than before.
But I think we still need to gain a longer timeout.

And I think I should gain rpc_response_timeout rather than long_rpc_timeout
in nova.
Please correct me if wrong.

Many thanks,
Eddie.

Matt Riedemann <mriedemos at gmail.com> æ?¼ 2019å¹´7æ??25æ?¥ é?±å?? ä¸?å??8:11寫é??ï¼?

> On 7/25/2019 3:14 AM, Gorka Eguileor wrote:
> > Attachment delete is a synchronous operation, so all the different
> > connection timeouts may affect the operation: Nova to HAProxy, HAProxy
> > to Cinder-API, Cinder-API to Cinder-Volume via RabbitMQ, Cinder-Volume
> > to Storage backend.
> >
> > I would recommend you looking at the specific attachment_delete request
> > that failed in Cinder logs and see how long it took to complete, and
> > then check how long it took for the 504 error to happen.  With that info
> > you can get an idea of how much higher your timeout must be.
> >
> > It could also happen that the Cinder-API raises a timeout error when
> > calling the Cinder-Volume.  In this case you should check the
> > cinder-volume service to see how long it took it to complete, as the
> > operation continues.
> >
> > Internally the Cinder-API to Cinder-Volume timeout is usually around 60
> > seconds (rpc_response_timeout).
>
> Yeah this is a known intermittent issue in our CI jobs as well, for
> example:
>
> http://status.openstack.org/elastic-recheck/#1763712
>
> As I mentioned in the bug report for that issue:
>
> https://bugs.launchpad.net/cinder/+bug/1763712
>
> It might be worth using the long_rpc_timeout approach for this assuming
> the http response doesn't timeout. Nova uses long_rpc_timeout for known
> long RPC calls:
>
>
> https://docs.openstack.org/nova/latest/configuration/config.html#DEFAULT.long_rpc_timeout
>
> Cinder should probably do the same for initialize connection style RPC
> calls. I've seen other gate failures where cinder-backup to
> cinder-volume rpc calls to initialize a connection have timed out as
> well, e.g.:
>
> https://bugs.launchpad.net/cinder/+bug/1739482
>
> --
>
> Thanks,
>
> Matt
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20190726/30608794/attachment.html>