osdir.com


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Slow instance launch times due to RabbitMQ


I am curious how your system is setup?
Are you using nova with local storage?
Are you using ceph?

How long does it take to launch an instance when you are seeing this
message?




On Wed, Aug 7, 2019 at 11:12 AM Herve Beraud <hberaud at redhat.com> wrote:

>
>
> Le mar. 6 août 2019 à 17:14, Ben Nemec <openstack at nemebean.com> a écrit :
>
>> Another thing to check if you're having seemingly inexplicable messaging
>> issues is that there isn't a notification queue filling up somewhere. If
>> notifications are enabled somewhere but nothing is consuming them the
>> size of the queue will eventually grind rabbit to a halt.
>>
>> I used to check queue sizes through the rabbit web ui, so I have to
>> admit I'm not sure how to do it through the cli.
>>
>
> You can use the following command to monitor your queues and observe size
> and growing:
>
> ```
> watch -c "rabbitmqctl list_queues name messages_unacknowledged"
> ```
>
> Or also something like that:
>
> ```
> rabbitmqctl list_queues messages consumers name message_bytes
> messages_unacknowledged > messages_ready head_message_timestamp
> consumer_utilisation memory state | grep reply
> ```
>
>
>>
>> On 7/31/19 10:48 AM, Gabriele Santomaggio wrote:
>> > Hi,
>> > Are you using ssl connections ?
>> >
>> > Can be this issue ?
>> > https://bugs.launchpad.net/ubuntu/+source/oslo.messaging/+bug/1800957
>> >
>> >
>> > ------------------------------------------------------------------------
>> > *From:* Laurent Dumont <laurentfdumont at gmail.com>
>> > *Sent:* Wednesday, July 31, 2019 4:20 PM
>> > *To:* Grant Morley
>> > *Cc:* openstack-operators at lists.openstack.org
>> > *Subject:* Re: Slow instance launch times due to RabbitMQ
>> > That is a bit strange, list_queues should return stuff. Couple of ideas
>> :
>> >
>> >   * Are the Rabbit connection failure logs on the compute pointing to a
>> >     specific controller?
>> >   * Are there any logs within Rabbit on the controller that would point
>> >     to a transient issue?
>> >   * cluster_status is a snapshot of the cluster at the time you ran the
>> >     command. If the alarms have cleared, you won't see anything.
>> >   * If you have the RabbitMQ management plugin activated, I would
>> >     recommend a quick look to see the historical metrics and overall
>> status.
>> >
>> >
>> > On Wed, Jul 31, 2019 at 9:35 AM Grant Morley <grant at civo.com
>> > <mailto:grant at civo.com>> wrote:
>> >
>> >     Hi guys,
>> >
>> >     We are using Ubuntu 16 and OpenStack ansible to do our setup.
>> >
>> >     rabbitmqctl list_queues
>> >     Listing queues
>> >
>> >     (Doesn't appear to be any queues )
>> >
>> >     rabbitmqctl cluster_status
>> >
>> >     Cluster status of node
>> >     'rabbit at management-1-rabbit-mq-container-b4d7791f'
>> >     [{nodes,[{disc,['rabbit at management-1-rabbit-mq-container-b4d7791f',
>> >                      'rabbit at management-2-rabbit-mq-container-b455e77d
>> ',
>> >                      'rabbit at management-3-rabbit-mq-container-1d6ae377
>> ']}]},
>> >       {running_nodes,['rabbit at management-3-rabbit-mq-container-1d6ae377
>> ',
>> >                       'rabbit at management-2-rabbit-mq-container-b455e77d
>> ',
>> >                       'rabbit at management-1-rabbit-mq-container-b4d7791f
>> ']},
>> >       {cluster_name,<<"openstack">>},
>> >       {partitions,[]},
>> >       {alarms,[{'rabbit at management-3-rabbit-mq-container-1d6ae377',[]},
>> >                {'rabbit at management-2-rabbit-mq-container-b455e77d',[]},
>> >                {'rabbit at management-1-rabbit-mq-container-b4d7791f
>> ',[]}]}]
>> >
>> >     Regards,
>> >
>> >     On 31/07/2019 11:49, Laurent Dumont wrote:
>> >>     Could you forward the output of the following commands on a
>> >>     controller node? :
>> >>
>> >>     rabbitmqctl cluster_status
>> >>     rabbitmqctl list_queues
>> >>
>> >>     You won't necessarily see a high load on a Rabbit cluster that is
>> >>     in a bad state.
>> >>
>> >>     On Wed, Jul 31, 2019 at 5:19 AM Grant Morley <grant at civo.com
>> >>     <mailto:grant at civo.com>> wrote:
>> >>
>> >>         Hi all,
>> >>
>> >>         We are randomly seeing slow instance launch / deletion times
>> >>         and it appears to be because of RabbitMQ. We are seeing a lot
>> >>         of these messages in the logs for Nova and Neutron:
>> >>
>> >>         ERROR oslo.messaging._drivers.impl_rabbit [-]
>> >>         [f4ab3ca0-b837-4962-95ef-dfd7d60686b6] AMQP server on
>> >>         10.6.2.212:5671 <http://10.6.2.212:5671> is unreachable: Too
>> >>         many heartbeats missed. Trying again in 1 seconds. Client
>> >>         port: 37098: ConnectionForced: Too many heartbeats missed
>> >>
>> >>         The RabbitMQ cluster isn't under high load and I am not seeing
>> >>         any packets drop over the network when I do some tracing.
>> >>
>> >>         We are only running 15 compute nodes currently and have >1000
>> >>         instances so it isn't a large deployment.
>> >>
>> >>         Are there any good configuration tweaks for RabbitMQ running
>> >>         on OpenStack Queens?
>> >>
>> >>         Many Thanks,
>> >>
>> >>         --
>> >>
>> >>         Grant Morley
>> >>         Cloud Lead, Civo Ltd
>> >>         www.civo.com <https://www.civo.com/>| Signup for an account!
>> >>         <https://www.civo.com/signup>
>> >>
>> >     --
>> >
>> >     Grant Morley
>> >     Cloud Lead, Civo Ltd
>> >     www.civo.com <https://www.civo.com/>| Signup for an account!
>> >     <https://www.civo.com/signup>
>> >
>>
>>
>
> --
> Hervé Beraud
> Senior Software Engineer
> Red Hat - Openstack Oslo
> irc: hberaud
> -----BEGIN PGP SIGNATURE-----
>
> wsFcBAABCAAQBQJb4AwCCRAHwXRBNkGNegAALSkQAHrotwCiL3VMwDR0vcja10Q+
> Kf31yCutl5bAlS7tOKpPQ9XN4oC0ZSThyNNFVrg8ail0SczHXsC4rOrsPblgGRN+
> RQLoCm2eO1AkB0ubCYLaq0XqSaO+Uk81QxAPkyPCEGT6SRxXr2lhADK0T86kBnMP
> F8RvGolu3EFjlqCVgeOZaR51PqwUlEhZXZuuNKrWZXg/oRiY4811GmnvzmUhgK5G
> 5+f8mUg74hfjDbR2VhjTeaLKp0PhskjOIKY3vqHXofLuaqFDD+WrAy/NgDGvN22g
> glGfj472T3xyHnUzM8ILgAGSghfzZF5Skj2qEeci9cB6K3Hm3osj+PbvfsXE/7Kw
> m/xtm+FjnaywZEv54uCmVIzQsRIm1qJscu20Qw6Q0UiPpDFqD7O6tWSRKdX11UTZ
> hwVQTMh9AKQDBEh2W9nnFi9kzSSNu4OQ1dRMcYHWfd9BEkccezxHwUM4Xyov5Fe0
> qnbfzTB1tYkjU78loMWFaLa00ftSxP/DtQ//iYVyfVNfcCwfDszXLOqlkvGmY1/Y
> F1ON0ONekDZkGJsDoS6QdiUSn8RZ2mHArGEWMV00EV5DCIbCXRvywXV43ckx8Z+3
> B8qUJhBqJ8RS2F+vTs3DTaXqcktgJ4UkhYC2c1gImcPRyGrK9VY0sCT+1iA+wp/O
> v6rDpkeNksZ9fFSyoY2o
> =ECSj
> -----END PGP SIGNATURE-----
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20190807/56b91e7d/attachment.html>