osdir.com


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

camel-rabbitmq automated recovery


Hi All,

 

Hoping someone with more experience with RabbitMQ and Camel might be able to tell me what’s going on with the below scenario as I’m not sure I understand it right.

 

We have now 3 servers in a test environment with RabbitMQ installed (v3.6.12, Elang 20.1), we’re using camel-rabbitmq component (2.21.3) in our Camel application to talk to the rabbit servers. Two of the servers (vm1 and vm2) are in a cluster and have mirrored queues, the third server (vm3) is not in the cluster but has federated queues and sits in a different data centre. What we’re trying to figure out now is how to do disaster recovery flip-over and it’s not working out too well so far.

 

Here is an example of one of the consumer endpoints we have:

 

rabbitmq://vm1:5671/someExchange?connectionFactory=rabbitConnectionFactory&autoDelete=false&queue=someQueue&exchangeType=topic&autoAck=false&bridgeEndpoint=true&concurrentConsumers=10&threadPoolSize=10&channelPoolMaxSize=10&prefetchCount=1&prefetchEnabled=true&exchangePattern=InOnly&automaticRecoveryEnabled=true&networkRecoveryInterval=15000&addresses=vm1:5671,vm2:5671,vm3:5671

 

Here is the connection factory (sslEnabled property is set to true):

 

    public ConnectionFactory rabbitConnectionFactory() throws KeyManagementException, NoSuchAlgorithmException {

        ConnectionFactory cf = new ConnectionFactory();

        cf.setHost(rabbitHost);

        cf.setPort(rabbitPort);

        cf.setUsername(rabbitUser);

        cf.setPassword(rabbitPassword);

        cf.setVirtualHost(rabbitVirtualhost);

        if (sslEnabled)

                cf.useSslProtocol();

        return cf;

    }

 

With only vm1 or vm2 going down it seems to work out just fine, the broker seems to take care of the mirroring and the client side (Camel) rarely notices a thing. Issues start when both of the cluster nodes (vm1 and vm2) go down, for some reason the third address in the list isn’t being tried for the recovery attempts. Without a Camel restart I’ve never seen the consumers flipping over to vm3.

 

In the logs I’m getting the following, which is most likely as expected:

 

2018-11-12 10:55:10,161 [Camel (camel-1) thread #13 - RabbitMQConsumer] INFO  o.a.c.c.r.RabbitConsumer  - Received shutdown signal on the rabbitMQ channel

2018-11-12 10:55:10,161 [Camel (camel-1) thread #13 - RabbitMQConsumer] INFO  o.a.c.c.r.RabbitConsumer  - Attempting to open a new rabbitMQ channel

2018-11-12 10:55:10,161 [AMQP Connection 10.1.1.1:5671] ERROR c.r.c.i.ForgivingExceptionHandler  - An unexpected connection driver error occured

javax.net.ssl.SSLException: Connection has been shutdown: javax.net.ssl.SSLException: java.net.SocketException: Connection reset by peer: socket write error

                at sun.security.ssl.SSLSocketImpl.checkEOF(Unknown Source)

                at sun.security.ssl.AppInputStream.read(Unknown Source)

                at java.io.BufferedInputStream.fill(Unknown Source)

                at java.io.BufferedInputStream.read(Unknown Source)

                at java.io.DataInputStream.readUnsignedByte(Unknown Source)

                at com.rabbitmq.client.impl.Frame.readFrom(Frame.java:91)

                at com.rabbitmq.client.impl.SocketFrameHandler.readFrame(SocketFrameHandler.java:164)

                at com.rabbitmq.client.impl.AMQConnection$MainLoop.run(AMQConnection.java:596)

                at java.lang.Thread.run(Unknown Source)

Caused by: javax.net.ssl.SSLException: java.net.SocketException: Connection reset by peer: socket write error

                at sun.security.ssl.Alerts.getSSLException(Unknown Source)

                at sun.security.ssl.SSLSocketImpl.fatal(Unknown Source)

                at sun.security.ssl.SSLSocketImpl.fatal(Unknown Source)

                at sun.security.ssl.SSLSocketImpl.handleException(Unknown Source)

                at sun.security.ssl.SSLSocketImpl.handleException(Unknown Source)

                at sun.security.ssl.AppOutputStream.write(Unknown Source)

                at java.io.BufferedOutputStream.flushBuffer(Unknown Source)

                at java.io.BufferedOutputStream.flush(Unknown Source)

                at java.io.DataOutputStream.flush(Unknown Source)

                at com.rabbitmq.client.impl.SocketFrameHandler.flush(SocketFrameHandler.java:177)

                at com.rabbitmq.client.impl.AMQConnection.flush(AMQConnection.java:571)

                at com.rabbitmq.client.impl.AMQCommand.transmit(AMQCommand.java:134)

                at com.rabbitmq.client.impl.AMQChannel.quiescingTransmit(AMQChannel.java:447)

                at com.rabbitmq.client.impl.AMQChannel.quiescingTransmit(AMQChannel.java:429)

                at com.rabbitmq.client.impl.AMQConnection.handleConnectionClose(AMQConnection.java:846)

                at com.rabbitmq.client.impl.AMQConnection.processControlCommand(AMQConnection.java:799)

                at com.rabbitmq.client.impl.AMQConnection$1.processAsync(AMQConnection.java:242)

                at com.rabbitmq.client.impl.AMQChannel.handleCompleteInboundCommand(AMQChannel.java:178)

                at com.rabbitmq.client.impl.AMQChannel.handleFrame(AMQChannel.java:111)

                at com.rabbitmq.client.impl.AMQConnection.readFrame(AMQConnection.java:650)

                at com.rabbitmq.client.impl.AMQConnection.access$300(AMQConnection.java:48)

                at com.rabbitmq.client.impl.AMQConnection$MainLoop.run(AMQConnection.java:597)

                ... 1 common frames omitted

Caused by: java.net.SocketException: Connection reset by peer: socket write error

                at java.net.SocketOutputStream.socketWrite0(Native Method)

                at java.net.SocketOutputStream.socketWrite(Unknown Source)

                at java.net.SocketOutputStream.write(Unknown Source)

                at sun.security.ssl.OutputRecord.writeBuffer(Unknown Source)

                at sun.security.ssl.OutputRecord.write(Unknown Source)

                at sun.security.ssl.SSLSocketImpl.writeRecordInternal(Unknown Source)

                at sun.security.ssl.SSLSocketImpl.writeRecord(Unknown Source)

                ... 18 common frames omitted

 

Then it starts the recovery attempts:

 

2018-11-12 13:47:57,846 [Camel (camel-1) thread #103 - RabbitMQConsumer] WARN  o.a.c.c.r.RabbitConsumer  - Unable to obtain a RabbitMQ channel. Will try again. Caused by: Waiting for channel to re-open.. Stacktrace logged at DEBUG logging level.

2018-11-12 13:47:57,861 [Camel (camel-1) thread #19 - RabbitMQConsumer] WARN  o.a.c.c.r.RabbitConsumer  - Unable to obtain a RabbitMQ channel. Will try again. Caused by: Waiting for channel to re-open.. Stacktrace logged at DEBUG logging level.

2018-11-12 13:47:57,861 [Camel (camel-1) thread #28 - RabbitMQConsumer] WARN  o.a.c.c.r.RabbitConsumer  - Unable to obtain a RabbitMQ channel. Will try again. Caused by: Waiting for channel to re-open.. Stacktrace logged at DEBUG logging level.

 

It prints that for each consumer/channel I guess, so there is a lot of those lines, but for some reason the last available address in the addresses list (vm3) is not being attempted to connect to (needs a restart to do that, so defeats the purpose of automated  recovery).

Also, when the primary two nodes are brought back up the recovery is still not recovering, I waited for about 30min and it still didn’t recover, just kept printing the above warnings.

When I tried to stop our app server (Tomcat) that hosts the Camel application the above warnings started to be printed in their thousands and I got about 500MB of logs inside a minute, all with the same warning message as above (guess the retry timeout got ignored on shutdown while still trying to recover the connections).

 

I’d highly appreciate some insights into the above as I’m a bit lost on how this is actually expected to work, I’ve been reading both RabbitMQ and Camel docs but I’m none the wiser of how to do the DR flip-over or if that’s even possible to do. I really hope it’s just some trivial config thing we’ve been missing so far.

 

 

Thanks,

Valdis

Vhi Group DAC (Vhi) is a holding company for insurance and healthcare services, which include Vhi Healthcare DAC, Vhi Insurance DAC, Vhi Health Services DAC and Vhi Investments DAC. Vhi Healthcare DAC trading as Vhi Healthcare and Vhi Insurance DAC trading as Vhi Insurance are regulated by the Central Bank of Ireland. Vhi Healthcare is tied to Vhi Insurance DAC for health insurance in Ireland which is underwritten by Vhi Insurance DAC. Vhi Healthcare is tied to Zurich Life Assurance plc for Vhi Life Term Insurance and Vhi Mortgage Protection which is underwritten by Zurich Life Assurance plc. Vhi Healthcare is tied to Collinson Insurance Services Limited for MultiTrip Travel Insurance, Backpacker Travel Insurance and Vhi Dental Insurance which are underwritten by Great Lakes Insurance SE, UK branch and for Vhi Canada Cover and Vhi International Health Insurance which are underwritten by Astrenska Insurance Limited. For more information about the Vhi Group please go to: https://www.vhi.ie/about-vhi.


Tá Vhi Group DAC (Vhi) ina chuideachta sealbhaíochta le haghaidh seirbhísí árachais agus seirbhísí cúram sláinte, lena n-áirítear Vhi Healthcare DAC, Vhi Insurance DAC, Vhi Health Services DAC agus Vhi Investments DAC. Déanann Banc Ceannais na hÉireann rialáil ar Vhi Healthcare DAC, ag trádáil dó mar Vhi Healthcare, agus ar Vhi Insurance DAC, ag trádáil dó mar Vhi Insurance. Tá Vhi Healthcare ceangailte le Vhi Insurance DAC le haghaidh árachas sláinte in Éirinn, rud atá frithgheallta ag Vhi Insurance DAC. Tá Vhi Healthcare ceangailte le Zurich Life Assurance plc le haghaidh Árachais Saoil de chuid Vhi agus Árachas Cosanta Morgáiste de chuid Vhi atá frithgheallta ag Zurich Life Assurance plc. Tá Vhi Healthcare ceangailte le Collinson Insurance Services Limited le haghaidh Árachas Taistil Ilturais agus Turasóirí Mála Droma agus Árachas Fiaclóireachta de chuid Vhi atá frithgheallta ag Great Lakes Insurance SE, UK branch agus le haghaidh Clúdach Cheanada de chuid Vhi agus Árachas Sláinte Idirnáisiúnta de chuid Vhi atá frithgheallta ag Astrenska Insurance Limited. Chun tuilleadh faisnéise a fháil faoi Ghrúpa Vhi, tabhair cuairt ar: https://www.vhi.ie/about-vhi.

This e-mail and any files transmitted with it contain information which may be confidential and which may also be privileged and is intended solely for the use of the individual or entity to whom it is addressed. Unless you are the intended recipient you may not copy or use it, or disclose it to anyone else. Any opinions expressed are that of the individual and not necessarily that of the Vhi Group. If you have received this e-mail in error please notify the sender by return.