OSDir


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Compaction process stuck


That looks a bit to me like it isnt stuck but just a long running compaction. Can you include the output of `nodetool compactionstats` and the `nodetool cfstats` with schema for the table thats being compacted (redacted names if necessary).

Can stop compaction with `nodetool stop COMPACTION` or restarting the node.

Chris

On Jul 5, 2018, at 12:08 AM, atul atri <atulatri2004@xxxxxxxxx> wrote:

Hi,

We noticed that compaction process is also hanging on a node in backup ring. Please find attached thread dump for both servers. Recently, we have made few changes in cluster topology.

a. Added new server in backup data-center and decommissioned old server. Backup ring only has 2 server.
b. Added new node in primary data-center. Now it has 4 nods.

Is there way we can stop this compaction? As we have added a new node in this cluster and we are waiting to run cleanup on this node on which compaction is hanging. I am afraid that cleanup will not start until compaction job finishes.

Attachments:
1. cass-logg02.prod2.thread_dump.out: Thread dump from old node in primary datacenter
2. cass-logg03.prod1.thread_dump.out: Thread dump from new node in backup datacenter. This node is added recently.

Your help is much appreciated.

Thanks & Regards,
Atul Atri.


On 4 July 2018 at 21:15, atul atri <atulatri2004@xxxxxxxxx> wrote:
Hi Chris,
Thanks for reply.

Unfortunately, our servers do not have jstack installed.
I tried "kill -3 <PID>" option but that is also not generating thread dump.

Is there any other way I can generate thread dump?

Thanks & Regards,
Atul Atri.

On 4 July 2018 at 20:32, Chris Lohfink <clohfink@xxxxxxxxx> wrote:
Can you take a thread dump (jstack) and share the state of the compaction threads? Also check for “Exception” in logs

Chris

Sent from my iPhone

On Jul 4, 2018, at 8:37 AM, atul atri <atulatri2004@xxxxxxxxx> wrote:

Hi,

On one of our server, compaction process is hanging. It's stuck at 80%. It was stuck for last 3 days. And today we did a cluster restart (one host at time). And again it is stuck at same 80%. CPU usages are 100% and there seems no IO issue. We are seeing following kinds of WARNING in system.log

BatchStatement.java (line 226) Batch of prepared statements for [****, *****] is of size 7557, exceeding specified threshold of 5120 by 2437.


Other than this there seems no error.  I have tried to stop compaction process, but it does not stop. Cassandra version is 2.1.

 Can someone please guide us in solving this issue?

Thanks & Regards,
Atul Atri.


<cass-logg02.prod2.thread_dump.out><cass-logg03.prod1.thread_dump.out>
---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@xxxxxxxxxxxxxxxxxxxx
For additional commands, e-mail: user-help@xxxxxxxxxxxxxxxxxxxx