3 nodes in my cluster have 100% cpu usage and most of it is used by org.apache.cassandra.util.coalesceInternal and SepWorker.run?
The most active threads are the messaging-service-incomming.
Other nodes are normal, having 30 nodes, using Rack Aware strategy. with 10 rack each having 3 nodes. The problematic nodes are configured for one rack, on normal write load, system.log reports too many hint message dropped (cross node). also there are alot of parNewGc with about 700-1000ms and commit log isolated disk, is utilized about 80-90%. on startup of these 3 nodes, there are alot of "updateing topology" logs (1000s of them pending).
Using iperf, i'm sure that network is OK
checking NTPs and mutations on each node, load is balanced among the nodes.
using apache cassandra 3.11.2
I can not not figure out the root cause of the problem, although there are some obvious symptoms.