I'm investigating LWT performance with C* 3.11.3.
It looks that the performance is bounded by messaging latency when many requests are issued concurrently.
According to the source code, the number of messaging threads per node is only 1 thread for incoming and 1 thread for outbound "small" message to another node.
I guess these threads are frequently interrupted because many threads are executed when many requests are issued.
Especially, I think it affects the LWT performance when many LWT requests which need lots of inter-node messaging are issued.
I measured that latency. It took 2.5 ms in average to enqueue a message at a node and to receive the message at the **same** node with 96 concurrent LWT writes.
Is it normal? I think it is too big latency, though a message was sent to the same node.
Decreasing numbers of other threads like `concurrent_counter_writes`, `concurrent_materialized_view_writes` reduced a bit the latency.
Can I change any other parameter to reduce the latency?
I've tried using message coalescing, but they didn't reduce that.