osdir.com

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Inter-node messaging latency


Hi,

Thank you for the reply.
I've measured LWT throughput in 4.0.

I used the cassandra-stress tool to insert rows with LWT for 3 minutes on
i3.xlarge and i3.4xlarge
For 3.11, I modified the tool to support LWT.
Before each measurement, I cleaned up all Cassandra data.

The throughput in 4.0 is 5 % faster than 3.11.
The CPU load of i3.4xlarge (16 vCPUs) is only up to 75% in both versions.
And, the throughput was slower than 4 times that of i3.xlarge.
I think the throughput wasn't bounded by CPU also in 4.0.

The CPU load of i3.4xlarge is up to 80 % with non-LWT write.

I wonder what is the bottleneck for writes on a many-core machine if the
issue about messaging has been resolved in 4.0.
Can I use up CPU to insert rows by changing any parameter?

# LWT insert
* Cassandra 3.11.3
| instance type | # of threads | concurrent_writes | Throughput [op/s] |
|     i3.xlarge |           64 |                32 |              2815 |
|    i3.4xlarge |          256 |               128 |              9506 |
|    i3.4xlarge |          512 |               256 |             10540 |

* Cassandra 4.0 (trunk)
| instance type | # of threads | concurrent_writes | Throughput [op/s] |
|     i3.xlarge |           64 |                32 |              2951 |
|    i3.4xlarge |          256 |               128 |              9816 |
|    i3.4xlarge |          512 |               256 |             11055 |

* Environment
- 3 node cluster
- Replication factor: 3
- Node instance: AWS EC2 i3.xlarge / i3.4xlarge

* C* configuration
- Apache Cassandra 3.11.3 / 4.0 (trunk)
- commitlog_sync: batch
- concurrent_writes: 32, 256
- native_transport_max_threads: 128(default), 256 (when concurrent_writes
is 256)

Thanks,
Yuji


2018年11月26日(月) 17:27 sankalp kohli <kohlisankalp@xxxxxxxxx>:

> Inter-node messaging is rewritten using Netty in 4.0. It will be better to
> test it using that as potential changes will mostly land on top of that.
>
> On Mon, Nov 26, 2018 at 7:39 AM Yuji Ito <yuji@xxxxxxxxxxxxxxxxx> wrote:
>
>> Hi,
>>
>> I'm investigating LWT performance with C* 3.11.3.
>> It looks that the performance is bounded by messaging latency when many
>> requests are issued concurrently.
>>
>> According to the source code, the number of messaging threads per node is
>> only 1 thread for incoming and 1 thread for outbound "small" message to
>> another node.
>>
>> I guess these threads are frequently interrupted because many threads are
>> executed when many requests are issued.
>> Especially, I think it affects the LWT performance when many LWT requests
>> which need lots of inter-node messaging are issued.
>>
>> I measured that latency. It took 2.5 ms in average to enqueue a message
>> at a node and to receive the message at the **same** node with 96
>> concurrent LWT writes.
>> Is it normal? I think it is too big latency, though a message was sent to
>> the same node.
>>
>> Decreasing numbers of other threads like `concurrent_counter_writes`,
>> `concurrent_materialized_view_writes` reduced a bit the latency.
>> Can I change any other parameter to reduce the latency?
>> I've tried using message coalescing, but they didn't reduce that.
>>
>> * Environment
>> - 3 node cluster
>> - Replication factor: 3
>> - Node instance: AWS EC2 i3.xlarge
>>
>> * C* configuration
>> - Apache Cassandra 3.11.3
>> - commitlog_sync: batch
>> - concurrent_reads: 32 (default)
>> - concurrent_writes: 32 (default)
>>
>> Thanks,
>> Yuji
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@xxxxxxxxxxxxxxxxxxxx
>> For additional commands, e-mail: dev-help@xxxxxxxxxxxxxxxxxxxx
>
>