OSDir


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Compacting more than the actual used space


You can check cfstats to see what's the compression ratio.
It's totally possible to have the values you're reporting as a compression ratio of 0.2 is quite common depending on the data you're storing (compressed size is then 20% of the original data).

Compaction throughput changes are taken into account for running compactions starting with Cassandra 2.2 if I'm correct. Your compaction could be bound by cpu, not I/O in that case.

Cheers

Le lun. 5 nov. 2018 à 20:41, Pedro Gordo <pedro.gordo1986@xxxxxxxxx> a écrit :
Hi

We have an ongoing compaction for roughly 2.5 TB, but "nodetool status" reports a load of 1.09 TB. Even if we take into account that the load presented by "nodetool status" is the compressed size, I very much doubt that compression would work to reduce from 2.5 TB to 1.09.
We can also take into account that, even if this is the biggest table, there are other tables in the system, so the 1.09 TB reported is not just for the table being compacted.

What could lead to results like this? We have 4 attached volumes for data directories. Could this be a likely cause for such discrepancy?

Bonus question: changing the compaction throughput to 0 (removing the throttling), had no impacts in the current compaction. Do new compaction throughput values only come into effect when a new compaction kicks in?

Cheers

Pedro Gordo
--
-----------------
Alexander Dejanovski
France
@alexanderdeja

Consultant
Apache Cassandra Consulting