[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Sporadic high IO bandwidth and Linux OOM killer


I had few instances in the past that were showing that unresponsivveness behaviour. Back then I saw with iotop/htop/dstat ... the system was stuck on a single thread processing (full throttle) for seconds. According to iotop that was the kswapd0 process. That system was an ubuntu 16.04 actually "Ubuntu 16.04.4 LTS".

From there I started to dig what kswap process was involved in a system with no swap and found that is used for mmapping. This erratic (allow me to say erratic) behaviour was not showing up when I was on 3.0.6 but started to right after upgrading to 3.0.17.

By "load" I refer to the load as reported by the `nodetool status`. On my systems, when disk_access_mode is auto (read mmap), it is the sum of the node load plus the jmv heap size. Of course this is just what I noted on my systems not really sure if that should be the case on yours too.

I hope someone with more experience than me will add a comment about your settings. Reading the configuration file, writers and compactors should be 2 at minimum. I can confirm when I tried in the past to change the concurrent_compactors to 1 I had really bad things happenings (high system load, high message drop rate, ...)

I have the "feeling", when running on constrained hardware the underlaying kernel optimization is a must. I agree with Jonathan H. that you should think about increasing the instance size, CPU and memory mathters a lot.


On Wed, Dec 5, 2018 at 10:36 PM Oleksandr Shulgin <oleksandr.shulgin@xxxxxxxxxx> wrote:
On Wed, 5 Dec 2018, 19:34 Riccardo Ferrari <ferrarir@xxxxxxxxx wrote:
Hi Alex,

I saw that behaviout in the past.


Thank you for the reply!

Do you refer to kswapd issue only or have you observed more problems that match behavior I have described?

I can tell you the kswapd0 usage is connected to the `disk_access_mode` property. On 64bit systems defaults to mmap.

Hm, that's interesting, I will double-check.

That also explains why your virtual memory is so high (it somehow matches the node load, right?).

Not sure what do you mean by "load" here. We have a bit less than 1.5TB per node on average.