now i know why those files wasn't "remove". They remove but very slow. In my case(Flink 1.3) the problem is in line
client.delete().inBackground(backgroundCallback, executor).forPath(path);where deletion is in background in executor pool where size is equal to 2. When i have more files/dirs in "high-availability.storageDir" and "state.backend.fs.checkpointdir"
then delete operation are longer and longer and queued operation in pool are increase. In my case the main problem is that i have 12 job deployed on cluster and checkpoint is set for 5 seconds.
I know that i need to increase timeout between checkpoints, i will increase to 1 or 5 minutes depends from job businesses logic.
But i still have some question. Where is set size of executor pool size because i was analyzing the flink code and still don't know where the size is set. Maybe someone can of users know where pool is created.
On 22.04.2018 17:22, Szymon Szczypiński wrote: