[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: BEAM-6018: memory leak in thread pool instantiation

Thanks for the context Dan, that was helpful.

On Fri, Nov 9, 2018 at 10:09 AM Udi Meiri <ehudm@xxxxxxxxxx> wrote:
The reasoning unbounded threadpool is explained as:
/* The SDK requires an unbounded thread pool because a step may create X writers
* each requiring their own thread to perform the writes otherwise a writer may
* block causing deadlock for the step because the writers buffer is full.
* Also, the MapTaskExecutor launches the steps in reverse order and completes
* them in forward order thus requiring enough threads so that each step's writers
* can be active.

On Thu, Nov 8, 2018 at 11:41 PM Dan Halperin <dhalperi@xxxxxxxxxx> wrote:

On Thu, Nov 8, 2018 at 2:12 PM Udi Meiri <ehudm@xxxxxxxxxx> wrote:
Both options risk delaying worker shutdown if the executor's shutdown() hasn't been called, which is I guess why the executor in GcsOptions.java creates daemon threads.

My guess (and it really is a guess at this point) is that this was a fix for DirectRunner issues - want that to exit quickly!

On Thu, Nov 8, 2018 at 1:02 PM Lukasz Cwik <lcwik@xxxxxxxxxx> wrote:
Not certain, it looks like we should have been caching the executor within the GcsUtil as a static instance instead of creating one each time. Could have been missed during code review / slow code changes over time. GcsUtil is not well "loved".

On Thu, Nov 8, 2018 at 11:00 AM Udi Meiri <ehudm@xxxxxxxxxx> wrote:
I've identified a memory leak when GcsUtil.java instantiates a ThreadPoolExecutor (https://issues.apache.org/jira/browse/BEAM-6018).
The code uses the getExitingExecutorService wrapper, which leaks memory. The question is, why is that wrapper necessary if executor.shutdown(); is later unconditionally called?