[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Cleanup resources on pipeline cancelation

Hi Andrew,

Beam currently does not have a generalized cleanup story so answer usually has been ad-hoc. For bounded source we can (1) cleanup any resources created for splitting after splitting (2) cleanup resources created for a given reader when the reader exists (last advaince() call).

I'm not sure what the proper solution for UnboundedSources is and it might not even make sense to to add cleanup logic to an unbounded source that is never expected to end. We might need something more generic (for example, a mechanism to collect temporary resources and delete such resources at pipeline termination).


On Tue, Jul 31, 2018 at 10:04 PM Romain Manni-Bucau <rmannibucau@xxxxxxxxx> wrote:
Hi Andrew,

IIRC sources should clean up their resources per method since they dont have a better lifecycle. Readers can create anything longer and release it at close time.

Le mer. 1 août 2018 00:31, Andrew Pilloud <apilloud@xxxxxxxxxx> a écrit :
Some of our IOs create external resources that need to be cleaned up when a pipeline is terminated. It looks like the org.apache.beam.sdk.io.UnboundedSource interface is called on creation, but there is no call for cleanup. For example, PubsubIO creates a Pubsub subcription in createReader()/split() and it should be deleted at shutdown. Does anyone have ideas on how I might make this happen?

(I filed https://issues.apache.org/jira/browse/BEAM-5051 tracking the PubSub specific issue.)