osdir.com

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Question about checkpoint logic of the Dataflow Runner


https://github.com/apache/beam/blob/bcc34f56eb37046c1b2d62ceb77ea912706383cb/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/StreamingModeExecutionContext.java#L376

The callback is held in memory and an id is output to a a service called windmill (aka streaming engine). In a future work item given to the worker, windmill returns that finalization id which is used to look up the callback in memory and invoke it.

On Wed, Nov 28, 2018 at 4:54 PM flyisland <fly.island@xxxxxxxxx> wrote:
Hi Gurus,

I need to understand the checkpoint logic of the Dataflow Runner, like when and how will the runner trigger a finalize on a checkpoint, is the finalize thread same as the reader thread?

Could you share me the information, or point me to the related source code, thanks in advance!

Island Chen