[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: CEP & checkpoints/savepoints

Hi Ron,

The CEP library is built on top of the DataStream / ProcessFunction API and
holds all necessary state (the state of the pattern matching state machine)
in regular keyed MapState.
Hence, CEP does not require a dedicated configuration for checkpoints and
savepoints, besides the regular application checkpoint configuration.

That's also why there's no dedicated documentation about this subject.

@Dawid or Klou, please correct me if I'm wrong.

Best, Fabian

2018-09-17 19:09 GMT+02:00 Ron Crocker <rcrocker@xxxxxxxxxxxx.invalid>:

> I’m working with CEP to detect when something stops reporting (which is
> very simple), but I need to show the team that the jobs will survive being
> shutdown and restarted without either a) declaring that everything stopped
> reporting (false positives) or b) missing things that have indeed stopped
> reporting (false negatives).
> There seems to be NO documentation regarding CEP and
> checkpoints/savepoints. Am I just missing it? Or is it something so simple
> that it should be obvious?
> Our graph is fairly straightforward - keyed stream using event time and a
> Pattern that is essentially a report followed by a report within a time
> window, and we use the timed-out side output as the “events” indicating
> “thing stopped reporting”. It seems that we need to checkpoint/savepoint
> the pattern state along with the normal things checkpointed (e.g., Kafka
> offsets).
> For now, I should be able to sell an assertion from you that it works, but
> official documentation would help.
> Ron
> —
> Ron Crocker
> Distinguished Engineer & Architect
> ( ( •)) New Relic
> rcrocker@xxxxxxxxxxxx
> M: +1 630 363 8835