[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Multiple Streams Connect Watermark

You can also merge all three types into an nary-Either type and union all three inputs together.
However, Flink only supports a binary Either, so you'd have to implement a custom TypeInformation and TypeSerializer to make that work.

Best, Fabian

2018-04-26 20:44 GMT+02:00 Chengzhi Zhao <w.zhaochengzhi@xxxxxxxxx>:
Thanks Fabian for the explanation. 

If I have data with different schemas, it seems the only option I have is to use connect to perform joins (inner, outer), is there any operators that can put more than two streams together (all different schema)?


On Thu, Apr 26, 2018 at 6:05 AM, Fabian Hueske <fhueske@xxxxxxxxx> wrote:
Hi Chengzhi,

Functions in Flink are implemented in a way to preserve the timestamps of elements or assign timestamps which are aligned with the existing watermarks.
For example, the result of a time window aggregation has the end timestamp of the window as a timestamp and records emitted by the onTimer() method have the timestamp of the timer as a record timestamp.
So unless you fiddle with internal APIs to reset the record timestamps of elements, you don't need to worry about generating new watermarks.

Best, Fabian

2018-04-25 20:20 GMT+02:00 Chengzhi Zhao <w.zhaochengzhi@xxxxxxxxx>:
Hi, everyone, 

I am trying to do some join-like pipeline using flink connect operator and CoProcessFunction, I have use case that I need to connect 3+ streams. So I am having something like this:

    ===> C
B                 ==> E

So two streams A and B connect at first with 3 hours late on low watermark, after data has been emitted (the output C stream), a new stream D connect to C and emitted E as final output. I was wondering how the downstream watermark should be defined. Should I give C stream a new watermark for 3 hours delay again? or when I connect stream D, everything will be 6 hours late on low watermark.

I am using BoundedOutOfOrdernessGenerator[1] with maxOutOfOrderness 3 hours

Thanks for your tips and help in advance.