[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Access to Kafka Event Time

Hi Vishal,

to answer the original question: it should not assumed that mutations of the element will be reflected downstream. For your situation this means that you have to use a ProcessingFunction to put the timestamp of a record into the record itself.

Also, Flink 1.6 will come with the next version of the BucketingSink called StreamingFileSink, where the Bucketer interface was updated to allow access to the element timestamp. The new interface is now called BucketAssigner.


On 1. Aug 2018, at 16:36, Hequn Cheng <chenghequn@xxxxxxxxx> wrote:

Hi Vishal,

We have a use case where multiple topics are streamed to hdfs and we would want to created buckets based on ingestion time 
If I understand correctly, you want to create buckets based on event time. Maybe you can use window[1]. For example, a tumbling window of 5 minutes groups rows in 5 minutes intervals. And you can get window start time(TUMBLE_START(time_attr, interval)) and end time(TUMBLE_END(time_attr, interval)) when output data.

Best, Hequn

On Wed, Aug 1, 2018 at 8:21 PM, Vishal Santoshi <vishal.santoshi@xxxxxxxxx> wrote:
Any feedbaxk?

On Tue, Jul 31, 2018, 10:20 AM Vishal Santoshi <vishal.santoshi@xxxxxxxxx> wrote:
In fact it may be available else where too ( for example ProcessFunction etc ) but do we have no need to create one, it is just a data relay ( kafka to hdfs ) and any intermediate processing should be avoided if possible IMHO.

On Tue, Jul 31, 2018 at 9:10 AM, Vishal Santoshi <vishal.santoshi@xxxxxxxxx> wrote:
We have a use case where multiple topics are streamed to hdfsand we would want to created buckets based on ingestion time ( the time the event were pushed to kafka ). Our producers to kafka will set that the event time

suggests that the the "previousElementTimeStamp" will provide that timestamp provided "EventTime" characteristic is set. It also provides for the element. In out case the element will expose setIngestionTIme(long time) method. Is the element in this method
public long extractTimestamp(Long element, long previousElementTimestamp)
 passed by reference and can it be safely ( loss lessly ) mutated for downstream operators ?

That said there is another place where that record time stamp is available.

Is it possible to change the signature of the 

to add record timestamp as the last argument ?