OSDir


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Sharing state between subtasks


On Tue, Oct 9, 2018 at 1:25 AM Aljoscha Krettek <aljoscha@xxxxxxxxxx> wrote:

> @Elias Do you know if Kafka Consumers do this alignment across multiple
> consumers or only within one Consumer across the partitions that it reads
> from.
>

The behavior is part of Kafka Streams
<https://github.com/apache/kafka/blob/96132e2dbb69a0c6c11cb183bb05cefef4e30557/streams/src/main/java/org/apache/kafka/streams/processor/internals/PartitionGroup.java#L65>,
not the Kafka consumer.  The alignment does not occur across Kafka
consumers, but that is because Kafka Streams, unlikely Flink, uses a single
consumer to fetch records from multiple sources / topics.  The alignment
occurs with the stream task.  Stream tasks keep queues per topic-partition
(which may be from different topics), and select the next record to
processed by selecting the queue with the lowest timestamp.

The equivalent in Flink would be for the Kafka connector source to select
the message among partitions with the lowest timestamp to emit next, and
for multiple input stream operators to select the message among inputs with
the lowest timestamp to process.