osdir.com


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: understadning kafka connector - rebalance


You can use rebalance before keyBy because rebalance returns DataStream. The API does not allow rebalance on keyedStreamed which is returned after keyBy so you are safe. 

On Mon 26 Nov, 2018, 2:25 PM Avi Levi <avi.levi@xxxxxxxxxxxxxx wrote:
Ok, thanks for the clarification. but if I use it with keyed state so the partition is by the key. rebalancing will not shuffle this partitioning ? e.g 
.addSource(source)
      .rebalance
      .keyBy(_.id)
      .mapWithState(...)


On Mon, Nov 26, 2018 at 8:32 AM Taher Koitawala <taher.koitawala@xxxxxxxxx> wrote:
Hi Avi,
          No, rebalance is not changing the number of kafka partitions. Lets say you have 6 kafka partitions and your flink parallelism is 8, in this case using rebalance will send records to all downstream operators in a round robin fashion. 

Regards,
Taher Koitawala
GS Lab Pune
+91 8407979163


On Mon, Nov 26, 2018 at 11:33 AM Avi Levi <avi.levi@xxxxxxxxxxxxxx> wrote:
Hi
Looking at this example, doing the "rebalance" (e.g messageStream.rebalance().map(...) ) operation on heavy load stream wouldn't slow the stream ? is the rebalancing action occurs only when there is a partition change ? 
it says that "the rebelance call is causing a repartitioning of the data so that all machines" is it actually changing the num of partitions of the topic to match the num of flink operators ?

Avi