OSDir

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]# Iterative Stream won't loop

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Hi,

__labels __= __vertices__.flatmap (emiting a tupple <VertexID, label> for every vertices.f0 and every element on vertices.f1)

__updatedVertices __= __vertices__. join(labels).where(VertexId).equalTo(VertexId)

I am trying to implement a connected components algorithm using DataStream. For this algorithm, I'm separating the data by tumbling windows. So, for each window, I'm trying to compute it independently.

This algorithm is iterative because the labels (colors) of the vertices need to be propagated. Basically, I need to iterate over the following steps:

Input: __vertices __= Datastream of <VertexId, [list of neighbor vertices], label>

Loop:

.keyBy(VertexID)

.window(...)

.min(label);

.windowAll(...)

.apply(re-emit original __vertices __stream tuples, but keeping the new labels)

End loop

I am trying to use IterativeStreams to do so. However, despite successfully separating the tuples that need to be fed back to the loop (by using filters and closeWith), the subsequent iterations are not happening. So, what I get is only the first iteration.

I suppose this might come from the fact that I'm creating a new stream (labels) based on the original IterativeStream, joining it with the original one (vertices) and only then closing the loop with it.

Do you know whether Flink has some limitation in this respect? and if so, would you have a hint about a different approach I could take for this algorithm to avoid this?

thank you in advance,

Henrique Colao

- Prev by Date:
**Getting below exception while submitting Jar in task manager** - Next by Date:
**Re: Getting below exception while submitting Jar in task manager** - Previous by thread:
**Getting below exception while submitting Jar in task manager** - Next by thread:
**Re: Iterative Stream won't loop** - Index(es):