osdir.com

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Assign IDs to Operators


Hi Chang,

The partitioning steps, like keyBy() are not operators.  In general you can let Flink's fluent-style API tell you the answer.  If you can call .uid() in the API and it compiles then the thing just before that is an operator ;)

-Jamie


On Wed, Nov 21, 2018 at 5:59 AM Chang Liu <fluency.03@xxxxxxxxx> wrote:
Dear All,

As stated here (https://ci.apache.org/projects/flink/flink-docs-stable/ops/state/savepoints.html), it is highly recommended to assign IDs to Operators, especially for the stateful ones.

My question is: what is the gradually of a so-called Operator.

To be more specific, in the following example, we have the Operators like, addSource and map. I am wondering is shuffle and print also a some kind of Operator?

DataStream<String> stream = env.
  // Stateful source (e.g. Kafka) with ID
  .addSource(new StatefulSource())
  .uid("source-id") // ID for the source operator
  .shuffle()
  // Stateful mapper with ID
  .map(new StatefulMapper())
  .uid("mapper-id") // ID for the mapper
  // Stateless printing sink
  .print(); // Auto-generated ID

Or, in the following example, how many Operator we have (that we can assign IDs to)? 3? KeyBy, window and aggregate?


input
    .keyBy(<key selector>)
    .window(<window assigner>)
    .aggregate(new AverageAggregate)

Then, how many Operators (and which are they) do we have in the following example?

stream.join(otherStream)
    .where(<KeySelector>)
    .equalTo(<KeySelector>)
    .window(<WindowAssigner>)
    .apply(<JoinFunction>)

Many Thanks.

Best regards/祝好,

Chang Liu 刘畅