osdir.com


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Select node for a Flink DataStream execution


Hi Rafael,

For Standalone clusters, it seems that Flink does not provide such a feature. 
In general, at the execution level, we don't talk about DataStream, but we talk about Job. 
If your Flink is running on YARN, you can use YARN's Node Label feature to assign a Label to some Nodes. 
Earlier this year, I had solved an issue that could solve the problem of specifying a node label when submitting a job for Flink on YARN.[1] 
This feature is available in the recently released Flink 1.6.0.
Don't know if it meets your requirements?

[1]: https://issues.apache.org/jira/browse/FLINK-7836

Thanks, vino.

Rafael Leira Osuna <rafael.leira@xxxxxx> 于2018年8月13日周一 上午12:16写道:
Hi!

I have been searching a lot but I didn't found a solution for this.

Lets supose some of the steps on the streaming process must be executed
in just a subset of the available nodes/taskmanagers, while the rest of
the tasks are free to be computed anywhere.

**¿How can I assign a DataStream to be executed ONLY in a node
subset?**

This is required mainly for input/sink tasks as not every node in the
cluster have the same conectivity / security restrictions.

I'm new on flink, so please forgive me if I'm asking for something
obvious.

Thanks a lot.

Rafael Leira.

Pd: Currently, we have a static standalone flink cluster.