OSDir


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Setting the parallelism in a cluster of machines properly


Hello Flinkers,

I have deployed Flink in a cluster of 17 nodes, each having 8 CPUs. Thus, in
total there are 136 CPUs available.

I have set the parameter askmanager.numberOfTaskSlots = 8 in all machines,
since they have 8 CPUs.

And when I am going to run ./flink run -c classpath jarFile -p 136 and I get
error. 

I can only put it maximum 8 which is reasonable from one point. But here [1]
it says the following :

parallelism.default: The default parallelism to use for programs that have
no parallelism specified. (DEFAULT: 1). For setups that have no concurrent
jobs running, setting this value to NumTaskManagers * NumSlotsPerTaskManager
will cause the system to use all available execution resources for the
program’s execution. Note: The default parallelism can be overwritten for an
entire job by calling setParallelism(int parallelism) on the
ExecutionEnvironment or by passing -p <parallelism> to the Flink
Command-line frontend. It can be overwritten for single transformations by
calling setParallelism(int parallelism) on an operator. See Parallel
Execution for more information about parallelism.

So...specially the part : setting this value to NumTaskManagers *
NumSlotsPerTaskManager will cause the system to use all available execution
resources for the program’s execution. 

So, for me NumTaskManagers * NumSlotsPerTaskManager = 17 * 8 = 136. Right?
Any idea why this does not work?

Best,
Max

[1] --
https://ci.apache.org/projects/flink/flink-docs-release-1.4/ops/config.html



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/