osdir.com


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Flink CLI does not return after submitting yarn job in detached mode


Hi Marvin777,

You are wrong. It uses the Flink on YARN single job mode and should use the "-yd" parameter.

Hi Madhav,

I seem to have found the problem, the source code of your log is here.[1]  

It is based on a judgment method "isUsingInteractiveMode". 

The source code for this method is here[2], returning true when "program" is null. And when is this field null? it's here.[3]

So, from the source code point of view, I suggest you explicitly specify the class in which the Main method is located in the CLI args.



[1]: https://github.com/apache/flink/blob/release-1.4.2/flink-clients/src/main/java/org/apache/flink/client/program/ClusterClient.java#L380

[2]: https://github.com/apache/flink/blob/release-1.4.2/flink-clients/src/main/java/org/apache/flink/client/program/PackagedProgram.java#L276

[3]: https://github.com/apache/flink/blob/release-1.4.2/flink-clients/src/main/java/org/apache/flink/client/program/PackagedProgram.java#L217

Thanks, vino.

Marvin777 <xymaqingxiang777@xxxxxxxxx> 于2018年8月16日周四 上午11:00写道:
Hi, Madhav,
 
./flink-1.4.2/bin/flink run -m yarn-cluster -yd -yn 2 -yqu "default"  -ytm 2048 myjar.jar 

Modified to, ./flink-1.4.2/bin/flink run -m yarn-cluster -d -yn 2 -yqu "default"  -ytm 2048 myjar.jar 



image.png

madhav Kelkar <madhav.kelkar@xxxxxxxxx> 于2018年8月16日周四 上午5:01写道:
Hi there,
  
    I am trying to run a single flink job on YARN in detached mode. as per the docs for flink 1.4.2, I am using -yd to do that.

The problem I am having is the flink bash script doesn't terminate execution and return until I press control + c. In detached mode, I would expect the flink CLI to return as soon as yarn job is submitted. is there something I am missing? here is exact output I get -



./flink-1.4.2/bin/flink run -m yarn-cluster -yd -yn 2 -yqu "default"  -ytm 2048 myjar.jar \
....program arguments omitted


Using the result of 'hadoop classpath' to augment the Hadoop classpath: /Users/makelkar/work/hadoop-2.7.3/etc/hadoop:/Users/makelkar/work/hadoop-2.7.3/share/hadoop/common/lib/*:/Users/makelkar/work/hadoop-2.7.3/share/hadoop/common/*:/Users/makelkar/work/hadoop-2.7.3/share/hadoop/hdfs:/Users/makelkar/work/hadoop-2.7.3/share/hadoop/hdfs/lib/*:/Users/makelkar/work/hadoop-2.7.3/share/hadoop/hdfs/*:/Users/makelkar/work/hadoop-2.7.3/share/hadoop/yarn/lib/*:/Users/makelkar/work/hadoop-2.7.3/share/hadoop/yarn/*:/Users/makelkar/work/hadoop-2.7.3/share/hadoop/mapreduce/lib/*:/Users/makelkar/work/hadoop-2.7.3/share/hadoop/mapreduce/*:/Users/makelkar/work/hadoop-2.7.3/contrib/capacity-scheduler/*.jar
2018-08-15 14:39:36,873 INFO  org.apache.flink.yarn.cli.FlinkYarnSessionCli                 - No path for the flink jar passed. Using the location of class org.apache.flink.yarn.YarnClusterDescriptor to locate the jar
2018-08-15 14:39:36,873 INFO  org.apache.flink.yarn.cli.FlinkYarnSessionCli                 - No path for the flink jar passed. Using the location of class org.apache.flink.yarn.YarnClusterDescriptor to locate the jar
2018-08-15 14:39:36,921 INFO  org.apache.hadoop.yarn.client.RMProxy                         - Connecting to ResourceManager at /0.0.0.0:8032
2018-08-15 14:39:37,226 INFO  org.apache.flink.yarn.YarnClusterDescriptor                   - Cluster specification: ClusterSpecification{masterMemoryMB=1024, taskManagerMemoryMB=2048, numberTaskManagers=2, slotsPerTaskManager=1}
2018-08-15 14:39:37,651 WARN  org.apache.flink.yarn.YarnClusterDescriptor                   - The configuration directory ('/Users/makelkar/work/flink/flink-1.4.2/conf') contains both LOG4J and Logback configuration files. Please delete or rename one of them.
2018-08-15 14:39:37,660 INFO  org.apache.flink.yarn.Utils                                   - Copying from file:/Users/makelkar/work/flink/flink-1.4.2/conf/logback.xml to hdfs://localhost:9000/user/makelkar/.flink/application_1534188161088_0019/logback.xml

2018-08-15 14:39:37,986 INFO  org.apache.flink.yarn.Utils                                   - Copying from file:/Users/makelkar/work/flink/flink-1.4.2/lib/log4j-1.2.17.jar to hdfs://localhost:9000/user/makelkar/.flink/application_1534188161088_0019/lib/log4j-1.2.17.jar
2018-08-15 14:39:38,011 INFO  org.apache.flink.yarn.Utils                                   - Copying from file:/Users/makelkar/work/flink/flink-1.4.2/lib/flink-dist_2.11-1.4.2.jar to hdfs://localhost:9000/user/makelkar/.flink/application_1534188161088_0019/lib/flink-dist_2.11-1.4.2.jar
2018-08-15 14:39:38,586 INFO  org.apache.flink.yarn.Utils                                   - Copying from file:/Users/makelkar/work/flink/flink-1.4.2/lib/flink-python_2.11-1.4.2.jar to hdfs://localhost:9000/user/makelkar/.flink/application_1534188161088_0019/lib/flink-python_2.11-1.4.2.jar
2018-08-15 14:39:38,603 INFO  org.apache.flink.yarn.Utils                                   - Copying from file:/Users/makelkar/work/flink/flink-1.4.2/conf/log4j.properties to hdfs://localhost:9000/user/makelkar/.flink/application_1534188161088_0019/log4j.properties

2018-08-15 14:39:39,002 INFO  org.apache.flink.yarn.Utils                                   - Copying from file:/Users/makelkar/work/flink/flink-1.4.2/lib/flink-dist_2.11-1.4.2.jar to hdfs://localhost:9000/user/makelkar/.flink/application_1534188161088_0019/flink-dist_2.11-1.4.2.jar
2018-08-15 14:39:39,401 INFO  org.apache.flink.yarn.Utils                                   - Copying from /var/folders/b6/_t_6q0vs3glcggp_8rgyxxl40000gn/T/application_1534188161088_0019-flink-conf.yaml8441703337078262150.tmp to hdfs://localhost:9000/user/makelkar/.flink/application_1534188161088_0019/application_1534188161088_0019-flink-conf.yaml8441703337078262150.tmp
2018-08-15 14:39:39,836 INFO  org.apache.flink.yarn.YarnClusterDescriptor                   - Submitting application master application_1534188161088_0019
2018-08-15 14:39:39,858 INFO  org.apache.hadoop.yarn.client.api.impl.YarnClientImpl         - Submitted application application_1534188161088_0019
2018-08-15 14:39:39,858 INFO  org.apache.flink.yarn.YarnClusterDescriptor                   - Waiting for the cluster to be allocated
2018-08-15 14:39:39,859 INFO  org.apache.flink.yarn.YarnClusterDescriptor                   - Deploying cluster, current state ACCEPTED
2018-08-15 14:39:47,733 INFO  org.apache.flink.yarn.YarnClusterDescriptor                   - YARN application has been deployed successfully.
2018-08-15 14:39:47,733 INFO  org.apache.flink.yarn.YarnClusterDescriptor                   - The Flink YARN client has been started in detached mode. In order to stop Flink on YARN, use the following command or a YARN web interface to stop it:
yarn application -kill application_1534188161088_0019
Please also note that the temporary files of the YARN session in the home directoy will not be removed.
Cluster started: Yarn cluster with application id application_1534188161088_0019
Using address localhost:51252 to connect to JobManager.
Using the parallelism provided by the remote cluster (2). To use another parallelism, set it at the ./bin/flink client.
Starting execution of program
2018-08-15 14:39:47,757 INFO  org.apache.flink.yarn.YarnClusterClient                       - Starting program in interactive mode


I have to press cntrl + c to kill this shell script. When I do that, the program prints messages below -

2018-08-15 14:39:56,332 INFO  org.apache.flink.yarn.YarnClusterClient                       - Shutting down YarnClusterClient from the client shutdown hook
2018-08-15 14:39:56,333 INFO  org.apache.flink.yarn.YarnClusterClient                       - Disconnecting YarnClusterClient from ApplicationMaster

Thanks,
Madhav.