[jira] [Created] (FLINK-10286) Flink Persist Invalid Job Graph in Zookeeper
Sayat Satybaldiyev created FLINK-10286:
Summary: Flink Persist Invalid Job Graph in Zookeeper
Issue Type: Bug
Affects Versions: 1.6.0
Reporter: Sayat Satybaldiyev
In HA mode Flink 1.6, Flink persist job graph in Zookpeer even if the job was not accepted by Job Manager. This particularly bad as later if JM dies and restarts JM tries to recover the job and obviously fails and dies completely.
How to reproduce:
1. Have HA Flink cluster 1.6
2. Submit invalid job, in my case I'm put invalid file schema for rocksdb state backed
StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
RocksDBStateBackend backend = new RocksDBStateBackend("hddd:///tmp/flink/rocksdb");
The program finished with the following exception:
org.apache.flink.client.program.ProgramInvocationException: Could not submit job (JobID: 9680f02ae2f3806c3b4da25bfacd0749)
JM does not accept job, this truncated error log from JM:
Caused by: org.apache.flink.runtime.client.JobSubmissionException: Failed to submit job.
... 24 more
Caused by: java.util.concurrent.CompletionException: java.lang.RuntimeException: org.apache.flink.runtime.client.JobExecutionException: Could not set up JobManager
Caused by: java.lang.RuntimeException: Failed to start checkpoint ID counter: Could not find a file system implementation for scheme 'hddd'. The scheme is not directly supported by Flink and no Hadoop file system to support this scheme could be loaded.
4. Go to ZK and observe that JM has saved job to ZK
This message was sent by Atlassian JIRA