[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: FileNotFoundException on starting the job

my guess is that tmp directory got cleaned on your host and Flink couldn't restore memory state from it upon startup. 

On Thu, Nov 1, 2018 at 8:51 PM Dmitry Minaev <minaevd@xxxxxxxxx> wrote:
Hi everyone,

I'm having an issue when restarting a job in Flink. I'm doing a simple stop with savepoint and then start from the savepoint. Savepoints are stored in a separate folder, there is no configuration for "/tmp" folder in my setup. There is only 1 task manager and parallelism is 1.

I'm getting FileNotFoundException:

31 Oct 2018 23:40:35,837 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph - filter-business-metrics -> Sink: data_feed (1/1) (51ce53532932c33805291dc188d2f99e) switched from DEPLOYING to RUNNING.
31 Oct 2018 23:40:35,837 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph - agents-working-on-interactions (1/1) (72a916158d07f2353fb270848d95ba2f) switched from DEPLOYING to RUNNING.
31 Oct 2018 23:40:35,929 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph - interaction-details (1/1) (c004e64e90c0dbd3bc007459bc3d7420) switched from RUNNING to FAILED.
java.io.FileNotFoundException: /tmp/flink-io-7bfd6603-c115-463d-bcfc-b97e31be5a37/f7ce787242e6afd91c3cbeccc2f74bc4a7dd0e6e600ff83e51bc5be9a95750f9.0.buffer (No such file or directory)
        at java.io.RandomAccessFile.open0(Native Method)
        at java.io.RandomAccessFile.open(RandomAccessFile.java:316)
        at java.io.RandomAccessFile.<init>(RandomAccessFile.java:243)
        at org.apache.flink.streaming.runtime.io.BufferSpiller.createSpillingChannel(BufferSpiller.java:259)
        at org.apache.flink.streaming.runtime.io.BufferSpiller.<init>(BufferSpiller.java:120)
        at org.apache.flink.streaming.runtime.io.BarrierBuffer.<init>(BarrierBuffer.java:149)
        at org.apache.flink.streaming.runtime.io.StreamInputProcessor.<init>(StreamInputProcessor.java:129)
        at org.apache.flink.streaming.runtime.tasks.OneInputStreamTask.init(OneInputStreamTask.java:56)
        at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:235)
        at org.apache.flink.runtime.taskmanager.Task.run(Task.java:718)
        at java.lang.Thread.run(Thread.java:748)

I've checked the logs and there are no errors prior to that. The job was stopped with no issues, and it was starting normally and passed multiple operators setting them to RUNNING state. But for several other operators it throws this FileNotFoundException.

Any help is appreciated.

-- Regards, Dmitry