osdir.com


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[jira] [Created] (FLINK-10434) Blob file not removed after job execution.


Piotr Szczepanek created FLINK-10434:
----------------------------------------

             Summary: Blob file not removed after job execution.
                 Key: FLINK-10434
                 URL: https://issues.apache.org/jira/browse/FLINK-10434
             Project: Flink
          Issue Type: Bug
    Affects Versions: 1.5.2
         Environment: Flink container running on Yarn.
            Reporter: Piotr Szczepanek


We have Flink container running on Yarn, and we're using YarnClusterClient for job submission. After successfully/failed job execution it looks like blob file for that job is deleted, but there is still handle from Flink process to that file. As a result the file is
not removed from machine, until we restart the whole Flink container.

>From the size comparison it looks like mentioned file is actually our jar with Flink job.

This is quite big problem for us, as we are submitting many jobs, every submission upload new jar and created new blob file, which is never removed from disc until we restart container. We already faced out of disc space.



Results of lsof are:
During job execution:
lsof /flinkDir | grep job_dbafb671b0d60ed8a8ec2651fe59303b
java 11883 yarn mem REG 253,2 112384928 109973177
/flinkDir/yarn/../application_1536668870638_5555/blobStore-a1bcdbd4-5388-4c56-8052-6051f5af38dd/job_dbafb671b0d60ed8a8ec2651fe59303b/blob_p-8771d9ccac35e28d8571ac8957feaaecdebaeadd-7748aec7fe7369ca26181d0f94b1a578
java 11883 yarn 1837r REG 253,2 112384928 109973177
/flinkDir/yarn/../application_1536668870638_5555/blobStore-a1bcdbd4-5388-4c56-8052-6051f5af38dd/job_dbafb671b0d60ed8a8ec2651fe59303b/blob_p-8771d9ccac35e28d8571ac8957feaaecdebaeadd-7748aec7fe7369ca26181d0f94b1a578

After job execution:
lsof /flinkDir | grep job_dbafb671b0d60ed8a8ec2651fe59303b
java 11883 yarn DEL REG 253,2 109973177
/flinkDir/yarn/../application_1536668870638_5555/blobStore-a1bcdbd4-5388-4c56-8052-6051f5af38dd/job_dbafb671b0d60ed8a8ec2651fe59303b/blob_p-8771d9ccac35e28d8571ac8957feaaecdebaeadd-7748aec7fe7369ca26181d0f94b1a578
java 11883 yarn 1837r REG 253,2 112384928 109973177
/flinkDir/yarn/../application_1536668870638_5555/blobStore-a1bcdbd4-5388-4c56-8052-6051f5af38dd/job_dbafb671b0d60ed8a8ec2651fe59303b/blob_p-8771d9ccac35e28d8571ac8957feaaecdebaeadd-7748aec7fe7369ca26181d0f94b1a578
*(deleted)*



After restarting Flink container this handle disappeared.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)