[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Retaining uploaded job jars on Flink HA restarts on Kubernetes

Yes, I am aware that to restart the jobs Flink won't require the jars. But would have been awesome if it could have retained those.

Thanks all for the help.

- Rohil

On Mon 7 May, 2018, 5:32 PM Sampath Bhat, <sam414255path@xxxxxxxxx> wrote:
Hi Rohil

You need not upload the jar again when job manager restarts in an HA environment. Only the the jar stored in web.upload.dir will be deleted which is fine. The jars needed for the job manager to restart will be stored in high-availability.storageDir along with job graphs and job related stuff. So when HA is enabled and the job manager restarts for whatsoever reason the job manager looks into high-availability.storageDir location for restarting the previously running jobs.

On Mon, May 7, 2018 at 5:22 PM, Rohil Surana <rohilsurana96@xxxxxxxxx> wrote:
but why was this decision taken to automatically delete and not retain the jars, to me it makes sense to have the uploaded jars so user doesn't have to do it when JobManager restarts.

- Rohil

On Mon, May 7, 2018 at 12:16 PM, Chesnay Schepler <chesnay@xxxxxxxxxx> wrote:
The jar directory is automatically deleted when a JobManager shuts down.

In other words, there is no way to retain uploaded jars if a JobManager dies, and no way to point a JobManager to a pre-existing directory.

On 07.05.2018 08:18, Chirag Dewan wrote:
I think you are looking for jobmanager.web.tmpdir along with upload.dir 

From the documentation :

  • jobmanager.web.tmpdir: This configuration parameter allows defining the Flink web directory to be used by the web interface. The web interface will copy its static files into the directory. Also uploaded job jars are stored in the directory if not overridden. By default, the temporary directory is used.

  • jobmanager.web.upload.dir: The config parameter defining the directory for uploading the job jars. If not specified a dynamic directory will be used under the directory specified by jobmanager.web.tmpdir.



On Sunday, 6 May, 2018, 12:29:43 AM IST, Rohil Surana <rohilsurana96@xxxxxxxxx> wrote:


I have a very basic Flink HA setup on Kubernetes and wanted to retain job jars on JobManager Restarts.

For HA I am using a Zookeeper and a NFS drive mounted on all pods (JobManager and TaskManagers), that is being used for checkpoints and have also set the `web.upload.dir: /data/flink-uploads` where /data is for the NFS volume.

Still when the JobManager is killed, the uploaded jars are lost.

Would really appreciate if anyone can help in what I am missing.
Here is the link to my flink-conf.yaml - https://pastebin.com/dt7tGTYQ


- Rohil