osdir.com

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Use KubernetesExecutor to launch tasks into a Dask cluster in Kubernetes


@Kylen so what I'm trying to understand is why you would want to run a
static DASK cluster when you can launch Dask containers/pods using the
executor?

Seems like there are a few possible options:

1.  add the Dask pip modules to the airflow docker image and call on that
image in the executor_config whenever you need to launch a Dask task. This
would allow you to launch Dask jobs whenever you want in an elastic manner.
2. If there are benefits to keeping the static Dask cluster, then writing a
DaskOperator would be pretty straightforward. You could use the
DaskExecutor as a scaffold and basically write an operator that sends a
request to the Dask cluster and then monitors the job unti the task is
finished. You could also check out the KubernetesPodOperator to see how
that would look.



On Sun, Apr 29, 2018 at 2:58 PM Kyle Hamlin <hamlin.kn@xxxxxxxxx> wrote:

> Hi Fokko,
>
> So its always been my intention to use the KubernetesExecutor. What I'm
> trying to figure out is how to pair the KubernetesExecutor with a
> Dask cluster, since Dask clusters have many optimizations for ML type
> tasks.
>
> On Sat, Apr 28, 2018 at 2:29 PM Driesprong, Fokko <fokko@xxxxxxxxxxxxxx>
> wrote:
>
> > Also one of the main benefits of the Kubernetes Executor is having a
> Docker
> > image that contains all the dependencies that you need for your job.
> > Personally I would switch to Kubernetes when it leaves the experimental
> > stage.
> >
> > Cheers, Fokko
> >
> > 2018-04-28 16:27 GMT+02:00 Kyle Hamlin <hamlin.kn@xxxxxxxxx>:
> >
> > > I don't have a Dask cluster yet, but I'm interested in taking advantage
> > of
> > > it for ML tasks. My use case would be bursting a lot of ML jobs into a
> > > Dask cluster all at once.
> > > From what I understand, Dask clusters utilize caching to help speed up
> > jobs
> > > so I don't know if it makes sense to launch a Dask cluster for every
> > single
> > > ML job. Conceivably, I could just have a single Dask worker running
> 24/7
> > > and when its time to burst k8 could autoscale the Dask workers as more
> ML
> > > jobs are launched into the Dask cluster?
> > >
> > > On Fri, Apr 27, 2018 at 10:35 PM Daniel Imberman <
> > > daniel.imberman@xxxxxxxxx>
> > > wrote:
> > >
> > > > Hi Kyle,
> > > >
> > > > So you have a static Dask cluster running your k8s cluster? Is there
> > any
> > > > reason you wouldn't just launch the Dask cluster for the job you're
> > > running
> > > > and then tear it down? I feel like with k8s the elasticity is one of
> > the
> > > > main benefits.
> > > >
> > > > On Fri, Apr 27, 2018 at 12:32 PM Kyle Hamlin <hamlin.kn@xxxxxxxxx>
> > > wrote:
> > > >
> > > > > Hi all,
> > > > >
> > > > > If I have a Kubernetes cluster running in DCOC and a Dask cluster
> > > running
> > > > > in that same Kubernetes cluster is it possible/does it makes sense
> to
> > > use
> > > > > the KubernetesExecutor to launch tasks into the Dask cluster (these
> > are
> > > > ML
> > > > > jobs with sklearn)? I feel like there is a bit of inception going
> on
> > > here
> > > > > in my mind and I just want to make sure a setup like this makes
> > sense?
> > > > > Thanks in advance for anyone's input!
> > > > >
> > > >
> > >
> > >
> > > --
> > > Kyle Hamlin
> > >
> >
>
>
> --
> Kyle Hamlin
>