OSDir


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Best Practice of Airflow Setting-Up & Usage


Many thanks for sharing, Manu!

I realise I have missed an important question: how many DAGs/tasks are your
Airflow instance dealing with.

I would like to share the current status in my organisation as well:

*- Setting-up*: we're using both "one-time" and container setting-up ways,
in different environments. But we have plan to migrate all of them into
container style, for the sake of maintainability and faster failure
recovery.
*- Executors*: CeleryExecutor. Celery Flower also brings additional
monitoring feature, which is very helpful.
*- Scale*: we have a few workers, labelled to two different queues.
*- Queues*: now we're using *queue* feature to solve environment dependency
of different tasks (for example, some DAGs need specific software which is
only installed on one worker). I'm also planning to set up queues based on
task nature (CPU-bound, network-bound) in the future.
*- SLA*: our team is looking at ~3 minutes.
*- # of DAGs/Tasks*: we're maintaining a few hundred DAGs, and about 5
tasks in each DAG by average. # of DAGs/Tasks actually puts pressure on SLA
as well.

Look forward to more inputs! Thanks!


XD


On Wed, Sep 5, 2018 at 10:29 PM Manu Zhang <owenzhang1990@xxxxxxxxx> wrote:

> Hi Xiaodong,
>
> Thanks for preparing the questions.
>
> Setting-Up: In container (previously Swarm and now K8S)
> Executor: CeleryExecutor
> Scale: two airflow workers
> Queue: No
> SLA: We don't have a hard limit but it would be unbearable for a DAG to be
> scheduled in more than one minute.
>
> Airflow has been run steadily and the Web UI is great to monitor the DAG
> status (we added a button to allow user to upload their DAG files though).
> The main frustration comes from that everything is in UTC time (we are in
> GMT+8) although we can now set up a DAG in local timezone.
> It has been confusing and inconvenient since users' data are usually
> partitioned in local time.
>
> Thanks,
> Manu Zhang
>
>
> On Wed, Sep 5, 2018 at 9:31 PM airflowuser
> <airflowuser@xxxxxxxxxxxxxx.invalid> wrote:
>
> > Hi,
> >
> > Setting up Airflow for the first time is a BIG DEAL.
> > unlike the initial intention of the community of easy install with SQLite
> > and SequentialExecutor - for actually working environment you need to
> > change a lot of settings. It doesn't help much that the demo install went
> > smoothly.
> >
> > The support for issues and problems is very limited. There is no actual
> > community on StackOveflow and on Gitter other than Ash (and maybe few
> more
> > occasionally) no one replies.
> >
> > Don't consider this as criticism. At the end all of you guys donating
> your
> > time.. I simply writing my impressions. To be honest we were very close
> to
> > neglect this project. May I suggest a module of "premium support" for
> > payment which will be contribution to the community? Support in terms of
> > questions, installation help etc..
> >
> >
> > To your questions:
> > 1. one-time
> > 2. LocalExecutor
> >
> > Thous are not because this is what we wanted it's because that was the
> > only thing that we could make it work. Hopefully we will try to install
> > 1.10.1 from fresh and try to solve all the issues we encountered.
> >
> > 3. I use Queues.
> > 4. Don't use SLAs.
> >
> >
> >
> >
> > Sent with ProtonMail Secure Email.
> >
> > ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
> > On September 5, 2018 3:56 PM, Deng Xiaodong <xd.deng.r@xxxxxxxxx> wrote:
> >
> > > Hi folks,
> > >
> > > May you kindly share how your organization is setting up Airflow and
> > using
> > > it? Especially in terms of architecture. For example,
> > >
> > > -   Setting-Up: Do you install Airflow in a "one-time" fashion, or
> > >     containerization fashion?
> > >
> > > -   Executor: Which executor are you using (LocalExecutor,
> > >     CeleryExecutor, etc)? I believe most production environments are
> > using
> > >     CeleryExecutor?
> > >
> > > -   Scale: If using Celery, normally how many worker nodes do you add?
> > (for
> > >     sure this is up to workloads and performance of your worker nodes).
> > >
> > > -   Queue: if Queue feature
> > >     https://airflow.apache.org/concepts.html#queues is used in your
> > >
> > >
> > > architecture? For what advantage? (for example, explicitly assign
> > > network-bound tasks to a worker node whose parallelism can be much
> higher
> > > than its # of cores)
> > >
> > > -   SLA: do you have any SLA for your scheduling? (this is inspired by
> > >     @yrqls21's PR 3830
> > https://github.com/apache/incubator-airflow/pull/3830)
> > >
> > > -   etc.
> > >
> > >     Airflow's setting-up can be quite flexible, but I believe there is
> > some
> > >     sort of best practice, especially in the organisations where
> > scalability is
> > >     essential.
> > >
> > >     Thanks for sharing in advance!
> > >
> > >     Best regards,
> > >     XD
> > >
> >
> >
> >
>