Re: Best Practice of Airflow Setting-Up & Usage
Thanks for preparing the questions.
Setting-Up: In container (previously Swarm and now K8S)
Scale: two airflow workers
SLA: We don't have a hard limit but it would be unbearable for a DAG to be
scheduled in more than one minute.
Airflow has been run steadily and the Web UI is great to monitor the DAG
status (we added a button to allow user to upload their DAG files though).
The main frustration comes from that everything is in UTC time (we are in
GMT+8) although we can now set up a DAG in local timezone.
It has been confusing and inconvenient since users' data are usually
partitioned in local time.
On Wed, Sep 5, 2018 at 9:31 PM airflowuser
> Setting up Airflow for the first time is a BIG DEAL.
> unlike the initial intention of the community of easy install with SQLite
> and SequentialExecutor - for actually working environment you need to
> change a lot of settings. It doesn't help much that the demo install went
> The support for issues and problems is very limited. There is no actual
> community on StackOveflow and on Gitter other than Ash (and maybe few more
> occasionally) no one replies.
> Don't consider this as criticism. At the end all of you guys donating your
> time.. I simply writing my impressions. To be honest we were very close to
> neglect this project. May I suggest a module of "premium support" for
> payment which will be contribution to the community? Support in terms of
> questions, installation help etc..
> To your questions:
> 1. one-time
> 2. LocalExecutor
> Thous are not because this is what we wanted it's because that was the
> only thing that we could make it work. Hopefully we will try to install
> 1.10.1 from fresh and try to solve all the issues we encountered.
> 3. I use Queues.
> 4. Don't use SLAs.
> Sent with ProtonMail Secure Email.
> ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
> On September 5, 2018 3:56 PM, Deng Xiaodong <xd.deng.r@xxxxxxxxx> wrote:
> > Hi folks,
> > May you kindly share how your organization is setting up Airflow and
> > it? Especially in terms of architecture. For example,
> > - Setting-Up: Do you install Airflow in a "one-time" fashion, or
> > containerization fashion?
> > - Executor: Which executor are you using (LocalExecutor,
> > CeleryExecutor, etc)? I believe most production environments are
> > CeleryExecutor?
> > - Scale: If using Celery, normally how many worker nodes do you add?
> > sure this is up to workloads and performance of your worker nodes).
> > - Queue: if Queue feature
> > https://airflow.apache.org/concepts.html#queues is used in your
> > architecture? For what advantage? (for example, explicitly assign
> > network-bound tasks to a worker node whose parallelism can be much higher
> > than its # of cores)
> > - SLA: do you have any SLA for your scheduling? (this is inspired by
> > @yrqls21's PR 3830
> > - etc.
> > Airflow's setting-up can be quite flexible, but I believe there is
> > sort of best practice, especially in the organisations where
> scalability is
> > essential.
> > Thanks for sharing in advance!
> > Best regards,
> > XD