OSDir


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Best Practice of Airflow Setting-Up & Usage


Hi,

Setting up Airflow for the first time is a BIG DEAL.
unlike the initial intention of the community of easy install with SQLite and SequentialExecutor - for actually working environment you need to change a lot of settings. It doesn't help much that the demo install went smoothly.

The support for issues and problems is very limited. There is no actual community on StackOveflow and on Gitter other than Ash (and maybe few more occasionally) no one replies.

Don't consider this as criticism. At the end all of you guys donating your time.. I simply writing my impressions. To be honest we were very close to neglect this project. May I suggest a module of "premium support" for payment which will be contribution to the community? Support in terms of questions, installation help etc..


To your questions:
1. one-time
2. LocalExecutor

Thous are not because this is what we wanted it's because that was the only thing that we could make it work. Hopefully we will try to install 1.10.1 from fresh and try to solve all the issues we encountered.

3. I use Queues.
4. Don't use SLAs.




Sent with ProtonMail Secure Email.

‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On September 5, 2018 3:56 PM, Deng Xiaodong <xd.deng.r@xxxxxxxxx> wrote:

> Hi folks,
>
> May you kindly share how your organization is setting up Airflow and using
> it? Especially in terms of architecture. For example,
>
> -   Setting-Up: Do you install Airflow in a "one-time" fashion, or
>     containerization fashion?
>
> -   Executor: Which executor are you using (LocalExecutor,
>     CeleryExecutor, etc)? I believe most production environments are using
>     CeleryExecutor?
>
> -   Scale: If using Celery, normally how many worker nodes do you add? (for
>     sure this is up to workloads and performance of your worker nodes).
>
> -   Queue: if Queue feature
>     https://airflow.apache.org/concepts.html#queues is used in your
>
>
> architecture? For what advantage? (for example, explicitly assign
> network-bound tasks to a worker node whose parallelism can be much higher
> than its # of cores)
>
> -   SLA: do you have any SLA for your scheduling? (this is inspired by
>     @yrqls21's PR 3830 https://github.com/apache/incubator-airflow/pull/3830)
>
> -   etc.
>
>     Airflow's setting-up can be quite flexible, but I believe there is some
>     sort of best practice, especially in the organisations where scalability is
>     essential.
>
>     Thanks for sharing in advance!
>
>     Best regards,
>     XD
>