osdir.com

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: About the project support in Airflow


It seems that all the current approach is pointing to multiple instance of airflow, but project concept is very nature since one user might to handle different type of tasks.

Another thing about the multiple user support, one way is also to deploy multiple instance, but it seems that airflow is providing multiple user function builtin.

So I can not be convinced that using multiple instance for multiple project purpose.

Thanks,
Song




On Wed, Apr 25, 2018 at 4:25 AM +0800, "Ace Haidrey" <acehaidrey@xxxxxxxxx<mailto:acehaidrey@xxxxxxxxx>> wrote:


Looks neat Taylor!

And regarding the original question, going off of what Maxime and Bolke said, at Pandora, it made more sense for us to have an instance per team since each team has its own system user for prod and the instance can run all processes as that user. Alternatively you could have a super user that can sudo as those other system users, and have many teams on a single instance but that is a security concern (what if one team sudo's as the other team and accidentally overwrites data - there is nothing stopping them from doing it). It depends what your org set up is, but let me know if there are any questions I can help with.

Ace


> On Apr 24, 2018, at 1:16 PM, Taylor Edmiston  wrote:
>
> We use a similar approach like Bolke mentioned with running multiple
> Airflow instances.
>
> I haven't read the Pandora article yet, but we have an Astronomer Open
> Edition (fully open source) that bundles similar tools like Prometheus,
> Grafana, Celery, etc with Airflow and a Docker Compose file if you're
> looking to get a setup like that up and running quickly.
>
> https://github.com/astronomerio/astronomer/blob/master/examples/airflow-enterprise/docker-compose.yml
> https://github.com/astronomerio/astronomer
>
> *Taylor Edmiston*
> Blog  | Stack Overflow CV
>  | LinkedIn
>  | AngelList
>
>
>
> On Tue, Apr 24, 2018 at 3:30 PM, Maxime Beauchemin <
> maximebeauchemin@xxxxxxxxx> wrote:
>
>> Related blog post about multi-tenant Airflow deployment out of Pandora:
>> https://engineering.pandora.com/apache-airflow-at-pandora-1d7a844d68ee
>>
>> On Tue, Apr 24, 2018 at 10:20 AM, Bolke de Bruin
>> wrote:
>>
>>> My suggestion would be to deploy airflow per project. You could even use
>>> airflow to manage your ci/cd pipeline.
>>>
>>> B.
>>>
>>> Sent from my iPhone
>>>
>>>> On 24 Apr 2018, at 18:33, Maxime Beauchemin <
>> maximebeauchemin@xxxxxxxxx>
>>> wrote:
>>>>
>>>> People have been talking about namespacing DAGs in the past. I'd
>>> recommend
>>>> using tags (many to many) instead of categories/projects (one to many).
>>>>
>>>> It should be fairly easy to add this feature. One question is whether
>>> tags
>>>> are defined as code or in the UI/db only.
>>>>
>>>> Max
>>>>
>>>>> On Tue, Apr 24, 2018 at 1:48 AM, Song Liu
>> wrote:
>>>>>
>>>>> Hi,
>>>>>
>>>>> Basically the DAGs are created for a project purpose, so if I have
>> many
>>>>> different projects, will the Airflow support the Project concept and
>>>>> organize them separately ?
>>>>>
>>>>> Is this a known requirement or any plan for this already ?
>>>>>
>>>>> Thanks,
>>>>> Song
>>>>>
>>>
>>