osdir.com

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: About the project support in Airflow


Hi James,

Yes, the “multi-user” feature is kind of way to exhaust Airflow resources actually, but “multi-user” feature is import to let the Airflow to be more like as a service.
There will be more work to let Airflow be more scalable, but the direction looks to be promising.

Thanks,
Song

On 26/04/2018, 3:05 AM, "James Meickle" <jmeickle@xxxxxxxxxxxxxx> wrote:

    Another reason you would want separated infrastructure is that there are a
    lot of ways to exhaust Airflow resources or otherwise cause contention -
    like having too many sensors or sub-DAGs using up all available tasks.
    
    Doesn't seem like a great idea to push for having different teams with
    co-tenancy until there is also per-team control over resource use...
    
    On Tue, Apr 24, 2018 at 8:27 PM, 刘松(Cycle++开发组) <liusong02@xxxxxxxxxx>
    wrote:
    
    > It seems that all the current approach is pointing to multiple instance of
    > airflow, but project concept is very nature since one user might to handle
    > different type of tasks.
    >
    > Another thing about the multiple user support, one way is also to deploy
    > multiple instance, but it seems that airflow is providing multiple user
    > function builtin.
    >
    > So I can not be convinced that using multiple instance for multiple
    > project purpose.
    >
    > Thanks,
    > Song
    >
    >
    >
    >
    > On Wed, Apr 25, 2018 at 4:25 AM +0800, "Ace Haidrey" <acehaidrey@xxxxxxxxx
    > <mailto:acehaidrey@xxxxxxxxx>> wrote:
    >
    >
    > Looks neat Taylor!
    >
    > And regarding the original question, going off of what Maxime and Bolke
    > said, at Pandora, it made more sense for us to have an instance per team
    > since each team has its own system user for prod and the instance can run
    > all processes as that user. Alternatively you could have a super user that
    > can sudo as those other system users, and have many teams on a single
    > instance but that is a security concern (what if one team sudo's as the
    > other team and accidentally overwrites data - there is nothing stopping
    > them from doing it). It depends what your org set up is, but let me know if
    > there are any questions I can help with.
    >
    > Ace
    >
    >
    > > On Apr 24, 2018, at 1:16 PM, Taylor Edmiston  wrote:
    > >
    > > We use a similar approach like Bolke mentioned with running multiple
    > > Airflow instances.
    > >
    > > I haven't read the Pandora article yet, but we have an Astronomer Open
    > > Edition (fully open source) that bundles similar tools like Prometheus,
    > > Grafana, Celery, etc with Airflow and a Docker Compose file if you're
    > > looking to get a setup like that up and running quickly.
    > >
    > > https://github.com/astronomerio/astronomer/blob/master/examples/airflow-
    > enterprise/docker-compose.yml
    > > https://github.com/astronomerio/astronomer
    > >
    > > *Taylor Edmiston*
    > > Blog  | Stack Overflow CV
    > >  | LinkedIn
    > >  | AngelList
    > >
    > >
    > >
    > > On Tue, Apr 24, 2018 at 3:30 PM, Maxime Beauchemin <
    > > maximebeauchemin@xxxxxxxxx> wrote:
    > >
    > >> Related blog post about multi-tenant Airflow deployment out of Pandora:
    > >> https://engineering.pandora.com/apache-airflow-at-pandora-1d7a844d68ee
    > >>
    > >> On Tue, Apr 24, 2018 at 10:20 AM, Bolke de Bruin
    > >> wrote:
    > >>
    > >>> My suggestion would be to deploy airflow per project. You could even
    > use
    > >>> airflow to manage your ci/cd pipeline.
    > >>>
    > >>> B.
    > >>>
    > >>> Sent from my iPhone
    > >>>
    > >>>> On 24 Apr 2018, at 18:33, Maxime Beauchemin <
    > >> maximebeauchemin@xxxxxxxxx>
    > >>> wrote:
    > >>>>
    > >>>> People have been talking about namespacing DAGs in the past. I'd
    > >>> recommend
    > >>>> using tags (many to many) instead of categories/projects (one to
    > many).
    > >>>>
    > >>>> It should be fairly easy to add this feature. One question is whether
    > >>> tags
    > >>>> are defined as code or in the UI/db only.
    > >>>>
    > >>>> Max
    > >>>>
    > >>>>> On Tue, Apr 24, 2018 at 1:48 AM, Song Liu
    > >> wrote:
    > >>>>>
    > >>>>> Hi,
    > >>>>>
    > >>>>> Basically the DAGs are created for a project purpose, so if I have
    > >> many
    > >>>>> different projects, will the Airflow support the Project concept and
    > >>>>> organize them separately ?
    > >>>>>
    > >>>>> Is this a known requirement or any plan for this already ?
    > >>>>>
    > >>>>> Thanks,
    > >>>>> Song
    > >>>>>
    > >>>
    > >>
    >
    >
    >