osdir.com

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: About the project support in Airflow


Hi Taylor,

Yes, I know that this RBAC feature would be released within the 1.10 release.

# About multi-user support

But Why not deploy one instance of Airflow per user ? (
With this feature, don’t you think that the Airflow is to be more likely as a platform to serve more different users.
Also multi-user case would exhaust the Airflow resource more easily if we are talking the scalability capability of Airflow.

# About multi-project support

You could see the “project” concept is some kind of logical group of the DAGs to let the DAGs be organized more structural.
I can’t see it will beat the “scalability” of Airflow somehow, it just let the user experience be more friendly I see.

So that is why I want to use the “multi-user support” case to argue why suggest using multi-instance for “multi-project”,
since that I think the “multi-user” support is kindly of pushing the Airflow in the way of “be more scalable”, but “multi-project” just be more intuitive and more user-experience friendly.  

Thanks,
Song

On 26/04/2018, 4:50 AM, "Taylor Edmiston" <tedmiston@xxxxxxxxx> wrote:

    Something else that might be relevant for your multi-user use case is the
    new RBAC support that Joy Gao added.
    
    https://github.com/apache/incubator-airflow/pull/3015
    
    *Taylor Edmiston*
    Blog <http://blog.tedmiston.com> | Stack Overflow CV
    <https://stackoverflow.com/story/taylor> | LinkedIn
    <https://www.linkedin.com/in/tedmiston/> | AngelList
    <https://angel.co/taylor>
    
    
    On Wed, Apr 25, 2018 at 3:04 PM, James Meickle <jmeickle@xxxxxxxxxxxxxx>
    wrote:
    
    > Another reason you would want separated infrastructure is that there are a
    > lot of ways to exhaust Airflow resources or otherwise cause contention -
    > like having too many sensors or sub-DAGs using up all available tasks.
    >
    > Doesn't seem like a great idea to push for having different teams with
    > co-tenancy until there is also per-team control over resource use...
    >
    > On Tue, Apr 24, 2018 at 8:27 PM, 刘松(Cycle++开发组) <liusong02@xxxxxxxxxx>
    > wrote:
    >
    > > It seems that all the current approach is pointing to multiple instance
    > of
    > > airflow, but project concept is very nature since one user might to
    > handle
    > > different type of tasks.
    > >
    > > Another thing about the multiple user support, one way is also to deploy
    > > multiple instance, but it seems that airflow is providing multiple user
    > > function builtin.
    > >
    > > So I can not be convinced that using multiple instance for multiple
    > > project purpose.
    > >
    > > Thanks,
    > > Song
    > >
    > >
    > >
    > >
    > > On Wed, Apr 25, 2018 at 4:25 AM +0800, "Ace Haidrey" <
    > acehaidrey@xxxxxxxxx
    > > <mailto:acehaidrey@xxxxxxxxx>> wrote:
    > >
    > >
    > > Looks neat Taylor!
    > >
    > > And regarding the original question, going off of what Maxime and Bolke
    > > said, at Pandora, it made more sense for us to have an instance per team
    > > since each team has its own system user for prod and the instance can run
    > > all processes as that user. Alternatively you could have a super user
    > that
    > > can sudo as those other system users, and have many teams on a single
    > > instance but that is a security concern (what if one team sudo's as the
    > > other team and accidentally overwrites data - there is nothing stopping
    > > them from doing it). It depends what your org set up is, but let me know
    > if
    > > there are any questions I can help with.
    > >
    > > Ace
    > >
    > >
    > > > On Apr 24, 2018, at 1:16 PM, Taylor Edmiston  wrote:
    > > >
    > > > We use a similar approach like Bolke mentioned with running multiple
    > > > Airflow instances.
    > > >
    > > > I haven't read the Pandora article yet, but we have an Astronomer Open
    > > > Edition (fully open source) that bundles similar tools like Prometheus,
    > > > Grafana, Celery, etc with Airflow and a Docker Compose file if you're
    > > > looking to get a setup like that up and running quickly.
    > > >
    > > > https://github.com/astronomerio/astronomer/blob/
    > master/examples/airflow-
    > > enterprise/docker-compose.yml
    > > > https://github.com/astronomerio/astronomer
    > > >
    > > > *Taylor Edmiston*
    > > > Blog  | Stack Overflow CV
    > > >  | LinkedIn
    > > >  | AngelList
    > > >
    > > >
    > > >
    > > > On Tue, Apr 24, 2018 at 3:30 PM, Maxime Beauchemin <
    > > > maximebeauchemin@xxxxxxxxx> wrote:
    > > >
    > > >> Related blog post about multi-tenant Airflow deployment out of
    > Pandora:
    > > >> https://engineering.pandora.com/apache-airflow-at-pandora-
    > 1d7a844d68ee
    > > >>
    > > >> On Tue, Apr 24, 2018 at 10:20 AM, Bolke de Bruin
    > > >> wrote:
    > > >>
    > > >>> My suggestion would be to deploy airflow per project. You could even
    > > use
    > > >>> airflow to manage your ci/cd pipeline.
    > > >>>
    > > >>> B.
    > > >>>
    > > >>> Sent from my iPhone
    > > >>>
    > > >>>> On 24 Apr 2018, at 18:33, Maxime Beauchemin <
    > > >> maximebeauchemin@xxxxxxxxx>
    > > >>> wrote:
    > > >>>>
    > > >>>> People have been talking about namespacing DAGs in the past. I'd
    > > >>> recommend
    > > >>>> using tags (many to many) instead of categories/projects (one to
    > > many).
    > > >>>>
    > > >>>> It should be fairly easy to add this feature. One question is
    > whether
    > > >>> tags
    > > >>>> are defined as code or in the UI/db only.
    > > >>>>
    > > >>>> Max
    > > >>>>
    > > >>>>> On Tue, Apr 24, 2018 at 1:48 AM, Song Liu
    > > >> wrote:
    > > >>>>>
    > > >>>>> Hi,
    > > >>>>>
    > > >>>>> Basically the DAGs are created for a project purpose, so if I have
    > > >> many
    > > >>>>> different projects, will the Airflow support the Project concept
    > and
    > > >>>>> organize them separately ?
    > > >>>>>
    > > >>>>> Is this a known requirement or any plan for this already ?
    > > >>>>>
    > > >>>>> Thanks,
    > > >>>>> Song
    > > >>>>>
    > > >>>
    > > >>
    > >
    > >
    > >
    >