OSDir


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: About the project support in Airflow


Hi Feng,

Thanks for your information, indeed I have noticed this work also.

But if I am understanding correctly, it is focus on the permission (edit/read etc.) with the DAG itself.

“project concept” is some kind of “Group” but it is more meaningful than the “Tag”, so if we don’t want to support “project concept”, is there any other solution for this requirement or any consideration behind ?

Many thanks for help.

Thanks,
Song

On 26/04/2018, 12:28 PM, "Tao Feng" <fengtao04@xxxxxxxxx> wrote:

    Hi Song,
    
    Just noted that we are also working on dag-level access on top of
    RBAC(AIRFLOW-2267) which should provide dag-level acl functionality. The
    WIP pr could be found at
    https://github.com/apache/incubator-airflow/pull/3197
    
    On Wed, Apr 25, 2018 at 7:42 PM, 刘松(Cycle++开发组) <liusong02@xxxxxxxxxx>
    wrote:
    
    > Hi Taylor,
    >
    > Yes, I know that this RBAC feature would be released within the 1.10
    > release.
    >
    > # About multi-user support
    >
    > But Why not deploy one instance of Airflow per user ? (
    > With this feature, don’t you think that the Airflow is to be more likely
    > as a platform to serve more different users.
    > Also multi-user case would exhaust the Airflow resource more easily if we
    > are talking the scalability capability of Airflow.
    >
    > # About multi-project support
    >
    > You could see the “project” concept is some kind of logical group of the
    > DAGs to let the DAGs be organized more structural.
    > I can’t see it will beat the “scalability” of Airflow somehow, it just let
    > the user experience be more friendly I see.
    >
    > So that is why I want to use the “multi-user support” case to argue why
    > suggest using multi-instance for “multi-project”,
    > since that I think the “multi-user” support is kindly of pushing the
    > Airflow in the way of “be more scalable”, but “multi-project” just be more
    > intuitive and more user-experience friendly.
    >
    > Thanks,
    > Song
    >
    > On 26/04/2018, 4:50 AM, "Taylor Edmiston" <tedmiston@xxxxxxxxx> wrote:
    >
    >     Something else that might be relevant for your multi-user use case is
    > the
    >     new RBAC support that Joy Gao added.
    >
    >     https://github.com/apache/incubator-airflow/pull/3015
    >
    >     *Taylor Edmiston*
    >     Blog <http://blog.tedmiston.com> | Stack Overflow CV
    >     <https://stackoverflow.com/story/taylor> | LinkedIn
    >     <https://www.linkedin.com/in/tedmiston/> | AngelList
    >     <https://angel.co/taylor>
    >
    >
    >     On Wed, Apr 25, 2018 at 3:04 PM, James Meickle <
    > jmeickle@xxxxxxxxxxxxxx>
    >     wrote:
    >
    >     > Another reason you would want separated infrastructure is that there
    > are a
    >     > lot of ways to exhaust Airflow resources or otherwise cause
    > contention -
    >     > like having too many sensors or sub-DAGs using up all available
    > tasks.
    >     >
    >     > Doesn't seem like a great idea to push for having different teams
    > with
    >     > co-tenancy until there is also per-team control over resource use...
    >     >
    >     > On Tue, Apr 24, 2018 at 8:27 PM, 刘松(Cycle++开发组) <
    > liusong02@xxxxxxxxxx>
    >     > wrote:
    >     >
    >     > > It seems that all the current approach is pointing to multiple
    > instance
    >     > of
    >     > > airflow, but project concept is very nature since one user might to
    >     > handle
    >     > > different type of tasks.
    >     > >
    >     > > Another thing about the multiple user support, one way is also to
    > deploy
    >     > > multiple instance, but it seems that airflow is providing multiple
    > user
    >     > > function builtin.
    >     > >
    >     > > So I can not be convinced that using multiple instance for multiple
    >     > > project purpose.
    >     > >
    >     > > Thanks,
    >     > > Song
    >     > >
    >     > >
    >     > >
    >     > >
    >     > > On Wed, Apr 25, 2018 at 4:25 AM +0800, "Ace Haidrey" <
    >     > acehaidrey@xxxxxxxxx
    >     > > <mailto:acehaidrey@xxxxxxxxx>> wrote:
    >     > >
    >     > >
    >     > > Looks neat Taylor!
    >     > >
    >     > > And regarding the original question, going off of what Maxime and
    > Bolke
    >     > > said, at Pandora, it made more sense for us to have an instance
    > per team
    >     > > since each team has its own system user for prod and the instance
    > can run
    >     > > all processes as that user. Alternatively you could have a super
    > user
    >     > that
    >     > > can sudo as those other system users, and have many teams on a
    > single
    >     > > instance but that is a security concern (what if one team sudo's
    > as the
    >     > > other team and accidentally overwrites data - there is nothing
    > stopping
    >     > > them from doing it). It depends what your org set up is, but let
    > me know
    >     > if
    >     > > there are any questions I can help with.
    >     > >
    >     > > Ace
    >     > >
    >     > >
    >     > > > On Apr 24, 2018, at 1:16 PM, Taylor Edmiston  wrote:
    >     > > >
    >     > > > We use a similar approach like Bolke mentioned with running
    > multiple
    >     > > > Airflow instances.
    >     > > >
    >     > > > I haven't read the Pandora article yet, but we have an
    > Astronomer Open
    >     > > > Edition (fully open source) that bundles similar tools like
    > Prometheus,
    >     > > > Grafana, Celery, etc with Airflow and a Docker Compose file if
    > you're
    >     > > > looking to get a setup like that up and running quickly.
    >     > > >
    >     > > > https://github.com/astronomerio/astronomer/blob/
    >     > master/examples/airflow-
    >     > > enterprise/docker-compose.yml
    >     > > > https://github.com/astronomerio/astronomer
    >     > > >
    >     > > > *Taylor Edmiston*
    >     > > > Blog  | Stack Overflow CV
    >     > > >  | LinkedIn
    >     > > >  | AngelList
    >     > > >
    >     > > >
    >     > > >
    >     > > > On Tue, Apr 24, 2018 at 3:30 PM, Maxime Beauchemin <
    >     > > > maximebeauchemin@xxxxxxxxx> wrote:
    >     > > >
    >     > > >> Related blog post about multi-tenant Airflow deployment out of
    >     > Pandora:
    >     > > >> https://engineering.pandora.com/apache-airflow-at-pandora-
    >     > 1d7a844d68ee
    >     > > >>
    >     > > >> On Tue, Apr 24, 2018 at 10:20 AM, Bolke de Bruin
    >     > > >> wrote:
    >     > > >>
    >     > > >>> My suggestion would be to deploy airflow per project. You
    > could even
    >     > > use
    >     > > >>> airflow to manage your ci/cd pipeline.
    >     > > >>>
    >     > > >>> B.
    >     > > >>>
    >     > > >>> Sent from my iPhone
    >     > > >>>
    >     > > >>>> On 24 Apr 2018, at 18:33, Maxime Beauchemin <
    >     > > >> maximebeauchemin@xxxxxxxxx>
    >     > > >>> wrote:
    >     > > >>>>
    >     > > >>>> People have been talking about namespacing DAGs in the past.
    > I'd
    >     > > >>> recommend
    >     > > >>>> using tags (many to many) instead of categories/projects (one
    > to
    >     > > many).
    >     > > >>>>
    >     > > >>>> It should be fairly easy to add this feature. One question is
    >     > whether
    >     > > >>> tags
    >     > > >>>> are defined as code or in the UI/db only.
    >     > > >>>>
    >     > > >>>> Max
    >     > > >>>>
    >     > > >>>>> On Tue, Apr 24, 2018 at 1:48 AM, Song Liu
    >     > > >> wrote:
    >     > > >>>>>
    >     > > >>>>> Hi,
    >     > > >>>>>
    >     > > >>>>> Basically the DAGs are created for a project purpose, so if
    > I have
    >     > > >> many
    >     > > >>>>> different projects, will the Airflow support the Project
    > concept
    >     > and
    >     > > >>>>> organize them separately ?
    >     > > >>>>>
    >     > > >>>>> Is this a known requirement or any plan for this already ?
    >     > > >>>>>
    >     > > >>>>> Thanks,
    >     > > >>>>> Song
    >     > > >>>>>
    >     > > >>>
    >     > > >>
    >     > >
    >     > >
    >     > >
    >     >
    >
    >
    >