OSDir


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: 1.10.0beta1 now available for download


Hi Jakob,

This ‘release’ is not effectively a RC. We want to have the kubernetes
executor stabilised or at least passing its own tests before we like to move
to RC status. People also tend to rally to have some extra bugfixes in or 
some extra features when we announce “beta” status. Given the fact that
going from 1.9 to 1.10 is a big leap I think it is important to have 
period to funnel towards a RC/Release.

Gotcha on httpd. However it still seems semantics to me. I would equal
a Spark nightly somewhat to an Airflow alpha. A snapshot somewhat 
to a beta. Ie. for Airflow ‘alphas’ and ‘betas’ are not releases, not from a 
process perspective and and not from a technical perspective. 

Practically, I think we need a way to stabilise the tree so we have
a reasonable confidence we can pass a vote for ‘real release, which is a 
technical vote of confidence and a process vote of confidence. Voting
on alphas (equivalent to a nightly) and betas would make this a very
cumbersome process. Particularly as a podling: getting 3 votes at the IPMC
is a tough process (I’ve been physically going around at a conference to
obtain votes last year). If we then get a “no you can’t have a alpha because
header XYZ is missing” it kind of defeats the purpose of having alphas
from the process side (which you are basically saying). However, it still
has a technical merit.

What would your suggestion be? I’m really afraid of getting stuck
in process and the process, to me currently, does not seem to have the merit
we are looking for*. We might have a different understanding 
what we consider to be a ‘release’ though. So open to suggestions
(also from the wider community here :) ).

Cheers
Bolke

* dont misunderstand me here please, for Releases (e.g. 1.10.0 with no extra
label) I’m quite okay.

> On 1 May 2018, at 23:51, Jakob Homan <jghoman@xxxxxxxxx> wrote:
> 
> Hey-
>   Correct, we can publish nightlies and SNAPSHOTs, but those are not
> releases.  Also, if a community votes to consider a release alpha or
> beta, it may do so (From the httpd link, "Based on the community's
> confidence in the code, the potential release is tagged as alpha, beta
> or general availability (GA) and the candidate and is voted in that
> manner."), but this is an indicator of the technical quality of the
> actual release, not the point in the release's lifecycle.
> 
>   My question is - if this  release is effectively an RC, why not
> make it officially so? What's the goal of the beta compared to an RC?
> As a mentor, I see an invitation for users to come and test some work
> that could potentially be a release.  That's what we ask for during a
> release process, along with the release manager activity, publishing
> to specified locations, etc.  It would be good to demonstrate we can
> do that well.
> 
> Thanks,
> Jakob
> 
> 
> On 1 May 2018 at 14:31, Bolke de Bruin <bdbruin@xxxxxxxxx> wrote:
>> Hi Jakob,
>> 
>> To be honest I’m confused now. In software land (and I assume you know)
>> Alpha -> Beta -> RC -> Release is well known and so well established that I would
>> be surprised if anyone got confused by that. Even the oldest project from Apache
>> have alpha-s and beta-s (https://httpd.apache.org/dev/release.html) and something
>> called GA which is equal to a release I guess.
>> 
>> If you would expect people to pick up from a git tag and build from there and then report back
>> to us, that doesn’t really happen. We are always having a challenge to have enough test surface,
>> that would diminish that surface.
>> 
>> Other projects also “publish” other than voted upon artefacts. E.g. Spark has nightly builds and SNAPSHOTS.
>> A snapshot clearly has a different state than a nightly. Apache Flink state that 1.4.2 is their latest stable release.
>> So there seems to be a “non-stable” release as well. I did see that their git repositories only mention “RC-X” tags
>> or branches.
>> 
>> Reading through https://incubator.apache.org/policy/incubation.html#releases it does not mention anywhere
>> that we need to have RCs. It just states that if you want to do a release you need to call a vote and for distribution
>> it must be at a certain location. As mentioned this is a “beta” which is not a “release”. We haven’t released it either as
>> it wasn’t voted upon and no vote was called. It was just made available for convenience of the community.
>> 
>> So I am not sure what is expected from us here. How do wo go though dev -> test -> acc -> prod release process
>> together with the community? The release process you seem to be referring is only part of the last state imho. Or
>> do we need to call a vote on every state change?
>> 
>> Cheers
>> Bolke
>> 
>> 
>>> On 1 May 2018, at 22:47, Jakob Homan <jghoman@xxxxxxxxx> wrote:
>>> 
>>> Hey Bolke-
>>>  To be clear, I'm not suggesting anyone is trying to do anything
>>> wrong.  Release wasn't mentioned, but a new tar ball with a new
>>> version number with a 'beta' tag is published in some way for people
>>> to come and test.  How is that different than the expected release/RC
>>> process (specify a git point, offer a tar ball, add an RCx tag and
>>> invite people to test that)?  Seems like a parallel process with lots
>>> of similarities that could confuse both our end users and the IPMC.
>>> 
>>> Thanks,
>>> Jakob
>>> 
>>> On 1 May 2018 at 13:08, Bolke de Bruin <bdbruin@xxxxxxxxx> wrote:
>>>> Hi Jakob,
>>>> 
>>>> Understood. But isn’t that in this case not just wording? Ie. this is a tar-ball that we think is beyond just developer testing (alpha) but more towards the enthusiasts (beta) but not a version of the tarball that is for the general public to test (RC) and not a Release (release)? Ie. is the issue in calling it a ‘release’ which in this case is just meta for a tarball? In the original email in never mentioned the word release in conjunction with the beta I think.
>>>> 
>>>> Cheers
>>>> Bolke
>>>> 
>>>> 
>>>>> On 1 May 2018, at 22:01, Jakob Homan <jghoman@xxxxxxxxx> wrote:
>>>>> 
>>>>> Hey all-
>>>>> With my Mentor hat on, I need to point out that ASF doesn't really
>>>>> have beta releases.  This work is awesome, but really needs to go
>>>>> through the proper steps.  The Release Candidate process is pretty
>>>>> well described:
>>>>> https://incubator.apache.org/policy/incubation.html#releases.  This is
>>>>> particularly important since, as was mentioned, graduation should be
>>>>> imminent and this process will be heavily scrutinized.
>>>>> 
>>>>> -Jakob
>>>>> 
>>>>> On 1 May 2018 at 12:41, James Meickle <jmeickle@xxxxxxxxxxxxxx> wrote:
>>>>>> Thanks for the pointer! I went through and set this up today, using Google
>>>>>> OAuth as the RBAC provider. Overall I'm quite enthusiastic about this move,
>>>>>> but I thought that it might be helpful to collect feedback as someone who
>>>>>> hasn't been following the overall process and is therefore coming at it
>>>>>> with fresh eyes.
>>>>>> 
>>>>>> - The Flask appbuilder security documentation is poor quality (e.g.,
>>>>>> there's some broken sentences); if Airflow is to send people there, it
>>>>>> might be worth PRing some of the docs to at least look more professional.
>>>>>> 
>>>>>> - There's not much documentation out there on how to properly set up an
>>>>>> OAuth app in Google (in my case, using the G+ API). From an adoption POV,
>>>>>> it would be good to screenshot the (current) steps in the process, and
>>>>>> point out which values should be used in which fields on Google. For
>>>>>> example, I had to grep the code base to find the callback URL.
>>>>>> 
>>>>>> - The initial login UI seems over-complex: you have to click the provider
>>>>>> icon, and then click either login or register. The standard for this
>>>>>> workflow is that you login by clicking the desired provider's icon, and
>>>>>> doing so will register you automatically if you aren't already. In my case
>>>>>> I only have one provider, so this menu was even more confusing.
>>>>>> 
>>>>>> - It was not clear to me that the "Public" role has absolutely no
>>>>>> permissions. When I set this as the default role and registered, I could no
>>>>>> longer access the site until I cleared cookies. I thought it was an OAuth
>>>>>> error at first, but it turns out the Public role has fewer effective
>>>>>> permissions than an anonymous user; this resulted in a redirect loop
>>>>>> because I could not even view the homepage. I had to correct this in the
>>>>>> database to be able to log in.
>>>>>> 
>>>>>> - The roles list (at roles/list/ ) is intimidatingly large and hard to
>>>>>> parse. For instance, I couldn't tell at a glance what "user" allows
>>>>>> relative to "viewer". It would be good to have a narrative description of
>>>>>> what each of these roles is intended for, and to present the list of
>>>>>> permissions in a more clustered or diffable way. Permissions lists tend to
>>>>>> only grow, after all.
>>>>>> 
>>>>>> - A "Viewer" currently lacks enough access to see their own profile.
>>>>>> 
>>>>>> - "User Statistics" (userstatschartview/chart/) uses the internal name,
>>>>>> rather than firstname/lastname - which in my case is a `google_idnumber`
>>>>>> name. Should probably show both names.
>>>>>> 
>>>>>> Unrelatedly to RBAC (I think), on this branch on my sandbox instance, tasks
>>>>>> appear to be failing with the only logs present in the UI as:
>>>>>> 
>>>>>> [{'end_of_log': True}, {'end_of_log': True}, {'end_of_log': True},
>>>>>> {'end_of_log': True}, {'end_of_log': True}, {'end_of_log': True}]
>>>>>> 
>>>>>> 
>>>>>> Finally, in case anyone else wanted to test run a similar setup, here is
>>>>>> the webserver_config.py that I ended up using (note that it has Jinja
>>>>>> templating via Ansible):
>>>>>> 
>>>>>> import os
>>>>>> from airflow import configuration as conf
>>>>>> from flask_appbuilder.security.manager import AUTH_OAUTH
>>>>>> basedir = os.path.abspath(os.path.dirname(__file__))
>>>>>> 
>>>>>> # The SQLAlchemy connection string.
>>>>>> SQLALCHEMY_DATABASE_URI = conf.get('core', 'SQL_ALCHEMY_CONN')
>>>>>> 
>>>>>> # Flask-WTF flag for CSRF
>>>>>> CSRF_ENABLED = True
>>>>>> 
>>>>>> # The name to display, e.g. "Airflow Staging Sandbox"
>>>>>> APP_NAME = "Airflow {{ env }} {{ app_config | capitalize }}"
>>>>>> 
>>>>>> # Use OAuth
>>>>>> AUTH_TYPE = AUTH_OAUTH
>>>>>> 
>>>>>> # Will allow user self registration
>>>>>> AUTH_USER_REGISTRATION = True
>>>>>> 
>>>>>> # The default user self registration role
>>>>>> AUTH_USER_REGISTRATION_ROLE = "{{ airflow_rbac_registration_role |
>>>>>> default('Viewer') }}"
>>>>>> 
>>>>>> # Google OAuth:
>>>>>> OAUTH_PROVIDERS = [{
>>>>>> # The name of the provider
>>>>>> 'name': 'google',
>>>>>> # The icon to use
>>>>>> 'icon': 'fa-google',
>>>>>> # The name of the key that the provider sends
>>>>>> 'token_key': 'access_token',
>>>>>> # Just in case, whitelist to only @quantopian.com emails
>>>>>> 'whitelist': ['@quantopian.com'],
>>>>>> # Define the remote app:
>>>>>> 'remote_app': {
>>>>>> 'base_url': 'https://www.googleapis.com/oauth2/v2/',
>>>>>> 'access_token_url': 'https://accounts.google.com/o/oauth2/token',
>>>>>> 'authorize_url': 'https://accounts.google.com/o/oauth2/auth',
>>>>>> 'request_token_url': None,
>>>>>> 'request_token_params': {
>>>>>> # Uses the Google+ API, requestingf the 'email' and 'profile' scope
>>>>>> 'scope': 'email profile'
>>>>>> },
>>>>>> 'consumer_key': '{{ vault_airflow_google_oauth_key }}',
>>>>>> 'consumer_secret': '{{ vault_airflow_google_oauth_secret }}'
>>>>>> }
>>>>>> }]
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> On Mon, Apr 30, 2018 at 12:54 PM, Jørn A Hansen <jornhansen@xxxxxxxxx>
>>>>>> wrote:
>>>>>> 
>>>>>>> On Mon, 30 Apr 2018 at 15.56, James Meickle <jmeickle@xxxxxxxxxxxxxx>
>>>>>>> wrote:
>>>>>>> 
>>>>>>>> Installed this off of the branch, and I do get the Kubernetes executor
>>>>>>>> (incl. demo DAG) and some bug fixes - but I don't see any RBAC feature
>>>>>>>> anywhere I'd think to look. Do I need to set up some config to get that
>>>>>>> to
>>>>>>>> show up?
>>>>>>> 
>>>>>>> 
>>>>>>> See
>>>>>>> https://github.com/apache/incubator-airflow/blob/v1-10-
>>>>>>> test/UPDATING.md#new-webserver-ui-with-role-based-access-control
>>>>>>> 
>>>>>>> It had me left wondering as well - so I decided to go hunt for it in the
>>>>>>> RBAC PR. And there it was :-)
>>>>>>> 
>>>>>>> Cheers,
>>>>>>> JornH
>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On Mon, Apr 23, 2018 at 2:06 PM, Bolke de Bruin <bdbruin@xxxxxxxxx>
>>>>>>> wrote:
>>>>>>>> 
>>>>>>>>> Hi All,
>>>>>>>>> 
>>>>>>>>> I am really happy that Fokko and I have created the v1-10-test branch
>>>>>>> and
>>>>>>>>> subsequently build the first beta of Apache Airflow 1.10!
>>>>>>>>> 
>>>>>>>>> It is available for testing here:
>>>>>>>>> 
>>>>>>>>> https://dist.apache.org/repos/dist/dev/incubator/airflow/1.10.0beta1/
>>>>>>>>> 
>>>>>>>>> Highlights include:
>>>>>>>>> 
>>>>>>>>> * New RBAC web interface in beta
>>>>>>>>> * Timezone support
>>>>>>>>> * First class kubernetes operator
>>>>>>>>> * Experimental kubernetes executor
>>>>>>>>> * Documentation improvements
>>>>>>>>> * Performance optimizations for large DAGs
>>>>>>>>> * many GCP and S3 integration improvements
>>>>>>>>> * many new operators
>>>>>>>>> * many many many bug fixes
>>>>>>>>> 
>>>>>>>>> We are aiming for a fully compliant Apache release so we should be able
>>>>>>>> to
>>>>>>>>> kick off the graduation process after this release. I hope you help us
>>>>>>>> out
>>>>>>>>> getting there!
>>>>>>>>> 
>>>>>>>>> Kind regards,
>>>>>>>>> 
>>>>>>>>> Bolke & Fokko
>>>>>>>> 
>>>>>>> 
>>>> 
>>