osdir.com


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [VOTE] Release 2.8.0, release candidate #1




On Mon, Oct 29, 2018 at 12:40 PM, Ismaël Mejía <iemejia@xxxxxxxxx> wrote:
From the Apache point of view nothing impedes anyone from doing
intermediate releases for non LTS releases, only needed thing is
someone willing to do the release and the due vote process.

Agreed. I was not suggesting not doing a release. I wanted to understand cost benefit.
 

I don’t know however how will we decide this, we are exactly in the
middle of the release cycle and in 3 weeks we will be cutting the next
version so not sure if it is worth, any thoughts?

My suggestion is to look from a user perspective. Are we affecting a significant chunk of users? And could those stay on 2.7 until we release 2.9? From there we can decide whether this warrants a patch release or not. I do not have information on how large of a user base we are affecting. I assume the answer to the second question is yes and we can suggest them to stay on 2.7 until a new release is out. From that perspective, I would suggest skipping a patch release and waiting for the next regular release.
 

On Mon, Oct 29, 2018 at 6:08 PM Ahmet Altay <altay@xxxxxxxxxx> wrote:
>
>
>
> On Mon, Oct 29, 2018 at 8:55 AM, Kenneth Knowles <kenn@xxxxxxxxxx> wrote:
>>
>> I think definitely open a cherry pick PR to a 2.8.x branch. I think we must not corrupt maven central, so if it is published to users this has to be 2.8.1. Ahmet - we are to this point, right?
>
>
> Yes, if someone is willing to make a new release this would be 2.8.1 release. (2.8.0 is already on Maven central.)
>
> Side question about the initial LTS discussion. We have decided to not make 2.8.0 a LTS release. Should we wait until next release to patch this issue? What is the cost/benefit of maintaining this branch?
>
>>
>>
>> Kenn
>>
>> On Mon, Oct 29, 2018 at 8:40 AM Ismaël Mejía <iemejia@xxxxxxxxx> wrote:
>>>
>>> First thanks Etienne and Kenn for noting the performance issue. I
>>> reviewed the discussed PR.It introduced a new ‘@Experimental’ option
>>> to the Spark runner to change the default source partitioning and
>>> enable users to control it via a predefined size (a prerrequisite for
>>> Spark’s dynamicAllocation).
>>>
>>> This however must not be the default behavior, it seems after looking
>>> at the PR that things are not as expected and the default is now the
>>> new behavior. I will provide a PR to fix this quickly. However the
>>> question is, should I do cherry pick it and we do a new RC (since the
>>> release was already 'passed') ?
>>> On Mon, Oct 29, 2018 at 2:51 PM Kenneth Knowles <kenn@xxxxxxxxxx> wrote:
>>> >
>>> > I didn't isolate it to a cause and commit, so that is extremely useful to know. To bring some details on thread:
>>> >
>>> > query 4: a single aggregation in sliding windows
>>> > query 8: a single join with no other interesting logic
>>> > query 9 (prefix of query 6*): find the winning bid for each auction
>>> > query 6: query 9 followed by a single aggregation
>>> >
>>> > Kenn
>>> >
>>> > * they seem out of order because the original queries were 1-8 and we added 9 later to benchmark the baseline without the aggregation
>>> >
>>> > On Mon, Oct 29, 2018 at 3:28 AM Etienne Chauchot <echauchot@xxxxxxxxxx> wrote:
>>> >>
>>> >> Oops, just saw than Kenn already mentioned spark perf degradation on spark runner around 10/05. Sorry for the repetition.
>>> >> Nevertheless, IMHO, I think it will be still worth checking PR #6181.
>>> >>
>>> >> Etienne
>>> >>
>>> >> Le lundi 29 octobre 2018 à 10:42 +0100, Etienne Chauchot a écrit :
>>> >>
>>> >> Hey,
>>> >> I would vote -0 : here is the explanation:
>>> >>
>>> >> I took a look at Nexmark dashboards for output size and performance for all the runners in all the modes around the date of the release cut to search for regressions.
>>> >>
>>> >> I noted a regression on the performance of the spark runner. Query4, Query6, Query8 and Query9 running times were multiplied by 2 to 3 around the date of 10/05/18. See https://apache-beam-testing.appspot.com/explore?dashboard=5138380291571712
>>> >> So I searched in the commit history of the spark runner module for what happened around 10/05/18. And I found this commit
>>> >>
>>> >> e4a1ccbaa10808d88c6ad2a687fe9f6d52392d90: Merge pull request #6181: [BEAM-4783] Add bundleSize for splitting BoundedSources
>>> >>
>>> >> I don't know if it should be considered a blocker but we should definitely take another look at pull request #6181 that seems to change the way we split on spark runner.
>>> >>
>>> >> Best
>>> >> Etienne
>>> >>
>>> >>
>>> >> Le vendredi 26 octobre 2018 à 18:20 +0200, Maximilian Michels a écrit :
>>> >>
>>> >> +1 (binding)
>>> >>
>>> >>
>>> >> On 26.10.18 17:45, Kenneth Knowles wrote:
>>> >>
>>> >> Nice. Thanks.
>>> >>
>>> >>
>>> >> +1
>>> >>
>>> >>
>>> >>
>>> >> On Fri, Oct 26, 2018 at 8:44 AM Robert Bradshaw <robertwb@xxxxxxxxxx
>>> >>
>>> >> <mailto:robertwb@xxxxxxxxxx>> wrote:
>>> >>
>>> >>
>>> >>     Thanks Tim!
>>> >>
>>> >>
>>> >>     This was my only hesitation, and sounds like we're in the clear here.
>>> >>
>>> >>
>>> >>     +1 (binding)
>>> >>
>>> >>     On Fri, Oct 26, 2018 at 5:05 PM Tim Robertson
>>> >>
>>> >>     <timrobertson100@xxxxxxxxx <mailto:timrobertson100@gmail.com>> wrote:
>>> >>
>>> >>      >
>>> >>
>>> >>      > A colleague and I tested on 2.7.0 and 2.8.0RC1:
>>> >>
>>> >>      >
>>> >>
>>> >>      > 1. Quickstart on Spark/YARN/HDFS (CDH 5.12.0) (commented in
>>> >>
>>> >>     spreadsheet)
>>> >>
>>> >>      > 2. Our Avro to Avro pipelines on Spark/YARN/HDFS (note we
>>> >>
>>> >>     backport the un-merged BEAM-5036 fix in our code)
>>> >>
>>> >>      > 3. Our Avro to Elasticsearch pipelines on Spark/YARN/HDFS
>>> >>
>>> >>      >
>>> >>
>>> >>      > Everything worked, and performance was similar on both.
>>> >>
>>> >>      > We built using maven pointing at
>>> >>
>>> >>     https://repository.apache.org/content/repositories/orgapachebeam-1049/
>>> >>
>>> >>      >
>>> >>
>>> >>      > Based on this limited testing: +1
>>> >>
>>> >>      >
>>> >>
>>> >>      > Thank you to the release managers,
>>> >>
>>> >>      > Tim
>>> >>
>>> >>      >
>>> >>
>>> >>      >
>>> >>
>>> >>      > On Thu, Oct 25, 2018 at 7:21 PM Tim <timrobertson100@xxxxxxxxx
>>> >>
>>> >>     <mailto:timrobertson100@gmail.com>> wrote:
>>> >>
>>> >>      >>
>>> >>
>>> >>      >> I can do some tests on Spark / YARN tomorrow (CEST timezone).
>>> >>
>>> >>     Sorry I’ve just been too busy to assist.
>>> >>
>>> >>      >>
>>> >>
>>> >>      >> Tim
>>> >>
>>> >>      >>
>>> >>
>>> >>      >> On 25 Oct 2018, at 18:59, Kenneth Knowles <kenn@xxxxxxxxxx
>>> >>
>>> >>     <mailto:kenn@xxxxxxxxxx>> wrote:
>>> >>
>>> >>      >>
>>> >>
>>> >>      >> I tried to do a more thorough job on this.
>>> >>
>>> >>      >>
>>> >>
>>> >>      >>  - I could not reproduce the slowdown in Query 9. I believe the
>>> >>
>>> >>     variance was simply high given the parameters and environment
>>> >>
>>> >>      >>  - I saw the same slowdown in Query 8 when running as part of
>>> >>
>>> >>     the suite, but it vanished when I ran repeatedly on its own, so
>>> >>
>>> >>     again it is not good methodology probably
>>> >>
>>> >>      >>
>>> >>
>>> >>      >> We do have the dashboard at
>>> >>
>>> >>     https://apache-beam-testing.appspot.com/dashboard-admin though no
>>> >>
>>> >>     anomaly detection set up AFAIK.
>>> >>
>>> >>      >>
>>> >>
>>> >>      >>  - There is no issue easily visible in DirectRunner:
>>> >>
>>> >>     https://apache-beam-testing.appspot.com/explore?dashboard=5084698770407424
>>> >>
>>> >>      >>  - There is a notable degradation in Spark runner on 10/5 for
>>> >>
>>> >>     many queries.
>>> >>
>>> >>     https://apache-beam-testing.appspot.com/explore?dashboard=5138380291571712
>>> >>
>>> >>      >>  - Something minor happened for Dataflow around 10/1:
>>> >>
>>> >>     https://apache-beam-testing.appspot.com/explore?dashboard=5670405876482048
>>> >>
>>> >>      >>  - Flink runner seems to have had some fantastic improvements
>>> >>
>>> >>     :-)
>>> >>
>>> >>     https://apache-beam-testing.appspot.com/explore?dashboard=5699257587728384
>>> >>
>>> >>      >>
>>> >>
>>> >>      >> So if there is a blocker it would really be the Spark runner
>>> >>
>>> >>     perf changes. Of course, all these except Dataflow are using local
>>> >>
>>> >>     instances so may not be representative of larger scale AFAIK.
>>> >>
>>> >>      >>
>>> >>
>>> >>      >> Kenn
>>> >>
>>> >>      >>
>>> >>
>>> >>      >> On Wed, Oct 24, 2018 at 9:48 AM Maximilian Michels
>>> >>
>>> >>     <mxm@xxxxxxxxxx <mailto:mxm@xxxxxxxxxx>> wrote:
>>> >>
>>> >>      >>>
>>> >>
>>> >>      >>> I've run WordCount using Quickstart with the FlinkRunner
>>> >>
>>> >>     (locally and
>>> >>
>>> >>      >>> against a Flink cluster).
>>> >>
>>> >>      >>>
>>> >>
>>> >>      >>> Would give a +1 but waiting what Kenn finds.
>>> >>
>>> >>      >>>
>>> >>
>>> >>      >>> -Max
>>> >>
>>> >>      >>>
>>> >>
>>> >>      >>> On 23.10.18 07:11, Ahmet Altay wrote:
>>> >>
>>> >>      >>> >
>>> >>
>>> >>      >>> >
>>> >>
>>> >>      >>> > On Mon, Oct 22, 2018 at 10:06 PM, Kenneth Knowles
>>> >>
>>> >>     <kenn@xxxxxxxxxx <mailto:kenn@xxxxxxxxxx>
>>> >>
>>> >>      >>> > <mailto:kenn@xxxxxxxxxx <mailto:kenn@xxxxxxxxxx>>> wrote:
>>> >>
>>> >>      >>> >
>>> >>
>>> >>      >>> >     You two did so much verification I had a hard time
>>> >>
>>> >>     finding something
>>> >>
>>> >>      >>> >     where my help was meaningful! :-)
>>> >>
>>> >>      >>> >
>>> >>
>>> >>      >>> >     I did run the Nexmark suite on the DirectRunner against
>>> >>
>>> >>     2.7.0 and
>>> >>
>>> >>      >>> >     2.8.0 following
>>> >>
>>> >>      >>> >
>>> >>
>>> >>     https://beam.apache.org/documentation/sdks/java/nexmark/#running-smoke-suite-on-the-directrunner-local
>>> >>
>>> >>      >>> >
>>> >>
>>> >>       <https://beam.apache.org/documentation/sdks/java/nexmark/#running-smoke-suite-on-the-directrunner-local>.
>>> >>
>>> >>      >>> >
>>> >>
>>> >>      >>> >     It is admittedly a very silly test - the instructions leave
>>> >>
>>> >>      >>> >     immutability enforcement on, etc. But it does appear that
>>> >>
>>> >>     there is a
>>> >>
>>> >>      >>> >     30% degradation in query 8 and 15% in query 9. These are
>>> >>
>>> >>     the pure
>>> >>
>>> >>      >>> >     Java tests, not the SQL variants. The rest of the queries
>>> >>
>>> >>     are close
>>> >>
>>> >>      >>> >     enough that differences are not meaningful.
>>> >>
>>> >>      >>> >
>>> >>
>>> >>      >>> >
>>> >>
>>> >>      >>> > (It would be a good improvement for us to have alerts on daily
>>> >>
>>> >>      >>> > benchmarks if we do not have such a concept already.)
>>> >>
>>> >>      >>> >
>>> >>
>>> >>      >>> >
>>> >>
>>> >>      >>> >     I would ask a little more time to see what is going on
>>> >>
>>> >>     here - is it
>>> >>
>>> >>      >>> >     a real performance issue or an artifact of how the tests are
>>> >>
>>> >>      >>> >     invoked, or ...?
>>> >>
>>> >>      >>> >
>>> >>
>>> >>      >>> >
>>> >>
>>> >>      >>> > Thank you! Much appreciated. Please let us know when you are
>>> >>
>>> >>     done with
>>> >>
>>> >>      >>> > your investigation.
>>> >>
>>> >>      >>> >
>>> >>
>>> >>      >>> >
>>> >>
>>> >>      >>> >     Kenn
>>> >>
>>> >>      >>> >
>>> >>
>>> >>      >>> >     On Mon, Oct 22, 2018 at 6:20 PM Ahmet Altay
>>> >>
>>> >>     <altay@xxxxxxxxxx <mailto:altay@xxxxxxxxxx>
>>> >>
>>> >>      >>> >     <mailto:altay@xxxxxxxxxx <mailto:altay@xxxxxxxxxx>>> wrote:
>>> >>
>>> >>      >>> >
>>> >>
>>> >>      >>> >         Hi all,
>>> >>
>>> >>      >>> >
>>> >>
>>> >>      >>> >         Did you have a chance to review this RC? Between me
>>> >>
>>> >>     and Robert
>>> >>
>>> >>      >>> >         we ran a significant chunk of the validations. Let me
>>> >>
>>> >>     know if
>>> >>
>>> >>      >>> >         you have any questions.
>>> >>
>>> >>      >>> >
>>> >>
>>> >>      >>> >         Ahmet
>>> >>
>>> >>      >>> >
>>> >>
>>> >>      >>> >         On Thu, Oct 18, 2018 at 5:26 PM, Ahmet Altay
>>> >>
>>> >>     <altay@xxxxxxxxxx <mailto:altay@xxxxxxxxxx>
>>> >>
>>> >>      >>> >         <mailto:altay@xxxxxxxxxx <mailto:altay@xxxxxxxxxx>>>
>>> >>
>>> >>     wrote:
>>> >>
>>> >>      >>> >
>>> >>
>>> >>      >>> >             Hi everyone,
>>> >>
>>> >>      >>> >
>>> >>
>>> >>      >>> >             Please review and vote on the release candidate
>>> >>
>>> >>     #1 for the
>>> >>
>>> >>      >>> >             version 2.8.0, as follows:
>>> >>
>>> >>      >>> >             [ ] +1, Approve the release
>>> >>
>>> >>      >>> >             [ ] -1, Do not approve the release (please
>>> >>
>>> >>     provide specific
>>> >>
>>> >>      >>> >             comments)
>>> >>
>>> >>      >>> >
>>> >>
>>> >>      >>> >             The complete staging area is available for your
>>> >>
>>> >>     review,
>>> >>
>>> >>      >>> >             which includes:
>>> >>
>>> >>      >>> >             * JIRA release notes [1],
>>> >>
>>> >>      >>> >             * the official Apache source release to be
>>> >>
>>> >>     deployed to
>>> >>
>>> >>      >>> > dist.apache.org <http://dist.apache.org>
>>> >>
>>> >>     <http://dist.apache.org> [2], which is
>>> >>
>>> >>      >>> >             signed with the key with fingerprint 6096FA00 [3],
>>> >>
>>> >>      >>> >             * all artifacts to be deployed to the Maven Central
>>> >>
>>> >>      >>> >             Repository [4],
>>> >>
>>> >>      >>> >             * source code tag "v2.8.0-RC1" [5],
>>> >>
>>> >>      >>> >             * website pull request listing the release and
>>> >>
>>> >>     publishing
>>> >>
>>> >>      >>> >             the API reference manual [6].
>>> >>
>>> >>      >>> >             * Python artifacts are deployed along with the source
>>> >>
>>> >>      >>> >             release to the dist.apache.org
>>> >>
>>> >>     <http://dist.apache.org> <http://dist.apache.org> [2].
>>> >>
>>> >>      >>> >             * Validation sheet with a tab for 2.8.0 release
>>> >>
>>> >>     to help with
>>> >>
>>> >>      >>> >             validation [7].
>>> >>
>>> >>      >>> >
>>> >>
>>> >>      >>> >             The vote will be open for at least 72 hours. It
>>> >>
>>> >>     is adopted
>>> >>
>>> >>      >>> >             by majority approval, with at least 3 PMC
>>> >>
>>> >>     affirmative votes.
>>> >>
>>> >>      >>> >
>>> >>
>>> >>      >>> >             Thanks,
>>> >>
>>> >>      >>> >             Ahmet
>>> >>
>>> >>      >>> >
>>> >>
>>> >>      >>> >             [1]
>>> >>
>>> >>      >>> >
>>> >>
>>> >>     https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527&version=12343985
>>> >>
>>> >>      >>> >
>>> >>
>>> >>       <https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527&version=12343985>
>>> >>
>>> >>      >>> >             [2] https://dist.apache.org/repos/dist/dev/beam/2.8.0
>>> >>
>>> >>      >>> >             <https://dist.apache.org/repos/dist/dev/beam/2.8.0>
>>> >>
>>> >>      >>> >             [3] https://dist.apache.org/repos/dist/dev/beam/KEYS
>>> >>
>>> >>      >>> >             <https://dist.apache.org/repos/dist/dev/beam/KEYS>
>>> >>
>>> >>      >>> >             [4]
>>> >>
>>> >>      >>> >
>>> >>
>>> >>     https://repository.apache.org/content/repositories/orgapachebeam-1049/
>>> >>
>>> >>      >>> >
>>> >>
>>> >>       <https://repository.apache.org/content/repositories/orgapachebeam-1049/>
>>> >>
>>> >>      >>> >             [5] https://github.com/apache/beam/tree/v2.8.0-RC1
>>> >>
>>> >>      >>> >             <https://github.com/apache/beam/tree/v2.8.0-RC1>
>>> >>
>>> >>      >>> >             [6] https://github.com/apache/beam-site/pull/583
>>> >>
>>> >>      >>> >             <https://github.com/apache/beam-site/pull/583> and
>>> >>
>>> >>      >>> > https://github.com/apache/beam/pull/6745
>>> >>
>>> >>      >>> >             <https://github.com/apache/beam/pull/6745>
>>> >>
>>> >>      >>> >             [7]
>>> >>
>>> >>      >>> >
>>> >>
>>> >>     https://docs.google.com/spreadsheets/d/1qk-N5vjXvbcEk68GjbkSZTR8AGqyNUM-oLFo_ZXBpJw/edit#gid=1854712816
>>> >>
>>> >>      >>> >
>>> >>
>>> >>       <https://docs.google.com/spreadsheets/d/1qk-N5vjXvbcEk68GjbkSZTR8AGqyNUM-oLFo_ZXBpJw/edit#gid=1854712816>
>>> >>
>>> >>      >>> >
>>> >>
>>> >>      >>> >
>>> >>
>>> >>      >>> >
>>> >>
>>> >>
>
>