OSDir

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [VOTE] Apache Beam, version 2.5.0, release candidate #1


Hi,
I've just run the nexmark queries on v2.5.0-RC1 tag
What we can notice:
- query 3 (exercises CoGroupByKey, state and timer) shows different output with DR between batch and streaming and with the other runners => I compared with v2.4 there were still these differences but with different output size numbers

- query 6 (exercises specialized combiner) shows different output between the runners => the correct output is 401. strange that in batch mode some runners output les Sellers. I compared with v2.4 same output

- response time of query 7 (exercices Max transform, fanout and side input) is very slow on DR => I compared with v2.4 , comparable execution times

I'm not comparing q10 because it is a write to GCS so it is very specific.

=> Basically no regression comparing to v2.4

For the record here is the output (waiting for ongoing perfkit integration):


1. DR batch

Performance:
  Conf  Runtime(sec)    (Baseline)  Events(/sec)    (Baseline)       Results    (Baseline)
  0000           5,8                     17283,1                      100000              
  0001           3,2                     31104,2                       92000              
  0002           1,2                     82918,7                         351              
  0003           2,2                     46210,7                         458              
  0004           1,2                      8503,4                          40              
  0005           4,0                     25220,7                          12              
  0006           0,9                     11148,3                         401              
  0007          13,2                      7580,9                           1              
  0008           1,5                     67340,1                        6000              
  0009           0,7                     14025,2                         298              
  0010          12,8                      7793,0                           1              
  0011           2,4                     42319,1                        1919              
  0012           1,6                     61462,8                        1919              
==========================================================================================

2. DR streaming

Performance:
  Conf  Runtime(sec)    (Baseline)  Events(/sec)    (Baseline)       Results    (Baseline)
  0000           6,5                     15285,8                      100000              
  0001           3,7                     27397,3                       92000              
  0002           1,4                     69108,5                         351              
  0003           3,2                     31181,8                         447              
  0004           1,2                      8361,2                          40              
  0005           5,3                     18903,6                          12              
  0006           0,9                     11111,1                         401              
  0007          82,5                      1212,2                           1              
  0008           2,0                     51072,5                        6000              
  0009           0,8                     12903,2                         298              
  0010          49,5                      2021,8                           1              
  0011           3,9                     25667,4                        1919              
  0012           2,4                     41067,8                        1919              
==========================================================================================

3. Flink batch
Performance:
  Conf  Runtime(sec)    (Baseline)  Events(/sec)    (Baseline)       Results    (Baseline)
  0000           1,0                     97656,3                      100000              
  0001           0,7                    141643,1                       92000              
  0002           0,4                    228310,5                         351              
  0003           1,6                     64020,5                         580              
  0004           0,7                     13831,3                          40              
  0005           1,4                     72939,5                          12              
  0006           0,5                     20491,8                         103              
  0007           1,3                     74239,0                           1              
  0008           0,8                    121506,7                        6000              
  0009           0,6                     17953,3                         298              
  0010           1,3                     74682,6                           1              
  0011           1,1                     92936,8                        1919              
  0012           0,8                    123001,2                        1919              
==========================================================================================

4. Flink streaming
Performance:
  Conf  Runtime(sec)    (Baseline)  Events(/sec)    (Baseline)       Results    (Baseline)
  0000           5,4                     18677,6                      100000              
  0001           2,8                     35511,4                       92000              
  0002           1,8                     54318,3                         351              
  0003           2,4                     41614,6                         580              
  0004           1,0                     10341,3                          40              
  0005           3,4                     29568,3                          12              
  0006           0,7                     13369,0                         401              
  0007           2,8                     36192,5                           1              
  0008           1,8                     54854,6                        6000              
  0009           0,7                     13369,0                         298              
  0010           3,4                     29841,8                           2              
  0011           5,0                     19932,2                        1919              
  0012           2,6                     38835,0                        1919              
==========================================================================================

5. Spark batch
Performance:
  Conf  Runtime(sec)    (Baseline)  Events(/sec)    (Baseline)       Results    (Baseline)
  0000           1,5                     65445,0                      100000              
  0001           1,3                     79491,3                       92000              
  0002           0,9                    112107,6                         351              
  0003           2,0                     48804,3                         580              
  0004           1,2                      8382,2                          40              
  0005           2,0                     50838,8                          12              
  0006           1,0                      9699,3                         103              
  0007           2,3                     43308,8                           1              
  0008           2,1                     46794,6                        6000              
  0009           1,1                      8976,7                         298              
  0010           1,6                     62111,8                           1              
  0011           2,1                     46598,3                        1919              
  0012           2,3                     43687,2                        1919              
==========================================================================================

Le mercredi 06 juin 2018 à 10:50 +0200, Etienne Chauchot a écrit :
Thanks JB for all your work ! I believe doing the first gradle release must have been hard.
I'll run Nexmark on the release and keep you posted.

Best 
Etienne


Le mercredi 06 juin 2018 à 10:44 +0200, Jean-Baptiste Onofré a écrit :
Hi everyone,

Please review and vote on the release candidate #1 for the version
2.5.0, as follows:

[ ] +1, Approve the release
[ ] -1, Do not approve the release (please provide specific comments)

NB: this is the first release using Gradle, so don't be too harsh ;) A
PR about the release guide will follow thanks to this release.

The complete staging area is available for your review, which includes:
* JIRA release notes [1],
* the official Apache source release to be deployed to dist.apache.org
[2], which is signed with the key with fingerprint C8282E76 [3],
* all artifacts to be deployed to the Maven Central Repository [4],
* source code tag "v2.5.0-RC1" [5],
* website pull request listing the release and publishing the API
reference manual [6].
* Java artifacts were built with Gradle 4.7 (wrapper) and OpenJDK/Oracle
JDK 1.8.0_172 (Oracle Corporation 25.172-b11).
* Python artifacts are deployed along with the source release to the
dist.apache.org [2].

The vote will be open for at least 72 hours. It is adopted by majority
approval, with at least 3 PMC affirmative votes.

Thanks,
JB

[1]
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527&version=12342847
[2] https://dist.apache.org/repos/dist/dev/beam/2.5.0/
[3] https://dist.apache.org/repos/dist/release/beam/KEYS
[4] https://repository.apache.org/content/repositories/orgapachebeam-1041/
[5] https://github.com/apache/beam/tree/v2.5.0-RC1
[6] https://github.com/apache/beam-site/pull/463