OSDir

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [VOTE] Apache Beam, version 2.5.0, release candidate #1


Thank you JB for your work!

I tested running simple streaming (KafkaIO) and batch (TextIO / HDFS) pipelines with SparkRunner on YARN cluster - it works fine.

WBR,
Alexey

On 8 Jun 2018, at 10:00, Etienne Chauchot <echauchot@xxxxxxxxxx> wrote:

I forgot to vote:
+1 (non binding). 
What I tested:
- no functional or performance regression comparing to v2.4
- dependencies in the poms are ok

Etienne
Le vendredi 08 juin 2018 à 08:27 +0200, Romain Manni-Bucau a écrit :
+1 (non-binding), mainstream usage is not broken by the pom changes and runtime has no known regression compared to the 2.4.0

(side note: kudo to JB for this build tool change release, I know how it can hurt ;))

Romain Manni-Bucau
@rmannibucau |  Blog | Old BlogGithub | LinkedIn | Book


Le jeu. 7 juin 2018 à 16:17, Jean-Baptiste Onofré <jb@xxxxxxxxxxxx> a écrit :
Thanks for the details Etienne !

The good news is that the artifacts seem OK and the overall Nexmark
results are consistent with the 2.4.0 release ones.

I'm starting a complete review using the beam-samples as well.

Regards
JB

On 07/06/2018 16:14, Etienne Chauchot wrote:
> Hi,
> I've just run the nexmark queries on v2.5.0-RC1 tag
> What we can notice:
> - query 3 (exercises CoGroupByKey, state and timer) shows different
> output with DR between batch and streaming and with the other runners =>
> I compared with v2.4 there were still these differences but with
> different output size numbers
>
> - query 6 (exercises specialized combiner) shows different output
> between the runners => the correct output is 401. strange that in batch
> mode some runners output les Sellers. I compared with v2.4 same output
>
> - response time of query 7 (exercices Max transform, fanout and side
> input) is very slow on DR => I compared with v2.4 , comparable execution
> times
>
> I'm not comparing q10 because it is a write to GCS so it is very specific.
>
> => Basically no regression comparing to v2.4
>
> For the record here is the output (waiting for ongoing perfkit integration):
>
>
> 1. DR batch
>
> Performance:
>  
> Conf  Runtime(sec)    (Baseline)  Events(/sec)    (Baseline)       Results    (Baseline)
>  
> 0000           5,8                     17283,1                      100000              
>  
> 0001           3,2                     31104,2                       92000              
>  
> 0002           1,2                     82918,7                         351              
>  
> 0003           2,2                     46210,7                         458              
>  
> 0004           1,2                      8503,4                          40              
>  
> 0005           4,0                     25220,7                          12              
>  
> 0006           0,9                     11148,3                         401              
>  
> 0007          13,2                      7580,9                           1              
>  
> 0008           1,5                     67340,1                        6000              
>  
> 0009           0,7                     14025,2                         298              
>  
> 0010          12,8                      7793,0                           1              
>  
> 0011           2,4                     42319,1                        1919              
>  
> 0012           1,6                     61462,8                        1919              
> ==========================================================================================
>
> 2. DR streaming
>
> Performance:
>  
> Conf  Runtime(sec)    (Baseline)  Events(/sec)    (Baseline)       Results    (Baseline)
>  
> 0000           6,5                     15285,8                      100000              
>  
> 0001           3,7                     27397,3                       92000              
>  
> 0002           1,4                     69108,5                         351              
>  
> 0003           3,2                     31181,8                         447              
>  
> 0004           1,2                      8361,2                          40              
>  
> 0005           5,3                     18903,6                          12              
>  
> 0006           0,9                     11111,1                         401              
>  
> 0007          82,5                      1212,2                           1              
>  
> 0008           2,0                     51072,5                        6000              
>  
> 0009           0,8                     12903,2                         298              
>  
> 0010          49,5                      2021,8                           1              
>  
> 0011           3,9                     25667,4                        1919              
>  
> 0012           2,4                     41067,8                        1919              
> ==========================================================================================
>
> 3. Flink batch
> Performance:
>  
> Conf  Runtime(sec)    (Baseline)  Events(/sec)    (Baseline)       Results    (Baseline)
>  
> 0000           1,0                     97656,3                      100000              
>  
> 0001           0,7                    141643,1                       92000              
>  
> 0002           0,4                    228310,5                         351              
>  
> 0003           1,6                     64020,5                         580              
>  
> 0004           0,7                     13831,3                          40              
>  
> 0005           1,4                     72939,5                          12              
>  
> 0006           0,5                     20491,8                         103              
>  
> 0007           1,3                     74239,0                           1              
>  
> 0008           0,8                    121506,7                        6000              
>  
> 0009           0,6                     17953,3                         298              
>  
> 0010           1,3                     74682,6                           1              
>  
> 0011           1,1                     92936,8                        1919              
>  
> 0012           0,8                    123001,2                        1919              
> ==========================================================================================
>
> 4. Flink streaming
> Performance:
>  
> Conf  Runtime(sec)    (Baseline)  Events(/sec)    (Baseline)       Results    (Baseline)
>  
> 0000           5,4                     18677,6                      100000              
>  
> 0001           2,8                     35511,4                       92000              
>  
> 0002           1,8                     54318,3                         351              
>  
> 0003           2,4                     41614,6                         580              
>  
> 0004           1,0                     10341,3                          40              
>  
> 0005           3,4                     29568,3                          12              
>  
> 0006           0,7                     13369,0                         401              
>  
> 0007           2,8                     36192,5                           1              
>  
> 0008           1,8                     54854,6                        6000              
>  
> 0009           0,7                     13369,0                         298              
>  
> 0010           3,4                     29841,8                           2              
>  
> 0011           5,0                     19932,2                        1919              
>  
> 0012           2,6                     38835,0                        1919              
> ==========================================================================================
>
> 5. Spark batch
> Performance:
>  
> Conf  Runtime(sec)    (Baseline)  Events(/sec)    (Baseline)       Results    (Baseline)
>  
> 0000           1,5                     65445,0                      100000              
>  
> 0001           1,3                     79491,3                       92000              
>  
> 0002           0,9                    112107,6                         351              
>  
> 0003           2,0                     48804,3                         580              
>  
> 0004           1,2                      8382,2                          40              
>  
> 0005           2,0                     50838,8                          12              
>  
> 0006           1,0                      9699,3                         103              
>  
> 0007           2,3                     43308,8                           1              
>  
> 0008           2,1                     46794,6                        6000              
>  
> 0009           1,1                      8976,7                         298              
>  
> 0010           1,6                     62111,8                           1              
>  
> 0011           2,1                     46598,3                        1919              
>  
> 0012           2,3                     43687,2                        1919              
> ==========================================================================================
>
> Le mercredi 06 juin 2018 à 10:50 +0200, Etienne Chauchot a écrit :
>> Thanks JB for all your work ! I believe doing the first gradle release
>> must have been hard.
>> I'll run Nexmark on the release and keep you posted.
>>
>> Best 
>> Etienne
>>
>>
>> Le mercredi 06 juin 2018 à 10:44 +0200, Jean-Baptiste Onofré a écrit :
>>> Hi everyone,
>>>
>>> Please review and vote on the release candidate #1 for the version
>>> 2.5.0, as follows:
>>>
>>> [ ] +1, Approve the release
>>> [ ] -1, Do not approve the release (please provide specific comments)
>>>
>>> NB: this is the first release using Gradle, so don't be too harsh ;) A
>>> PR about the release guide will follow thanks to this release.
>>>
>>> The complete staging area is available for your review, which includes:
>>> * JIRA release notes [1],
>>> * the official Apache source release to be deployed to dist.apache.org
>>> [2], which is signed with the key with fingerprint C8282E76 [3],
>>> * all artifacts to be deployed to the Maven Central Repository [4],
>>> * source code tag "v2.5.0-RC1" [5],
>>> * website pull request listing the release and publishing the API
>>> reference manual [6].
>>> * Java artifacts were built with Gradle 4.7 (wrapper) and OpenJDK/Oracle
>>> JDK 1.8.0_172 (Oracle Corporation 25.172-b11).
>>> * Python artifacts are deployed along with the source release to the
>>> dist.apache.org [2].
>>>
>>> The vote will be open for at least 72 hours. It is adopted by majority
>>> approval, with at least 3 PMC affirmative votes.
>>>
>>> Thanks,
>>> JB
>>>
>>> [1]
>>> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527&version=12342847
>>> [2] https://dist.apache.org/repos/dist/dev/beam/2.5.0/
>>> [3] https://dist.apache.org/repos/dist/release/beam/KEYS
>>> [4] https://repository.apache.org/content/repositories/orgapachebeam-1041/
>>> [5] https://github.com/apache/beam/tree/v2.5.0-RC1
>>> [6] https://github.com/apache/beam-site/pull/463
>>>