[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [ANNOUNCEMENT] Nexmark included to the CI

Hi Kai,

Cool for TPC-H, it will be complementary to Nexmark.
Regarding Dataflow we can run Nexmark on dataflow, note that it was the original target of Nexmark port by Mark. We have not done it because we have no DF environment available.

Is someone from google willing to run nexmark on dataflow and add the postCommit script and the perfkit dashboards?



Le mercredi 11 juillet 2018 à 20:11 -0700, Kai Jiang a écrit :
Hi Etienne,

It's awesome for working on these useful dashboards. I am getting TPC-H benchmark running on Flink and Dataflow Runner. I could work on similar dashboards for TPC benchmark after code merged.
Also, it's great to have a dashboards for Dataflow.


On Wed, Jul 11, 2018 at 6:35 AM Etienne Chauchot <echauchot@xxxxxxxxxx> wrote:
First catch of the nexmark-CI:
It seems that there was a change in the direct runner.

Query3 (exercise state and timers) 
- output size should be constant but has increased today => Was there a change in state and timer related code?
- the output size of this query is different between batch and streaming modes on direct runner.


Le mercredi 11 juillet 2018 à 15:25 +0200, Etienne Chauchot a écrit :
Is someone interested in creating the scripts and dashboards for the other runners? They can be created by copying the existing scripts and dashboards and changing one gradle parameter in the scripts and the table name in the dashboards.

I have created the tickets:

Le mercredi 11 juillet 2018 à 15:13 +0200, Etienne Chauchot a écrit :

Hi guys,

I'm glad to announce that the CI of Beam has much improved ! Indeed Nexmark is now included in the perfkit dashboards.

At each commit on master, nexmark suites are run and plots are created on the graphs.

I've created 2 kind of dashboards:
- one for performances (run times of the queries)
- one for the size of the output PCollection (which should be constant)

There are dashboards for these runners:
- spark
- flink
- direct runner

Each dashboard contains:
- graphs in batch mode 
- graphs in streaming mode
- graphs for the 13 queries.

That gives more than a hundred of graphs (my right finger hurts after so many clics on the mouse :) ). It is detailed that much so that anyone can focus on the area they have interest in.
Feel free to also create new dashboards with more aggregated data.

Thanks to Lukasz and Cham for reviewing my PRs and showing how to use perfkit dashboards.

Dashboards are there: