As suggested by Anton bellow, I opened a PR on the website to reference the Nexmark dashboards.
As I did not want users to take them for proper neutral benchmarks of the runners / engines, but more for a CI piece of software, I added a disclaimer.
- tell if you agree on the publication of such performance results
- comment on the PR for the disclaimer.
Le jeudi 19 juillet 2018 à 12:30 +0200, Etienne Chauchot a écrit :
Yes, good idea, I'll update nexmark website page
Le mercredi 18 juillet 2018 à 10:17 -0700, Anton Kedin a écrit :
These dashboards look great!
Can publish the links to the dashboards somewhere, for better visibility? E.g. in the jenkins website / emails, or the wiki.
I've been asking around and it sounds like we should be able to get a dedicated Jenkins node for performance tests. Another thing that might help is making the runs a few times longer. They are currently running around 2 seconds each, so the total time of the build probably exceeds testing. Internally at Google we are running them with 2000x as many events on Dataflow, but a job of that size won't even complete on the Direct Runner.
I didn't see the query 3 issues, but now that you point it out it looks like a bug to me too.
Yes I saw that, except dedicating jenkins nodes to nexmark, I see no other way.
Also, did you see query 3 output size on direct runner? Should be a straight line and it is not, I'm wondering if there is a problem with sate and timers impl in direct runner.
Le mardi 17 juillet 2018 à 11:38 -0700, Andrew Pilloud a écrit :
I'm noticing the graphs are really noisy. It looks like we are running these on shared Jenkins executors, so our perf tests are fighting with other builds for CPU. I've opened an issue https://issues.apache.org/jira/browse/BEAM-4804
and am wondering if anyone knows an easy fix to isolate these jobs.
@Etienne: Nice to see the graphs! :)
@Ismael: Good idea, there's no document yet. I think we could create a small google doc with instructions on how to do this.
@Andrew, this is because I did not find a way to set 2 scales on the Y axis on the perfkit graphs. Indeed numResults varies from 1 to 100 000 and runtimeSec is usually bellow 10s.
Le jeudi 12 juillet 2018 à 12:04 -0700, Andrew Pilloud a écrit :
This is great, should make performance work much easier! I'm going to get the Beam SQL Nexmark jobs publishing as well. (Opened https://issues.apache.org/jira/browse/BEAM-4774
to track.) I might take on the Dataflow runner as well if no one else volunteers.
I am curious as to why you have two separate graphs for runtime and count rather then graphing runtime/count to get the throughput rate for each run? Or should that be a third graph? Looks like it would just be a small tweak to the query in perfkit.
This is really cool Etienne : ) thanks for working on this.
Our of curiosity, do you know how often the tests run on each runner?
Awesome Etienne, this is really important for the (user) community to have that visibility since it is one of the most important aspect of the Beam's quality, kudo!