[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Codespeed deployment for Flink

Hello community,

For almost a year in data Artisans Nico and I were maintaining a setup
that continuously evaluates Flink with benchmarks defined at
https://github.com/dataArtisans/flink-benchmarks <https://github.com/dataArtisans/flink-benchmarks>. With growing interest
and after proving useful a couple of times, we have finally decided to
publish the web UI layer of this setup. Currently it is accessible via
the following (maybe not so?) temporarily url:

http://codespeed.dak8s.net:8000 <http://codespeed.dak8s.net:8000/>

This is a simple web UI to present performance changes over past and
present commits to Apache Flink. It only has a couple of views and the
most useful ones are:

1. Timeline
2. Comparison (I recommend to use normalization)

Timeline is useful for spotting unintended regressions or unexpected
improvements. It is being updated every six hours.
Comparison is useful for comparing a given branch (for example a pending
PR) with the master branch. More about that later.

The codespeed project on it’s own is just a presentation layer. As
mentioned before, the only currently available benchmarks are defined in
the flink-benchmarks repository and they are executed periodically or on
demand by Jenkins on a single bare metal machine. The current setup
limits us only to micro benchmarks (they are easier to
setup/develop/maintain and have a quicker feedback loop compared to
cluster benchmarks) but there is no reason preventing us from setting up 
other kinds of benchmarks and upload their results to our codespeed 
instance as well.

Regarding the comparison view. Currently data Artisans’ Flink mirror
repository at https://github.com/dataArtisans/flink <https://github.com/dataArtisans/flink> is configured to
trigger benchmark runs on every commit/change that happens on the
benchmark-request branch (We chose to use dataArtisans' repository here
because we needed a custom GitHub hook that we couldn’t add to the
apache/flink repository). Benchmarking usually takes between one and two
hours. One obvious limitation at the moment is that there is only one
comparison view, with one comparison branch, so trying to compare two
PRs at the same time is impossible. However we can tackle
this problem once it will become a real issue, not only a theoretical one.

Piotrek & Nico