I totally agree with your points especially the users priorities (stick to the already working version) , and the need to leverage important new features. It is indeed a difficult balance to find .
I can talk for a part I know: for the Spark runner, the aim was to support Dataset native spark API (in place of RDD). For that we needed to upgrade to spark 2.x (and we will probably leverage Beam Row as well).
But such an upgrade is a good amount of work which makes it difficult to commit on a schedule such as "if there is a major new feature on an execution engine that we want to leverage, then the upgrade in Beam will be done within x months".
Regarding your point on portability : decoupling SDK from runner with runner harness and SDK harness might make pipeline authors work easy regarding pipeline maintenance. But, still, if we upgrade runner libs, then the users might have their runner harness not work with their engine version.
If such SDK/runner decoupling is 100% functional, then we could imaging having multiple runner harnesses shipping different versions of the runner libs to solve this problem.
But we would need to support more than one version of the runner libs. We chose not to do this on spark runner.
In the light of the discussion about Beam LTS releases, I'd like to kick
off a thread about how often we upgrade the execution engine of each
Runner. By upgrade, I mean major/minor versions which typically break
the binary compatibility of Beam pipelines.
For the Flink Runner, we try to track the latest stable version. Some
users reported that this can be problematic, as it requires them to
potentially upgrade their Flink cluster with a new version of Beam.
From a developer's perspective, it makes sense to migrate as early as
possible to the newest version of the execution engine, e.g. to leverage
the newest features. From a user's perspective, you don't care about the
latest features if your use case still works with Beam.
We have to please both parties. So I'd suggest to upgrade the execution
engine whenever necessary (e.g. critical new features, end of life of
current version). On the other hand, the upcoming Beam LTS releases will
contain a longer-supported version.
Maybe we don't need to discuss much about this but I wanted to hear what
the community has to say about it. Particularly, I'd be interested in
how the other Runner authors intend to do it.
As far as I understand, with the portability being stable, we could
theoretically upgrade the SDK without upgrading the runtime components.
That would allow us to defer the upgrade for a longer time.