[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[PROPOSAL] [community] A more structured approach to reviews and contributions

Hi Flink community members!

As many of you will have noticed, the Flink project activity has gone up
again quite a bit.
There are many more contributions, which is an absolutely great thing to
have :-)

However, we see a continuously growing backlog of pull requests and JIRA
To make sure the community will be able to handle the increased volume, I
think we need to revisit some
approaches and processes. I believe there are a few opportunities to
structure things a bit better, which
should help to scale the development.

The first thing I would like to bring up are *Pull Request Reviews*. Even
though more community members being
active in reviews (which is a really great thing!) the Pull Request backlog
is increasing quite a bit.

Why are pull requests still not merged faster? Looking at the reviews, one
thing I noticed is that most reviews deal
immediately with detailed code issues, and leave out most of the core
questions that need to be answered
before a Pull Request can be merged, like "is this a desired feature?" or
"does this align well with other developments?".
I think that we even make things slightly worse that way: From my personal
experience, I have often thought "oh, this
PR has a review already" and rather looked at another PR, only to find
later that the first review did never decide whether
this PR is actually a good fit for Flink.

There has never been a proper documentation of how to answer these
questions, what to evaluate in reviews,
guidelines for how to evaluate pull requests, other than code quality. I
suspect that this is why so many reviewers
do not address the "is this a good contribution" questions, making pull
requests linger until another committers joins
the review.

Below is an idea for a guide *"How to Review Contributions"*. It outlines
five core aspects to be checked in every
pull request, and suggests a priority for clarifying those. The idea is
that this helps us to better structure reviews, and
to make each reviewer aware what we look for in a review and where to best
bring in their help.

Looking forward to comments!



The draft is in this Google Doc. Please add small textual comments to the
doc, and bigger principle discussions as replies to this mail.


*How to Review Contributions------------------------------This guide is for
all committers and contributors that want to help with reviewing
contributions. Thank you for your effort - good reviews are one the most
important and crucial parts of an open source project. This guide should
help the community to make reviews such that: - Contributors have a good
contribution experience- Reviews are structured and check all important
aspects of a contribution- Make sure we keep a high code quality in Flink-
We avoid situations where contributors and reviewers spend a lot of time to
refine a contribution that gets rejected laterReview ChecklistEvery review
needs to check the following five aspects. We encourage to check these
aspects in order, to avoid spending time on detailed code quality reviews
when there is not yet consensus that a feature or change should be actually
be added.(1) Is there consensus whether the change of feature should go
into to Flink?For bug fixes, this needs to be checked only in case it
requires bigger changes or might break existing programs and
setups.Ideally, this question is already answered from a JIRA issue or the
dev-list discussion, except in cases of bug fixes and small lightweight
additions/extensions. In that case, this question can be immediately marked
as resolved. For pull requests that are created without prior consensus,
this question needs to be answered as part of the review.The decision
whether the change should go into Flink needs to take the following aspects
into consideration: - Does the contribution alter the behavior of features
or components in a way that it may break previous users’ programs and
setups? If yes, there needs to be a discussion and agreement that this
change is desirable. - Does the contribution conceptually fit well into
Flink? Is it too much of special case such that it makes things more
complicated for the common case, or bloats the abstractions / APIs? - Does
the feature fit well into Flink’s architecture? Will it scale and keep
Flink flexible for the future, or will the feature restrict Flink in the
future? - Is the feature a significant new addition (rather than an
improvement to an existing part)? If yes, will the Flink community commit
to maintaining this feature? - Does the feature produce added value for
Flink users or developers? Or does it introduce risk of regression without
adding relevant user or developer benefit?All of these questions should be
answerable from the description/discussion in JIRA and Pull Request,
without looking at the code.(2) Does the contribution need attention from
some specific committers and is there time commitment from these
committers?Some changes require attention and approval from specific
committers. For example, changes in parts that are either very performance
sensitive, or have a critical impact on distributed coordination and fault
tolerance need input by a committer that is deeply familiar with the
component.As a rule of thumb, this is the case when the Pull Request
description answers one of the questions in the template section “Does this
pull request potentially affect one of the following parts” with ‘yes’.This
question can be answered with - Does not need specific attention- Needs
specific attention for X (X can be for example checkpointing, jobmanager,
etc.).- Has specific attention for X by @commiterA, @contributorBIf the
pull request needs specific attention, one of the tagged
committers/contributors should give the final approval.(3) Is the
contribution described well?Check whether the contribution is sufficiently
well described to support a good review. Trivial changes and fixes do not
need a long description. Any pull request that changes functionality or
behavior needs to describe the big picture of these changes, so that
reviews know what to look for (and don’t have to dig through the code to
hopefully understand what the change does).Changes that require longer
descriptions are ideally based on a prior design discussion in the mailing
list or in JIRA and can simply link to there or copy the description from
there.(4) Does the implementation follow the right overall
approach/architecture?Is this the best approach to implement the fix or
feature, or are there other approaches that would be easier, more robust,
or more maintainable?This question should be answerable from the Pull
Request description (or the linked JIRA) as much as possible.We recommend
to check this before diving into the details of commenting on individual
parts of the change.(5) Is the overall code quality good, meeting standard
we want to maintain in Flink?This is the detailed code review of the actual
changes, covering: - Are the changes doing what is described in the design
document or PR description?- Does the code follow the right software
engineering practices? It the code correct, robust, maintainable,
testable?- Are the change performance aware, when changing a performance
sensitive part?- Are the changes sufficiently covered by tests?- Are the
tests executing fast?- Does the code format follow Flink’s checkstyle
pattern?- Does the code avoid to introduce additional compiler
warnings?Some code style guidelines can be found in the [Flink Code Style
<https://flink.apache.org/contribute-code.html#code-style>)Pull Request
Review TemplateAdd the following checklist to the pull request review,
checking the boxes as the questions are answered:  - [ ] Consensus that the
contribution should go into to Flink  - [ ] Does not need specific
attention | Needs specific attention for X | Has attention for X by Y  - [
] Contribution description  - [ ] Architectural approach  - [ ] Overall
code quality*