Re: [PROPOSAL] [community] A more structured approach to reviews and contributions
Thanks for raising this discussion and the previous work! I
strongly support effort to improve the process of contributions and reviews.
As you mentioned above, Flink community responses to contributions a bit
inefficiently, both for latency and quality. In my opinion the most
important things are to make sure reviews of contributions supervised and
push on the stage of a contribution step by step. The latter is similar as
you mentioned in the document.
Here are my observations and suggestions on these theme.
1. The document says that some contributions need need attention from some
specific committers. I definitely agree with that. In fact, under our
current process, every pull request is ideally named in format "[FLINK-XXX]
[module] Title". The module part indicates that in someways. However, there
is a time I make a contribution and know nothing about to whom I could cc.
Take a look at the Scala repo ( https://github.com/scala/scala/ ). It lists
by which committers a module maintain, I think it is really help for new
2. I find that there are quite a number of pull requests of Flink are
inactive because no one comments or take concern of. Despite the reason it
is quite a bad experience. We surely reject some contributions but should
never disregard them. Here comes the idea that we always initially assign a
committer to a new contribution ( ideally, for we might have no so many
committer yet ). They are initially in charge of the contribution and take
the responsibility to nude it. Surely one can reassign to another who is
more properly, since the main issue here is ensure contribution supervised.
3. The document introduce five stages of a review, and take 2 into
consideration. I'd like to add 2 more stage, one of which is
"waiting-for-review" represents no one comments and "stall" represents pull
request cannot go ahead for some reason such as contributor doesn't
response, note that a "stall" pull requests is not rejected.
Having these previous, as Fabian Hueske comments in the document, we can
implement a bot to do such work. GitHub supports tagging pull requests
which meet our requirement of marking stage of a pr. Tags we would to
support include five stages in document, two addition stages as above.
Since a pr closed by rejected or merged and must with a conclusion in the
comments, we don't need that tag.
Rust-lang team implements a bot (
https://github.com/rust-lang-nursery/highfive ) which does these things
amazingly. See how https://github.com/rust-lang/rust/pulls works.
To sum up, I suggestion that we (1) introduce our committer and which
modules they maintain/familiar with at somewhere easy to find. (2) make
sure that every contribution is supervised. (3) Tag pull requests so that
the process more smoothly.
Till Rohrmann <trohrmann@xxxxxxxxxx> 于2018年9月17日周一 下午4:27写道：
> Thanks for writing this up Stephan. I like the steps and hope that it will
> help the community to make the review process better. Thus, +1 for putting
> your proposal to practice.
> On Mon, Sep 17, 2018 at 10:00 AM Stephan Ewen <sewen@xxxxxxxxxx> wrote:
> > Hi Flink community members!
> > As many of you will have noticed, the Flink project activity has gone up
> > again quite a bit.
> > There are many more contributions, which is an absolutely great thing to
> > have :-)
> > However, we see a continuously growing backlog of pull requests and JIRA
> > issues.
> > To make sure the community will be able to handle the increased volume, I
> > think we need to revisit some
> > approaches and processes. I believe there are a few opportunities to
> > structure things a bit better, which
> > should help to scale the development.
> > The first thing I would like to bring up are *Pull Request Reviews*. Even
> > though more community members being
> > active in reviews (which is a really great thing!) the Pull Request
> > is increasing quite a bit.
> > Why are pull requests still not merged faster? Looking at the reviews,
> > thing I noticed is that most reviews deal
> > immediately with detailed code issues, and leave out most of the core
> > questions that need to be answered
> > before a Pull Request can be merged, like "is this a desired feature?" or
> > "does this align well with other developments?".
> > I think that we even make things slightly worse that way: From my
> > experience, I have often thought "oh, this
> > PR has a review already" and rather looked at another PR, only to find
> > later that the first review did never decide whether
> > this PR is actually a good fit for Flink.
> > There has never been a proper documentation of how to answer these
> > questions, what to evaluate in reviews,
> > guidelines for how to evaluate pull requests, other than code quality. I
> > suspect that this is why so many reviewers
> > do not address the "is this a good contribution" questions, making pull
> > requests linger until another committers joins
> > the review.
> > Below is an idea for a guide *"How to Review Contributions"*. It outlines
> > five core aspects to be checked in every
> > pull request, and suggests a priority for clarifying those. The idea is
> > that this helps us to better structure reviews, and
> > to make each reviewer aware what we look for in a review and where to
> > bring in their help.
> > Looking forward to comments!
> > Best,
> > Stephan
> > ====================================
> > The draft is in this Google Doc. Please add small textual comments to the
> > doc, and bigger principle discussions as replies to this mail.
> > *How to Review Contributions------------------------------This guide is
> > all committers and contributors that want to help with reviewing
> > contributions. Thank you for your effort - good reviews are one the most
> > important and crucial parts of an open source project. This guide should
> > help the community to make reviews such that: - Contributors have a good
> > contribution experience- Reviews are structured and check all important
> > aspects of a contribution- Make sure we keep a high code quality in
> > We avoid situations where contributors and reviewers spend a lot of time
> > refine a contribution that gets rejected laterReview ChecklistEvery
> > needs to check the following five aspects. We encourage to check these
> > aspects in order, to avoid spending time on detailed code quality reviews
> > when there is not yet consensus that a feature or change should be
> > be added.(1) Is there consensus whether the change of feature should go
> > into to Flink?For bug fixes, this needs to be checked only in case it
> > requires bigger changes or might break existing programs and
> > setups.Ideally, this question is already answered from a JIRA issue or
> > dev-list discussion, except in cases of bug fixes and small lightweight
> > additions/extensions. In that case, this question can be immediately
> > as resolved. For pull requests that are created without prior consensus,
> > this question needs to be answered as part of the review.The decision
> > whether the change should go into Flink needs to take the following
> > into consideration: - Does the contribution alter the behavior of
> > or components in a way that it may break previous users’ programs and
> > setups? If yes, there needs to be a discussion and agreement that this
> > change is desirable. - Does the contribution conceptually fit well into
> > Flink? Is it too much of special case such that it makes things more
> > complicated for the common case, or bloats the abstractions / APIs? -
> > the feature fit well into Flink’s architecture? Will it scale and keep
> > Flink flexible for the future, or will the feature restrict Flink in the
> > future? - Is the feature a significant new addition (rather than an
> > improvement to an existing part)? If yes, will the Flink community commit
> > to maintaining this feature? - Does the feature produce added value for
> > Flink users or developers? Or does it introduce risk of regression
> > adding relevant user or developer benefit?All of these questions should
> > answerable from the description/discussion in JIRA and Pull Request,
> > without looking at the code.(2) Does the contribution need attention from
> > some specific committers and is there time commitment from these
> > committers?Some changes require attention and approval from specific
> > committers. For example, changes in parts that are either very
> > sensitive, or have a critical impact on distributed coordination and
> > tolerance need input by a committer that is deeply familiar with the
> > component.As a rule of thumb, this is the case when the Pull Request
> > description answers one of the questions in the template section “Does
> > pull request potentially affect one of the following parts” with
> > question can be answered with - Does not need specific attention- Needs
> > specific attention for X (X can be for example checkpointing, jobmanager,
> > etc.).- Has specific attention for X by @commiterA, @contributorBIf the
> > pull request needs specific attention, one of the tagged
> > committers/contributors should give the final approval.(3) Is the
> > contribution described well?Check whether the contribution is
> > well described to support a good review. Trivial changes and fixes do not
> > need a long description. Any pull request that changes functionality or
> > behavior needs to describe the big picture of these changes, so that
> > reviews know what to look for (and don’t have to dig through the code to
> > hopefully understand what the change does).Changes that require longer
> > descriptions are ideally based on a prior design discussion in the
> > list or in JIRA and can simply link to there or copy the description from
> > there.(4) Does the implementation follow the right overall
> > approach/architecture?Is this the best approach to implement the fix or
> > feature, or are there other approaches that would be easier, more robust,
> > or more maintainable?This question should be answerable from the Pull
> > Request description (or the linked JIRA) as much as possible.We recommend
> > to check this before diving into the details of commenting on individual
> > parts of the change.(5) Is the overall code quality good, meeting
> > we want to maintain in Flink?This is the detailed code review of the
> > changes, covering: - Are the changes doing what is described in the
> > document or PR description?- Does the code follow the right software
> > engineering practices? It the code correct, robust, maintainable,
> > testable?- Are the change performance aware, when changing a performance
> > sensitive part?- Are the changes sufficiently covered by tests?- Are the
> > tests executing fast?- Does the code format follow Flink’s checkstyle
> > pattern?- Does the code avoid to introduce additional compiler
> > warnings?Some code style guidelines can be found in the [Flink Code Style
> > Page](https://flink.apache.org/contribute-code.html#code-style
> > <https://flink.apache.org/contribute-code.html#code-style>)Pull Request
> > Review TemplateAdd the following checklist to the pull request review,
> > checking the boxes as the questions are answered: - [ ] Consensus that
> > contribution should go into to Flink - [ ] Does not need specific
> > attention | Needs specific attention for X | Has attention for X by Y -
> > ] Contribution description - [ ] Architectural approach - [ ] Overall
> > code quality*