osdir.com


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Please please please review : Re: PubSub connector (FLINK-9311)


Hi!

We're currently in the release process for 1.7.0 with a feature-freeze in effect, which locks down most (all?) committers to bug-fixes and testing.

I apologize for this PR being ignored for so long, especially so since you started a discussion beforehand as to whether the Flink community would be interested in having such a connector.

I will nag some people to review it once the release is done or have a crack at it myself.

On 02.11.2018 08:23, Niels Basjes wrote:
Hi all
We created this feature almost 2 months ago.
Please review this so it becomes part ofthe release so that others can use
it too.

So far we have not seen any feedback on what we should change/improve to
make this happen.

What should we do?

Niels & Richard

On Wed, Oct 17, 2018, 10:52 Richard Deurwaarder <richard@xxxxxxx wrote:

Hello everyone,

To improve performance we have changed some parts of the code:
* Previously the same threads used to poll PubSub we're used to actually
run the flink pipeline, this caused some issues in the PubSub SDK. We've
changed this so that the PubSub threads are only used to poll PubSub and
the flink pipeline runs in the same thread as the run() method.
* We now expose PubSub options so we are able to tune the PubSub source
functions when the defaults don't give the best performance.

I did run into one issue that we are currently unable to fix outside of
flink when rescaling the job to a lower parallelism:
We use the MessageAcknowledgingSourceBase and this had been designed for
RabbitMQ to run with a parallelism of 1. For PubSub we would like a higher
parallelism and fallback to ATLEAST_ONCE.
When rescaling back to a lower parallelism we need to combine state of 2
tasks but the MessageAcknowledingSourceBase explicitly does not support
this in the initializeState() function.
I've added this commit:

https://github.com/Xeli/flink/commit/698b6f1c802427f940f2f550796539fbfa4b5dfa
to
fix this, this should not change anything for RabbitMQ but does allow
rescaling when needed.


That last issue is not possible for us to fix outside of flink, therefore
we would really like to see this added to 1.7 :)
If it remains difficult to find Google Cloud Platform users to review this,
would it be an option to add the PubSub connector as a 'beta' feature?
Perhaps this would make it more visible to the community?

Best,

Richard

On Fri, Sep 14, 2018 at 4:01 PM Aljoscha Krettek <aljoscha@xxxxxxxxxx>
wrote:

Hi Niels and Richard,

I would be very happy about having a PubSub connector in Flink. Having it
in Flink means that you don't have manual effort for tracking API changes
and I think having a production user is incentive enough for them (you)
to
maintain the connector.

I'm afraid we don't have much PubSub knowledge in the Flink community but
I will try and talk to some folks to have this reviewed.

Best,
Aljoscha

On 14. Sep 2018, at 15:40, Niels Basjes <Niels@xxxxxxxxx> wrote:

Hi all,

We (Richard and I) would really appreciate it if you guys could review
the new feature we created (see below).
It is something we really need in production and thought it would be
best if it can be a native part of the Flink toolset.
Please indicate what we need to change/improve in order to get this
committed.
Thank you.

Niels Basjes

On Mon, Sep 10, 2018 at 10:12 AM Richard Deurwaarder <richard@xxxxxxx>
wrote:
Hello everyone,

A while back I opened this Jira issue:
https://issues.apache.org/jira/browse/FLINK-9311. It is regarding a
new
Flink connector for Google PubSub.

At Bol.com, the company I work for, we are going to use this connector
to
do our financial processing. My colleague, Niels Basjes, and I have
finished implementing this and we are about ready to start running it
in
production.

We would like to donate this code and have opened a pull request (
https://github.com/apache/flink/pull/6594). The pull request
contains:
   - a SourceFunction (with 2 test versions)
   - a SinkFunction
   - an example application
   - and End-to-End tests using a docker container

Yanghua has taken a glance at the code and already provided us with
some
feedback. We would like to invite others to provide additional
feedback.
I hope you find this useful and will consider merging the PR!

Best,

Richard Deurwaarder


--
Best regards / Met vriendelijke groeten,

Niels Basjes