OSDir


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Consuming data from dynamoDB streams to flink


HI Gordon:

We are starting to implement some of the primitives along this path. Please
let us know if you have any suggestions.

Thanks!

On Fri, Jun 29, 2018 at 12:31 AM, Ying Xu <yxu@xxxxxxxx> wrote:

> Hi Gordon:
>
> Really appreciate the reply.
>
> Yes our plan is to build the connector on top of the FlinkKinesisConsumer.
> At the high level, FlinkKinesisConsumer mainly interacts with Kinesis
> through the AmazonKinesis client, more specifically through the following
> three function calls:
>
>    - describeStream
>    - getRecords
>    - getShardIterator
>
> Given that the low-level DynamoDB client (AmazonDynamoDBStreamsClient)
> has already implemented similar calls, it is possible to use that client to
> interact with the dynamoDB streams, and adapt the results from the dynamoDB
> streams model to the kinesis model.
>
> It appears this is exactly what the AmazonDynamoDBStreamsAdapterClient
> <https://github.com/awslabs/dynamodb-streams-kinesis-adapter/blob/master/src/main/java/com/amazonaws/services/dynamodbv2/streamsadapter/AmazonDynamoDBStreamsAdapterClient.java>
> does. The adaptor client implements the AmazonKinesis client interface,
> and is officially supported by AWS.  Hence it is possible to replace the
> internal Kinesis client inside FlinkKinesisConsumer with this adapter
> client when interacting with dynamoDB streams.  The new object can be a
> subclass of FlinkKinesisConsumer with a new name e.g, FlinkDynamoStreamCon
> sumer.
>
> At best this could simply work. But we would like to hear if there are
> other situations to take care of.  In particular, I am wondering what's the *"resharding
> behavior"* mentioned in FLINK-4582.
>
> Thanks a lot!
>
> -
> Ying
>
> On Wed, Jun 27, 2018 at 10:43 PM, Tzu-Li (Gordon) Tai <tzulitai@xxxxxxxxxx
> > wrote:
>
>> Hi!
>>
>> I think it would be definitely nice to have this feature.
>>
>> No actual previous work has been made on this issue, but AFAIK, we should
>> be able to build this on top of the FlinkKinesisConsumer.
>> Whether this should live within the Kinesis connector module or an
>> independent module of its own is still TBD.
>> If you want, I would be happy to look at any concrete design proposals you
>> have for this before you start the actual development efforts.
>>
>> Cheers,
>> Gordon
>>
>> On Thu, Jun 28, 2018 at 2:12 AM Ying Xu <yxu@xxxxxxxx> wrote:
>>
>> > Thanks Fabian for the suggestion.
>> >
>> > *Ying Xu*
>> > Software Engineer
>> > 510.368.1252 <+15103681252>
>> > [image: Lyft] <http://www.lyft.com/>
>> >
>> > On Wed, Jun 27, 2018 at 2:01 AM, Fabian Hueske <fhueske@xxxxxxxxx>
>> wrote:
>> >
>> > > Hi Ying,
>> > >
>> > > I'm not aware of any effort for this issue.
>> > > You could check with the assigned contributor in Jira if there is some
>> > > previous work.
>> > >
>> > > Best, Fabian
>> > >
>> > > 2018-06-26 9:46 GMT+02:00 Ying Xu <yxu@xxxxxxxx>:
>> > >
>> > > > Hello Flink dev:
>> > > >
>> > > > We have a number of use cases which involves pulling data from
>> DynamoDB
>> > > > streams into Flink.
>> > > >
>> > > > Given that this issue is tracked by Flink-4582
>> > > > <https://issues.apache.org/jira/browse/FLINK-4582>. we would like
>> to
>> > > check
>> > > > if any prior work has been completed by the community.   We are also
>> > very
>> > > > interested in contributing to this effort.  Currently, we have a
>> > > high-level
>> > > > proposal which is based on extending the existing
>> FlinkKinesisConsumer
>> > > and
>> > > > making it work with DynamoDB streams (via integrating with the
>> > > > AmazonDynamoDBStreams API).
>> > > >
>> > > > Any suggestion is welcome. Thank you very much.
>> > > >
>> > > >
>> > > > -
>> > > > Ying
>> > > >
>> > >
>> >
>>
>
>