[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Output batch to Kafka

You could go with Chesnay's suggestion, which might be the quickest fix.

Creating a KafkaOutputFormat (possibly wrapping the KafkaProducer) would be a bit cleaner. Would be happy to have that as a contribution, actually ;-)

If you care about producing "exactly once" using Kafka Transactions (Kafka 0.11+), it may be a tad bit more involved - please let me know if you want details there.

On Tue, Jun 5, 2018 at 8:10 AM, Chesnay Schepler <chesnay@xxxxxxxxxx> wrote:
This depends a little bit on your requirements.
If it just about reading data from HDFS and writing it into Kafka, then it should be possible to simply wrap a KafkaProducer in a RichMapFunction that you use as a sink in your DataSet program.

However you could also use the Streaming API for that.

On 05.06.2018 00:39, Oleksandr Nitavskyi wrote:

Hello Squirrels,


Flink has a wonderful Kafka connector. We need to move data from HDFS to Kafka. Confluent is proposing to use Kafka-connect for this, but probably it can be easier to use Flink for such task, much higher abstraction, less details to manage, easier for our context.


Do you know is there a way to output data into the Kafka using the Batch approach?



Kind Regards

Oleksandr Nitavskyi