osdir.com


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Questions in sink exactly once implementation


Hi Henry,

Yes, exactly once using atomic way is heavy for mysql. However, you don't have to buffer data if you choose option 2. You can simply overwrite old records with new ones if result data is idempotent and this way can also achieve exactly once. 
There is a document about End-to-End Exactly-Once Processing in Apache Flink[1], which may be helpful for you.

Best, Hequn




On Fri, Oct 12, 2018 at 5:21 PM 徐涛 <happydexutao@xxxxxxxxx> wrote:
Hi
        I am reading the book “Introduction to Apache Flink”, and in the book there mentions two ways to achieve sink exactly once:
        1. The first way is to buffer all output at the sink and commit this atomically when the sink receives a checkpoint record.
        2. The second way is to eagerly write data to the output, keeping in mind that some of this data might be “dirty” and replayed after a failure. If there is a failure, then we need to roll back the output, thus overwriting the dirty data and effectively deleting dirty data that has already been written to the output.

        I read the code of Elasticsearch sink, and find there is a flushOnCheckpoint option, if set to true, the change will accumulate until checkpoint is made. I guess it will guarantee at-least-once delivery, because although it use batch flush, but the flush is not a atomic action, so it can not guarantee exactly-once delivery.

        My question is :
        1. As many sinks do not support transaction, at this case I have to choose 2 to achieve exactly once. At this case, I have to buffer all the records between checkpoints and delete them, it is a bit heavy action.
        2. I guess mysql sink should support exactly once delivery, because it supports transaction, but at this case I have to execute batch according to the number of actions between checkpoints but not a specific number, 100 for example. When there are a lot of items between checkpoints, it is a heavy action either.

Best
Henry