[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: S3 for state backend in Flink 1.4.0

A heads up on this front:

  - For state backends during checkpointing, I would suggest to use the flink-s3-fs-presto, which is quite a bit faster than the flink-s3-fs-hadoop by avoiding a bunch of unnecessary metadata operations.

  - We have started work on re-writing the Bucketing Sink to make it work with the shaded S3 filesystems (like flink-s3-fs-presto). We are also adding a more powerful internal abstraction that uses multipart uploads for faster incremental persistence of result chunks on checkpoints. This should be in 1.6, happy to share more as soon as it is out...

On Wed, Feb 7, 2018 at 3:56 PM, Marchant, Hayden <hayden.marchant@xxxxxxxx> wrote:
WE actually got it working. Essentially, it's an implementation of HadoopFilesytem, and was written with the idea that it can be used with Spark (since it has broader adoption than Flink as of now). We managed to get it configured, and found the latency to be much lower than by using the s3 connector. There are a lot less copying operations etc... happening under the hood when using this native API which explains the better performance.

Happy to provide assistance offline if you're interested.


-----Original Message-----
From: Edward Rojas [mailto:edward.rojascl@gmail.com]
Sent: Thursday, February 01, 2018 6:09 PM
To: user@xxxxxxxxxxxxxxxx
Subject: RE: S3 for state backend in Flink 1.4.0

Hi Hayden,

It seems like a good alternative. But I see it's intended to work with spark, did you manage to get it working with Flink ?

I some tests but I get several errors when trying to create a file, either for checkpointing or saving data.

Thanks in advance,

Sent from: https://urldefense.proofpoint.com/v2/url?u=http-3A__apache-2Dflink-2Duser-2Dmailing-2Dlist-2Darchive.2336050.n4.nabble.com_&d=DwICAg&c=j-EkbjBYwkAB4f8ZbVn1Fw&r=g-5xYRH8L3aCnCNTROw5LrsB5gbTayWjXSm6Nil9x0c&m=MW1NZ-mLVkooOHg-TWiOE7j2e9PCk7EOAmahXApcLtQ&s=b8kvNKIjylDuKlc2munyBj1da85y8aZ8brJsO24R2GU&e=