osdir.com


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: support/docs for compression in StreamingFileSink


Just noticed one detail about using the BulkWriter interface, you no longer
can assign a rolling policy. That makes sense for formats like orc/parquet,
but perhaps not for simple text compression.



On Wed, Nov 14, 2018 at 1:43 PM Addison Higham <addisonj@xxxxxxxxx> wrote:

> HI all,
>
> I am moving some code to use the StreamingFileSink. Currently, it doesn't
> look like there is any native support for compression (gzip or otherwise)
> built into flink when using the StreamingFileSink. It seems like this is a
> really common need that as far as I could tell, wasn't represented in jira.
>
> After a fair amount of digging, it seems like the way to do that is to
> implement that is the BulkWriter interface where you can trivially wrap an
> outputStream with something like a GZIPOutputStream.
>
> It seems like it would make sense that until compression functionality is
> built into the StreamingFileSink, it might make sense to add some docs on
> how to use compression with the StreamingFileSink.
>
> I am willing to spend a bit of time documenting that, but before I do i
> wanted to make sure I understand if that is in fact the correct way to
> think about this problem and get your thoughts.
>
> Thanks!
>
>
>