[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: support/docs for compression in StreamingFileSink

Hi Addison,

I think it is a good idea to add some more details to the documentation.
Thus, it would be great if you could contribute how to enable compression.

Concerning the RollingPolicy, I've pulled in Klou who might give you more
details about the design decisions.


On Wed, Nov 14, 2018 at 10:07 PM Addison Higham <addisonj@xxxxxxxxx> wrote:

> Just noticed one detail about using the BulkWriter interface, you no longer
> can assign a rolling policy. That makes sense for formats like orc/parquet,
> but perhaps not for simple text compression.
> On Wed, Nov 14, 2018 at 1:43 PM Addison Higham <addisonj@xxxxxxxxx> wrote:
> > HI all,
> >
> > I am moving some code to use the StreamingFileSink. Currently, it doesn't
> > look like there is any native support for compression (gzip or otherwise)
> > built into flink when using the StreamingFileSink. It seems like this is
> a
> > really common need that as far as I could tell, wasn't represented in
> jira.
> >
> > After a fair amount of digging, it seems like the way to do that is to
> > implement that is the BulkWriter interface where you can trivially wrap
> an
> > outputStream with something like a GZIPOutputStream.
> >
> > It seems like it would make sense that until compression functionality is
> > built into the StreamingFileSink, it might make sense to add some docs on
> > how to use compression with the StreamingFileSink.
> >
> > I am willing to spend a bit of time documenting that, but before I do i
> > wanted to make sure I understand if that is in fact the correct way to
> > think about this problem and get your thoughts.
> >
> > Thanks!
> >
> >
> >