OSDir


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Encoding options (delta, rle, ...) in pyarrow bindings


Hi Sebastian -- Uwe is referring to Parquet files. We don't yet have
in-memory RLE or Delta encoding in the Arrow columnar format. I suspect
this will eventually be added as it can be quite important to improve
in-memory query execution performance.

Wes

On Fri, Nov 2, 2018, 2:18 PM Uwe L. Korn <uwelk@xxxxxxxxxx wrote:

> Hello Sebastian,
>
> currently you can only switch between plain and
> dictionary-encoding-combined-with-run-length encoding using the
> `use_dictionary` flag on
> https://arrow.apache.org/docs/python/generated/pyarrow.parquet.write_table.html#pyarrow.parquet.write_table
> . Other encoding are yet only implemented on the read path, we cannot write
> delta encodings yet.
>
> Uwe
>
> On Fri, Nov 2, 2018, at 12:53 PM, Sebastian Himberger wrote:
> > Hi,
> >
> > I hope this is the right list. I couldn't find a "users" list on the
> > website so please forgive me if I am interrupting here.
> >
> > I am developing an application using the pyarrow module. By reading
> through
> > the documents I couldn't find a way to specify an encoding like delta or
> > run length to a column. Is this not supported yet or am I missing
> something?
> >
> > Thanks so much,
> > Sebastian
>