osdir.com

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RUST] [DISCUSS] Changing type of array lengths


Thanks for raising the issue, Paddy. In C++/Python/R we often work
with vary large contiguous datasets, so having support for 64-bit
lengths is important. If supporting this in Rust is not a hardship, I
think it's a good idea.

For IPC (shared memory) or RPC (Flight / gRPC), in many cases it would
make sense to break things into smaller chunks. We have an interface
to slice a table (which may be either contiguous or chunked
internally) into chunks of a desired size (like 64K or similar)

https://github.com/apache/arrow/blob/master/cpp/src/arrow/table.h#L266

- Wes
On Thu, Dec 6, 2018 at 8:20 PM paddy horan <paddyhoran@xxxxxxxxxxx> wrote:
>
> All,
>
> As part of the PR for ARROW-3347 there was a discussion regarding the type that should be used for anything that measures the length of an array, i.e.  len and capacity.
>
> The result of this discussion was that the Rust implementation should switch to using usize as the type for representing len and capacity.  This would mean supporting a way to split larger arrays into smaller array when passing data from one implementation to another.  The exact size of these smaller arrays would depend on the implementation you are passing data to.  C++ supports arrays up to size i64, but **all** implementations support lengths up to i32 as specified by the spec.  The full discussion is here:
> https://github.com/apache/arrow/pull/2858
>
> This is not a major change so I’ll push it to 0.13 but I wanted to open up the discussion before making the change, the previous debate was hidden in a PR.  In particular, Andy and Chao are you in favor of this change?
>
> Paddy