[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [DISCUSS] Developing a standard memory layout for in-memory records / "row-oriented" data


> I'm not sure this makes sense as an external stable api. I definitely think it is useful as an internal representation for use within a particular algorithm. I also think that can be informed by the particular algorithm that you're working on.

I agree that this is definitely needed in certain algorithms, e.g.
certain types of hashing. And the memory layout that's best for a
given algorithm might change. Since we have a number of support
structures already in place for columnar data, like dictionary
encoding, it would be easier to implement these things for
row-oriented data.

I think the question is really about open standards. Our original
focus when we started the project was to develop an open standard for
columnar data. It seems valuable to have one for row-oriented data.
Given how many systems have developed their own internal formats, it
seems like an inevitability. This begs the questions: if it does not
happen here, then where? and if not now, then when?

That being said, it's hard to say how feasible the project would be
until we gather more requirements and non-requirements.

On Wed, Jun 27, 2018 at 3:20 AM, Siddharth Teotia <siddharth@xxxxxxxxxx> wrote:
> I am wondering if this can be considered as an opportunity to implement
> support in Arrow for building high performance in-memory row stores for low
> latency and high throughput key based queries. In other words, we can
> design the in-memory record format keeping efficient RDMA reads as one of
> the goals too. Consider two data structures in memory -- a  hash table and
> a row-store comprising of records in Arrow row format. Hashtable points to
> row store and information can be read from both data structures without
> interrupting the CPU on server. This client-server code-path support can
> also be incorporated into Arrow Flight
>
> On Tue, Jun 26, 2018 at 7:49 PM, Jacques Nadeau <jacques@xxxxxxxxxx> wrote:
>
>> I'm not sure this makes sense as an external stable api. I definitely think
>> it is useful as an internal representation for use within a particular
>> algorithm. I also think that can be informed by the particular algorithm
>> that you're working on.
>>
>> We definitely had this requirement in Dremio and came up with an internal
>> representation that we are happy with for the use in hash tables. I'll try
>> to dig up the design docs we had around this but the actual
>> pivoting/unpivoting code that we developed can be seen here: [1], [2].
>>
>> Our main model is two blocks: a fixed width block and a variable width
>> block (with the fixed width block also carrying address & length of the
>> variable data). Fixed width is randomly accessible and variable width is
>> randomly accessible through fixed width.
>>
>> [1]
>> https://github.com/dremio/dremio-oss/blob/master/sabot/
>> kernel/src/main/java/com/dremio/sabot/op/common/ht2/Pivots.java
>> [2]
>> https://github.com/dremio/dremio-oss/blob/master/sabot/
>> kernel/src/main/java/com/dremio/sabot/op/common/ht2/Unpivots.java
>>
>> On Tue, Jun 26, 2018 at 10:20 AM, Wes McKinney <wesmckinn@xxxxxxxxx>
>> wrote:
>>
>> > hi Antoine,
>> >
>> > On Sun, Jun 24, 2018 at 1:06 PM, Antoine Pitrou <antoine@xxxxxxxxxx>
>> > wrote:
>> > >
>> > > Hi Wes,
>> > >
>> > > Le 24/06/2018 à 08:24, Wes McKinney a écrit :
>> > >>
>> > >> If this sounds interesting to the community, I could help to kickstart
>> > >> a design process which would likely take a significant amount of time.
>> > >> The requirements could be complex (i.e. we might want to support
>> > >> variable-size record fields while also providing random access
>> > >> guarantees).
>> > >
>> > > What do you call "variable-sized" here? A scheme where the length of a
>> > > record's field is determined by the value of another field in the same
>> > > record?
>> >
>> > As an example, here is a fixed size record
>> >
>> > record foo {
>> >   a: int32;
>> >   b: float64;
>> >   c: uint8;
>> > }
>> >
>> > With padding suppose this is 16 bytes per record; so if we have a
>> > column of these, then random accessing any value in any record is
>> > simple.
>> >
>> > Here's a variable-length record:
>> >
>> > record bar {
>> >   a: string;
>> >   b: list<int32>;
>> > }
>> >
>> > What I've seen done to represent this in memory is to have a fixed
>> > size record followed by a sidecar containing the variable-length data,
>> > so the fixed size portion might look something like
>> >
>> > a_offset: int32;
>> > a_length: int32;
>> > b_offset: int32;
>> > b_length: int32;
>> >
>> > So from this, you can do random access into the record. If you wanted
>> > to do random access on a _column_ of such records, it is similar to
>> > our current variable-length Binary type. So it might be that the
>> > underlying Arrow memory layout would be FixedSizeBinary for fixed-size
>> > records and variable Binary for variable-size records.
>> >
>> > - Wes
>> >
>> > >
>> > >
>> > >
>> > > Regards
>> > >
>> > > Antoine.
>> >
>>