I started this branch to clean up the Vector Data classes to make it easier to add higher-level Table and Vector operators, but as the Data classes are fairly embedded in the core, it lead to a larger refactor of the DataTypes, Vectors, Visitors, and IPC readers and writers.
While I was updating the IPC readers and writers, I took the opportunity to back-port all the Node and WhatWG (browser) streams integration that we've built for Graphistry. Putting it in the Arrow JS library means we can better ensure zero-copy when possible, empowers library consumers to easily build streaming applications in both server and browser environments, and (selfishly) reduces complexity in my code base. It also advances a longer term personal goal to more closely adhere to the structure and organization of ArrowCPP when reasonable.
A non-exhaustive list of updates includes:* Updates the Table, Schema, RecordBatch, Visitor, Vector, Data, and DataTypes to ensure the generic type signatures cascade recursively through the type declarations * New io primitives that abstract over the (mutually exclusive) file and stream APIs in both node and browser environments * New RecordBatchReaders and RecordBatchWriters that directly use the zero-copy node and browser io primitives * A consolidated reflective Visitor implementation that supports late binding to shortcut traversal, provides an easy API for building higher level Vector operators * Fixed bugs/added support for reading and writing DictionaryBatch deltas (tricky) * Updated all the dependencies and did some config file gardening to make debugging tests easier
* Added a bunch of new testsI'd be more than happy to help shepherd a 0.4.0 release of what's in arrow/master if that's what everyone wants to do. But in the interest of cutting a more feature-rich release and preventing customers paying the cost of updating twice in a short time span, I vote we hold off for another day or two and merge + release the work in the refactor branch.
Paul On 12/9/18 10:51 AM, Wes McKinney wrote: