osdir.com

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Timeline for Arrow 0.12.0 release


hi Andy,

I can see an argument for having some basic native function kernel
support in Rust. One of the things that Gandiva has begun is a
Protobuf-based serialized representation representation of projection
and filter expressions. In the long run I would like to see a more
complete relational algebra / logical query plan that can be submitted
for execution. There's complexities, though, such as bridging
iteration of data sources written in Rust, say, with a query engine
written in C++. You would need to provide some kind of a callback
mechanism for the query engine to request the next chunk of a dataset
to be materialized.

It will be interested to see what contributors will be motivated
enough to build over the next few years. At the end of the day, Apache
projects are do-ocracies.

- Wes
On Fri, Dec 7, 2018 at 6:22 AM Andy Grove <andygrove73@xxxxxxxxx> wrote:
>
> I've added one PR to the list (https://github.com/apache/arrow/pull/3119)
> to update the project to use Rust 2018 Edition.
>
> I'm also considering removing one PR from the list and would like to get
> opinions here.
>
> I have a PR (https://github.com/apache/arrow/pull/3033) to add some basic
> math and comparison operators to primitive arrays. These are baby steps
> towards implementing more query execution capabilities such as projection,
> selection, etc but Chao made a good point that other Rust implementations
> don't have these kind of capabilities and I am now wondering if this is a
> distraction. We already have Gandiva and the new efforts in Ursa labs and
> it would probably make more sense to look at having Rust bindings for the
> query execution capabilities there rather than having a competing (and less
> capable) implementation in Rust.
>
> Thoughts?
>
> Andy.
>
>
>
>
>
> On Thu, Dec 6, 2018 at 8:42 PM paddy horan <paddyhoran@xxxxxxxxxxx> wrote:
>
> > Other than Andy’s PR below I’m going to try and find time to work on
> > ARROW-3827, I’ll bump it 0.13 if I can’t find the time early next week.
> > There is nothing else in the 0.12 backlog for Rust.  It would be nice to
> > get the parquet merge in though.
> >
> >
> >
> > Paddy
> >
> >
> >
> > ________________________________
> > From: Andy Grove <andygrove73@xxxxxxxxx>
> > Sent: Thursday, December 6, 2018 10:20:48 AM
> > To: dev@xxxxxxxxxxxxxxxx
> > Subject: Re: Timeline for Arrow 0.12.0 release
> >
> > I have PRs pending for all the Rust issues that I want to get into 0.12.0
> > and would appreciate some reviews so I can go ahead and merge:
> >
> > https://github.com/apache/arrow/pull/3033 (covers ARROW-3880 and
> > ARROW-3881
> > - add math and comparison operations to primitive arrays)
> > https://github.com/apache/arrow/pull/3096 (ARROW-3885 - Rust release
> > process)
> > https://github.com/apache/arrow/pull/3111 (ARROW-3838 - CSV Writer)
> >
> > With these in place I plan on writing a tutorial for reading a CSV file,
> > performing some operations on primitive arrays and writing the output to a
> > new CSV file.
> >
> > I am deferring ARROW-3882 (casting for primitive arrays) to 0.13.0
> >
> > Thanks,
> >
> > Andy.
> >
> > On Tue, Dec 4, 2018 at 7:57 PM Andy Grove <andygrove73@xxxxxxxxx> wrote:
> >
> > > I'd love to tackle the three related issues for supporting simple
> > > math/comparison operations on primitive arrays and casting primitive
> > arrays
> > > but since the change to use Rust specialization feature I'm a bit stuck
> > and
> > > need some assistance applying the math operations to the numeric types
> > and
> > > not the boolean primitives. I have added a comment to
> > > https://github.com/apache/arrow/pull/3033 ... if I can get help solving
> > > for this PR then I should be able to handle the others. I'll also do some
> > > research and try and figure this out myself.
> > >
> > > Andy.
> > >
> > >
> > >
> > >
> > >
> > >
> > > On Tue, Dec 4, 2018 at 7:03 PM Wes McKinney <wesmckinn@xxxxxxxxx> wrote:
> > >
> > >> Andy, Paddy, or other Rust developers -- could you review the 6 issues
> > >> in TODO in the 0.12 backlog and either assign them or move them to the
> > >> next release if they aren't going to be completed this week or next?
> > >>
> > >>
> > >> On Fri, Nov 30, 2018 at 4:34 PM Wes McKinney <wesmckinn@xxxxxxxxx>
> > wrote:
> > >> >
> > >> > hi folks,
> > >> >
> > >> > Tomorrow is December 1. The last major Arrow release (0.11.0) took
> > >> > place on October 8. Given how much work has happened in the project in
> > >> > the last ~2 months, I think it would be great to complete the next
> > >> > major release before the end-of-year holidays set in.
> > >> >
> > >> > I've been curating the JIRA backlog the last couple of weeks, and have
> > >> > just created a 0.12.0 release wiki page to help us stay organized
> > >> >
> > >> >
> > https://cwiki.apache.org/confluence/display/ARROW/Arrow+0.12.0+Release
> > >> >
> > >> > Given that there are only 3 full working weeks between now and
> > >> > Christmas, I think we should be in position to cut a release by the
> > >> > end of the week of December 10, i.e. by Friday December 14. Not all of
> > >> > the TODO issues have to be completed to make the release, but it would
> > >> > be good to push to complete as much as possible. Please help by
> > >> > reviewing the backlog, and if possible, assigning issues to yourself
> > >> > that you'd like to pursue in the next 2 weeks.
> > >> >
> > >> > Let me know if this sounds reasonable, or any concerns.
> > >> >
> > >> > Thanks
> > >> > Wes
> > >>
> > >
> >