osdir.com

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Timeline for Arrow 0.12.0 release


I agree that we should aim for time-based releases. Let's discuss a
time-based release schedule (my preference would be ~every 2 months)
for 2019 after we get 0.12 out.
On Wed, Dec 12, 2018 at 3:15 AM Antoine Pitrou <antoine@xxxxxxxxxx> wrote:
>
>
> I think we should aim for time-based releases in general (rather than a
> specific set of features), but delaying this one sounds good to me.
>
> Regards
>
> Antoine.
>
>
> Le 12/12/2018 à 01:34, Wes McKinney a écrit :
> > hi all,
> >
> > I'm looking at the 0.12 backlog and I am not too comfortable with the
> > things that would have to be cut to get a release out next week.
> > Additionally, not a lot of developers are going to be working the week
> > of December 24 because of the Christmas and New Year's holidays, so
> > even if we did release, it might not get seen by a lot of people until
> > after the New Year.
> >
> > Based on this, I would suggest we push to complete as much work as
> > possible (from the 0.12 backlog and beyond) by the end of the year,
> > and release as soon as possible in 2019. Of course, anyone is welcome
> > to contribute work that is not found in the 0.12 milestone =)
> >
> > Any objections?
> >
> > Thanks
> > Wes
> > On Mon, Dec 10, 2018 at 8:04 AM Andy Grove <andygrove73@xxxxxxxxx> wrote:
> >>
> >> Cool. I will continue to add primitive operations but I am now adding this
> >> in a separate source file to keep it separate from the core array code.
> >>
> >> I'm not sure how important it will be to support Rust data sources with
> >> Gandiva. I can see that each language should be able to construct the
> >> logical query plan to submit to Gandiva and let Gandiva handle execution. I
> >> think the more interesting part is how do we support language-specific
> >> lambda functions as part of that logical query plan. Maybe it is possible
> >> to compile the lambda down to LLVM (I haven't started learning about LLVM
> >> in detail yet so this is wild speculation on my part). Another option is
> >> for Gandiva to support calling into shared libraries and that maybe is
> >> simpler for languages that support building C-native shared libraries (Rust
> >> supports this with zero overhead).
> >>
> >> Andy.
> >>
> >>
> >>
> >>
> >> On Sun, Dec 9, 2018 at 11:42 AM Wes McKinney <wesmckinn@xxxxxxxxx> wrote:
> >>
> >>> hi Andy,
> >>>
> >>> I can see an argument for having some basic native function kernel
> >>> support in Rust. One of the things that Gandiva has begun is a
> >>> Protobuf-based serialized representation representation of projection
> >>> and filter expressions. In the long run I would like to see a more
> >>> complete relational algebra / logical query plan that can be submitted
> >>> for execution. There's complexities, though, such as bridging
> >>> iteration of data sources written in Rust, say, with a query engine
> >>> written in C++. You would need to provide some kind of a callback
> >>> mechanism for the query engine to request the next chunk of a dataset
> >>> to be materialized.
> >>>
> >>> It will be interested to see what contributors will be motivated
> >>> enough to build over the next few years. At the end of the day, Apache
> >>> projects are do-ocracies.
> >>>
> >>> - Wes
> >>> On Fri, Dec 7, 2018 at 6:22 AM Andy Grove <andygrove73@xxxxxxxxx> wrote:
> >>>>
> >>>> I've added one PR to the list (https://github.com/apache/arrow/pull/3119
> >>> )
> >>>> to update the project to use Rust 2018 Edition.
> >>>>
> >>>> I'm also considering removing one PR from the list and would like to get
> >>>> opinions here.
> >>>>
> >>>> I have a PR (https://github.com/apache/arrow/pull/3033) to add some
> >>> basic
> >>>> math and comparison operators to primitive arrays. These are baby steps
> >>>> towards implementing more query execution capabilities such as
> >>> projection,
> >>>> selection, etc but Chao made a good point that other Rust implementations
> >>>> don't have these kind of capabilities and I am now wondering if this is a
> >>>> distraction. We already have Gandiva and the new efforts in Ursa labs and
> >>>> it would probably make more sense to look at having Rust bindings for the
> >>>> query execution capabilities there rather than having a competing (and
> >>> less
> >>>> capable) implementation in Rust.
> >>>>
> >>>> Thoughts?
> >>>>
> >>>> Andy.
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> On Thu, Dec 6, 2018 at 8:42 PM paddy horan <paddyhoran@xxxxxxxxxxx>
> >>> wrote:
> >>>>
> >>>>> Other than Andy’s PR below I’m going to try and find time to work on
> >>>>> ARROW-3827, I’ll bump it 0.13 if I can’t find the time early next week.
> >>>>> There is nothing else in the 0.12 backlog for Rust.  It would be nice
> >>> to
> >>>>> get the parquet merge in though.
> >>>>>
> >>>>>
> >>>>>
> >>>>> Paddy
> >>>>>
> >>>>>
> >>>>>
> >>>>> ________________________________
> >>>>> From: Andy Grove <andygrove73@xxxxxxxxx>
> >>>>> Sent: Thursday, December 6, 2018 10:20:48 AM
> >>>>> To: dev@xxxxxxxxxxxxxxxx
> >>>>> Subject: Re: Timeline for Arrow 0.12.0 release
> >>>>>
> >>>>> I have PRs pending for all the Rust issues that I want to get into
> >>> 0.12.0
> >>>>> and would appreciate some reviews so I can go ahead and merge:
> >>>>>
> >>>>> https://github.com/apache/arrow/pull/3033 (covers ARROW-3880 and
> >>>>> ARROW-3881
> >>>>> - add math and comparison operations to primitive arrays)
> >>>>> https://github.com/apache/arrow/pull/3096 (ARROW-3885 - Rust release
> >>>>> process)
> >>>>> https://github.com/apache/arrow/pull/3111 (ARROW-3838 - CSV Writer)
> >>>>>
> >>>>> With these in place I plan on writing a tutorial for reading a CSV
> >>> file,
> >>>>> performing some operations on primitive arrays and writing the output
> >>> to a
> >>>>> new CSV file.
> >>>>>
> >>>>> I am deferring ARROW-3882 (casting for primitive arrays) to 0.13.0
> >>>>>
> >>>>> Thanks,
> >>>>>
> >>>>> Andy.
> >>>>>
> >>>>> On Tue, Dec 4, 2018 at 7:57 PM Andy Grove <andygrove73@xxxxxxxxx>
> >>> wrote:
> >>>>>
> >>>>>> I'd love to tackle the three related issues for supporting simple
> >>>>>> math/comparison operations on primitive arrays and casting primitive
> >>>>> arrays
> >>>>>> but since the change to use Rust specialization feature I'm a bit
> >>> stuck
> >>>>> and
> >>>>>> need some assistance applying the math operations to the numeric
> >>> types
> >>>>> and
> >>>>>> not the boolean primitives. I have added a comment to
> >>>>>> https://github.com/apache/arrow/pull/3033 ... if I can get help
> >>> solving
> >>>>>> for this PR then I should be able to handle the others. I'll also do
> >>> some
> >>>>>> research and try and figure this out myself.
> >>>>>>
> >>>>>> Andy.
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> On Tue, Dec 4, 2018 at 7:03 PM Wes McKinney <wesmckinn@xxxxxxxxx>
> >>> wrote:
> >>>>>>
> >>>>>>> Andy, Paddy, or other Rust developers -- could you review the 6
> >>> issues
> >>>>>>> in TODO in the 0.12 backlog and either assign them or move them to
> >>> the
> >>>>>>> next release if they aren't going to be completed this week or next?
> >>>>>>>
> >>>>>>>
> >>>>>>> On Fri, Nov 30, 2018 at 4:34 PM Wes McKinney <wesmckinn@xxxxxxxxx>
> >>>>> wrote:
> >>>>>>>>
> >>>>>>>> hi folks,
> >>>>>>>>
> >>>>>>>> Tomorrow is December 1. The last major Arrow release (0.11.0) took
> >>>>>>>> place on October 8. Given how much work has happened in the
> >>> project in
> >>>>>>>> the last ~2 months, I think it would be great to complete the next
> >>>>>>>> major release before the end-of-year holidays set in.
> >>>>>>>>
> >>>>>>>> I've been curating the JIRA backlog the last couple of weeks, and
> >>> have
> >>>>>>>> just created a 0.12.0 release wiki page to help us stay organized
> >>>>>>>>
> >>>>>>>>
> >>>>> https://cwiki.apache.org/confluence/display/ARROW/Arrow+0.12.0+Release
> >>>>>>>>
> >>>>>>>> Given that there are only 3 full working weeks between now and
> >>>>>>>> Christmas, I think we should be in position to cut a release by
> >>> the
> >>>>>>>> end of the week of December 10, i.e. by Friday December 14. Not
> >>> all of
> >>>>>>>> the TODO issues have to be completed to make the release, but it
> >>> would
> >>>>>>>> be good to push to complete as much as possible. Please help by
> >>>>>>>> reviewing the backlog, and if possible, assigning issues to
> >>> yourself
> >>>>>>>> that you'd like to pursue in the next 2 weeks.
> >>>>>>>>
> >>>>>>>> Let me know if this sounds reasonable, or any concerns.
> >>>>>>>>
> >>>>>>>> Thanks
> >>>>>>>> Wes
> >>>>>>>
> >>>>>>
> >>>>>
> >>>