osdir.com

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Gandiva Initiative


This is really exciting, thanks a lot for sharing!

In case anybody wants to try this out from Python, I wrote up some Cython
bindings (very limited so far, but they can already be used to construct
some computation graphs and do some benchmarks):
https://github.com/apache/arrow/pull/2153

They are developed in the Arrow repo for now, it would be great if we could
find a good solution to integrate the two projects and build systems
seamlessly (for example setting up a Cython environment in the Gandiva repo
in a way that interoperates well with PyArrow would be hard right now).

-- Philipp.

On Thu, Jun 21, 2018 at 4:26 PM, Wes McKinney <wesmckinn@xxxxxxxxx> wrote:

> hi Jacques,
>
> This is very exciting! LLVM codegen for Arrow has been on my wishlist
> since the early days of the project. I always considered it more of a
> "when" question more than "if".
>
> I will take a closer look at the codebase to make some comments, but
> my biggest initial question is whether we could work to make Gandiva
> the official community-supported LLVM framework for creating
> JIT-compiled Arrow kernels. In the Ursa Labs (a new lab I am building
> to focus 90+% on Apache Arrow development) tech roadmap we discussed
> the need for a subgraph compiler using LLVM:
> https://ursalabs.org/tech/#subgraph-compilation-code-generation.
>
> I would be interesting in getting involved in the project, and I
> expect in time many others will, as well. An obvious question would be
> whether you would be interested in donating the project to Apache
> Arrow and continuing the work there. We would benefit from common
> build, testing/CI, and packaging/deployment infrastructure. I'm keen
> to see JIT-powered predicate pushdown in Parquet files, for example.
> Phillip and I could look into building a Gandiva backend for compiling
> a subset of expressions originating from Ibis, a lazy-evaluation DSL
> system with similar API to pandas
> (https://github.com/ibis-project/ibis).
>
> best
> Wes
>
> On Thu, Jun 21, 2018 at 4:13 PM, Dimitri Vorona
> <alendit@xxxxxxxxxxxxxx.invalid> wrote:
> > Hey Jaques,
> >
> > Great stuff! I'm actually researching the integration of arrow and flight
> > into a main memory database which also uses LLVM for dynamic query
> > generation! Excited to have a more detailed look at Gandiva!
> >
> > Cheers,
> > Dimitri.
> >
> > On Thu, Jun 21, 2018, 21:15 Jacques Nadeau <jacques@xxxxxxxxxx> wrote:
> >
> >> Hey Guys,
> >>
> >> Dremio just open sourced a new framework for processing data in Arrow
> data
> >> structures [1], built on top of the Apache Arrow C++ APIs and leveraging
> >> LLVM (Apache licensed). It also includes Java APIs that leverage the
> Apache
> >> Arrow Java libraries. I expect the developers who have been working on
> this
> >> will introduce themselves soon. To read more about it, take a look at
> our
> >> Ravindra's blog post (he's the lead developer driving this work): [2].
> >> Hopefully people will find this interesting/useful.
> >>
> >> Let us know what you all think!
> >>
> >> thanks,
> >> Jacques
> >>
> >>
> >> [1] https://github.com/dremio/gandiva
> >> [2] https://www.dremio.com/announcing-gandiva-initiative-
> for-apache-arrow/
> >>
>