Re: Gandiva Initiative
This is really exciting, thanks a lot for sharing!
In case anybody wants to try this out from Python, I wrote up some Cython
bindings (very limited so far, but they can already be used to construct
some computation graphs and do some benchmarks):
They are developed in the Arrow repo for now, it would be great if we could
find a good solution to integrate the two projects and build systems
seamlessly (for example setting up a Cython environment in the Gandiva repo
in a way that interoperates well with PyArrow would be hard right now).
On Thu, Jun 21, 2018 at 4:26 PM, Wes McKinney <wesmckinn@xxxxxxxxx> wrote:
> hi Jacques,
> This is very exciting! LLVM codegen for Arrow has been on my wishlist
> since the early days of the project. I always considered it more of a
> "when" question more than "if".
> I will take a closer look at the codebase to make some comments, but
> my biggest initial question is whether we could work to make Gandiva
> the official community-supported LLVM framework for creating
> JIT-compiled Arrow kernels. In the Ursa Labs (a new lab I am building
> to focus 90+% on Apache Arrow development) tech roadmap we discussed
> the need for a subgraph compiler using LLVM:
> I would be interesting in getting involved in the project, and I
> expect in time many others will, as well. An obvious question would be
> whether you would be interested in donating the project to Apache
> Arrow and continuing the work there. We would benefit from common
> build, testing/CI, and packaging/deployment infrastructure. I'm keen
> to see JIT-powered predicate pushdown in Parquet files, for example.
> Phillip and I could look into building a Gandiva backend for compiling
> a subset of expressions originating from Ibis, a lazy-evaluation DSL
> system with similar API to pandas
> On Thu, Jun 21, 2018 at 4:13 PM, Dimitri Vorona
> <firstname.lastname@example.org> wrote:
> > Hey Jaques,
> > Great stuff! I'm actually researching the integration of arrow and flight
> > into a main memory database which also uses LLVM for dynamic query
> > generation! Excited to have a more detailed look at Gandiva!
> > Cheers,
> > Dimitri.
> > On Thu, Jun 21, 2018, 21:15 Jacques Nadeau <jacques@xxxxxxxxxx> wrote:
> >> Hey Guys,
> >> Dremio just open sourced a new framework for processing data in Arrow
> >> structures , built on top of the Apache Arrow C++ APIs and leveraging
> >> LLVM (Apache licensed). It also includes Java APIs that leverage the
> >> Arrow Java libraries. I expect the developers who have been working on
> >> will introduce themselves soon. To read more about it, take a look at
> >> Ravindra's blog post (he's the lead developer driving this work): .
> >> Hopefully people will find this interesting/useful.
> >> Let us know what you all think!
> >> thanks,
> >> Jacques
> >>  https://github.com/dremio/gandiva
> >>  https://www.dremio.com/announcing-gandiva-initiative-