OSDir


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Gandiva Initiative


Hello Antoine,

the LLVM API is an interesting point. I've been using PyArrow and Numba for quite a bit and this would definitely clash. A quick Google search did not reveal any workaround for this issue. In the other cases where we have such clashes, boost and jemalloc, the library itself already provides the infrastructure to vendor it with a private namespace. LLVM does not seem to have such infrastructure.

>From my experience, the llvmlite (the Python package for LLVM which Numba uses) maintainers have been quite quick in updating to new LLVM versions. I would expect that we also would update quite frequently. Thus the newest releases would work nicely together (assuming that we may finally get the infrastructure for monthly releases running(. The problematic situation would be when the user has two Python packages installed with differing LLVM versions. In the case of conda this would probably be detected at the package manager level but with pip we probably be facing the problems only on execution.

Uwe

On Sun, Jun 24, 2018, at 7:02 PM, Antoine Pitrou wrote:
> 
> Hi,
> 
> I think JIT-compiling of kernels operating on Arrow data is an important
> development path, but just for the record, LLVM doesn't have a stable
> C++ API (the API changes at each feature release).  Just something to
> keep a mind for the ensuing packaging discussions ;-)
> 
> (it also raises interesting questions such as "what happens if a user
> wants to use both PyArrow and Numba in a given process, and they don't
> target the same LLVM API version")
> 
> Regards
> 
> Antoine.
> 
> 
> Le 22/06/2018 à 01:26, Wes McKinney a écrit :
> > hi Jacques,
> > 
> > This is very exciting! LLVM codegen for Arrow has been on my wishlist
> > since the early days of the project. I always considered it more of a
> > "when" question more than "if".
> > 
> > I will take a closer look at the codebase to make some comments, but
> > my biggest initial question is whether we could work to make Gandiva
> > the official community-supported LLVM framework for creating
> > JIT-compiled Arrow kernels. In the Ursa Labs (a new lab I am building
> > to focus 90+% on Apache Arrow development) tech roadmap we discussed
> > the need for a subgraph compiler using LLVM:
> > https://ursalabs.org/tech/#subgraph-compilation-code-generation.
> > 
> > I would be interesting in getting involved in the project, and I
> > expect in time many others will, as well. An obvious question would be
> > whether you would be interested in donating the project to Apache
> > Arrow and continuing the work there. We would benefit from common
> > build, testing/CI, and packaging/deployment infrastructure. I'm keen
> > to see JIT-powered predicate pushdown in Parquet files, for example.
> > Phillip and I could look into building a Gandiva backend for compiling
> > a subset of expressions originating from Ibis, a lazy-evaluation DSL
> > system with similar API to pandas
> > (https://github.com/ibis-project/ibis).
> > 
> > best
> > Wes
> > 
> > On Thu, Jun 21, 2018 at 4:13 PM, Dimitri Vorona
> > <alendit@xxxxxxxxxxxxxx.invalid> wrote:
> >> Hey Jaques,
> >>
> >> Great stuff! I'm actually researching the integration of arrow and flight
> >> into a main memory database which also uses LLVM for dynamic query
> >> generation! Excited to have a more detailed look at Gandiva!
> >>
> >> Cheers,
> >> Dimitri.
> >>
> >> On Thu, Jun 21, 2018, 21:15 Jacques Nadeau <jacques@xxxxxxxxxx> wrote:
> >>
> >>> Hey Guys,
> >>>
> >>> Dremio just open sourced a new framework for processing data in Arrow data
> >>> structures [1], built on top of the Apache Arrow C++ APIs and leveraging
> >>> LLVM (Apache licensed). It also includes Java APIs that leverage the Apache
> >>> Arrow Java libraries. I expect the developers who have been working on this
> >>> will introduce themselves soon. To read more about it, take a look at our
> >>> Ravindra's blog post (he's the lead developer driving this work): [2].
> >>> Hopefully people will find this interesting/useful.
> >>>
> >>> Let us know what you all think!
> >>>
> >>> thanks,
> >>> Jacques
> >>>
> >>>
> >>> [1] https://github.com/dremio/gandiva
> >>> [2] https://www.dremio.com/announcing-gandiva-initiative-for-apache-arrow/
> >>>