Re: Algorithmic explorations of bitmaps vs. sentinel values
Le 16/10/2018 à 14:05, Wes McKinney a écrit :
> hi folks,
> I explored a bit the performance implications of using validity
> bitmaps (like the Arrow columnar format) vs. sentinel values (like
> NaN, INT32_MIN) for nulls:
> The vectorization results may be of interest to those implementing
> analytic functions targeting the Arrow memory format. There's probably
> some other optimizations that can be employed, too.
This is a nice write-up. It may also possible to further speed up
things using explicit SIMD operations.
For the non-null case, it should be relatively doable, see e.g.
For the with-nulls case, it might be possible to do something with SIMD
masks, but I'm not competent to propose anything concrete :-)
> Caveat: it's entirely possible I made some mistakes in my code. I
> checked the various implementations for correctness only, and did not
> dig too deeply beyond that.
> - Wes