[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Migrating Beam SQL to Calcite's code generation


That seems pretty amazing.

On Mon, Sep 17, 2018 at 3:22 PM Mingmin Xu <mingmxus@xxxxxxxxx> wrote:
Awesome work, we should call Calcite operator functions if available.

I haven't get time to read the PR yet, for those impacted would keep existing implementation. One example is, I notice FLOOR/CEIL only supports months/years recently which is quite a surprise to me.

Mingmin

On Mon, Sep 17, 2018 at 3:03 PM Anton Kedin <kedin@xxxxxxxxxx> wrote:
This is pretty amazing! Thank you for doing this!

Regards,
Anton

On Mon, Sep 17, 2018 at 2:27 PM Andrew Pilloud <apilloud@xxxxxxxxxx> wrote:
I've adapted Calcite's EnumerableCalc code generation to generate the BeamCalc DoFn. The primary purpose behind this change is so we can take advantage of Calcite's extensive SQL operator implementation. This deletes ~11000 lines of code from Beam (with ~350 added), significantly increases the set of supported SQL operators, and improves performance and correctness of currently supported operators. Here is my work in progress: https://github.com/apache/beam/pull/6417

There are a few bugs in Calcite that this has exposed:

Fixed in Calcite master:
  • CALCITE-2321 - The type of a union of CHAR columns of different lengths should be VARCHAR
  • CALCITE-2447 - Some POWER, ATAN2 functions fail with NoSuchMethodException
Pending PRs:
  • CALCITE-2529 - linq4j should promote integer to floating point when generating function calls
  • CALCITE-2530 - TRIM function does not throw exception when the length of trim character is not 1(one)
More work:
  • CALCITE-2404 - Accessing structured-types is not implemented by the runtime
  • (none yet) - Support multi character TRIM extension in Calcite
I would like to push these changes in with these minor regressions. Do any of these Calcite bugs block this functionality being adding to Beam?

Andrew


--
----
Mingmin