[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: TableFunctions/TableMacros and Cursors?


Thanks.  This is exactly what we're hoping to use Calcite for --
essentially use SQL generate a list of Primary/Secondary Keys to access a
massive data lake and return the subset as a virtualized table.  Had
started on an approach building an adapter with push-downs on joins so that
we could have a table that was generated by being the right-hand side of a
join, but was getting tripped up on getting the left hand results into the
TableScan.  Finding your 2014 blog post on TableMacros made me fopeful we
could find a simpler/more elegant solution through table functions.

In terms of what's here, the validator correctly gets that the argument
being passed into the function (the fromType) is CURSOR, it just doesn't
have a registered type for Enumerable/Queryable in nodeToTypeMap to
determine that the method parameter should also be CURSOR.  I'll keep
digging down this rode.  Hoping that it might be as simple as hooking into
SqlValidatorImpl and calling setValidatedNodeType.

If I get anywhere with this, will fire away with updates/questions as we
learn more.  Thanks for the feedback!


On Wed, May 23, 2018 at 3:15 PM Julian Hyde <jhyde@xxxxxxxxxx> wrote:

> These kinds of queries — data flow networks that combine SQL and
> user-defined table functions — are like representing an ETL process in SQL.
> It’s a very interesting problem space, and it’s a shame we have not managed
> to take it very far. (I’d love to do stuff like pushing filters through
> table functions, re-ordering table functions, materializing intermediate
> results, and tracking lineage at a column level as data flows through table
> functions. ETL could really use a query optimization approach!)
> I do think that table functions with cursor arguments are the right way to
> do this in standard SQL. (Or table macros, a Calcite extension, which are
> the same for parsing/validation purposes, but behave differently at
> planning time.)
> You’re referring to the tests ReflectiveSchemaTest.testOperator,
> TableFunctionTest.testTableFunctionCursorInputs and
> TableFunctionTest.testTableFunctionCursorsInputs. They are all disabled,
> and I don’t think they ever worked. (There is a test for a table function
> with a scalar argument but no cursor arguments,
> TableFunctionTest.testTableFunction, and that works just fine.)
> I was a bit surprised that the validator thinks actual argument types are
> OTHER. It should be able to figure out that the type is CURSOR (and
> therefore match the formal parameter type CURSOR). And furthermore it ought
> to be able to figure out the row-type of the cursor.
> SqlValidatorTest.testCollectionTableWithCursorParam tests all this stuff,
> and it passes. That makes me optimistic. The validator seems to doing the
> hard part correctly, so maybe there is something minor broken in the
> sql-to-rel or planning process.
> Julian
> > On May 23, 2018, at 10:10 AM, Jeffrey Peden <jpeden@xxxxxxxxxxxxx>
> wrote:
> >
> > Hi All,
> >
> > Hoping for a pointer in the right direction.  It looks like that at one
> > point Calcite/Optiq supported table generation via
> > TableFunctions/TableMacros by passing the result set of another query
> into
> > the function via a CURSOR.
> >
> >>> select * from table(s.process(2,cursor(select * from
> > table(s.GenerateStrings(1)))))
> >
> > Currently, it appears this no longer works, as the test cases in
> > TableFunctionTest have been commented out, and when attempting to build
> > functions that take an Enumerable or Queryable as input from a cursor (as
> > above), they fall down on SqlTypeAssignmentRules throwing an assertion
> > error that "No assign rules for OTHER defined" in checking canCastFrom().
> >
> > My questions:
> > 1) Would TableFunctions/TableMacros still be the preferred way to
> generate
> > a new Table as the result of another query?  (Or is there another
> approach
> > that I'm missing)
> > 2) If so, is there an easy (i.e. without patching up the type checking
> > system) way to bring this functionality back to life?
> > 3) If not an easy way, any pointers on where to start on (2)?
> >
> > Thanks a ton!
> >
> > --Jeff

Jeffrey Peden, Founder, CraveLabs
Email: jpeden@xxxxxxxxxxxxx
Voice: 781-367-9352
Linked In: http://linkd.in/jpeden