osdir.com

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: TableFunctions/TableMacros and Cursors?


These kinds of queries — data flow networks that combine SQL and user-defined table functions — are like representing an ETL process in SQL. It’s a very interesting problem space, and it’s a shame we have not managed to take it very far. (I’d love to do stuff like pushing filters through table functions, re-ordering table functions, materializing intermediate results, and tracking lineage at a column level as data flows through table functions. ETL could really use a query optimization approach!)

I do think that table functions with cursor arguments are the right way to do this in standard SQL. (Or table macros, a Calcite extension, which are the same for parsing/validation purposes, but behave differently at planning time.)

You’re referring to the tests ReflectiveSchemaTest.testOperator, TableFunctionTest.testTableFunctionCursorInputs and TableFunctionTest.testTableFunctionCursorsInputs. They are all disabled, and I don’t think they ever worked. (There is a test for a table function with a scalar argument but no cursor arguments, TableFunctionTest.testTableFunction, and that works just fine.)

I was a bit surprised that the validator thinks actual argument types are OTHER. It should be able to figure out that the type is CURSOR (and therefore match the formal parameter type CURSOR). And furthermore it ought to be able to figure out the row-type of the cursor.

SqlValidatorTest.testCollectionTableWithCursorParam tests all this stuff, and it passes. That makes me optimistic. The validator seems to doing the hard part correctly, so maybe there is something minor broken in the sql-to-rel or planning process.

Julian
 

> On May 23, 2018, at 10:10 AM, Jeffrey Peden <jpeden@xxxxxxxxxxxxx> wrote:
> 
> Hi All,
> 
> Hoping for a pointer in the right direction.  It looks like that at one
> point Calcite/Optiq supported table generation via
> TableFunctions/TableMacros by passing the result set of another query into
> the function via a CURSOR.
> 
>>> select * from table(s.process(2,cursor(select * from
> table(s.GenerateStrings(1)))))
> 
> Currently, it appears this no longer works, as the test cases in
> TableFunctionTest have been commented out, and when attempting to build
> functions that take an Enumerable or Queryable as input from a cursor (as
> above), they fall down on SqlTypeAssignmentRules throwing an assertion
> error that "No assign rules for OTHER defined" in checking canCastFrom().
> 
> My questions:
> 1) Would TableFunctions/TableMacros still be the preferred way to generate
> a new Table as the result of another query?  (Or is there another approach
> that I'm missing)
> 2) If so, is there an easy (i.e. without patching up the type checking
> system) way to bring this functionality back to life?
> 3) If not an easy way, any pointers on where to start on (2)?
> 
> Thanks a ton!
> 
> --Jeff