osdir.com

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: TableFunctions/TableMacros and Cursors?


If you’re working on something, please create a JIRA case to track your work. We can have conversations there.

I recall — a long time ago, in the code that later evolved into Calcite — that we had to devise a way pass the row-type of cursor arguments in a “back channel” because the type derivation mechanism didn’t allow rich type information to be passed. You may come across the remnants of that back-channel and be able to re-instate it. SqlValidatorImpl.cursorScopes may be it, and there may be others.

Julian


> On May 23, 2018, at 12:30 PM, Jeffrey Peden <jpeden@xxxxxxxxxxxxx> wrote:
> 
> Julian,
> 
> Thanks.  This is exactly what we're hoping to use Calcite for --
> essentially use SQL generate a list of Primary/Secondary Keys to access a
> massive data lake and return the subset as a virtualized table.  Had
> started on an approach building an adapter with push-downs on joins so that
> we could have a table that was generated by being the right-hand side of a
> join, but was getting tripped up on getting the left hand results into the
> TableScan.  Finding your 2014 blog post on TableMacros made me fopeful we
> could find a simpler/more elegant solution through table functions.
> 
> In terms of what's here, the validator correctly gets that the argument
> being passed into the function (the fromType) is CURSOR, it just doesn't
> have a registered type for Enumerable/Queryable in nodeToTypeMap to
> determine that the method parameter should also be CURSOR.  I'll keep
> digging down this rode.  Hoping that it might be as simple as hooking into
> SqlValidatorImpl and calling setValidatedNodeType.
> 
> If I get anywhere with this, will fire away with updates/questions as we
> learn more.  Thanks for the feedback!
> 
> --JP
> 
> On Wed, May 23, 2018 at 3:15 PM Julian Hyde <jhyde@xxxxxxxxxx> wrote:
> 
>> These kinds of queries — data flow networks that combine SQL and
>> user-defined table functions — are like representing an ETL process in SQL.
>> It’s a very interesting problem space, and it’s a shame we have not managed
>> to take it very far. (I’d love to do stuff like pushing filters through
>> table functions, re-ordering table functions, materializing intermediate
>> results, and tracking lineage at a column level as data flows through table
>> functions. ETL could really use a query optimization approach!)
>> 
>> I do think that table functions with cursor arguments are the right way to
>> do this in standard SQL. (Or table macros, a Calcite extension, which are
>> the same for parsing/validation purposes, but behave differently at
>> planning time.)
>> 
>> You’re referring to the tests ReflectiveSchemaTest.testOperator,
>> TableFunctionTest.testTableFunctionCursorInputs and
>> TableFunctionTest.testTableFunctionCursorsInputs. They are all disabled,
>> and I don’t think they ever worked. (There is a test for a table function
>> with a scalar argument but no cursor arguments,
>> TableFunctionTest.testTableFunction, and that works just fine.)
>> 
>> I was a bit surprised that the validator thinks actual argument types are
>> OTHER. It should be able to figure out that the type is CURSOR (and
>> therefore match the formal parameter type CURSOR). And furthermore it ought
>> to be able to figure out the row-type of the cursor.
>> 
>> SqlValidatorTest.testCollectionTableWithCursorParam tests all this stuff,
>> and it passes. That makes me optimistic. The validator seems to doing the
>> hard part correctly, so maybe there is something minor broken in the
>> sql-to-rel or planning process.
>> 
>> Julian
>> 
>> 
>>> On May 23, 2018, at 10:10 AM, Jeffrey Peden <jpeden@xxxxxxxxxxxxx>
>> wrote:
>>> 
>>> Hi All,
>>> 
>>> Hoping for a pointer in the right direction.  It looks like that at one
>>> point Calcite/Optiq supported table generation via
>>> TableFunctions/TableMacros by passing the result set of another query
>> into
>>> the function via a CURSOR.
>>> 
>>>>> select * from table(s.process(2,cursor(select * from
>>> table(s.GenerateStrings(1)))))
>>> 
>>> Currently, it appears this no longer works, as the test cases in
>>> TableFunctionTest have been commented out, and when attempting to build
>>> functions that take an Enumerable or Queryable as input from a cursor (as
>>> above), they fall down on SqlTypeAssignmentRules throwing an assertion
>>> error that "No assign rules for OTHER defined" in checking canCastFrom().
>>> 
>>> My questions:
>>> 1) Would TableFunctions/TableMacros still be the preferred way to
>> generate
>>> a new Table as the result of another query?  (Or is there another
>> approach
>>> that I'm missing)
>>> 2) If so, is there an easy (i.e. without patching up the type checking
>>> system) way to bring this functionality back to life?
>>> 3) If not an easy way, any pointers on where to start on (2)?
>>> 
>>> Thanks a ton!
>>> 
>>> --Jeff
>> 
>> 
> 
> -- 
> Jeffrey Peden, Founder, CraveLabs
> Email: jpeden@xxxxxxxxxxxxx
> Voice: 781-367-9352
> Linked In: http://linkd.in/jpeden