[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Question about complex rule operands and Rel tree in general

For validation (i.e. that doesn’t modify the tree), I would use a visitor. RelVisitor may suffice.

There are also a few “whole tree” transformations, e.g. column pruning. Use sparingly.

You are correct that rules and their operands do not “scale” to match large sections of the tree. We could in principle extend matching a little (e.g. better handing of Union with many inputs) but the locality is mostly a good thing. In a Volcano graph, there are multiple nodes in each equivalence set, therefore huge numbers of paths through the graph. Deep matches would quickly become intractable.

I strongly recommend using traits, and in particular predicates (RelMdPredicates / RelOptPredicateList). Let’s suppose you want to know whether a particular input column is always equal to 5. You could write a rule that looks for a Project several layers down whose expression is the literal 5. But much better is to look at the predicates. Predicates are propagated up the tree, which means you don’t need to look at the structure, and you can reason and act locally.

Similar arguments apply for sort and distribution (which are also traits).

If are able to package your logic into a RelOptRule you will be pleased with the results. It composes beautifully and efficiently with the hundreds of other rules, and with all the flavors of metadata.


> On Nov 8, 2018, at 12:50 PM, Андрей Цвелодуб <a.tsvelodub@xxxxxxxxx> wrote:
> Hello everyone!
> I have a question that I can't find an answer to, so maybe someone could
> help me.
> As a part of Rel Rules, there is always an operand, that matches a part of
> the tree, and says if the rule should be executed.
> The operand can be complex, so I can say for example - match an Aggregate
> on top of Project on top of Filter. AFAIU, this operand will only match if
> exactly this three nodes will be somewhere in the tree.
> But here is my question - what if I want a rule that will match a more
> generic structure, like this
>> Aggregate
>> -...
>> --* any number of any nodes in any levels
>> ---...
>> ----  Project
> Is there an official way to do that?
> My first approach was to match any Aggregate and then try to inspect the
> underlying tree in matches()/onMatch(), but this turned out to be quite
> unreliable since it involves inspecting RelSubsets (and this shouldn't be
> done, as follows from
> https://lists.apache.org/thread.html/ee2349272e9d344228595c0940820b2fc525cc6115388c48e99495a6@%3Cdev.calcite.apache.org%3E).
> In case I'm doing it all wrong, I can formulate my question even broader -
> is there a mechanism to perform validation of the execution tree during the
> planning process, i.e. skip some plans as unimplementable based on their
> internal structure. As an example imagine I want to say that in
> JdbcConvention, all plans that have a Filter node, over a Project node that
> has more than three fields, should not be implemented. (Modifying cost
> calculation is also not an option since the plan still has RelSubsets)
> I hope this makes sense, and thanks in advance!
> Best Regards,
> Andrew Tsvielodub