[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Help with EnumerableMergeJoinRule which is losing a RelCollection trait

We found a strange behaviour in an execution plan, basially we have an
EnumerableMergeJoin which has as input two non-sorted

all the details are in this issue on HerdDB


Cut and paste from the issue in the bottom of this email

Any help is very appreciated, maybe some ring bells ....


SELECT * FROM license t0, customer c WHERE c.customer_id = t0.customer_id

It seems that Calcite is planning a Merge Join, but the tables are not
sorted according to the merge keys.

"License" table:
TABLE PK (non clustered): [license_id]
COL: license_id serialPos: 0 (serialPos is the index of the colum for Calcite)
COL: application serialPos: 1
COL: creation serialPos: 2
COL: data serialPos: 3
COL: deleted serialPos: 4
COL: modification serialPos: 5
COL: signature serialPos: 6
COL: customer_id serialPos: 7

"Customer" table:
TABLE PK (non clustered): [customer_id]
COL: customer_id serialPos: 0
COL: contact_email serialPos: 1
COL: contact_person serialPos: 2
COL: creation serialPos: 3
COL: deleted serialPos: 4
COL: modification serialPos: 5
COL: name serialPos: 6
COL: vetting serialPos: 7

the join is on PK (non clustered) column of table customer,
and the "customer_id" column of table 'license' which is not sorted
naturally by 'customerid' (we do not have clustered indexes !!)

This is the plan:

EnumerableMergeJoin(condition=[=($7, $9)], joinType=[inner]): rowcount
= 15.75, cumulative cost = {59.75 rows, 24.0 cpu, 0.0 io}, id = 114
EnumerableTableScan(table=[[herd, license]]): rowcount = 15.0,
cumulative cost = {15.0 rows, 16.0 cpu, 0.0 io}, id = 28
EnumerableTableScan(table=[[herd, customer]]): rowcount = 7.0,
cumulative cost = {7.0 rows, 8.0 cpu, 0.0 io}, id = 29

EnumerableTableScan does not contain any information which tells that
the Scan MUST be sorted according to the join keys (field 7 in
"licence", and field 0 in "customer")

Here in Calcite code the additional 'Collation' is lost as the
"replace" does not contain any 'RelCollation', so the inputs of the
join are not transformed


is it a bug in Calcite or in how we are passing data to Calcite ?
Tables do not have any impliticit "collation" in HerdDB so we are not
passing any 'RelCollation'

Thank you