[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Discuss] Make flattening on Struct/Row optional

Hi Community,

While trying to support Row type in Apache Beam SQL on top of Calcite, I
realized flattening Row logic will make structure information of Row lost
after Projections. There is a use case where users want to mix Beam
programming model with Beam SQL together to process a dataset. The
following is an example of the use case:

dataset.apply(something user defined)
            .apply(SELECT ...)
            .apply(something user defined)

As you can see, after the SQL statement is applied, the data structure
should be preserved for further processing.

The most straightforward way to me is to make Struct fattening optional so
I could choose to disable it and the Row structure is preserved. Can I ask
if it is feasible to make it happen? What could happen if Calcite just
doesn't flatten Struct in flattener? (I tried to disable it but had
exceptions in optimizer. I wasn't sure if that were some minor thing to fix
or Struct flattening was a design choice so the impact of change was huge)

Additionally, if there is a way to keep the information that I can use to
reconstruct the Row after projections, it might be ok as well. Does this
idea exist in Calcite? If it does not exist, how is this idea compared with
disabling Struct flattening?