OSDir


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Building and visualizing the Beam SQL graph


Not answering the original question, but doesn't "explain" satisfy the SQL use case?

Going forward we probably want to solve this in a more general way. We have at least 3 ways to represent the pipeline:
 - how runner executes it;
 - what it looks like when constructed;
 - what the user was describing in DSL;
And there will probably be more, if extra layers are built on top of DSLs.

If possible, we probably should be able to map any level of abstraction to any other to better understand and debug the pipelines.


On Mon, Jun 11, 2018 at 12:17 PM Kenneth Knowles <klk@xxxxxxxxxx> wrote:
In other words, revert https://github.com/apache/beam/pull/4705/files, at least in spirit? I agree :-)

Kenn

On Mon, Jun 11, 2018 at 11:39 AM Andrew Pilloud <apilloud@xxxxxxxxxx> wrote:
We are currently converting the Calcite Rel tree to Beam by recursively building a tree of nested PTransforms. This results in a weird nested graph in the dataflow UI where each node contains its inputs nested inside of it. I'm going to change the internal data structure for converting the tree from a PTransform to a PCollection, which will result in a more accurate representation of the tree structure being built and should simplify the code as well. This will not change the public interface to SQL, which will remain a PTransform. Any thoughts or objections?

I was also wondering if there are tools for visualizing the Beam graph aside from the dataflow runner UI. What other tools exist?

Andrew