[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[DISCUSS] Long-term goal of making flink-table Scala-free

Hi everyone,

as you all know, currently the Table & SQL API is implemented in Scala. This decision was made a long-time ago when the initital code base was created as part of a master's thesis. The community kept Scala because of the nice language features that enable a fluent Table API like table.select('field.trim()) and because Scala allows for quick prototyping (e.g. multi-line comments for code generation). The committers enforced not splitting the code-base into two programming languages.

However, nowadays the flink-table module more and more becomes an important part in the Flink ecosystem. Connectors, formats, and SQL client are actually implemented in Java but need to interoperate with flink-table which makes these modules dependent on Scala. As mentioned in an earlier mail thread, using Scala for API classes also exposes member variables and methods in Java that should not be exposed to users [1]. Java is still the most important API language and right now we treat it as a second-class citizen. I just noticed that you even need to add Scala if you just want to implement a ScalarFunction because of method clashes between `public String toString()` and `public scala.Predef.String toString()`.

Given the size of the current code base, reimplementing the entire flink-table code in Java is a goal that we might never reach. However, we should at least treat the symptoms and have this as a long-term goal in mind. My suggestion would be to convert user-facing and runtime classes and split the code base into multiple modules:

> flink-table-java {depends on flink-table-core}
Implemented in Java. Java users can use this. This would require to convert classes like TableEnvironment, Table.

> flink-table-scala {depends on flink-table-core}
Implemented in Scala. Scala users can use this.

> flink-table-common
Implemented in Java. Connectors, formats, and UDFs can use this. It contains interface classes such as descriptors, table sink, table source.

> flink-table-core {depends on flink-table-common and flink-table-runtime}
Implemented in Scala. Contains the current main code base.

> flink-table-runtime
Implemented in Java. This would require to convert classes in o.a.f.table.runtime but would improve the runtime potentially.

What do you think?



[1] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Convert-main-Table-API-classes-into-traits-tp21335.html