|Thank you everybody for your feedback!|
I think we can conclude that the most popular option, according to discussion above, is number 3. Not sure if we need to do a separate vote for that but, please, let me know if we need.
So, for now, I’d split a work into the following steps:
a) Create new module "hadoop-mapreduce-format” which implements support for MapReduce OutputFormat through new HadoopMapreduceFormat.Write class. For that, I just need to change a bit my already created PR 6306 that I added recently (renaming of module and class names).
b) Move all source and test classes of “hadoop-input-format” into the module "hadoop-mapreduce-format” and create new class HadoopMapreduceFormat.Read there to support MapReduce InputFormat.
c) Make old HadoopInputFormat.Read (in old “hadoop-input-format” module) deprecated and as proxy class to newly created HadoopMapreduceFormat.Read (to keep API compatibility)
These 3 steps should be performed and completed within one release cycle (approx. in 2.8). For steps “b” and “c” I’d create another PR to avoid having a huge commit if it will include step “a” as well.
Then, in next release after:
d) Remove completely module “hadoop-input-format” (approx. in 2.9).
Other two Hadoop modules (common and file-system) we leave as it is.
I hope that this a correct summary of what community decided and I can move forward.
Please, let me know if there any objections against this plan or other suggestions.