[DISCUSS] Unsustainable situation with ptests
I believe we have reached a state (maybe we did reach it a while ago) that is not sustainable anymore, as there are so many tests failing / timing out that it is not possible to verify whether a patch is breaking some critical parts of the system or not. It also seems to me that due to the timeouts (maybe due to infra, maybe not), ptest runs are taking even longer than usual, which in turn creates even longer queue of patches.
There is an ongoing effort to improve ptests usability (https://issues.apache.org/jira/browse/HIVE-19425), but apart from that, we need to make an effort to stabilize existing tests and bring that failure count to zero.
Hence, I am suggesting *we stop committing any patch before we get a green run*. If someone thinks this proposal is too radical, please come up with an alternative, because I do not think it is OK to have the ptest runs in their current state. Other projects of certain size (e.g., Hadoop, Spark) are always green, we should be able to do the same.
Finally, once we get to zero failures, I suggest we are less tolerant with committing without getting a clean ptests run. If there is a failure, we need to fix it or revert the patch that caused it, then we continue developing.
Please, let’s all work together as a community to fix this issue, that is the only way to get to zero quickly.
PS. I assume the flaky tests will come into the discussion. Let´s see first how many of those we have, then we can work to find a fix.