OSDir


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Hive QA batches timing out


I've a feeling that sometimes the same issue happens in other tests - but I agree; disabling it will make our life easier - until the real cause is uncovered and fixed.

cheers,
Zoltan


On 07/09/2018 09:05 AM, Deepak Jaiswal wrote:
Thanks Zoltan for the analysis. Perhaps we should disable the test in the meantime as it is blocking several people from committing.

I can go ahead and create a patch for it.

Regards,
Deepak

On 7/8/18, 11:33 PM, "Zoltan Haindrich" <kirk@xxxxxx> wrote:

     Hello
Thank you Deepak for taking a closer look! ....from what you've found I've noticed that the runtime of TestReplicationScenariosAcidTables have jumped up to ~2000sec in the
     runs which have failed....it seems like this problem is there for a long time now; I've found jira tickets in which this test was "timed out" and the HiveQA comment was
     date at April 03....so it's not entirely new...
The problem which prohibits this test from completing successfully seems like that it has difficulties closing down the metastore client - which goes on for a while ...
     I don't know if this is an acid/replication/metastore/? issue...but it seems intermittent - I've a hunch that somehow it might happen more reliably with this test...I've
     opened HIVE-20121 to investigate this...
2018-07-08T22:07:33,461 DEBUG [main] metastore.HiveMetaStoreClient: Unable to shutdown metastore client. Will try closing transport directly.
     org.apache.thrift.transport.TTransportException: Cannot write to null outputStream
some links to more or less recent logs:
     http://104.198.109.242/logs/PreCommit-HIVE-Build-12481/failed/240_UTBatch_itests__hive-unit_9_tests/maven-test.txt
     the hive.log is ~200M:
     http://104.198.109.242/logs/PreCommit-HIVE-Build-12481/failed/240_UTBatch_itests__hive-unit_9_tests/logs/hive.log
cheers,
     Zoltan
On 07/08/2018 06:49 PM, Deepak Jaiswal wrote:
     > I am seeing tests timing out in my latest ptest run,
     >
     > https://builds.apache.org/job/PreCommit-HIVE-Build/12468/testReport
     > https://builds.apache.org/job/PreCommit-HIVE-Build/12468/console
     >
     > TestAlterTableMetadata - did not produce a TEST-*.xml file (likely timed out) (batchId=240)
     > TestAutoPurgeTables - did not produce a TEST-*.xml file (likely timed out) (batchId=240)
     > TestLocationQueries - did not produce a TEST-*.xml file (likely timed out) (batchId=240)
     > TestReplicationScenariosAcidTables - did not produce a TEST-*.xml file (likely timed out) (batchId=240)
     > TestSemanticAnalyzerHookLoading - did not produce a TEST-*.xml file (likely timed out) (batchId=240)
     > TestSparkStatistics - did not produce a TEST-*.xml file (likely timed out) (batchId=240)
     >
     >
     >  From the Hive QA homepage, the last stable build was 12444 whereas the current run is 12473. I looked at some of the runs in between and it looks like most of the runs are failing due to the above batch of unit tests.
     >
     > Regards,
     > Deepak
     >