OSDir

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[jira] [Created] (HIVE-19562) Flaky test: TestMiniSparkOnYarn FileNotFoundException in spark-submit


Sahil Takiar created HIVE-19562:
-----------------------------------

             Summary: Flaky test: TestMiniSparkOnYarn FileNotFoundException in spark-submit
                 Key: HIVE-19562
                 URL: https://issues.apache.org/jira/browse/HIVE-19562
             Project: Hive
          Issue Type: Sub-task
          Components: Spark
            Reporter: Sahil Takiar
            Assignee: Sahil Takiar


Seeing sporadic failures during test setup. Specifically, when spark-submit runs this error (or a similar error) gets thrown:

{code}
2018-05-15T10:55:02,112  INFO [RemoteDriver-stderr-redir-27d3dcfb-2a10-4118-9fae-c200d2e095a5 main] client.SparkSubmitSparkClient: Exception in thread "main" java.io.FileNotFoundException: File file:/tmp/spark-56e217f7-b8a5-4c63-9a6b-d737a64f2820/__spark_libs__7371510645900072447.zip does not exist
2018-05-15T10:55:02,113  INFO [RemoteDriver-stderr-redir-27d3dcfb-2a10-4118-9fae-c200d2e095a5 main] client.SparkSubmitSparkClient:      at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:641)
2018-05-15T10:55:02,113  INFO [RemoteDriver-stderr-redir-27d3dcfb-2a10-4118-9fae-c200d2e095a5 main] client.SparkSubmitSparkClient:      at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:867)
2018-05-15T10:55:02,113  INFO [RemoteDriver-stderr-redir-27d3dcfb-2a10-4118-9fae-c200d2e095a5 main] client.SparkSubmitSparkClient:      at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:631)
2018-05-15T10:55:02,113  INFO [RemoteDriver-stderr-redir-27d3dcfb-2a10-4118-9fae-c200d2e095a5 main] client.SparkSubmitSparkClient:      at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:442)
2018-05-15T10:55:02,113  INFO [RemoteDriver-stderr-redir-27d3dcfb-2a10-4118-9fae-c200d2e095a5 main] client.SparkSubmitSparkClient:      at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:365)
2018-05-15T10:55:02,113  INFO [RemoteDriver-stderr-redir-27d3dcfb-2a10-4118-9fae-c200d2e095a5 main] client.SparkSubmitSparkClient:      at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:316)
2018-05-15T10:55:02,113  INFO [RemoteDriver-stderr-redir-27d3dcfb-2a10-4118-9fae-c200d2e095a5 main] client.SparkSubmitSparkClient:      at org.apache.spark.deploy.yarn.Client.copyFileToRemote(Client.scala:356)
2018-05-15T10:55:02,113  INFO [RemoteDriver-stderr-redir-27d3dcfb-2a10-4118-9fae-c200d2e095a5 main] client.SparkSubmitSparkClient:      at org.apache.spark.deploy.yarn.Client.org$apache$spark$deploy$yarn$Client$$distribute$1(Client.scala:478)
2018-05-15T10:55:02,113  INFO [RemoteDriver-stderr-redir-27d3dcfb-2a10-4118-9fae-c200d2e095a5 main] client.SparkSubmitSparkClient:      at org.apache.spark.deploy.yarn.Client.prepareLocalResources(Client.scala:565)
2018-05-15T10:55:02,113  INFO [RemoteDriver-stderr-redir-27d3dcfb-2a10-4118-9fae-c200d2e095a5 main] client.SparkSubmitSparkClient:      at org.apache.spark.deploy.yarn.Client.createContainerLaunchContext(Client.scala:863)
2018-05-15T10:55:02,113  INFO [RemoteDriver-stderr-redir-27d3dcfb-2a10-4118-9fae-c200d2e095a5 main] client.SparkSubmitSparkClient:      at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:169)
2018-05-15T10:55:02,113  INFO [RemoteDriver-stderr-redir-27d3dcfb-2a10-4118-9fae-c200d2e095a5 main] client.SparkSubmitSparkClient:      at org.apache.spark.deploy.yarn.Client.run(Client.scala:1146)
2018-05-15T10:55:02,113  INFO [RemoteDriver-stderr-redir-27d3dcfb-2a10-4118-9fae-c200d2e095a5 main] client.SparkSubmitSparkClient:      at org.apache.spark.deploy.yarn.YarnClusterApplication.start(Client.scala:1518)
2018-05-15T10:55:02,113  INFO [RemoteDriver-stderr-redir-27d3dcfb-2a10-4118-9fae-c200d2e095a5 main] client.SparkSubmitSparkClient:      at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:879)
2018-05-15T10:55:02,113  INFO [RemoteDriver-stderr-redir-27d3dcfb-2a10-4118-9fae-c200d2e095a5 main] client.SparkSubmitSparkClient:      at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:197)
2018-05-15T10:55:02,113  INFO [RemoteDriver-stderr-redir-27d3dcfb-2a10-4118-9fae-c200d2e095a5 main] client.SparkSubmitSparkClient:      at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:227)
2018-05-15T10:55:02,113  INFO [RemoteDriver-stderr-redir-27d3dcfb-2a10-4118-9fae-c200d2e095a5 main] client.SparkSubmitSparkClient:      at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:136)
2018-05-15T10:55:02,113  INFO [RemoteDriver-stderr-redir-27d3dcfb-2a10-4118-9fae-c200d2e095a5 main] client.SparkSubmitSparkClient:      at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
{code}

Essentially, Spark is writing some files for container localization to a tmp dir, and that tmp dir is getting deleted. We have seen a lot of issues with writing files to {{/tmp/}} in the past, so its probably best to write these files to a test-specific dir.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)