osdir.com

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Beam SparkRunner and Spark KryoSerializer problem


Folks,

Its someone using the SparkRunner out there with the Spark KryoSerializer ?

We are being force to use the not so efficient 'JavaSerializer' with Spark because we face the following exception:

<exception>
Exception in thread "main" java.lang.RuntimeException: org.apache.spark.SparkException: Job aborted due to stage failure: Exception while getting task result: com.esotericsoftware.kryo.KryoException: Unable to find class: org.apache.beam.runners.core.metrics.MetricsContainerImpl$$Lambda$31/1875283985
Serialization trace:
factory (org.apache.beam.runners.core.metrics.MetricsMap)
counters (org.apache.beam.runners.core.metrics.MetricsContainerImpl)
metricsContainers (org.apache.beam.runners.core.metrics.MetricsContainerStepMap)
metricsContainers (org.apache.beam.runners.spark.io.SparkUnboundedSource$Metadata)
at org.apache.beam.runners.spark.SparkPipelineResult.runtimeExceptionFrom(SparkPipelineResult.java:55)
at org.apache.beam.runners.spark.SparkPipelineResult.beamExceptionFrom(SparkPipelineResult.java:72)
at org.apache.beam.runners.spark.SparkPipelineResult.access$000(SparkPipelineResult.java:41)
at org.apache.beam.runners.spark.SparkPipelineResult$StreamingMode.stop(SparkPipelineResult.java:163)
at org.apache.beam.runners.spark.SparkPipelineResult.offerNewState(SparkPipelineResult.java:198)
at org.apache.beam.runners.spark.SparkPipelineResult.waitUntilFinish(SparkPipelineResult.java:101)
at org.apache.beam.runners.spark.SparkPipelineResult.waitUntilFinish(SparkPipelineResult.java:87)
at org.apache.beam.examples.BugWithKryoOnSpark.main(BugWithKryoOnSpark.java:75)
</exception>

I created a jira ticket and attached a project example on it, https://issues.apache.org/jira/browse/BEAM-4597

Any feedback is appreciated.

--

JC