[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Executing a pipeline from datalab - run.wait_until_finished() error

Hello all,

When I run the pipeline with 4 samples (very small dataset), I don't get any error on DirectRunner or DataflowRunner

When I run it with 50 samples dataset, I get the following error for the run.wait_until_finished()
What does this error mean? 

KeyErrorTraceback (most recent call last)
<ipython-input-38-4d2dc0c0717f> in <module>()
      1 result = pc.run()
----> 2 result.wait_until_finish()

/usr/local/envs/py2env/lib/python2.7/site-packages/apache_beam/runners/dataflow/dataflow_runner.pyc in wait_until_finish(self, duration)
    771       while thread.isAlive():
    772         time.sleep(5.0)
--> 773       if self.state != PipelineState.DONE:
    774         # TODO(BEAM-1290): Consider converting this to an error log based on the
    775         # resolution of the issue.

/usr/local/envs/py2env/lib/python2.7/site-packages/apache_beam/runners/dataflow/dataflow_runner.pyc in state(self)
    741     }
--> 743     return (api_jobstate_map[self._job.currentState] if self._job.currentState
    744             else PipelineState.UNKNOWN)

KeyError: CurrentStateValueValuesEnum(JOB_STATE_PENDING, 9)