osdir.com


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: How to use "PortableRunner" in Python SDK?


A quick follow-up on using current PortableRunner.

I followed the exact three steps as Ankur and Maximilian shared in https://beam.apache.org/roadmap/portability/#python-on-flink  ;   The wordcount example keeps hanging after 10 minutes.  I also tried specifying explicit input/output args, either using gcs folder or local file system, but none of them works. 

Spent some time looking into it but conclusion yet.  At this point though, I guess it does not matter much any more, given we already have the plan of merging PortableRunner into using java reference runner (i.e. :beam-runners-reference-job-server). 

Still appreciated if someone can try out the python-on-flink instructions in case it is just due to my local machine setup.  Thanks! 



On Thu, Nov 8, 2018 at 5:04 PM Ruoyun Huang <ruoyun@xxxxxxxxxx> wrote:
Thanks Maximilian! 

I am working on migrating existing PortableRunner to using java ULR (Link to Notes). If this issue is non-trivial to solve, I would vote for removing this default behavior as part of the consolidation.  

On Thu, Nov 8, 2018 at 2:58 AM Maximilian Michels <mxm@xxxxxxxxxx> wrote:
In the long run, we should get rid of the Docker-inside-Docker approach,
which was only intended for testing anyways. It would be cleaner to
start the SDK harness container alongside with JobServer container.

Short term, I think it should be easy to either fix the permissions of
the mounted "docker" executable or use a Docker image for the JobServer
which comes with Docker pre-installed.

JIRA: https://issues.apache.org/jira/browse/BEAM-6020

Thanks for reporting this Ruoyun!

-Max

On 08.11.18 00:10, Ruoyun Huang wrote:
> Thanks Ankur and Maximilian.
>
> Just for reference in case other people encountering the same error
> message, the "permission denied" error in my original email is exactly
> due to dockerinsidedocker issue that Ankur mentioned.      Thanks Ankur!
> Didn't make the link when you said it, had to discover that in a hard
> way (I thought it is due to my docker installation messed up).
>
> On Tue, Nov 6, 2018 at 1:53 AM Maximilian Michels <mxm@xxxxxxxxxx
> <mailto:mxm@xxxxxxxxxx>> wrote:
>
>     Hi,
>
>     Please follow
>     https://beam.apache.org/roadmap/portability/#python-on-flink
>
>     Cheers,
>     Max
>
>     On 06.11.18 01:14, Ankur Goenka wrote:
>      > Hi,
>      >
>      > The Portable Runner requires a job server uri to work with. The
>     current
>      > default job server docker image is broken because of docker inside
>      > docker issue.
>      >
>      > Please refer to
>      > https://beam.apache.org/roadmap/portability/#python-on-flink for
>     how to
>      > run a wordcount using Portable Flink Runner.
>      >
>      > Thanks,
>      > Ankur
>      >
>      > On Mon, Nov 5, 2018 at 3:41 PM Ruoyun Huang <ruoyun@xxxxxxxxxx
>     <mailto:ruoyun@xxxxxxxxxx>
>      > <mailto:ruoyun@xxxxxxxxxx <mailto:ruoyun@xxxxxxxxxx>>> wrote:
>      >
>      >     Hi, Folks,
>      >
>      >           I want to try out Python PortableRunner, by using following
>      >     command:
>      >
>      >     *sdk/python: python -m apache_beam.examples.wordcount
>      >       --output=/tmp/test_output   --runner PortableRunner*
>      >
>      >           It complains with following error message:
>      >
>      >     Caused by: java.lang.Exception: The user defined 'open()' method
>      >     caused an exception: java.io.IOException: Cannot run program
>      >     "docker": error=13, Permission denied
>      >     at
>     org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:498)
>      >     at
>      >   
>       org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:368)
>      >     at org.apache.flink.runtime.taskmanager.Task.run(Task.java:712)
>      >     ... 1 more
>      >     Caused by:
>      >   
>       org.apache.beam.repackaged.beam_runners_java_fn_execution.com.google.common.util.concurrent.UncheckedExecutionException:
>      >     java.io.IOException: Cannot run program "docker": error=13,
>      >     Permission denied
>      >     at
>      >   
>       org.apache.beam.repackaged.beam_runners_java_fn_execution.com.google.common.cache.LocalCache$LocalLoadingCache.getUnchecked(LocalCache.java:4994)
>      >
>      >     ... 7 more
>      >
>      >
>      >
>      >     My py2 environment is properly configured, because DirectRunner
>      >     works.  Also I tested my docker installation by 'docker run
>      >     hello-world ', no issue.
>      >
>      >
>      >     Thanks.
>      >     --
>      >     ================
>      >     Ruoyun  Huang
>      >
>
>
>
> --
> ================
> Ruoyun  Huang
>


--
================
Ruoyun  Huang



--
================
Ruoyun  Huang