[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Community Examples Repository

The way I see it, the examples repo needs not be versioned. Each example should specify it's dependencies the same way any external application would. In Python this would be the requirements.txt where we can specify apache-beam>=2.5.0 for example. For Java, it can be specified in the pom.xml file. This would show users how to use Beam for their own applications as Jesse mentioned.

A high friction point I've experienced in Java is creating a new pom file, so having a repository of examples that can be copy-pasted with a minimal and working pom file would be great.

In terms of packaging, I think we can get away with it just being a collection of independent examples with a testing infrastructure. Testing should be trigger-able at the root, but each sample should be tested in its own isolated environment. Since there are no dependencies, all examples can be tested in parallel. For every example, create its virtual environment, run all the tests and destroy it.

If we want the examples to also live in PyPI, then we would require versioning (major.minor.patch). Since it doesn't really matter, we can just have the major.minor part mirror the current apache-beam version for consistency, and as new examples get added / modified we can bump the patch version indefinitely.

On Fri, Aug 3, 2018 at 3:33 PM Charles Chen <ccy@xxxxxxxxxx> wrote:
We should separate out the decision for (1) whether examples should be packaged separately upon release and (2) where the example will live code-wise, i.e. whether we want another repo.  With respect to the first item, I think the proposal needs more detail before we can decide here--for example, if we separate out the packaging for the examples, we need to change our build process and potentially release additional PyPI packages and this should be thought about before we can make a decision.

On Fri, Aug 3, 2018 at 3:23 PM Pablo Estrada <pabloem@xxxxxxxxxx> wrote:
Hello all,
I see a number of mixed responses. I think it would be helpful to push for a decision by calling for a vote. 

Also, the proposal has a number of parts, so perhaps we could ask David and other contributors of the proposal to outline a couple alternatives the we can all vote on. (e.g. #1 no examples repo, #2 all examples to new repo, #3 examples repo, but some examples remain in main repo).

The outcome may be no change at all, or some change, but at least we'll have a definite decision from the community.

Does that sound reasonable?

On Thu, Aug 2, 2018 at 11:09 AM Ankur Goenka <goenka@xxxxxxxxxx> wrote:
I like he initiative but I feel that fragmenting the codebase will make it harder to discover examples. Having examples in a separate repo makes it easier to forget that examples should get the same love as the rest of the codebase. 
The other challenge is the tooling and integration which is harder with multiple repo. 
It makes sense to isolate the examples and make them more obvious.
A sub project of examples as mentioned in the discussion might be sufficient without having much overhead.


On Thu, Aug 2, 2018 at 10:52 AM Kai Jiang <jiangkai@xxxxxxxxx> wrote:
Agreed with Rui. We could also add more SQL examples (like, different IOs ) for everyone to get started with.


On 2018/08/02 17:40:32, Rui Wang <ruwang@xxxxxxxxxx> wrote:
> I might miss it: are examples to be moved including those which are not
> under example/? For example there are some BeamSQL examples in
> org/apache/beam/sdk/extensions/sql/example
> <https://github.com/apache/beam/tree/master/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/example>
> .
> It's better to keep BeamSQL examples in where it is because related API
> might still change.
> -Rui
> On Thu, Aug 2, 2018 at 8:58 AM Ahmet Altay <altay@xxxxxxxxxx> wrote:
> > Robert, I agree with you in general. However there is also a second
> > motivation. There is an increase in new PRs that are coming to add new
> > examples. This is great however the core code (including distributions) is
> > not a great place to host such examples. An examples repo would help in
> > this case. It could also serve as an entry point for new contributors.
> >
> >
> >
> > On Thu, Aug 2, 2018 at 12:40 AM, Robert Bradshaw <robertwb@xxxxxxxxxx>
> > wrote:
> >
> >> I have to admit I'm generally -1 on moving examples to a separate
> >> repository. In particular, I think it would actually inhibit the
> >> stated goals of increasing visibility and better keeping them up to
> >> date, and for all the reasons we just migrated the beam-site directory
> >> in. It seems the primary motivation is that it's difficult in Java to
> >> have a portion of the repo that depends on another as if it were
> >> "external" (i.e. the way others would use Beam) rather than being a
> >> sub-project of Beam. Is this not doable?
> >> On Wed, Aug 1, 2018 at 10:59 PM Charles Chen <ccy@xxxxxxxxxx> wrote:
> >> >
> >> > I would also prefer that examples be linked to releases so that we can
> >> build and test them during development; i.e. if your commit breaks
> >> wordcount, we want to know right away so we can revert.  Perhaps we can
> >> keep these in the repo but more clearly modularize the artifacts we release?
> >> >
> >> > For the Python SDK, if we separate this out in any way, there is the
> >> separate issue of dealing with namespace packages (which are kind of broken
> >> and poorly supported:
> >> https://github.com/pypa/python-packaging-user-guide/issues/265), if we
> >> want to keep the examples under the apache_beam.examples module path.  See
> >> also https://packaging.python.org/guides/packaging-namespace-packages/.
> >> >
> >> > On Wed, Aug 1, 2018 at 9:29 PM jb@xxxxxxxxxxxx <jb@xxxxxxxxxxxx> wrote:
> >> >>
> >> >> Hi,
> >> >>
> >> >> I don't have problem to move the examples in a dedicated repository.
> >> However, IMHO, we have to:
> >> >>
> >> >> 1. Keep a build of examples linked to latest core release/SNAPSHOT
> >> >> 2. Include the examples in the distribution (convenient for the users)
> >> >>
> >> >> On another topic, I think it would be better to avoid usage of Google
> >> Doc for such kind of discussion and directly share on the mailing list (at
> >> least a summary/light details).
> >> >>
> >> >> Regards
> >> >> JB
> >> >>
> >> >> On Thursday, August 02, 2018 00:12 CEST, David Cavazos <
> >> dcavazos@xxxxxxxxxx> wrote:
> >> >>
> >> >>
> >> >> Hi everyone!
> >> >>
> >> >> We wanted to migrate the examples from the core repository to a new
> >> Beam community examples repository. As the number of examples grow, it
> >> makes sense to modularize and decouple the core functionality from the
> >> examples.
> >> >>
> >> >> We will also create some guidelines with the best practices for new
> >> examples to be submitted.
> >> >>
> >> >> For more details, feel free to take a look and comment on the proposal.
> >> >>
> >> >> Cheers,
> >> >> David
> >> >>
> >> >>
> >> >>
> >> >>
> >> >>
> >>
> >
> >
Got feedback? go/pabloem-feedback