osdir.com


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Splitting the repo


Hi,

I agree that splitting up Beam into separate repositories would cause more pain than gain.

To a large degree we already have independent modules, e.g. runners/* or sdks/*. Although this is not the case for the core. It would be desirable to break it up further.

> possibly even with their own build system (unified only through a
> top-level "build everything" script that descends into each subdir and
> runs the appropriate command).

This is almost what we have. Yes, there are some dependencies on the Beam Gradle Plugin, but even if we had completely independent build directories, you'd still want to have a shared config/tasks across the projects (which might bring you back to a setup similar to what we have).

One of the pain points seems to be the portability which "polluted" some parts of the project (e.g. legacy Runners). As mentioned in this thread that could have been solved with an abstraction. But the lack of abstraction also forced us to adopt the portable pipeline code quicker.

-Max

On 10.10.18 10:51, Romain Manni-Bucau wrote:
Yep for the split

For the clean point it is quite linked to the build tools and fake env for not native modules for the build tool (go for gradle which is java first for instance). This is why having a real build which is natural per language would be beneficial IMO.

Le mer. 10 oct. 2018 11:38, Jean-Baptiste Onofré <jb@xxxxxxxxxxxx <mailto:jb@xxxxxxxxxxxx>> a écrit :

    Correct, it's more "module splitting" than repositories indeed.

    Regards
    JB

    On 10/10/2018 10:35, Robert Bradshaw wrote:
     > Gotcha. So this is more about dividing the code (particularly
    core) into
     > finer modules, rather than splitting the modules into separate
     > repositories, right?
     >
     > On Wed, Oct 10, 2018 at 10:29 AM Jean-Baptiste Onofré
    <jb@xxxxxxxxxxxx <mailto:jb@xxxxxxxxxxxx>
     > <mailto:jb@xxxxxxxxxxxx <mailto:jb@xxxxxxxxxxxx>>> wrote:
     >
     >     The purpose is that we have a monolithic core today mostly
    providing
     >     abstract classes.
     >
     >     The idea is to have something more API oriented with
    interface/SPI.
     >
     >     Our users would then be able to pick the part of the core
    they want,
     >     resulting with lighter artifacts, and for us, it gives a more
    flexible
     >     approach.
     >
     >     Regards
     >     JB
     >
     >     On 10/10/2018 10:26, Robert Bradshaw wrote:
     >     > My question was not whether we should split the repo, but why?
     >     (Dividing
     >     > things into more (or fewer) modules withing a single repo is a
     >     separate
     >     > question.) Maybe I'm just not following what you mean by
    "more API
     >     > oriented." It would force stabler APIs.
     >     >
     >     > On Wed, Oct 10, 2018 at 10:18 AM Jean-Baptiste Onofré
     >     <jb@xxxxxxxxxxxx <mailto:jb@xxxxxxxxxxxx>
    <mailto:jb@xxxxxxxxxxxx <mailto:jb@xxxxxxxxxxxx>>
     >     > <mailto:jb@xxxxxxxxxxxx <mailto:jb@xxxxxxxxxxxx>
    <mailto:jb@xxxxxxxxxxxx <mailto:jb@xxxxxxxxxxxx>>>> wrote:
     >     >
     >     >     Hi,
     >     >
     >     >     +1, even I think we could split the core even deeper.
     >     >
     >     >     I discussed with Luke and Reuven to introduce core-sql,
     >     core-schema,
     >     >     core-sdf, ...
     >     >
     >     >     It's not a huge effort, and would allow us to move
    forward on
     >     Beam "more
     >     >     API oriented" approach.
     >     >
     >     >     Regards
     >     >     JB
     >     >
     >     >     On 10/10/2018 10:12, Robert Bradshaw wrote:
     >     >     > Hi everyone,
     >     >     >
     >     >     > While IMHO it's too early to even be able to split
    the repo,
     >     it's
     >     >     not to
     >     >     > early to talk about it, and I wanted to spin this off to
     >     keep the
     >     >     other
     >     >     > thread focused.
     >     >     >
     >     >     > In particular, I am trying to figure out exactly what is
     >     hoped to be
     >     >     > gained by splitting things up. In my experience, a single
     >     project that
     >     >     > spans multiple repos has always come with excessive
    overhead
     >     and pain.
     >     >     > Of note, we recently merged the website and
    dataflow-worker
     >     into the
     >     >     > main repo *exactly* to avoid this pain (though the
    latter was
     >     >     > particularly bad due to one of the repos being private).
     >     >     >
     >     >     > If need be, I don't see any reason we can't have a single
     >     repo with
     >     >     > directories
     >     >     >
     >     >     > model/
     >     >     > website/
     >     >     > java/
     >     >     > go/
     >     >     > ...
     >     >     >
     >     >     > possibly even with their own build system (unified only
     >     through a
     >     >     > top-level "build everything" script that descends
    into each
     >     subdir and
     >     >     > runs the appropriate command). I'm not saying we
    should do
     >     this (there
     >     >     > is value in having a single consistent build system,
    etc.)
     >     but it's
     >     >     > possible. We could probably even make separate
    releases out
     >     of this
     >     >     > single repo (if we wanted, though given that our
    releases are
     >     >     time-based
     >     >     > rather than feature-based, I don't see much advantage
    here).
     >     >     >
     >     >     > Also, there was the comment.
     >     >     >
     >     >     > On Wed, Oct 10, 2018 at 7:35 AM Romain Manni-Bucau
     >     >     > <rmannibucau@xxxxxxxxx <mailto:rmannibucau@xxxxxxxxx>
    <mailto:rmannibucau@xxxxxxxxx <mailto:rmannibucau@xxxxxxxxx>>
     >     <mailto:rmannibucau@xxxxxxxxx <mailto:rmannibucau@xxxxxxxxx>
    <mailto:rmannibucau@xxxxxxxxx <mailto:rmannibucau@xxxxxxxxx>>>
     >     >     <mailto:rmannibucau@xxxxxxxxx
    <mailto:rmannibucau@xxxxxxxxx> <mailto:rmannibucau@xxxxxxxxx
    <mailto:rmannibucau@xxxxxxxxx>>
     >     <mailto:rmannibucau@xxxxxxxxx <mailto:rmannibucau@xxxxxxxxx>
    <mailto:rmannibucau@xxxxxxxxx <mailto:rmannibucau@xxxxxxxxx>>>>> wrote:
     >     >     >>
     >     >     >> Side note: beam portability would be saner if added
    on top
     >     of others
     >     >     > than the opposite which is done today.
     >     >     >
     >     >     > I think you brought this up before, Romain. I'm still
    trying to
     >     >     wrap my
     >     >     > head around what you mean here. Could you elaborate
    what such a
     >     >     > structure would look like?
     >     >
     >     >     --
     >     >     Jean-Baptiste Onofré
     >     > jbonofre@xxxxxxxxxx <mailto:jbonofre@xxxxxxxxxx>
    <mailto:jbonofre@xxxxxxxxxx <mailto:jbonofre@xxxxxxxxxx>>
     >     <mailto:jbonofre@xxxxxxxxxx <mailto:jbonofre@xxxxxxxxxx>
    <mailto:jbonofre@xxxxxxxxxx <mailto:jbonofre@xxxxxxxxxx>>>
     >     > http://blog.nanthrax.net
     >     >     Talend - http://www.talend.com
     >     >
     >
     >     --
     >     Jean-Baptiste Onofré
     > jbonofre@xxxxxxxxxx <mailto:jbonofre@xxxxxxxxxx>
    <mailto:jbonofre@xxxxxxxxxx <mailto:jbonofre@xxxxxxxxxx>>
     > http://blog.nanthrax.net
     >     Talend - http://www.talend.com
     >

-- Jean-Baptiste Onofré
    jbonofre@xxxxxxxxxx <mailto:jbonofre@xxxxxxxxxx>
    http://blog.nanthrax.net
    Talend - http://www.talend.com