osdir.com


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Nightly tests for Arrow


A confluence page sounds good to me. I'll create it.

On Wed, Oct 10, 2018 at 8:06 PM Wes McKinney <wesmckinn@xxxxxxxxx> wrote:

> How would you all like to manage this project? Maybe we should create
> a Confluence wiki page to enumerate the different facets of the effort
> and make sure we create JIRA issues to plot a course to where we want
> to go. Would someone like to take point on this?
>
> Thanks
> Wes
> On Tue, Oct 9, 2018 at 2:33 PM Krisztián Szűcs
> <szucs.krisztian@xxxxxxxxx> wrote:
> >
> > On Tue, Oct 9, 2018 at 6:02 PM Antoine Pitrou <antoine@xxxxxxxxxx>
> wrote:
> >
> > >
> > > Le 09/10/2018 à 17:54, Wes McKinney a écrit :
> > > > hi folks,
> > > >
> > > > After the packaging automation work for 0.10 was completed, we have
> > > > stalled out a bit on one of the objectives of this framework, which
> is
> > > > to allow contributors to define and add new tasks that can be run on
> > > > demand or as part of a nightly job.
> > > >
> > > > So we have some problems to solve:
> > > >
> > > > * How to define a task we wish to validate (like building the API
> > > > documentation, or building Arrow with some particular build
> > > > parameters) as a new Crossbow task -- document this well so that
> > > > people have some instructions to follow
> > >
> > Crossbow indeed lacks of documentation in that matter. Defining a task
> > requires
> > a CI configuration and commands per platform and a section in tasks.yml.
> > However I think this is not straightforward enough - like just creating a
> > bash/batch
> > script - We still need to define config management stuff (which makes
> user
> > friendliness harder to achieve).
> >
> > > > * How to add a task to some kind of a nightly build manifest
> > >
> > > * Where to schedule and run the nightly jobs
> > >
> > Currently nightly builds are submitted by this nightly travis script:
> >
> https://github.com/kszucs/crossbow/blob/trigger-nightly-builds/.travis.yml
> > We can have arbitrary number of branches to trigger custom jobs, however
> it
> > requires manual travis setup - with still not satisfying ergonomics.
> >
> > > > * Reporting nightly build failures to the mailing list
> > >
> > I regularly check the nightly builds which occasionally fails, mostly
> > transient failures.
> > For example last conda nightlies have failed, because conda-build have
> some
> > issues with libarchive - during the feedstock updates I couldn't even
> > rerender them
> > locally.
> > BTW to send the errors to the mailing list We need to set CROSSBOW_EMAIL
> env
> > variable
> > https://github.com/apache/arrow/blob/master/dev/tasks/crossbow.py#L475
> > (We might want to use a centralized crossbow repository though with
> proper
> > permissions).
> >
> > > >
> > > > In terms of scalability requirements, this needs to accommodate
> 50-100
> > > tasks.
> > >
> > The current tasks.yml contains a lot of duplication which bothers me, but
> > it provides
> > more flexibility than having another "matrix" definition and
> > implementation. I don't have
> > a user friendly solution for that yet.
> > Parallelization is another question, a single crossbow repo can run ~5
> > travis jobs and
> > a single appveyor job simultaneously, however We can improve that via
> > introducing more
> > CI services, e.g. pipelines and/or circleci.
> >
> > CI service agnostic?
> > Ideally We should abstract away the CI service (the worker itself), where
> > We do the
> > configuration management right now, see the ".<service>.yml" files:
> > https://github.com/apache/arrow/tree/master/dev/tasks/conda-recipes
> > But then We need to create another, custom (I hope not yml) "dialect" to
> > define build
> > requirements (e.g. node, python, ruby, clang, etc.). It's quite hard to
> > plan an easy
> > and flexible interface for that.
> >
> > > >
> > > > This won't be the last time we need to do some infrastructure work to
> > > > scale our testing process, but this will help with testing things
> that
> > > > we want to make sure work but without having to increase the size of
> > > > our CI matrix.
> > >
> > > One question which came to my mind is how to develop, debug and
> maintain
> > > the nightly tasks without waiting for the nightly Travis run for
> > > validation.  It doesn't seem easy to trigger a "nightly" build from the
> > > Travis UI.
> > >
> > Good point! Triggering is not the actual issue, but the evaluation of the
> > outcome.
> > We can submit builds if the PR touches e.g. the task definitions, but We
> > cannot
> > really wait for the results, thus triggering builds could be useless.
> >
> > Actually this can be solved by a github integration bot Wes has
> mentioned,
> > with
> > manual triggering and approval.
> >
> > >
> > > Regards
> > >
> > > Antoine.
> > >
> > All in all I feel the usability crucial here. A couple of examples how a
> > straightforward
> > task definition should look like would be handy. Handling and defining
> task
> > dependencies is another question too (I'm experimenting with a prototype
> > though).
> >
> > Regards, Krisztian
>