osdir.com


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Gandiva snapshot releases


I am not familiar with snapshot deployment processes for Java packages.
Since there are so many Apache projects that are Java-based, we should look
at what others are doing. If you need help with anything that requires PMC
karma I can try to help.

On Mon, Oct 15, 2018, 7:36 AM Krisztián Szűcs <szucs.krisztian@xxxxxxxxx>
wrote:

> Actually We should deploy to github releases as well, see
>
> https://github.com/apache/arrow/blob/master/dev/tasks/conda-recipes/travis.linux.yml#L59
>
> Then We could download the jars from github directly, similarly like
> https://github.com/kszucs/crossbow/releases/tag/nightly-73-ubuntu-xenial
>
> Once We have a working build cycle deploying to github We can setup another
> deployment to maven (however I can't mange maven permissions) - so the
> maven credential issue shouldn't block the actual artifact building
> process.
>
> On Mon, Oct 15, 2018 at 1:12 PM Praveen Kumar <praveen@xxxxxxxxxx> wrote:
>
> > Hey Kristian,
> >
> > Yes you are right, I am planning to use an encrypted variable in travis.
> >
> > But which token do i use for the deployment? My ossrh account will not
> have
> > deploy permissions in apache/arrow maven repository, so was wondering
> which
> > token to use? Once this is clarified, i will raise the PR that creates
> the
> > jar and deploys the same.
> >
> > Would you prefer to discuss the token too on the PR, if yes i will raise
> > the PR by tomorrow. Currently i tested on my private repo (
> > https://github.com/praveenbingo/crossbow) but the actual deploy would be
> > configured on
> > (https://github.com/dremio/crossbow).
> >
> > Thx.
> >
> > On Mon, Oct 15, 2018 at 4:09 PM Krisztián Szűcs <
> szucs.krisztian@xxxxxxxxx
> > >
> > wrote:
> >
> > > Hi Praveen,
> > >
> > > I assume We're planning to run it on travis, so We need to pass en
> > > encrypted env variable:
> > >
> > >
> >
> https://docs.travis-ci.com/user/environment-variables/#defining-variables-in-repository-settings
> > >
> > > Have You created the crossbow task for creating the jar? If You submit
> a
> > PR
> > > We could further
> > > discuss the deployment steps there.
> > >
> > > On Mon, Oct 15, 2018 at 5:53 AM Praveen Kumar <praveen@xxxxxxxxxx>
> > wrote:
> > >
> > > > Hi Kristian/Wes,
> > > >
> > > > Can you please advise on the deploy tokens. Also do you want to
> include
> > > the
> > > > arrow jars in the snapshot deploy?
> > > >
> > > > Thx.
> > > >
> > > > On Fri, Oct 12, 2018 at 11:50 AM Praveen Kumar <praveen@xxxxxxxxxx>
> > > wrote:
> > > >
> > > > > Hi Kristian,
> > > > >
> > > > > Thanks for reviewing.
> > > > >
> > > > > Yup that is our plan too, we are targeting the ubuntu release
> first.
> > We
> > > > > will pick the mac and the combiner as required later.
> > > > >
> > > > > For the frequency of deployments, we would be doing at-least once a
> > day
> > > > > with the flexibility to manually trigger too.
> > > > >
> > > > > Thx.
> > > > >
> > > > > On Thu, Oct 11, 2018 at 9:41 PM Krisztián Szűcs <
> > > > szucs.krisztian@xxxxxxxxx>
> > > > > wrote:
> > > > >
> > > > >> On Thu, Oct 11, 2018 at 12:58 PM Praveen Kumar <
> praveen@xxxxxxxxxx>
> > > > >> wrote:
> > > > >>
> > > > >> > Hi All,
> > > > >> >
> > > > >> > I spent some time today understanding cross bow and it looks
> > great!
> > > > >> >
> > > > >> > To unblock ourselves immediately, we are going to do the ubuntu
> > > deploy
> > > > >> > first, followed by the mac deploy and the fat jar deployment.
> > > > >> >
> > > > >> > To confirm our understanding we would be doing the following
> > > > >> >
> > > > >> > 1. Create a queue repo similar to one here(
> > > > >> > https://github.com/praveenbingo/crossbow) but under dremio org.
> > > > >> >
> > > > >> Correct, although We might want a centralized crossbow repo to
> > deploy
> > > > >> scheduled (e.g. nightly) packages.
> > > > >>
> > > > >> > 2. Have the repo kick off crossbow builds for each OS that we
> > would
> > > > >> want.
> > > > >> >
> > > > >> Correct. To run the tasks: `python crossbow.py submit gandiva-osx
> > > > >> gandiva-ubuntu`
> > > > >> It returns the build identifier, e.g. `build-123`
> > > > >>
> > > > >> > 3. In addition to OS builds, there would be another build which
> > > would
> > > > >> just
> > > > >> > be waiting for the OS builds to finish (with some timeout) and
> > once
> > > > done
> > > > >> > will package the fat jar and deploy to maven.
> > > > >> >
> > > > >> Basically yes, but depending on the build times it might worth
> > > building
> > > > >> the
> > > > >> fat jar
> > > > >> locally instead (of course You can trigger another task which does
> > the
> > > > >> same
> > > > >> thing
> > > > >> just remotely). Currently the artifact downloading is built in the
> > > > `sign`
> > > > >> command,
> > > > >> but we can quickly factor that out: `python crossbow.py sign
> > > build-123`
> > > > >>
> > > > >> I'd like to generalize task dependencies, but this is definitely
> the
> > > > >> quickest to start with.
> > > > >>
> > > > >> >
> > > > >> > The only thing that i am unclear of is the maven deploy tokens.
> > > Since
> > > > i
> > > > >> am
> > > > >> > not a committer with permissions to push to maven repo, I would
> > need
> > > > >> keys
> > > > >> > to be configured in the dremio/crossbow environment variables.
> > > > >> >
> > > > >> How often do We want to ship fat jars?
> > > > >>
> > > > >> >
> > > > >> > Wes - do Siddharth/Jacques have permissions to push to maven
> repo
> > > and
> > > > >> can i
> > > > >> > use the same?
> > > > >> >
> > > > >> > Also looks like the release scripts here
> > > > >> > <
> > > >
> https://github.com/apache/arrow/blob/master/dev/release/01-perform.sh>
> > > > >> > would need to be changed as well if we want to deploy the fat
> jar
> > as
> > > > >> part
> > > > >> > of releases.
> > > > >> >
> > > > >> Correct.
> > > > >>
> > > > >> >
> > > > >> > Kristian - can you please review the proposed steps and let me
> > know
> > > if
> > > > >> they
> > > > >> > look correct to you?
> > > > >> >
> > > > >>  Absolutely!
> > > > >>
> > > > >> BTW if You want to unblock yourself first, then it's enough to
> have
> > a
> > > > >> single task which
> > > > >> builds the ubuntu libs and the fat jar (in a single CI build), and
> > We
> > > > can
> > > > >> handle the
> > > > >> dependent task (fat jar building) after We introduce another child
> > > (mac
> > > > or
> > > > >> win). So We
> > > > >> could spare the third step in the first iteration.
> > > > >>
> > > > >> >
> > > > >> > Thx.
> > > > >> >
> > > > >> >
> > > > >> > On Wed, Oct 10, 2018 at 11:33 PM Praveen Kumar <
> > praveen@xxxxxxxxxx>
> > > > >> wrote:
> > > > >> >
> > > > >> > > Hi Wes,
> > > > >> > >
> > > > >> > > I'll take this to completion. Will send out a proposal
> tomorrow.
> > > > >> > >
> > > > >> > > Thx.
> > > > >> > >
> > > > >> > > On Wed, Oct 10, 2018, 23:32 Wes McKinney <wesmckinn@xxxxxxxxx
> >
> > > > wrote:
> > > > >> > >
> > > > >> > >> hi folks,
> > > > >> > >>
> > > > >> > >> How would you like to proceed on this? I'm tracking many
> > projects
> > > > >> > >> right now so I want to make sure someone else is "in charge"
> on
> > > > this
> > > > >> > >> effort
> > > > >> > >>
> > > > >> > >> Thanks,
> > > > >> > >> Wes
> > > > >> > >> On Sat, Oct 6, 2018 at 10:37 AM Wes McKinney <
> > > wesmckinn@xxxxxxxxx>
> > > > >> > wrote:
> > > > >> > >> >
> > > > >> > >> > > We could create a worker pool like abstraction where the
> > > > workers
> > > > >> are
> > > > >> > >> the CI services, but that would require a scheduler to poll
> the
> > > > >> finished
> > > > >> > >> jobs then submit the dependent ones. This sounds a bit
> > > > inconvenient,
> > > > >> > where
> > > > >> > >> would that scheduler run: locally, on a CI or self hosted?
> > > > >> > >> >
> > > > >> > >> > Inevitably we're going to need to build some kind of job
> > > > scheduler,
> > > > >> > >> > whether it uses Airflow or Luigi or some other tool of our
> > own
> > > > >> > >> > devising.
> > > > >> > >> >
> > > > >> > >> > Apache Arrow is eventually going to need a host where we
> can
> > > > manage
> > > > >> > >> > such workflows. I'm looking into the possibility of a
> > physical
> > > > >> > >> > CUDA-equipped host that could be made available to Arrow
> > > > >> developers to
> > > > >> > >> > use for testing and benchmarking. I may need to run the
> > machine
> > > > >> out of
> > > > >> > >> > my home (we did something similar for pandas -- physical
> > > machine
> > > > >> that
> > > > >> > >> > we can SSH into).
> > > > >> > >> >
> > > > >> > >> > All this idealism aside -- we take the shortest path
> possible
> > > for
> > > > >> this
> > > > >> > >> > particular packaging job, and make improvements as we can
> > going
> > > > >> > >> > forward.
> > > > >> > >> > On Sat, Oct 6, 2018 at 9:31 AM Krisztián Szűcs
> > > > >> > >> > <szucs.krisztian@xxxxxxxxx> wrote:
> > > > >> > >> > >
> > > > >> > >> > > I see now, so the jar would contain all of the three
> shared
> > > > >> > libraries.
> > > > >> > >> > >
> > > > >> > >> > > We could create a worker pool like abstraction where the
> > > > workers
> > > > >> are
> > > > >> > >> the
> > > > >> > >> > > CI services, but that would require a scheduler to poll
> the
> > > > >> finished
> > > > >> > >> jobs
> > > > >> > >> > > then
> > > > >> > >> > > submit the dependent ones. This sounds a bit
> inconvenient,
> > > > where
> > > > >> > would
> > > > >> > >> > > that scheduler run: locally, on a CI or self hosted?
> > > > >> > >> > >
> > > > >> > >> > > Another approach would be to use the worker the schedule
> > the
> > > > next
> > > > >> > >> task,
> > > > >> > >> > > in a similar fashion like dask's worker_client [1]
> launches
> > > > tasks
> > > > >> > from
> > > > >> > >> > > tasks.
> > > > >> > >> > > There could be synchronization problems though. This
> > approach
> > > > >> > requires
> > > > >> > >> > > to bootstrap crossbow on each CI jobs but that would:
> > > > >> > >> > > - make crossbow less CI dependent (to use azure pipelines
> > as
> > > > >> well)
> > > > >> > >> > > - unify the artifact uploading and downloading logic
> which
> > is
> > > > >> > >> required in
> > > > >> > >> > > order
> > > > >> > >> > >   to support dependent tasks
> > > > >> > >> > > - way less redundancy in task definitions
> > > > >> > >> > >
> > > > >> > >> > > What do You think? I'd prefer the second one.
> > > > >> > >> > >
> > > > >> > >> > > [1]
> > > > >> > >> > >
> > > > >> > >>
> > > > >> >
> > > > >>
> > > >
> > >
> >
> https://github.com/dask/distributed/blob/master/docs/source/task-launch.rst
> > > > >> > >> > >
> > > > >> > >> > > On Sat, Oct 6, 2018 at 10:57 AM Wes McKinney <
> > > > >> wesmckinn@xxxxxxxxx>
> > > > >> > >> wrote:
> > > > >> > >> > >
> > > > >> > >> > > > It seems the complicated part of this will be having a
> > > > >> dependent
> > > > >> > >> task
> > > > >> > >> > > > that packages up the 3 shared libraries, one for each
> > > > platform,
> > > > >> > >> after
> > > > >> > >> > > > the individual packaging tasks are run. How would you
> > > propose
> > > > >> > >> handling
> > > > >> > >> > > > that?
> > > > >> > >> > > > On Fri, Oct 5, 2018 at 8:03 AM Krisztián Szűcs
> > > > >> > >> > > > <szucs.krisztian@xxxxxxxxx> wrote:
> > > > >> > >> > > > >
> > > > >> > >> > > > > Ohh, just read the thread, sorry!
> > > > >> > >> > > > >
> > > > >> > >> > > > > So crossbow is located here
> > > > >> > >> > > > https://github.com/apache/arrow/tree/master/dev/tasks
> > > > >> > >> > > > > I suggest to "fork" the python-wheels directory which
> > > > >> contains
> > > > >> > >> three
> > > > >> > >> > > > templated ymls
> > > > >> > >> > > > > for osx, win and linux builds. For building on linux
> > > > >> something
> > > > >> > >> like the
> > > > >> > >> > > > following should
> > > > >> > >> > > > > be sufficient
> > > > >> > >> > > >
> > > > >> https://gist.github.com/kszucs/39154876d60c4109ff59b678afd65b19
> > > > >> > >> > > > > Then You need another entry in the tasks.yml, for
> > > example:
> > > > >> > >> > > > > jar-gandiva-linux:
> > > > >> > >> > > > > platform: linux
> > > > >> > >> > > > > template: gandiva-jars/travis.linux.yml
> > > > >> > >> > > > > params:
> > > > >> > >> > > > > # arbitrary params which are available from the
> > templated
> > > > yml
> > > > >> > >> > > > > ...
> > > > >> > >> > > > > artifacts:
> > > > >> > >> > > > > # these are the expected artifacts from the build
> > > > >> > >> > > > > - gandiva-SNAPSHOT-{version}.jar
> > > > >> > >> > > > > ...
> > > > >> > >> > > > >
> > > > >> > >> > > > > Of course crossbow is wired towards the current
> > packaging
> > > > >> > >> requirements,
> > > > >> > >> > > > so likely
> > > > >> > >> > > > > We need to adjust it to the newly appearing
> > requirements.
> > > > >> > >> > > > >
> > > > >> > >> > > > > Feel free to reach me on gitter @kszucs.
> > > > >> > >> > > > > On Oct 4 2018, at 2:02 pm, Wes McKinney <
> > > > wesmckinn@xxxxxxxxx
> > > > >> >
> > > > >> > >> wrote:
> > > > >> > >> > > > > >
> > > > >> > >> > > > > > hi Praveen,
> > > > >> > >> > > > > > Probably the best way to accomplish this is to use
> > our
> > > > new
> > > > >> > >> Crossbow
> > > > >> > >> > > > > > infrastructure for task automation on Travis CI and
> > > > >> Appveyor
> > > > >> > >> rather
> > > > >> > >> > > > > > than trying to do all of this within the CI
> entries.
> > > This
> > > > >> is
> > > > >> > >> how we
> > > > >> > >> > > > > > are producing all of our binary artifacts for
> > releases
> > > > now
> > > > >> --
> > > > >> > >> > > > > > presumably in future ASF releases, we will want to
> > > > include
> > > > >> a
> > > > >> > >> > > > > > platform-independent Gandiva JAR in our release
> > votes,
> > > so
> > > > >> this
> > > > >> > >> all
> > > > >> > >> > > > > > needs to end up in Crossbow anyway. The intent is
> for
> > > the
> > > > >> > >> Crossbow
> > > > >> > >> > > > > > system to take on responsibility for all packaging
> > > > >> automation
> > > > >> > >> rather
> > > > >> > >> > > > > > than using the normal CI for that.
> > > > >> > >> > > > > >
> > > > >> > >> > > > > > Krisztian, do you have time to help Praveen and the
> > > > Gandiva
> > > > >> > >> crew with
> > > > >> > >> > > > > > this project? This will be an important test to
> > > document
> > > > >> and
> > > > >> > >> improve
> > > > >> > >> > > > > > Crossbow for such use cases
> > > > >> > >> > > > > >
> > > > >> > >> > > > > > Thanks
> > > > >> > >> > > > > > Wes
> > > > >> > >> > > > > > On Thu, Oct 4, 2018 at 7:14 AM Praveen Kumar <
> > > > >> > >> praveen@xxxxxxxxxx>
> > > > >> > >> > > > wrote:
> > > > >> > >> > > > > > >
> > > > >> > >> > > > > > > Hi Folks,
> > > > >> > >> > > > > > > As part of
> > > > >> https://issues.apache.org/jira/browse/ARROW-3385
> > > > >> > ,
> > > > >> > >> we are
> > > > >> > >> > > > > > > planning to perform a snapshot release of the
> > Gandiva
> > > > >> Jar on
> > > > >> > >> each
> > > > >> > >> > > > commit to
> > > > >> > >> > > > > > > master. This would be a platform independent jar
> > that
> > > > >> > >> contains the
> > > > >> > >> > > > core
> > > > >> > >> > > > > > > gandiva library and its jni bridge packaged for
> > Mac,
> > > > >> Windows
> > > > >> > >> and *nix
> > > > >> > >> > > > > > > platforms.
> > > > >> > >> > > > > > >
> > > > >> > >> > > > > > > The current plan is to deploy separate snapshot
> > jars
> > > > for
> > > > >> > each
> > > > >> > >> OS
> > > > >> > >> > > > through
> > > > >> > >> > > > > > > entries in the Gandiva CI matrix and then have a
> > > > combine
> > > > >> > step
> > > > >> > >> that
> > > > >> > >> > > > pulls in
> > > > >> > >> > > > > > > each OS specific jar and builds a jar that has
> all
> > > the
> > > > >> > native
> > > > >> > >> > > > libraries.
> > > > >> > >> > > > > > > This build/deploy would happen only for commits
> on
> > > > master
> > > > >> > >> branch and
> > > > >> > >> > > > not
> > > > >> > >> > > > > > > for PR requests
> > > > >> > >> > > > > > >
> > > > >> > >> > > > > > > Does the plan sound ok (or) please let us know if
> > > there
> > > > >> is a
> > > > >> > >> better
> > > > >> > >> > > > way to
> > > > >> > >> > > > > > > achieve the same.
> > > > >> > >> > > > > > >
> > > > >> > >> > > > > > > If it sounds ok, can someone please help with the
> > > > >> following
> > > > >> > >> > > > > > > 1. It looks like we only do travis builds and not
> > > > >> appveyor
> > > > >> > for
> > > > >> > >> > > > master in
> > > > >> > >> > > > > > > arrow. Any reason for this?
> > > > >> > >> > > > > > > 2. Even if we did appveyor is there a way to
> > sequence
> > > > the
> > > > >> > >> builds.
> > > > >> > >> > > > Like wait
> > > > >> > >> > > > > > > for appveyor to complete before kicking off
> travis?
> > > > >> Since we
> > > > >> > >> would
> > > > >> > >> > > > need the
> > > > >> > >> > > > > > > dll to be pre-built.
> > > > >> > >> > > > > > > 3. Someone would need to configure the
> credentials
> > to
> > > > use
> > > > >> > for
> > > > >> > >> the
> > > > >> > >> > > > ossrh
> > > > >> > >> > > > > > > deployment. The credentials would need access to
> > > deploy
> > > > >> to
> > > > >> > >> > > > org.apache.arrow.
> > > > >> > >> > > > > > >
> > > > >> > >> > > > > > > Thanks ahead!
> > > > >> > >> > > >
> > > > >> > >>
> > > > >> > >
> > > > >> >
> > > > >>
> > > > >
> > > >
> > >
> >
>