|
upcoming "build on branches" feature: msg#00054python.buildbot.devel
Here are my notes on the feature I'm starting on this week. The details still need some working out, but I wanted to share what I'm thinking so everybody can chime in. I'll be doing this work on a branch, so that "0.7.0" will be the first release to contain the new code (even if there are 0.6.7-0.6.9 bugfix releases first). My goal is to include the long-discussed "try" feature in that release as well, since it is a natural extension of the "build this branch" framework. Be aware that I'm just copying in notes from my buildbot diary here, so I'm working out the ideas and terminology as I go along. Later stuff tends to supercede earlier stuff, and nothing's nailed down yet. -- begin notes -- Now I want to take on SF#1200394, building branches. Each VC system has a different approach. The general idea is that each build has three input properties which might change from build to build (as opposed to the setup of their steps, which is fixed by the config file): branch revision patch 'patch' is for the 'try' feature. 'revision' comes from Changes, some versionstamp which will build everything in all the Changes (and nothing else). 'branch' is the new one. Each Build should get a .branch attribute. Each VC step should look in its Build to see what branch to use. None means HEAD or Trunk or whatever default the VC system normally provides (which may require some configuration.. CVS has a default HEAD, but SVN needs it provided as part of the svnurl). The branch name will get sent to the slave VC commands in different forms. Conveniently this doesn't require changes to the slave code. Arch/Baz: this is just the args['version'] key CVS: args['branch'] SVN: needs to be appended to args['svnurl'] Darcs: "branches" are equivalent to repositories, so really the branch name would replace args['repourl'], but that is a security issue Git: good question, maybe args['repourl'] too P4: not applicable, would involve modifying the viewspec Monotone: will be args['branch']. checkouts require either a revision ID or a branch name. Each revision ID belongs to a single branch. ** Update versus Clobber You can't update a tree into a different branch (at least it doesn't seem like a good idea). It probably makes sense to put a .buildbot.branch stamp in the checked out tree, to remember which branch was last used. If the branch stamp is different than what we want to wind up with, clobber the tree first. Or, we could reserve a Builder (and therefore a builddir) for HEAD, and use a separate one for branches. The HEAD Builder would use mode=update as usual, while the branches would always use mode=clobber. Another point on the disk-versus-network axis is to use mode=copy, but use a different copydir when working with any branch. Really that would mean mode=copy for HEAD, but for other branches do an rmtree(branchdir) then mode=copy with copydir=branchdir. ** Security The buildslaves are pwned by anyone who can commit code to a repository that they pull from. Branches add a wrinkle to this. We saw the slaves are owned by anyone who can both commit changes to a given branch and then get the buildmaster to build from that branch. Most VC systems don't offer particularly fine-grained ACLs: write access to the repository is an all-or-nothing affair. SVN probably does better. If the repository is using per-branch permissions, then either the admin needs to provide a list of acceptable branches, or they must accept that the slaves will build code for anything in the repository regardless of ACLs that might prevent certain people from committing to certain branches. (i.e. the buildbot will be more permissive than the repository). ** How named-branch Builds are triggered The most obvious way is that someone specially requests one, with forceBuild(). It should acquire a branch= argument (and the Builder should have a default branch in case one is not provided). Another way is to have Changes trigger non-trunk builds. This would work for an environment in which branches are shared and need to be maintained just like HEAD. Ideally, the normal buildbot behavior should just be a degenerate case where HEAD is the only version tracked. It might be easiest to do this by having a configured list of branches which are allowed to trigger builds. The default case just has ["HEAD"] (or [None]). This would allow well-established/shared branches to be tracked, while other personal branches could be built only on demand. A separate list of valid branch names would provide for ACL security (for SVN repositories which bother with it), but should default to allowing everything. To do this, each Change must have .branch attribute. The ChangeSource is responsible for providing it. This is easier to determine for some VC systems than others: Arch/Baz: easy, each commit is part of a specific 'version' CVS: Thomas' cvstag change pulls this from the mail parser SVN: this is tricky, since the URL is just base://base/branch../dir/file . Given just the URL of the file that was changed, you can't tell where the branch ends and the tree begins. The ChangeSource will need logic to do the split correctly, possibly with a predefined set of branch names. Darcs: equivalent to the repository name. To follow multiple "branches", the notifier must watch multiple repositories. Git: dunno P4: dunno, it uses a path scheme like SVN Monotone: each revision has a specific branch. The change source needs to extract it and include it in the change notification. ** Status display It should be clear when a non-HEAD build is running (so it is obvious why other builds aren't). It should also not be confused with HEAD builds. Status targets that track current "tree is good / tree is broken" status should handle each branch separately, and remember that non-HEAD builds will not necessarily be re-built once they've been fixed. The tag= attribute that thomasvs added should be reworked into a branch= attribute. Each BuildStatus should have a .getBranch() or something (maybe part of .getSourceStamp). ** all-builder passing Building from branches (like the 'try' feature) is useful to answer the question "Is it safe to commit my changes to the trunk?". Sometimes developers want to force a build on a specific Builder, but more often they want to run their changes on all Builders, just like committing it would. It would be convenient to have some logic that watches all Builders and correlates their Builds with the inbound Changes (or Forces), so it could tag each Change or Force with a pass/fail status based upon a set of Builders. One component of this is changing forceBuild() to be more like submitBuild(), which is a less-granular form of submitChange(). The former distributes BuildRequests to all Builders, while the latter distributes Changes to them. The Builders are free to deal with Changes as they see fit, ignoring them or scheduling Builds to incorporate those Changes. The BuildRequests bypass this mechanism, and just get scheduled to run the next time the slave is free. (this also hints that Builders should have a queue of pending builds, and some prioritization logic, and that the Change distribution should be a bit more clever). side note: we might need a better name for the thing submitted by this submitBuild(). "Build" is a single build of a single sourcestamp by a single Builder. The thing we're talking about is a suite of builds (all of the same sourcestamp) across multiple Builders. The other component is status reporting. The developer who submits their personal branch for testing can watch the waterfall display until the component Builds finish, and eyeball-merge the results together, but really there should be something to track all the Builds and notify them upon the first failure or the last success. So imagine a BuildSet object. It takes a source stamp and a list of Builders. It has one "soon" Deferred which fires on the first failure (if no Builds fail, it fires when the last Build finishes). It also has a "done" Deferred which doesn't fire at all until all Builds have completed (but when it fires it still indicates that something went wrong). This builder list should have some symbolic names, like "all", which the buildmaster admin can configure to a list of all the Builders that developers are responsible for keeping happy. This should use thomas' tag= attributes. I really want to generalize this. I'm thinking that, instead of individual Builders/Builds making when-to-build decisions, there should be a layer which accepts Changes and (later) emits BuildSets. The current distinction between, say, a "quick" Builder and a set of "full" builders should be replaced with two instances of this BuildSet-producing layer, with different treeStableTimer values and different isFileImportant implementations. Ideally they should share Builders (so each Builder is just an architecture); that requires the quick/full distinction being expressed in the sourcestamp, which I'm not entirely comfortable with. The objects that this layer emits should be communicated to the status targets too, with messages like: BuildSetStarted: put in the queue on the various builders BuildStarted: one of the component builds has begun BuildFinished: one of the component builds has finished BuildSetFailed: the first failure was observed BuildSetFinished: all builds have finished Ok, one down side to this approach is that Builds are capable of testing multiple changes at once (more or less, with the obvious loss of granularity). The twisted 'reactors' builder takes an hour or two to run, and if it was asked to match Build for Build with the faster Builders, it would never catch up. The current "build anything that's stable" approach is a lot more effective. This doesn't necessarily invalidate the BuildSet thing (manually requested BuildSets would behave as before, it's just Change-triggered BuildSets that are the issue). The BuildSet could be an output thing instead of an input thing, watching the Changes that come in and tracking all the Builds that contained them. (this would make it a status thing, but not a scheduling thing). Hrm. Create a BuildRequest object that contains the source stamp. These are passed to Builders, which put them on the queue. submitBuild() creates regular ones, which are built independently. When the Change-driven scheduler layer decides a set of Changes are stable enough to build, they create a BuildSet which creates mergeable BuildRequest objects. When the Builder gets a second mergeable BuildRequest, it is allowed to merge the two together. Ooh, yes. The mergeability depends upon the branch (that is, the BuildRequest has both a "canBeMerged" flag and a branch name, and merges can only happen between BuildRequests that are both mergeable and have the same branch name). This would allow incoming Changes from multiple branches to be built as soon as possible. Merging will never delay a build, and the mergeable flag means that explicitly requested builds will run with exactly the set of sources requested. The scheduling thing.. let's call it a Scheduler. It decides when to submit BuildRequests to the Builders. (the Builders get to decide when to actually run the resulting Builds, however, so the Scheduler doesn't control all aspects of scheduling). In particular, Locks are still between Builds, not BuildRequests. However, Dependencies should (somehow) go through Schedulers. I think it will be enough to have each Scheduler list the set of other Schedulers that they depend upon (and maybe have them list the actual Schedulers, not just their names, to make it impossible to create a loop). Internally, the downstream one registers with the upstream one to receive information about Changes. It needs to know when the upstream one has decided that a given Change has passed or failed. It can also find out when the Change has been ignored (which probably counts as success here). All Schedulers are guaranteed to get all Changes, so the downstream one can do 'd=upstream.getResults(changenum)', which will fire with SUCCESS or FAILURE (or maybe SKIPPED). The downstream Scheduler has a potential BuildSet, which waits until the upstream one has passed all of its component Changes. Schedulers should be able to take input from more than just ChangeSources. Consider the case where you want to trigger a build every time some remote buildmaster has completed a successful build of some library that this project uses. You'd have a BuildTrigger object of some kind which subscribes to a PBListener status port on the remote buildmaster. You might want these to be called something else, or somehow mark them as not taking Changes. Or, maybe just let them take Changes too, just ignore them. ** control flow change-driven builds: ChangeSources send Changes to all Schedulers Schedulers send BuildRequests to (some) Builders Builders accumulate/prioritize BuildRequests, create and start Builds Schedulers watch BuildRequests, trigger dependent Schedulers higher level builds: BuildSources(sp?) give BuildSets to master use Schedulers instead? master sends BuildRequests to (specified) Builders ** Pieces to implement branch= arg/attribute on: VC slave commands (with .buildbot-branch stamp, clobber if changed) VC steps Build default value in Builder forceBuild (one Builder at a time) Change.branch BuildSet BuildSet(branch=None?, source_stamp, patch=None, builders) .waitUntilFirstFailure .waitUntilFinished control.addBuildSet(buildset) BuildSet status notification forms the basis for Problem tracking BuildRequest(source_stamp, branch, mergeable) .waitUntilFinished Builder.addRequest(buildrequest) Scheduler(name, treeStableTimer, branches, fileIsImportant, builders) c['schedulers'] = [s1,s2] default scheduler delivers to all builders changemaster.addChange: submits to all Schedulers eventually: turn BuildRequest/Build into Build/BuildProcess ** more thoughts The status events created by BuildSets finishing will form the basis for Problem tracking. Schedulers can also subscribe to hear about build status. The "retry transient failed builds" scheduler would do this to watch for builds that had failed in one builder (but not the others of the set) and re-try them. I don't know how to relate this to the BuildSet, though, unless something could say that the BuildSet hadn't really failed yet because some retries might still be pending. -- end notes -- Let me know what y'all think. cheers, -Brian ------------------------------------------------------- This SF.Net email is sponsored by Yahoo. Introducing Yahoo! Search Developer Network - Create apps using Yahoo! Search APIs Find out how you can build Yahoo! directly into your own Applications - visit http://developer.yahoo.net/?fr=offad-ysdn-ostg-q22005 |
|
| <Prev in Thread] | Current Thread | [Next in Thread> |
|---|---|---|
| Previous by Date: | exception in Change.asHTML in 0.6.6: 00054, Brian Warner |
|---|---|
| Next by Date: | Log file assertion: 00054, Nick Trout |
| Previous by Thread: | exception in Change.asHTML in 0.6.6i: 00054, Brian Warner |
| Next by Thread: | Log file assertion: 00054, Nick Trout |
| Indexes: | [Date] [Thread] [Top] [All Lists] |
| News | FAQ | advertise |