|
|
| <prev next> |
Choosing A Webhost: |
Re: RevML DTD: msg#00000version-control.revml
[CCed to the revml list with permission] On Thu, Jan 06, 2005 at 02:54:07PM +1100, Peter Miller wrote: > Hi Guys, > > I'm thinking of adding revml support to Aegis, and I have a few > questions. Excellent, we'll help you as best we can. > There is plenty of VCP documentation on the web site > (http://public.perforce.com/public/revml/index.html), but no RevML > documentation, and no RevML DTD, See this link for the latest: http://public.perforce.com/public/revml/revml.dtd NOTE: CPAN should not have any code left on it aside from the backpan; trying to distribute production code that way turns out to be problematic due to the number of prerequisites and the difficulty many people have in installing all of them. It also broadens and deepens the number and complexity of configurations we need to support. > Examples of valid RevML files would be nice, too. I can send you some. VCP generates only valid XML (it checks elements against the DTD). > How come you introduced <char code="0xNN"> instead of using the existing > &#xNN; mechanism? Maybe some words of explanation in the DTD would > help. It's an XML thing: no matter what XML method you use, you are not allowed to encode any character point below a space (32) with the exception of a few control characters like carriage return and line feed. Even in XML1.1 you can't encode a NUL (0x00). So we need a non-builtin way to carry the occasional illegal character through XML. > The rest of my questions revolve around the addition of repository types > that the DTD has never heard of. With a new VC/SCM project starting > about every 3 months at the moment, it seems that an extensible DTD, > with no assumptions about the set of all VC/SCM systems, would be a good > idea. Most of RevML has few assumptions other than a series of revisions linked in some way. > You have the ability to add <COMMENT>s to a change set, but what if the > originating system has *numerous* attribute for each change set, and the > description is only one of them? What if a system supports arbitrary > user defined change set attributes? Systems like Subversion and, I presume, Aegis, would need their own element; the DTD above defines <cvs_info>, <p4_info>, etc. in each rev currently as PCDATA blobs, but we can define structured information in to them at some point as well. > What is a system supports file attributes beyond the ones in the DTD? We'd open the DTD up to allow them. Let's define them :). > What if a system supports arbitrary user defined change set attributes? We'd add a "named attribute" element like what you show. And I agree with your choice of elements and not attributes. > For example: > <attribute><name>X-Aegis-cause</name > <value>internal_bug</value> </attribute> > <attribute><name>UUID</name > <value>8112422d-bddf-496a-bf3c-23a4ac283fc8</value> </attribute> > could be used to allow system specific attributes, for systems which > have yet to be invented, or which the DTD authors will never use, > without DTD changes. I want to capture standard stuff in the DTD to prevent accidental or overly creative misuse. By standardizing the commonly available pieces in the DTD, including element ordering where convenient, we narrow the range of variation and limit accidental dependance on unspecified ordering, for instance. > I don't understand the use intended for the BRANCH_MAP_ID and > BRANCH_MAP_ID forms. Those are not present any more; they were intended to declare a set of <branche>s early in the RevML file and then be able to use <branch_map> and related elements to specify how to do the mapping when copying. The <rev>s can then refer to them throughout the file but it turns out to add no value to the current implementation. Today, we merely insist that the sender (VCP::Source::* in practice) establish its own unique list of branch IDs and mention them in the <rev> tags; there is no forward declaration of <branch>es today. We can bring them back if ever we have a need, but for now, KISS applies. > Given the presence of the <REP_TYPE>, why is the rep_type redundantly > present in the names of all the <*_BRANCH_ID> forms? You see a false start :). > Given the presence of the <REP_TYPE>, why is the rep_type redundantly > present in the names of all the <*_INFO> forms? It is not now, not sure why it every was. > Would it not be > possible to have a simple <INFO> form, with some kind of extensible > content? VCP support for PVCS never got off the ground; that was (and is) provisional engineering. <pvcs_info> should be struck from the DTD until such time as it is needed. > What is a change set moves a file *and* changes it? (This is common for > include files and their #ifdef insulation for indempotency.) Shouldn't > the last line of <!element rev> say (delete | (move,delta?) | > ((content|(base_name?,base_rev_id,delta)),digest)) instead? That has not been considered. <delete>, <move> and friends have been replaced with the more generic <action> element and all tools are under scout's honor to only use certain strings ("edit", "add", "branch", "delete", and a few others to deal with branching/merging semantics to date). I've not implemented any backends that support a discreet "move", but "move" would be my choice there. I should document these strings in the current DTD. Any unrecognized string is treated as an "edit" by VCP::Dest::*. > What if a system supports file attributes beyond the ones in the DTD? > (Only comment is provided - is that the change set comment Yes, although two of the four systems (CVS, VSS) have no changeset concept and so the DTD does not assume changesets. I'd like to see some explicit support for declaring changeset-wide information and then referring to it in individual revs, but that would mean a whole lot more logic to handle indirection and save little or no disk space when RevML is compressed. > You have a comment that <REPOSITORY_*> specific tags as needed, but that > is not future friendly. By using an <attribute> <name>blah</name > <value>blah</value> </attribute> style, all you need is a convention for > the names, which no change to the DTD. Agreed, but I want to limit the ad-hoc use of a generic form to truely generic attributes; common attributes should be embodied in the DTD to encourage standardization and once a common attribute escapes in to the wild encapsulated in a generic form, it can never be recaptured in a standard form without having every tool support both forms (ugh). > The <TYPE> form is too limited. It is sufficient for the systems we've used RevML with; as other systems are added we will look at other techniques. I like MIME types, but we'd also have to maintain two mappings of MIME types, one "to" and one "from", the simpler types used by more widespread systems like CVS, Perforce, VSS, etc. > Other well known attributes could include branch-id, Comment, Executable > (true or false), label, lock, mod-time, rev-id, user-id, UUID, ...etc. > This has the advantage that if a system chooses to ignore an attribute, > they don't have to support the grammar for the ones they are ignoring. Ignoring portions of an XML grammer is easy :). Coping with multiple authors who do not happen to choose the same spelling for a <name> is difficult, I think. But we're flexible; we want to encourage well-known names by ensconcing them as element names, not forbid ad-hoc extensions. > Plus, they can all have X-system-blah-blah extensions. The ones that > support arbitrary user defined attributes could have User-blah-blah > attributes, too. Nice approach, actually. I like the idea of a <user_attribute> and <site_attribute> if an SCM makes some true semantic difference between them. > When the <TYPE> also has the charset it given, it becomes possible to > map the XML encoding into the correct file encoding. Agreed. I presume the XML would be encoded without a character set using UTF-*, code pages are so 1970s :). But we don't enforce that and haven't had to deal with it yet. > Note that some systems give each file a unique ID (at least two that I > know of use the standard GUID/UUID format) which is immutable; they > model filenames as an editable attribute of a file, thus a file rename > is a simple change of the filename attribute. The <rev id="..."> should contain the GUID/UUID while the <name> should be it's current public identity. We do look to extend the RevML DTD as new needs come along; let's discuss how. Thanks, Barrie
|
|
| <Prev in Thread] | Current Thread | [Next in Thread> |
|---|---|---|
| Next by Date: | Re: Re: RevML DTD, Peter Miller |
|---|---|
| Next by Thread: | Re: Re: RevML DTD, Peter Miller |
| Indexes: | [Date] [Thread] [Top] [All Lists] |
Free MagazinesCisco NewsReceive a free quarterly e-newsletter with exclusive articles on how Cisco IT uses its own products and solutions to enable the business. subscribe Systems Management News, the newspaper for IT systems administration and data center managers! Each issue of Systems Management News is chock-full of news and analysis to help you understand what's happening in your field. subscribe The Enterprise Newsweekly eWeek is the essential technology information source for builders of e-business. subscribe Oracle Magazine Oracle Magazine contains technology strategy articles, sample code, tips, Oracle and partner news, how to articles for developers and DBAs, and more. Oracle (NASDAQ: ORCL) is the world's largest enterprise software company. subscribe Total Telecom Total Telecom is "The Economist of the communications industry". subscribe |