logo       
Google Custom Search
    AddThis Social Bookmark Button
-->

Re: Re: patch domains idea: msg#00196

Subject: Re: Re: patch domains idea
Saturday, January 29, 2005 3:03 AM, Aaron Denney wrote:

> On 2005-01-27, David Roundy <droundy@xxxxxxxxxxxxxxx> wrote:
> > The idea is called "patch domains".  A patch domain is a description of
> > "what a patch acts on".  Like the concept of "patches" and "trees" in
the
> > existing patch theory, a "domain" is an abstract concept, but with
> > reasonably strict laws governing the behavior of any implementation.
>
> Neat, and quite possibly workable.  I'm somewhat dubious about how
> much they'll actually help though -- they'll grow and interact
> through use and have the same high time behaviour.  As I see it,
> it's just a means of "chunking" so that there are fewer things in that
> behaviour.

I had an idea sort of like this a while back, but didn't get any of my magic
inspirations.  I was thinking of something like making a "set of affected
files" for a given patch.  Then with some efficient set operations (like a
HashSet from Java) you could quickly find out if two patches touch any of
the same files.  If not, then they commute automatically.  I was thinking of
it sort of like a collision algorithm.  (and was trying to see if I could
apply any known 2D/3D etc collision algorithms, but didn't come up with any
ideas on this either)

The one catch I keep seeing is that the best efficiency is reached if this
information (like hash tables of affected filenames) were cached somewhere,
so that darcs didn't have to recompute it each time.

The problem with caching it is that the patches change when they
non-trivially commute, meaning that the affected files would need to be
recalculated anyway (like if a filename changes due to commuting with a mv
or rm).  I'm not sure how often patches commute in typical operations, so a
cache might still be a savings.

The other issue is that it would be nice to have arbitrarily-fine-grained
collision checking, so that eventually patches at the beginning of a file
wouldn't collide with patches at the end. (but actually, their line numbers
would still need to change, so they do sort of collide...  maybe "maps" are
the only kind of patch that can dodge a collision, and we don't have maps
within files yet, so maybe we should just focus outside the file?)

Another somewhat related idea is to give the patch-files an order and then
tree them.  Then, calculate a collision set for each parent node in the tree
by unioning the collision sets of its children.  That would let you commute
a patch from one end to the other in as little as O(log n) time, if it
didn't collide with anything, and if you had done all the hashing ahead of
time.  But, every time you re-order any patch you would have O(log n) hash
tables to modify.

And one last idea I'll toss in- you could try to get darcs to analyze
natural partitions between files, such that the fewest number of patches lie
accross the partition.  For example, I usually put my changes to ./doc and
./src in different patch files.  If darcs noticed this, it could compare all
patches first on whether they hit one "side" of ./src and ./doc, and then
(assuming it doesn't hit both sides) reduce all further comparisons to
sub-partitions of ./src, or of ./doc.

-Mike


<Prev in Thread] Current Thread [Next in Thread>