At 02:54 PM 4/14/2007, Paul Smith wrote:
Hi all; sorry this is a bit slow. Soccer season is starting and I'm
very busy! This email will be a bit of a brain dump so please bear with
me.
Likewise. Busy, distracted, skimming this thread where I should be
studying it, etc....
Before anything else, it's important to realize that there are two
distinct, yet interdependent capabilities we are discussing here: the
first is the ability to use a separate algorithm for OOD determination,
and that's the one we've been talking about so far. But the second one
is at least as important and, in my opinion, the more challenging
design-wise: that is stateful make; the ability to keep some state
information across invocations of make. I don't think there are too
many OOD algorithms that you can choose that wouldn't require persistent
state. Deciding how to store that state, especially when you don't
really know what format it will be in (obviously the state to be stored
will vary with the OOD algorithm chosen), provide it to the OOD
algorithm, etc. is a design challenge.
I should preface my remarks by acknowledging that many of you,
certainly Paul, have been thinking about make much longer and harder
than I have so I may well be misunderstanding or oversimplifying some
issues. But I hope you'll at least consider my argument that it
doesn't need to be as complicated as this.
The way I've always imagined this is that make would deliberately
*not* address the problem of persistence. Instead it would defer that
to the particular OOD override, which at least has the virtue of
laziness (pace Larry Wall). Let's start by considering the
"competition"; ClearCase is certainly the best-known, most widely
deployed tool which currently offers advanced OOD detection and it
keeps its persistent data in a network database. So let's say I or
someone else wants to implement full ClearCase-like functionality
using GNU make. A database may well be preferred to sentinel files in
such an environment. In fact if we want to be able to share our
stateful knowledge with someone not operating in the exact same file
tree, that may be necessary.
In fact let's take this to its logical conclusion: what if one could
reverse-engineer the CC network protocol and wanted to tap into its
database for OOD decisions? I have no plan (or hope) of doing this
but it's not inconceivable that the vendor might contribute an
implementation, and in any case it serves to illustrate the point
that make need not handle its own persistence.
Let's consider a possible design off the top of my head. Say we
define a struct containing all potentially pertinent data for OOD
decisions (using short names for now because I'm a bad typist):
typedef struct ood {
int ood_version;
char *ood_targets[]; // vector of paths to targets
char *ood_prereqs[]; // vector of paths to prereqs
char *ood_envp[]; // traditional environ vector
char *ood_cwd; // current working directory
char *ood_script; // the build script
};
That's all I can think of which would affect a go/no-go decision but
by using a struct and storing ood_version we give it a faint OO gloss
which would allow for extensibility in case something else turns up.
Make alway calls the ood() function and passes it the above struct
whenever such a determination needs to be made. The default algorithm
would be to simply compare the dates of the targets to those of the
prereqs, in which case state is handled for us as it always has been.
If an override algorithm is detected, the override function not only
has all the information it needs for the OOD decision, it has enough
data to store that state too[*], which it could store in a file or by
writing to a socket or whatever.
[*] I see the first flaw already, which is that as Paul said the
state must be stored *after* the build script is run while the OOD
decision is made before. So this basically means you'd need ood_pre()
and an ood_post() functions.
To my mind dependence on sentinel/stamp files is at least arguably a
hack and I'd prefer that the design didn't require them. I also think
anything which keeps the core of GNU make simpler is a good thing. Of
course pushing persistence off to the user might make the
implementation of these extensions a little more complex but you
could deal with that by taking the same code you'd use for storing
file-based state and stick it into a documented library instead of
linking it into the make program.
Expanding on a previous point: it seems impossible to encode details
such as "connect to port 9382 on machine foobar and send the names
and MD5 states of the prereqs down the wire, then let me know what
answer comes back" in a make variable like .OUT_OF_DATE. You'd
basically be forced to write a little client program to do so and run
it with $(shell) which would lead to performance issues, especially
on Windows which is not optimized for quick cheap process creation.
OTOH I do see that the ability to use target-specific settings would
be quite elegant.
So to sum up: my argument is that OOD is conceptually pretty simple:
(1) find all the places where datestamp comparison is done now and
bring them all through one API, and (2) come up with a way for that
API to be interposed. Am I missing something important?
David B
PS Both my model and yours would appear to suffer from an obvious
race condition; what if something happens to change one or more
prereqs between the "pre" moment (when OOD determination is made) and
the "post" moment (when state is stored), either as the result of a
badly designed build script or what ClearCase calls "interference
from another process"? It seems some transitional state must be
stored within the make process. Maybe building on your idea, the
ood_pre() function could return a char pointer which would be null if
the target is up to date and otherwise a valid string. This string is
then passed into the ood_post() call for it to use as desired. The
typical use would be to remember size/date/MD5 of the prereqs from
the pre call and check that they're unchanged in the post.
|