Hi Ed,
>I think that in practice only grabbers which can get a correct, unique
>programme identifier will bother to put one in the file. Assuming
>that we take the strong meaning of same-id where only the *same
>programme* (and not just two programmes with the same XML) are allowed
>to share an id.
>
>It would be possible for grabbers to make up their own ids, and add a
>unique token to them to make them unique across files, but I'm not
>sure there would be much point.
I don't think this is strong enough. The DTD should make it clear that the
ID must *only* be populated when either:
1. The underlying listing source provides a programme identifier that can
be used to detect duplicates across days and across files
2. By some (very magic) means the grabber itself is able to produce a
programme identifier that can be used to detect duplicates across days and
across files.
If we allow grabbers to produce their own identifiers to create their own
identifiers that *can't* detect duplicates across days and across files,
and we have no way to distingush "those that can" from "those that can't",
then a single grabber that can't ruins things for those that can.
Alternatively, we can allow any grabber to use the ID/IDREF that wants to,
and they will only be expected to detect duplicates within a single file.
They can grab the identifier from where-ever they like or invent it. Then,
in a separate element, we can expose the 'true' programme identifier from
the underlying listing for those grabbers that have it.
This latter approach is my preference because it separates the decision of
whether to use ID/IDREF (and thus shrink file size/identify duplicates
within a file) from the ability to expose the cross-file program id.
Lloyd
-------------------------------------------------------
This SF.Net email sponsored by: Parasoft
Error proof Web apps, automate testing & more.
Download & eval WebKing and get a free book.
www.parasoft.com/bulletproofapps1
|