This seems a good thing to do! Some feedback and questions...
- Please don't do anything that will interfere with the operation of
entity reolvers. If an entity resolver returns anything other than null,
don't "re-resolve" the result with a catalog. IOW, a catalog should be
consulted only if there is no entity resolver or the resolver can't
resolve it.
- Likewise, _after_ a catalog has returned a resolved system identifier
(one that is different than the original), the entity resolver should be
given a chance to resolve it. If you don't do this, catalogs won't be
able to use application-defined schemes supported only by the
application's entity resolver.
- It should go without saying that the catalog should apply to schemas
and DTDs as well as external entities. The catalog should respect the
caching grammar pool, by supplying it with the original entity
identifiers and, if that fails to obtain a result, by calling it again
with the catalog-resolved identifier(s).
- Features that require the parser to read an XML file in order to read
an XML file are detrimental to performance. The design should follow the
model of the caching grammar pool, to allow applications to re-use the
same "compiled" catalog for multiple instance/schema parses.
- I assume when you say "preference for system or public matches" you
are talking about "prefer system identifier" mode and "prefer public
identifier" mode, as in the catalog spec? I guess this would set the
default and the catalog itself could still override it with prefer
attributes?
- Will you support jar: scheme URIs?
Bob Foster
Michael Glavassevich wrote:
In the upcoming release of Xerces-J we are going to be including OASIS
XML Catalog [1] support which will allow users to resolve locations for
their schemas (using their target namespaces), DTDs and other external
entities using a catalog. The XML Commons resolver [2] which implements
XML Catalogs will be included with the distribution and used by the
parser to resolve URIs.
We plan on exposing this feature through a new parser property, probably
something like: http://apache.org/xml/properties/xml-catalog-manager.
The class which corresponds to this property will encapsulate the XML
Commons catalogs and have methods which resolve external identifiers and
other URIs using a list of catalogs the user sets on this object. It
will also allow the user to specify their preference for system or
public matches in the catalog and their preference for whether they wish
Xerces to resolve the literal system identifier (this is the system id
as it appeared in the document) or the expanded system identifier (this
is the system id after it has been absolutized against some base URI).
Based on Norman Walsh's comments [3] the literal system identifier
should be used but since the spec doesn't say anything about it, it's
better to give the user the option. The default would be to use the
literal system id. It is intended that the user will be able to share
the Xerces xml-catalog-manager with several parsers as well as using it
stand-alone to resolve other URIs. The catalog spec defines a
processing instruction which may appear in instance documents which
would specify additional catalogs for the parser to use. This won't be
supported in this release but may be in a future one if there is enough
interest from users.
[1] http://www.oasis-open.org/committees/entity/spec.html
[2] http://xml.apache.org/commons/components/resolver/index.html
[3]
http://lists.oasis-open.org/archives/entity-resolution-comment/200304/msg00008.html
Michael Glavassevich
XML Parser Development
IBM Toronto Lab
E-mail: mrglavas@xxxxxxxxxx
E-mail: mrglavas@xxxxxxxxxx
|
Try Searching:
servers, voip, java, networking, microsoft ...
|
|
|
|