[
http://issues.apache.org/jira/browse/XERCESJ-1104?page=comments#action_12331504
]
William Eliot Kimber commented on XERCESJ-1104:
-----------------------------------------------
I may be misunderstanding how the code works but while I understand that the
behavior of the resolveEntity() method is as you describe, it is the schema
loader, which knows that it is loading a schema, that is calling this method.
Therefore, it could just as easily call a different method.
That is, the SAX parser can't be reporting anything (other than a CDATA
attribute value) because SAX parsers are not, as far as I know, schema aware.
But maybe I'm misunderstanding how the parser works? I'm basing this
assumption on the fact that since a schema reference is not an entity reference
but, at the XML level, just an attribute, there would be no way per the current
SAX API for a SAX parser to handle this as a SAX event (i.e.,
startSchemaReference or something) as opposed to handling it as part of the SAX
event handler's business logic.
I've started experimenting with a fix, which is as follows:
1. Extend the XMLEntityResolver to add a new method, resolveResourceByUri()
(named so it doesn't conflict with the resolveURI() method on a related
interface).
2. In the XMLSchemaLoader's resolveDocument() method, replace the current line:
return entityResolver.resolveEntity(desc);
with:
entityResolver.resolveResourceByUri(desc);
Where the systemId value of of the XML resource description is the target URI
from the schema location hint.
This is somewhat abusing the notion of "entity resolver" as schemas are not
entities in the XML sense, but it's the least disruptive change I could think
of.
What I haven't been able to do yet (ran out of time yesterday) is build a test
case that demonstrates that this works when processing an XML document with
validation turned on--I know how to configure a validating Xerces SAX parser (I
use one with Saxon) but I haven't figured out what I need to put around that to
actually trigger the schema processing. Here's my test case:
public void testSchemaLoaderViaCatalog() {
XMLReader reader = new SAXParser();
URL catalogUrl = this.getClass().getResource("catalog-01.xml");
String[] catalogs = {catalogUrl.toExternalForm(),};
// Create catalog resolver and set a catalog list.
XMLCatalogResolver resolver = new XMLCatalogResolver();
resolver.setPreferPublic(true);
resolver.setCatalogList(catalogs);
// Set the resolver on the parser.
try {
reader.setProperty(
"http://apache.org/xml/properties/internal/entity-resolver",
resolver);
} catch (Throwable e) {
e.printStackTrace();
fail(e.getMessage());
}
DOMParser dp = new DOMParser();
dp.setEntityResolver(reader.getEntityResolver());
URL sourceDoc = this.getClass().getResource("doc-1.xml");
Document doc = null;
try {
dp.setFeature("http://xml.org/sax/features/validation", true);
dp.setFeature("http://apache.org/xml/features/validation/dynamic",
true);
dp.setFeature("http://apache.org/xml/features/validation/schema",
true);
dp.setFeature("http://apache.org/xml/features/validation/schema/normalized-value",
true);
dp.parse(sourceDoc.toExternalForm());
doc = dp.getDocument();
} catch (Throwable e) {
e.printStackTrace();
fail(e.getMessage());
}
Element root = doc.getDocumentElement();
assertEquals("oneElementDocument", root.getNodeName());
}
This runs but it should in fact fail at the moment because the input document
isn't valid against the schema (and the test schema may or may not be
valid--I'm still figuring that one out--don't ask).
> Resolution of schemaLocation URIs should be via URI resolution, not entity
> resolution
> -------------------------------------------------------------------------------------
>
> Key: XERCESJ-1104
> URL: http://issues.apache.org/jira/browse/XERCESJ-1104
> Project: Xerces2-J
> Type: Bug
> Components: XNI
> Versions: 2.6.0, 2.7.1
> Environment: All
> Reporter: William Eliot Kimber
>
> The resolveSchema() method of XSDHandler uses the entityResolver to resolve
> URIs in schema location hints. This has the effect that, when using a catalog
> resolver that SYSTEM entries are used to resolve the schema URIs.
> However, schemas are not entities and therefore their URI references should
> not be resolved via an entity resolver but via a URI resolver and should,
> therefore be resolved via URI catalog entries, not SYSTEM entries.
> That is, by the OASIS Entity Resolution spec one would expect to declare URI
> entries to remap schema location URIs but this does not work.
> I'm happy to develop a fix but it may take me a while to figure out exactly
> how to go about it.
> Because this behavior has been around for a while (since at least version
> 2.6) and is documented in at least one tutorial I found, it will probably be
> necessary to control the use of an entityResolver or URI resolver through a
> system property.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
http://www.atlassian.com/software/jira
|