I've written a small validating parser using the Xerces SAX parser.
However, I have found a performance problem between to similar implementations.
The code for both is shown below:
public static void validate(String instance)
{
try
{
XMLReader parser= null;
SAXParser parser2= new SAXParser();
try {
parser= XMLReaderFactory.createXMLReader(
"org.apache.xerces.parsers.SAXParser"
);
} catch(SAXException se) {
parser= XMLReaderFactory.createXMLReader();
System.out.println("oops");
}
//Validate the document and report validity errors.
parser.setFeature("http://xml.org/sax/features/validation", true);
parser2.setFeature("http://xml.org/sax/features/validation", true);
//Turn on XML Schema validation by inserting XML Schema
// validator in the pipeline.
parser.setFeature(
"http://apache.org/xml/features/validation/schema", true
);
parser2.setFeature(
"http://apache.org/xml/features/validation/schema", true
);
//Set the external schema location.
parser.setProperty(
"http://apache.org/xml/properties/schema/external-noNamespaceSchemaLocation",
"file:///myschema.xsd"
);
parser2.setProperty(
"http://apache.org/xml/properties/schema/external-noNamespaceSchemaLocation",
"file:///myschema.xsd"
);
ErrorHandler errors = new ErrorHandler();
parser.setErrorHandler(errors);
parser2.setErrorHandler(errors);
//Parse an XML document
System.out.println("XMLReader:");
System.out.println("Start: "+System.currentTimeMillis());
parser.parse(instance);
System.out.println("End: "+System.currentTimeMillis());
System.out.println("-------");
System.out.println("SAXParser:");
System.out.println("Start: "+System.currentTimeMillis());
parser2.parse(instance);
System.out.println("End: "+System.currentTimeMillis());
if (!errors.errorSeen())
{
System.out.println("Sucessfully validated " + instance);
}
}
catch (Exception e)
{
System.out.print("Problem parsing the file:");
System.out.println(e.getMessage());
e.printStackTrace();
}
The output from the four calls to System.currentTimeMillis() is the following
(this can vary from time to time of course):
XMLReader:
Start: 1084970176879
End: 1084970181622
-------
SAXParser:
Start: 1084970181625
End: 1084970182964
This tells me that the SAXParser takes 1339 milliseconds to complete while the
XMLReader takes 4743 milliseconds to complete!
What causes this difference. I expected both implementations to be
approximately as fast.
I hope someone can clear this up for me.
Thanks in advance.
Greetings,
Pieter van der Spek
@ @
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-oOO-(_)-OOo-=-=-=-=-=
Pieter van der Spek
---- West Consulting B.V. - www.west.nl
---- Tu Delft / Computer Science - www.tudelft.nl
|