logo       

Re: Moving XMLC to Xerces 2 won't help...: msg#00057

java.enhydra.xmlc

Subject: Re: Moving XMLC to Xerces 2 won't help...


Hi David,

Thank you very much for the more in-depth explanation.  It really cleared most things up for me.  Sounds like existing XMLC apps would have no compatibility problems with Parser and DomImpl wrapped into XMLC's namespace and it also solves the compatibility issues with existing environments.  If we can do this in the way you explain, it sounds like an excellent solution!

I have one other question, though.

Will it matter that in certain cases, a document will have a different implementation than that of XMLC's DOMImpl classes?  For instance, lets say Xerces2 is in CATALINA_HOME/common/endorsed and XMLC is in WEB-INF/lib.  If I bootstrap the DOM by doing...

DocumentBuilderFactory dbfactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dbuilder = dbfactory.newDocumentBuilder();
DOMImplementation dImpl = dbuilder.getDOMImplementation();
Document doc = dImpl.createDocument(null, "MyRootElement", null);

Or, alternatively, by using DOM3

DOMImplementation domImpl =
DOMImplementationRegistry.getDOMImplementation("XML 1.0");
Document doc = dImpl.createDocument(null, "MyRootElement", null);


Won't there be a clash of implementations?  Won't the DOM implementation in this case be Xerces2?  Does is matter?  I guess I'm probably sounding a bit naieve here.  I'm just not sure how it all these interactions affect each other?


BTW, I was totally mistaken in thinking that Xerces2 doesn't support DOM3.  I was thinking that the org.w3c.dom.ls (Load and Save) package (which I've used already) was DOM2.  It is, in fact, DOM3.  The one thing I knew wasn't implemented yet was DOMImplementationRegistry.  That is the key to everything since it allows you to bootstrap the DOM without requiring JAXP or any implementation specific references and since Xerces2 didn't have that, I assumed that it didn't support DOM3.  How wrong I was.

So, Xerces2 *does* support DOM3, just not all of it yet   See the Javadocs here:
http://xml.apache.org/xerces2-j/javadocs/api/index.html

Look specifically at the org.w3c.dom.ls package.

See the specs for Java Language Binding here:
http://www.w3.org/TR/DOM-Level-3-Core/java-binding.html
and
http://www.w3.org/TR/DOM-Level-3-LS/java-binding.html

Also, here is the latest on status of the DOM specs at the w3c:
http://www.w3.org/DOM/Activity.html

This, along with the org.xml.sax.ext package that Mark mentioned might be extremely useful.

Do we lose this functionality if we wrap the existing Xerces1 implementation up in XMLC?  We do, at least in the XMLC internals.  If XMLC application developers started using DOM3 and various other newer standard interfaces which have an implementation which is not the xerces version internal to XMLC and manipulate documents obtained by XMLC, do we run into any collisions?  Doesn't the call to hasFeature() get pretty messed up since the Document, if created by Xerces2, does have the feature, but then XMLC can't deal with that feature because it is older, and vice-versa, make things extremely confusing?  How is this all handled in a clean way....or is this not an issue, somehow?

Hmmm....  Unless I get good answers on this, I'm still leaning toward wanting a Xerces2 implementation of XMLC.  However, If the above turn out not to be issues, then I guess I'm right back with you in your suggestion to internalize the DOM and Parser implementations.

Jake

At 06:28 AM 1/24/2003 +0800, you wrote:
I regard to wrapping XMLC's version of Xerces1 into our own namespace.....

I'm not clear on how this would work?  Does this mean that people using XMLC would no longer access, say.... org.w3c.dom.Document and the like?  Because that is the issue at hand.  In order to think about putting XMLC in a webapp's WEB-INF/lib directory, it can't have any endorsed packages.  org.w3c.dom is an endorsed package and the JDK or jars added to endorsed directories will override any we add to XMLC.  The same goes for org.xml.sax and javax.xml.*.  If we do that and we want to keep XMLC totally independent of any outside parser containing those packages, then we necessarily force coders to do something like access org.enhydra.xmlc.dom.Document rather than org.w3c.dom.Document.  I don't think this is a good idea at all!  What XMLC does internally shouldn't affect peoples' use of the standard API's for DOM and XML programming.  This would actually hinder Barracuda since, in theory, it should work generically with any DOM templating engine, not just XMLC even though the focus has been 100% on XMLC thus far.

I wasn't very clear on my previous post. There are basically three parts of Xerces (Parser, DOMImpl and W3C/JAXP interfaces). Those get moved to the private name space of XMLC will be only Parser and DOMImpl. W3C/JAXP will be jar separately and the jars probably can be deleted if you deploy XMLC on a system already has those interfaces.

As I see it, we either need to move on to support the latest Xerces2 or go with something like DOM4J.  The DOM4J thing will probably make it harder to port existing apps to XMLC since many use the DOM directly in addition to XMLC's api, but it would solve our problems in the long run since DOM4J doesn't use endorsed API's except in providing objects that transform DOM and XML standard API objects to DOM4J objects and vice-versa.  Sticking with standard DOM stuff is the way to go if we want to keep compatibility with existing XMLC apps, but now we have to choose an implementation (at least for LazyDOM).  The one that everyone uses currently is Xerces2.  Maybe we can ask the Xerces group what the plans for Xerces2 are.  Will it be dropped and refactored for a newer and better Xerces3 branch like Xerces1 was, or does Xerces2 have some staying power?  My gut feeling is that the latter is true.

As I said in my first post, XMLC depending on implementation of Xerces, not just the official API. Moving to Xerces 2 in its current form, we will just end up with the same problem. We need to a specific version of Xerces 2 to be bundled with XMLC and that's causes the same problem as having a patched Xerces with XMLC.

Let me put it more concretely. Let's say we move XMLC to Xerces 2.2.1 and, very luckily, we don't need to patch Xerces. However, this just means that this new version of XMLC is tied to Xerces 2.2.1. We can't be sure if it will even work with Xerces 2.2.2 because XMLC will depend on internal implementation of 2.2.1 which could potentially be changed in 2.2.2.

My gut feeling is that we can't build XMLC simply on top of DOM API and JAXP. DOM/JAXP are a processing API. They let the Java programs reads in an XML files and be able to process it programatically. It does not maintain extra information about how the document is originally like. However, XMLC is a presentation API. It has to maintain those information to make sure the documents are correctly render and change the document as little as possible. The API we need to access these information simply do not exist in the DOM/JAXP API. DOM4J may present the similar problem.

The other issue we have in wrapping Xerces1 into our own namespace is that we will then never have any motivation to move it out.  We lose accessibility to performance increases that a package like Xerces2 might provide over Xerces1.  We may also have difficulty working with  new API's such as Jaxen which deal with standard API's + DOM4J, JDOM, and a couple others.  It seems like wrapping everything into our own namespace causes more problems than it solves.

XMLC has a different usage pattern than most XML applications. Most XML applications simply read the XML files, extract the data and done with the document. XMLC's usage pattern is construct the document, copy it, modify it a bit and output it and the copy/modify gets repeated a lot. This was the original intention to develop LazyDOM. For performance gain, we can probably do better enhancing LazyDOM than upgrading to latest Xerces. This is just my feeling, I don't have number to back this up yet.

Please point out the errors in my logic.  Otherwise, the issues I have raised need to be addressed before we move on wrapping all this stuff into our own namespace.  Maybe I just don't understand how it would work.  If so, please explain.

I hope that explains it.

I think the main problem with XMLC is that it is difficult to deploy because it requires the users to change the configuration of underlying environment. This adds to confusion to the new users and causes problem to deploy XMLC based applications on a hosted environment. The conflict with the environment comes from the following two.

1. The incompatibilities between W3C API in the underlying system and the one XMLC depending on.

Richard has solved the first issue by using callByReflection in DOMOps. DOMOps calls the XMLC specific DOM Level 3 methods on a standard DOM Level 2 interface using reflection. This just means XMLC no longer depending on the existing of Level 3 interface to work; however, it does not mean XMLC works with any DOM Level 2 implementation. The implementations need to have the Level 3 methods needed by XMLC.

2. The conflict of Xerces Parser/DOMImpl classese when underlying system already has it.

Moving the Xerces Parser/DOMImpl to XMLC's private name space would solve this problem.

A minor reason for this to be a good solution because it's the only feasible solution to the problem at hand. We have very limited resource available to work on the internal of XMLC. Hopefully, by making XMLC easier to use and deploy, we can get more developers. Then, we would have resources to address the issues of performance and how to really move to Xerces 2. ;)

David

</blockquote></x-html>
<Prev in Thread] Current Thread [Next in Thread>
Google Custom Search

News | FAQ | advertise