|
RE: [xmlc] xmlc2.3 include-ignorable-whitespace feature: msg#00019java.enhydra.xmlc
Quoting ʯöÎ <shixin129@xxxxxxx>: > I have some html pages like this : > ... <ul id="DemoList"> > <li></li> > <li></li> > </ul> > ...and use "xmlcObject.getElementDemoList().getChildNodes()" to get all <li> > elements, > but in xmlc2.3 the "getChildNodes()" return a NodeList contains > org.w3c.dom.Text Object. > Now I use "xmlcObject.getElementDemoList().getElementsByTagName("li")" > instead. > Your solution is more reliable than getChildNodes(). However, I want to explore this a bit more. Read on... Are you using the HTML DOM or the XHTML DOM? The "include-ignorable-whitespace" feature applies *only* to the latter. Since HTML isn't validated, there's no way for the parser to know what whitespace is ignorable. As such, the parser makes no attempt to remove whitespace, because without the DTD telling it what to remove, any attempt may remove important whitespace. I think there might be some confusion here. You originally expressed concern that "include-ignorable-whitespace" was "false" and wanted to be able to configure it, presumably, to "true". However, based on your example above, you are concerned that you are getting extra whitespace nodes in places where they it's arguable that they ought not be. This is exactly what setting "include-ignorable-whitespace" to "false" is for. It removes ignorable whitespace. I think this is pretty much what one would want (and what you seem to be trying to achieve) and don't see any benefit of making the feature configurable. My guess is that you are using the HTML DOM, not the XHTML DOM. If you upgraded from XMLC-2.2.xx and, all of a sudden, began seeing extra Text nodes where they didn't get created before, such between children of <ul>, <ol>, etc..., this is because Xerces-1.4.4 (which is what XMLC-2.2.xx uses) strips whitespace from HTML where Xerces2 (or NekoHTML) does not. IMO, Xerces2/NekoHTML is doing the right thing and Xerces-1.4.4 is doing the wrong thing. Without a DTD to validate against, Xerces-1.4.4 has no business in removing whitespace. For instance, it might remove whitespace inside <pre> tags without a DTD to tell it not to. The best way to avoid this problem is to use the XHTML DOM, which uses the validating XML parser instead of the non-validating HTML parser. That said, it is possible to mimick the include-ignorable-whitespace="false" in the HTML parser if we are very careful about following the rules of the XHTML 1.0 Transitional DTD. If you would like to take a crack at it, take a look at XercesHTMLDOMParser.java [1]. I even have a limited attempt that I commented out. Look at the commented out characters() method. That method might actually be correctly implemented as-is, but I wasn't 100% sure that it would be correct, so I left it commented out. It could be uncommented in a future release, but we'd have to be sure it isn't removing whitespace where it shouldn't. [1] http://cvs.forge.objectweb.org/cgi-bin/viewcvs.cgi/xmlc/xmlc/xmlc/modules/xmlc/src/org/enhydra/xml/xmlc/parsers/xerces/XercesHTMLDOMParser.java > > XMLC is compatible with OSGi , XMLCObject can be easily uesed in OSGi HTTP > Service or in Eclipse RCP , but jsp can't. > I use XMLC with OSGi for a long time, it work very well. > I'm interested in this. Do you have an external references that can show me and others how to integrate XMLC with OSGI. You're not obligated to, but if it isn't too much trouble, it would be much appreciated. > Sorry for my weak english. > Hey, no problem. You don't see me being able to speak Chinese, do you? You're one big step ahead of me! Jake > > Curry > > > > > Date: Thu, 24 May 2007 01:37:13 -0500> To: xmlc@xxxxxxxxxxxxx> From: > hoju@xxxxxxxx> Subject: Re: [xmlc] xmlc2.3 include-ignorable-whitespace > feature> > > Well, right now it isn't configurable, though it > could be > added as an option in the metadata in > the future. Can you explain why you > need > ignorable whitespace to be included? The DTD > defines whitespace as > ignorable or not. Why > would it lie? Can I assume you are using > XHTML? > Please describe what is getting broken so > I can better understand the > problem. And are you > using XMLC's DOMFormatter to output your markup or > some other mechanism?> > BTW, I'm curious, how are you using OSGI with XMLC?> > > > Jake> > At 07:53 PM 5/23/2007, you wrote:> > >Hi,> >> > I upgrade my > application (xmlc + osgi) to > > xmlc 2.3 , then I found that > > > "include-ignorable-whitespace" feature default > > is false. I can't find how > to configure.> >> > Who have a good idea?> >> >> > Curry> >> >> >> >> > >----------> >ͨ¹ý Live.com > >²é¿´×ÊѶ¡¢ÓéÀÖÐÅÏ¢ºÍÄú¹ØÐĵįäËûÐÅÏ¢£¡ > > ><http://www.live.com/getstarted.aspx>Á¢¼´³¢ÊÔ£¡> >--> >You receive this > message as a subscriber of the > >xmlc@xxxxxxxxxxxxx mailing list.> >To > unsubscribe: mailto:xmlc-unsubscribe@xxxxxxxxxxxxx> >For general help: > mailto:sympa@xxxxxxxxxxxxx?subject=help> >ObjectWeb mailing lists service > home page: http://www.objectweb.org/wws> > > _________________________________________________________________ > ʹÓÃÏÂÒ»´úµÄ MSN Messenger¡£ > http://imagine-msn.com/messenger/launch80/default.aspx?locale=zh-cn&source=wlmailtagline -- You receive this message as a subscriber of the xmlc@xxxxxxxxxxxxx mailing list. To unsubscribe: mailto:xmlc-unsubscribe@xxxxxxxxxxxxx For general help: mailto:sympa@xxxxxxxxxxxxx?subject=help ObjectWeb mailing lists service home page: http://www.objectweb.org/wws
|
|
| <Prev in Thread] | Current Thread | [Next in Thread> |
|---|---|---|
| Previous by Date: | RE: [xmlc] xmlc2.3 include-ignorable-whitespace feature, 石鑫 |
|---|---|
| Next by Date: | RE: [xmlc] xmlc2.3 include-ignorable-whitespace feature, 石鑫 |
| Previous by Thread: | RE: [xmlc] xmlc2.3 include-ignorable-whitespace feature, 石鑫 |
| Next by Thread: | RE: [xmlc] xmlc2.3 include-ignorable-whitespace feature, 石鑫 |
| Indexes: | [Date] [Thread] [Top] [All Lists] |
| News | FAQ | advertise |