logo       

Re: [xmlc] <br clear="none"></br>: msg#00016

java.enhydra.xmlc

Subject: Re: [xmlc] <br clear="none"></br>

At 12:00 PM 4/17/2007, you wrote:
>Quoting Erik Rasmussen <rasmussenerik@xxxxxxxxx>:
>
>> Ugh! Everything is just working now. Both <textarea></textarea> and
>> <textarea/> in my document are getting output the proper way. My
>> memory of the exact circumstance where we were seeing the problem
>> before is hazy.
>>
>> I'm a little annoyed that I'm coming out of this looking like an
>> idiot, but my program is working, XMLC is working properly, and
>> there's no work needed from you, so I guess all has ended well.
>>
>
>I'll probably look into it a bit anyway, just to see if I see something odd.
>

Ok, I do see one thing, but it's not exactly a bug (debatable), only a little confusing. When using the XHTML DOM, element names are lower-case. If the XHTML DOM is used and an explicit oo.setFormat(OutputOptions.FORMAT_HTML) is supplied, things will basically work, but you will see oddities like <br></br> and <hr></hr> being output instead of the proper <br> and <hr>, respectively. Both the XMLFormatter and HTMLFormatter define tags that belong to certain categories for special handling when the document is an XHTML or HTML document. The XMLFormatter presumes elements are in lower-case and, actually, forces lower-case when FORMAT_XHTML is used and the document is detected to be HTML, not XHTML (such as when not using the XMLC XHTML DOM). However, the HTMLFormatter always assumes tags are in upper-case and makes no attempt detect that the DOM contains lower-case elements. Comparisons of lower-case tags to the upper-case special handling categories never match, which is why <br> is not recognized as a tag with an empty content model. "br" is being matched against "BR" (in a Map or a Set, so no opportunity for equalsIgnoreCase() unless code is modified), which clearly doesn't match.

I will think about checking in a fix for this before the next 2.3 release. It would entail detecting lower-case elements and forcing them upper-case. Then again, maybe there could be an output option recognized only by the HTMLFormatter which states something like oo.setForceHTMLLowerCase(true). Output for HTML would then be lower case even though the DOM stores it upper-case. This would be ignored by the XMLFormatter. I'd probably also want to dump any attributes like "xml:lang" (maybe copy the value over to "lang") and "xmlns". I'd also need to replace XHTML DTD's with their HTML equivalents.

I'm not 100% sure this is worth it? If one has chosen to use the XHTML DOM and taken care to write valid XHTML documents, why would they want to output it as HTML? One answer might be "so that developers can write markup that validates in their editors, but be able to output basic HTML to less capable user-agents that only understand HTML and foul up on XHTML". This is certainly a legitimate use-case. I'm still not sure whether it's worth the effort, especially when I put so much effort into allowing HTML to be formatted as XHTML in the XMLFormatter.

Comments?


Jake

>Jake
>
>> Thanks for writing and taking the time to support such an excellent
>> project.
>>
>
>No problem. Just trying to get things stabilized for a 2.3 release, though I
>have no specific date for that yet.
>
>Jake
>
>> Erik
>>
>> On Apr 17, 2007, at 5:09 PM, Jacob Kjome wrote:
>> > Quoting Erik Rasmussen <rasmussenerik@xxxxxxxxx>:
>> >
>> >> I'm using the org.enhydra.xml.io.DOMFormatter directly. I can't use
>> >> the XMLCContext because I'm not using servlets and
>> >> HttpServletRequest/
>> >> Response.
>> >>
>> >> I think I figured it out. I had a stray outputOptions.setFormat
>> >> (OutputOptions.FORMAT_HTML) that I added around the time of the
>> >> upgrade to beta3. The reason that I added it might be a bug you
>> >> could fix, though.
>> >>
>> >
>> > Ahhhhh.... yes, that would probably do it. I'll have to look at the
>> > HTMLFormatter to see how it is dealing with XHTML documents. What
>> > I might do
>> > is force FORMAT_XHTML when it is detected that the current document is
>> > XML-based, causing the XMLFormatter to be used. However, the
>> > HTMLFormatter
>> > shouldn't be outputting <br></br> anyway. I'll try to look at it
>> > tonight.
>> >
>> >> It's kind of like the opposite of the <br> problem. Just like how
>> >> some browsers don't handle closed <script> tags properly, a closed
>> >> <textarea> tag will be rendered as a textarea containing the rest of
>> >> the document. So there should *never* be a <textarea/> in an HTML or
>> >> XHTML document.
>> >>
>> >> Could the setEnableXHTMLCompatibility() flag handle this?
>> >>
>> >
>> > Yes, it should. However, are you actually seeing this when you use
>> > FORMAT_XHTML
>> > (or don't specify the format explicitly and FORMAT_XHTML will get
>> > chosen for
>> > XHTML documents automatically). You're saying that...
>> >
>> > this:
>> > <textarea></textarea>
>> >
>> > turns into this?
>> > <textarea />
>> >
>> > Is that right? If you can verify that, that would be helpful.
>> > I'll test at
>> > home tonight.
>> >
>> >> Is there even a way for a dtd to require pcdata content?
>> >>
>> >
>> > Well, any way you look at it, modifying the DTD is not a good
>> > option. And just
>> > because a tag is written in the short form doesn't mean it can't
>> > hold PCDATA.
>> > It just means that, currently, there is nothing inside the tag.
>> > Both the long
>> > and short form are correct. Of course some browsers don't agree,
>> > which is why
>> > we prefer one form over the other; hence setXHTMLCompatiblity(true).
>> >
>> >> Thanks for your help, (sorry for wasting your time)
>> >
>> > No problem. Looks like it fettered out another issue that we can
>> > address.
>> >
>> >
>> > Jake
>> >
>> >> Erik
>> >>
>> >> On Apr 17, 2007, at 8:11 AM, Jacob Kjome wrote:
>> >>> BTW, how are you actually writing the document? Are you using
>> >>> XMLCContext (this is a webapp, right)? Here's the pattern....
>> >>>
>> >>> XMLCContext context = XMLCContext.getContext(servletObj);
>> >>> //OR
>> >>> //XMLCContext context = XMLCContext.getContext
>> >>> (servletContextObj);
>> >>>
>> >>> .....
>> >>> ....
>> >>> ....
>> >>> org.enhydra.xml.io.OutputOptions oo =
>> >>> context.createOutputOptions(req, resp, xmlObj);
>> >>> oo.setEnableXHTMLCompatibility(true);
>> >>> oo.setUseAposEntity(false);
>> >>> oo.setOmitXMLHeader(true);
>> >>> oo.setMIMEType("text/html"); //optionally override that
>> >>> defined by the DOMFactory implementation
>> >>> context.writeDOM(req, resp, oo, xmlObj);
>> >>>
>> >>>
>> >>> I want to make sure you are using XMLC's DOM formatting and not
>> >>> some other non-specialized formatter. I can't guarantee what other
>> >>> formatters will output.
>> >>>
>> >>>
>> >>> Jake
>> >>>
>> >>> At 12:52 AM 4/17/2007, you wrote:
>> >>>>
>> >>>> That's curious. I just tested using the HEAD (same as 2.3-beta3
>> >>>> plus
>> >>>> one or two irrelevant updates) source and I always get <br /> under
>> >>>> the XHTML 1.0 strict DTD and <br clear="none" /> under the XHTML
>> >>>> transitional DTD. I never get <br clear="none"></br>. And it's
>> >>>> not
>> >>>> just setXHTMLCompatiblity(true) that goes into choosing how the tag
>> >>>> is ended. My memory failed me a little bit there. Everything
>> >>>> appears to be working from what I can see.
>> >>>>
>> >>>> Are you absolutely sure you are using 2.3-beta3? I can't seem to
>> >>>> reproduce <br clear="none"></br>. Can you send a source document
>> >>>> that contains <br />, which comes out as <br clear="none"></
>> >>>> br>? I'm
>> >>>> a bit baffled by your findings. Originally I just assumed it was a
>> >>>> case that I must have overlooked, but based on a review of the code
>> >>>> and on my testing, it seems to be accounted for and behaving
>> >>> properly.
>> >>>>
>> >>>>
>> >>>> Jake
>> >>>>
>> >>>> At 03:12 PM 4/16/2007, you wrote:
>> >>>>> Dunno if it helps, but this BR breaking began after an upgrade
>> >>>>> from
>> >>>>> 2.3-beta to 2.3-beta3.
>> >>>>>
>> >>>>> Erik
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>> --
>> >>>> You receive this message as a subscriber of the xmlc@xxxxxxxxxxxxx
>> >>>> mailing list.
>> >>>> To unsubscribe: mailto:xmlc-unsubscribe@xxxxxxxxxxxxx
>> >>>> For general help: mailto:sympa@xxxxxxxxxxxxx?subject=help
>> >>>> ObjectWeb mailing lists service home page: http://
>> >>> www.objectweb.org/wws
>> >>>
>> >>>
>> >>>
>> >>> --
>> >>> You receive this message as a subscriber of the xmlc@xxxxxxxxxxxxx
>> >>> mailing list.
>> >>> To unsubscribe: mailto:xmlc-unsubscribe@xxxxxxxxxxxxx
>> >>> For general help: mailto:sympa@xxxxxxxxxxxxx?subject=help
>> >>> ObjectWeb mailing lists service home page: http://www.objectweb.org/
>> >>> wws
>> >>
>> >>
>> >>
>> >
>> >
>> >
>> >
>> >
>> > --
>> > You receive this message as a subscriber of the xmlc@xxxxxxxxxxxxx
>> > mailing list.
>> > To unsubscribe: mailto:xmlc-unsubscribe@xxxxxxxxxxxxx
>> > For general help: mailto:sympa@xxxxxxxxxxxxx?subject=help
>> > ObjectWeb mailing lists service home page: http://www.objectweb.org/
>> > wws
>>
>>
>>
>
>
>
>
>
>
>--
>You receive this message as a subscriber of the xmlc@xxxxxxxxxxxxx
>mailing list.
>To unsubscribe: mailto:xmlc-unsubscribe@xxxxxxxxxxxxx
>For general help: mailto:sympa@xxxxxxxxxxxxx?subject=help
>ObjectWeb mailing lists service home page: http://www.objectweb.org/wws



--
You receive this message as a subscriber of the xmlc@xxxxxxxxxxxxx mailing list.
To unsubscribe: mailto:xmlc-unsubscribe@xxxxxxxxxxxxx
For general help: mailto:sympa@xxxxxxxxxxxxx?subject=help
ObjectWeb mailing lists service home page: http://www.objectweb.org/wws
<Prev in Thread] Current Thread [Next in Thread>
Google Custom Search

News | FAQ | advertise