logo       

[jira] Resolved: (XERCESJ-1005) White space around ampersand is lost while : msg#00050

text.xml.xerces-j.devel

Subject: [jira] Resolved: (XERCESJ-1005) White space around ampersand is lost while parsing xml

Message:

The following issue has been resolved as INCOMPLETE.

Resolver: Michael Glavassevich
Date: Tue, 31 Aug 2004 11:22 AM

Here's the sequence of SAX events generated for that document. Note that all of
the characters are reported.

setDocumentLocator(locator=org.apache.xerces.parsers.AbstractSAXParser$LocatorProxy@cdfc9c)
startDocument()
startElement(uri="",localName="abc",qname="abc",attributes={})
characters(text="a and b ")
startEntity(name="amp")
characters(text="&")
endEntity(name="amp")
characters(text=" c")
endElement(uri="",localName="abc",qname="abc")
endDocument()

I'm fairly certain there was another bug report like this related to Castor
before. My guess is whatever handler is being registered on the parser has some
clever code for whitespace stripping in the implementation of
ContentHandler.characters() and is applying that
to each characters event. Xerces isn't doing this.
---------------------------------------------------------------------
View the issue:
http://issues.apache.org/jira/browse/XERCESJ-1005

Here is an overview of the issue:
---------------------------------------------------------------------
Key: XERCESJ-1005
Summary: White space around ampersand is lost while parsing xml
Type: Bug

Status: Resolved
Priority: Minor
Resolution: INCOMPLETE

Project: Xerces2-J
Versions:
2.6.2

Assignee:
Reporter: Sumit Arora

Created: Mon, 30 Aug 2004 12:42 PM
Updated: Tue, 31 Aug 2004 11:22 AM

Description:
White space on either side of the ampersand symbol (&) disappears during
parsing xml. e.g. the xml

<abc>a and b &amp; c</abc>

generates the element:

a and b&c

instead of:

a and b & c

A similar problem existed with parsing of double quotes (&quot;) in version 2.3
but seems to have been fixed in later versions.

I am seeing this behavior while using Castor (which uses xerses API
internally). So I am not sure which component of Xerces API causes this
behavior. More specifically, I see this behavior in the
unmarshal(java.io.Reader) method of the org.exolab.castor.xml.Unmarshaller
class.


---------------------------------------------------------------------
JIRA INFORMATION:
This message is automatically generated by JIRA.

If you think it was sent incorrectly contact one of the administrators:
http://issues.apache.org/jira/secure/Administrators.jspa

If you want more information on JIRA, or have a bug to report see:
http://www.atlassian.com/software/jira


<Prev in Thread] Current Thread [Next in Thread>
Google Custom Search

News | FAQ | advertise