logo       

Related Msgs: audio.musicbrai...    enbd.general/20...    ietf.idr/2002-0...    java.ant-contri...    gnu.make.genera...    qplus.devel/200...    video.freevo.cv...    os.netbsd.ports...    yellowdog.gener...    xfree86.cvs/200...    search.nutch.us...    freedesktop.xse...    programming.swi...    capabilities.ge...    telephony.pbx.a...    mail.sylpheed.c...    db.firebase.por...    boot-loaders.u-...    recreation.radi...    netbsd.bugs/200...    web.zope.plone....    user-groups.lin...   

Re: Xerces Unmarshaller bug removing single whitespace: msg#00062

Subject: Re: Xerces Unmarshaller bug removing single whitespace
Hi Thimo,

There's no such thing as a "Xerces Unmarshaller" so have no idea what
library you're referring to but it certainly doesn't come from this
project. I doubt this is a problem with Xerces. I suspect the Unmarshaller
classes you're using are the source of the odd behaviour possibly because
it's not handling multiple calls to the SAX characters() callback [1]
correctly.

A ContentHandler written like:

private StringBuffer buf;
public void characters(char[] ch, int start, int length)
   throws SAXException {
   buf.append(new String(ch, start, length).trim());
}

would cause whitespace to be dropped from seemingly random points in the
document (like you're seeing).

Thanks.

[1]
http://xerces.apache.org/xerces2-j/javadocs/api/org/xml/sax/ContentHandler.html#characters(char[],%20int,%20int)

Michael Glavassevich
XML Parser Development
IBM Toronto Lab
E-mail: mrglavas@xxxxxxxxxx
E-mail: mrglavas@xxxxxxxxxx

"Thimo von Rauchhaupt" <Thimo.Rauchhaupt@xxxxxxxx> wrote on 11/30/2007
08:54:56 AM:

> Hello ,
>
> When using Xerces (2.9.0 as well as 2.9.1) for unmarshalling it removes
> (from line 101:)
>
> <subjectmark><![CDATA[No specific subject]]></subjectmark>
>
> the single whitespace between "specific" and "subject". In the loaded
object
> the String value " No specificsubject" can be found.
>
> The strange behavior is, that if I enter some linebreaks obove the last
> object tag (question) from
>
>    </question>
>    <question>
>
> To
>
>    </question>
>
>
>
>
>
>
>    <question>
>
> the bug does not occur. Also strange is that the same tag (subjectmark)
with
> the same value occurs many times in the file, but only this one is parsed
> wrongly.
>
> My questions are:
> 1) Does anybody can tell me if I did something wrong?
> 2) Ist his a bug? Can anybody tell me how to report this bug / in which
> component? The bug reporting page is awfully complicated to do so. I only
> can read old bug reports but no data entry page can be found.
>
> Many thanks in advance,
> Thimo
>
>
> P.S.: My java code is:
>
> FileInputStream fis = new FileInputStream(aFileToImport); // is attached
> file AnonymizedImport.xml
> InputStreamReader isr = new InputStreamReader(fis,
> Exporter.DEFAULT_ENCODING); // means UTF8
>
> Unmarshaller tempUnmarshaller = new Unmarshaller();
> Mapping tempMapping = new Mapping();
>
>
tempMapping.loadMapping(Exporter.class.getClassLoader().getResource(Exporter

> .XML_MAPPING_FILE)); // see attached file import.xml
> tempUnmarshaller.setMapping(tempMapping);
> tempUnmarshaller.setDebug(stdlog.isDebugEnabled());
> ImportExportBean tempImportBean = (ImportExportBean)
> tempUnmarshaller.unmarshal(isr);
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: j-users-unsubscribe@xxxxxxxxxxxxxxxxx
> For additional commands, e-mail: j-users-help@xxxxxxxxxxxxxxxxx



Try Searching:
servers, voip, java, networking, microsoft ...
<Prev in Thread] Current Thread [Next in Thread>
Google Custom Search

Home | blog view | USPTO Patent Archive (NEW!) | advertise | OSDir is an inevitable website. super tiny logo