logo       

Re: Character encodings: msg#00245

text.xml.exist

Subject: Re: Character encodings

Hi,

we finally tracked down the character encoding issue to a problem in the
Apache xmlrpc lib. I uploaded a patched xmlrpc library to

http://prdownloads.sourceforge.net/exist/xmlrpc-1.2-patched.jar

Replace the old jar in $EXIST_HOME/lib/core with the patched one and rename it
to xmlrpc-1.2.jar (otherwise, start.jar doesn't find it).

For the interested, here's what I changed:

in org.apache.xmlrpc.XmlRpc.parse() I replaced line 421:

parser.parse(new InputSource (is));

with

InputStreamReader reader = new InputStreamReader(is, getEncoding());
parser.parse(new InputSource (reader));

The SAX parser assumed a wrong encoding of the input stream - though the XML
decl correctly declares UTF-8. I don't really know why. I tried different
parsers, but the result has always been the same.

Also, org.apache.xmlrpc.XmlWriter.chardata() limits the range of valid XML
characters to < 0xFF. I had to learn that the XMLRPC spec officially only
allows ASCII characters to be used in strings. Everything else should be
Base64 encoded. If we accept that, we can't use the string type anywhere. I
have thus just removed the limit.

I'm not really happy with these fixes. I really think we should move from
XMLRPC to REST as the main interface sooner or later.

Wolfgang


-------------------------------------------------------
This SF.net email is sponsored by: IT Product Guide on ITManagersJournal
Use IT products in your business? Tell us what you think of them. Give us
Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out more
http://productguide.itmanagersjournal.com/guidepromo.tmpl


<Prev in Thread] Current Thread [Next in Thread>
Google Custom Search

News | FAQ | advertise