osdir.com
mailing list archive

Subject: Illegal characters, can xmlbeans be forgiving? - msg#00062

List: text.xml.xmlbeans.user

Date: Prev Next Index Thread: Prev Next Index
Hi,

My application parses XML from many different sources. (It's a RSS
reader/Podcast receiver).
Before I switched to XMLBeans I was using an xml parser called nanoXMl which
didn't mind
Some illegal characters especially when wrapped in CDATA.
Now XMLBeans stumbles over the illegal chars below:(“) (Throws exception).

....
<description><![CDATA[
Miljenko “Mike� Grgich first gained international recognition at
the celebrated “Paris Tasting� of 1976. They had chosen Mike’s 1973
Chateau Montelena Chardonnay as the finest white wine in the world.
Today, Mike oversees daily operations at his winery Grgich Hills. His
aim, year after year, is to improve the quality of their [...]]]></description>
......

Is there anyway I can set an option to ignore illegal chars and go on. For me
this could be a deal-breaker. I unfortunatly can't expect all XML out on the
web to be "nice and tidy".

Thanks for the help!
Cheers / Christophe


Was this page helpful?
Yes No
Thread at a glance:

Previous Message by Date: click to view message preview

RE: XmlEntry filter??

Your use-case is quite particular, there isn't something out-of-the-box that can satisfy it. Fortunately for you, it seems like you should be able to use org.xml.sax.XMLFilter + a little custom code to solve this. Radu -----Original Message----- From: cschweers@xxxxxx [mailto:cschweers@xxxxxx] Sent: Tuesday, December 13, 2005 7:18 AM To: user@xxxxxxxxxxxxxxxxxxx Subject: XmlEntry filter?? Hi how can i filter some elements in a XmlEntry?? This is my XmlEntry (only a snapshot): <xml-fragment xmlns:xsi="http://whatever" xmlns:xtce="http://something"> <SequenceContainer name="default_Container" shortDescription="This is a default container"> <EntryList> <ParameterRefEntry parameterRef="parameterOne"> </ParameterRefEntry> </EntryList> </SequenceContainer> <SequenceContainer name="ContainerOne" shortDescription="Packet one..."> <EntryList> <ParameterRefEntry parameterRef="parameterTwo"> <LocationInContainerInBits> <FixedValue>129<FixedValue> </LocationInContainerInBits> </ParameterRefEntry> <ParameterRefEntry parameterRef="parameterThree"> <LocationInContainerInBits> <FixedValue>132<FixedValue> </LocationInContainerInBits> </ParameterRefEntry> </EntryList> </SequenceContainer> <SequenceContainer name="ContainerTwo" shortDescription="Packet two..."> <EntryList> <ParameterRefEntry parameterRef="parameterFour"> </ParameterRefEntry> <ParameterRefEntry parameterRef="parameterFive"> </ParameterRefEntry> </EntryList> </SequenceContainer> </xml-fragment> I use this to build a XmlTree. My Problem is that i need only "ContainerOne" and "ContainerTwo". The complete xml-file contains many more container. I need something to ignore all the container which have the word "default" in the name. Is there a way to do this?? Thanks. Christian --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscribe@xxxxxxxxxxxxxxxxxxx For additional commands, e-mail: user-help@xxxxxxxxxxxxxxxxxxx

Next Message by Date: click to view message preview

Authentication issues not handled

For example, when using scomp to get schemas from a URL, it is possible that the URL is password protected via BASIC or DIGEST http authentication method. It seems like the xmlbean does not handle this. We know that in simple case we can just use Authenticator.setDefault(...) to overcome the problem, however, if the URL requires authentication over SSL, there seems no good way to handle except modify xmlbeans' source code. Moreover, if the given URL uses protocols like SMB to access windows network neighborhood, it just much harder to deal with it without modifying the xmlbeans' source code.

Previous Message by Thread: click to view message preview

XmlEntry filter??

Hi how can i filter some elements in a XmlEntry?? This is my XmlEntry (only a snapshot): <xml-fragment xmlns:xsi="http://whatever" xmlns:xtce="http://something"> <SequenceContainer name="default_Container" shortDescription="This is a default container"> <EntryList> <ParameterRefEntry parameterRef="parameterOne"> </ParameterRefEntry> </EntryList> </SequenceContainer> <SequenceContainer name="ContainerOne" shortDescription="Packet one..."> <EntryList> <ParameterRefEntry parameterRef="parameterTwo"> <LocationInContainerInBits> <FixedValue>129<FixedValue> </LocationInContainerInBits> </ParameterRefEntry> <ParameterRefEntry parameterRef="parameterThree"> <LocationInContainerInBits> <FixedValue>132<FixedValue> </LocationInContainerInBits> </ParameterRefEntry> </EntryList> </SequenceContainer> <SequenceContainer name="ContainerTwo" shortDescription="Packet two..."> <EntryList> <ParameterRefEntry parameterRef="parameterFour"> </ParameterRefEntry> <ParameterRefEntry parameterRef="parameterFive"> </ParameterRefEntry> </EntryList> </SequenceContainer> </xml-fragment> I use this to build a XmlTree. My Problem is that i need only "ContainerOne" and "ContainerTwo". The complete xml-file contains many more container. I need something to ignore all the container which have the word "default" in the name. Is there a way to do this?? Thanks. Christian

Next Message by Thread: click to view message preview

RE: Illegal characters, can xmlbeans be forgiving?

Hi Christophe It's very unlikely that the characters are the problem - all Unicode characters are allowed in XML - see e.g. http://www.xml.com/axml/testaxml.htm (section 2.2) and hence in XmlBeans. What is more likely is that the characters are not encoded (as bytes) in the way XmlBeans expects. By default XmlBeans assumes UTF-8 encoding. Yours are probably ISO8859_1 or some such thing. If you want to play around with character encoding have a look at XmlOptions.setCharacterEncoding(). Cheers, Lawrence > -----Original Message----- > From: Christophe Bouhier (MC/ECM) [mailto:Christophe.Bouhier@xxxxxxxxxxxx] > Sent: Wednesday, December 14, 2005 6:04 PM > To: 'user@xxxxxxxxxxxxxxxxxxx' > Subject: Illegal characters, can xmlbeans be forgiving? > > Hi, > > My application parses XML from many different sources. (It's a RSS > reader/Podcast receiver). > Before I switched to XMLBeans I was using an xml parser called nanoXMl > which didn't mind > Some illegal characters especially when wrapped in CDATA. > Now XMLBeans stumbles over the illegal chars below:(ÃâÅ) (Throws > exception). > > .... > <description><![CDATA[ > Miljenko ÃâÅMikeÃâ? Grgich first gained international recognition at > the celebrated ÃâÅParis TastingÃâ? of 1976. They had chosen MikeÃââs 1973 > Chateau Montelena Chardonnay as the finest white wine in the world. > Today, Mike oversees daily operations at his winery Grgich Hills. > His aim, year after year, is to improve the quality of their > [...]]]></description> > ...... > > Is there anyway I can set an option to ignore illegal chars and go on. For > me this could be a deal-breaker. I unfortunatly can't expect all XML out > on the web to be "nice and tidy". > > Thanks for the help! > Cheers / Christophe > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscribe@xxxxxxxxxxxxxxxxxxx > For additional commands, e-mail: user-help@xxxxxxxxxxxxxxxxxxx
Sign up for updates to this mailing list. email:
Loading Comments...
Home | News | Patents | Sitemap | FAQ | advertise

Advertising by