Hi,
I agree with the points you mentioned, but the problem is exactly that
Xerces doesn't behave the way you say (and which I expected to encounter).
characters() method I use is standart SAX characters method (same as you use
in your Fibonacci example)
-------
public void characters(char[] text, int start, int length) {
if (processing) {
buffer.append(text,start,length);
}
}
public void startElement(...) {
if (localName.equals(codeName)) {
buffer = new StringBuffer();
processing=true;
}
}
public void endElement(...) {
if (localName.equals(codeName)) {
processing=false;
System.out.println(buffer.toString())
buffer.setLength(0);
}
}
-------
It's code I'd expect to work. Problem is that
- every char array returned by characters() has start=0 and length covering
full length of returned text. (so it doesn't return any data which aren't
meant to be not accepted)
- it returns what was returned before, with all chars of array marked as
valid (to be accepted)
so what I get for "1\n2\n3\n4\n" is (where real lines are longer, not just
one digit)
"1\n" : start =0, length =2
"1\n" : start =0, length =2
"2\n" : start =0, length =2
"1\n" : start =0, length =2
"2\n" : start =0, length =2
"3\n" : start =0, length =2
"1\n" : start =0, length =2
"2\n" : start =0, length =2
"3\n" : start =0, length =2
"4\n" : start =0, length =2
To me it seems that there may be problem with version mismatch of
InputSourcem or other classes which pass document to SAX itslef.
Michal
> At 3:48 PM +0000 1/16/04, Michal Sankot wrote:
> >I have problem with SAX bit of Xerces. I use SAX to get lines of an
element
> >of specified tag and print them out.
> >I was using older version of Xerces with which it run fine. When I
replaced
> >old xerces.jar with new xercesImpl.jar SAX starts to behave wierd.
> >
> >CDATA element content which is "1\n2\n3\n4\n" is returned by characters()
> >method as
> >"1\n"
> >"1\n"
> >"2\n"
> >
> >"1\n"
> >
> >"2\n"
> >"3\n"
> >"1\n"
> >"2\n"
> >"3\n"
> >"4\n"
> >strange (and frustrating), isn't it ?
> >
>
> I don't think so. Remember SAX parsers are not required to report all
> character data in a single call to characters. They can and do split
> nodes across multiple calls. You need to buffer and accumulate the
> data until you're ready to use it.
>
> You also have one or both of two other problems. The char array
> passed to characters() is not minimal. It normally contains other
> data not related to the current invocation. You need to use the start
> and length arguments to extract the sub-array relevant to the current
> call.
>
> Finally, the array passed to characters may be reused by the parser.
> You should not store it. Any data you need should be copied into some
> other object. See
>
> http://www.cafeconleche.org/books/xmljava/chapters/ch06s07.html
>
> for more discussion of these points.
> --
|