|
Re: Thought on future of XMLC: msg#00123java.enhydra.xmlc
On Friday 29 November 2002 11:38, Arno Schatz wrote: > David, > > you would need to modify the DOM implementation such that it will be aware > if there was any change which the application does to the tree. Also you > would need more information from the parsing process: When the DOM is > created you would need to store the beginning offset and ending offset > (from the original HTML string) within the DOM node. The original HTML > string must be stored in memory of course (little overhead). In the output > process the for each DOM node we would need to look if it was changed in > some way be the application. If yes, produce the html from the node output > as it is done now. If no, take the substriing from the original html from > the beginning offset to the ending offset and return that as result. That's basically what the LazyDOM does, with one smalll but important differences: The LazyDOM doesn't store the *orginal* HTML, but rather caches the HTML that is constructed by the "standard" output process. The reason for this is simply that it is much (orders of magnitude :-) easier to create HTML text from a DOM than to create a DOM-like structure from broken HTML-like text. The other difference is that the LazyDOM caches preformatted texts per DOM node - so, you still have to walk the tree and output each node. But the treewalk really isn't that much of a performance hit - the big hit is the text conversion (especially detecing characters that need to be converted to HTML entities). That said, changing the text cache so that a complete, unchanged subtree can be output in a single operation is something I've wanted to do for a while now, and I'll probably implement it along the way when XMLC is changed to no longer depend on a specific version of Xerces - so expect this for XMLC 3.something :-) > If you look at the changes we really make, (even if we consider URL > mapping) mostly we are changing some leaves. (copying template rows is not > really a changing operation on the node, as you still can use the original > html for out putting, because the original html is of course immutable) > > So we would have > 1) the size of the html as memory overhead. > 2) need to change the parsing process to keep track of the offsets > 3) need to have a DOM implementation which has a modified flag and a > beginning and ending offset (probably integer) LazyDOM already does that, minus the offset stuff. > And we get > 1) quite some speed in spitting out html (over the current process) A bit, but not too much faster than the LazyDOM is my guess. > 2) large parts of the output html will be exactly what the input (the > original html) was. But you spend a huge amount of time on parsing "HTML-like" stuff and forcing it into something that resembles a DOM - that's a can of worms I definitely don't want to open. -- Richard Kunze [ t]ivano Software, Bahnhofstr. 18, 63263 Neu-Isenburg Tel.: +49 6102 80 99 07 - 0, Fax.: +49 6102 80 99 07 - 1 http://www.tivano.de, kunze@xxxxxxxxx
|
|
| <Prev in Thread] | Current Thread | [Next in Thread> |
|---|---|---|
| Previous by Date: | Re: Thought on future of XMLC, Arno Schatz |
|---|---|
| Next by Date: | Re: Thought on future of XMLC, David Li |
| Previous by Thread: | Re: Thought on future of XMLC, Arno Schatz |
| Next by Thread: | Re: Thought on future of XMLC, David Li |
| Indexes: | [Date] [Thread] [Top] [All Lists] |
| News | FAQ | advertise |