logo       

Re: Thought on future of XMLC: msg#00118

java.enhydra.xmlc

Subject: Re: Thought on future of XMLC

Arno,

The problem with the approach you are proposing here is that it's impossible to predict which part of the HTML pages will be modified and which part won't. It may be possible for a small projects that only uses simple get/set methods. A lot of XMLC programming is done with DOM API which can potentially traverse the entire page.

An alternative is possible with LazyDOM. DOM is a tree structure. At each node, we can keep a serialized string of the node as how it and its subtree would look after being serialized. As LazyDOM keep track of which node is modified, we can assume that its copy of serialized string is invalid and traverse the subtree to generated the new serialized string. However, this would cause a large increase in the memory usage approximately O(filesize * height of the DOM tree * 2). For a 50 K page with 10 level depth, it comes out to be 2M of memory (ascii goes unicode in Java). Some smart pruning of tree is necessary to reduce the memory foot print to make it become feasible solution.

David Li
---
"It spells Mac OS X but pronounces NeXTSTEP"

On Friday, Nov 29, 2002, at 05:10 Asia/Shanghai, Arno Schatz wrote:

Hi Jake,

sorry to not explain properly, I guess some other did understand me only because I was mentioning this somewhere else before.

When the DOM tree is created, there are a lot of nodes which will not be changed by the programm. (Mostly a application program only changes nodes which have an id attribute) So there are whole subtrees of the created DOM tree, which will never be changed by the application. This subtree is created from an html-string (a substring of the original html-page). So if you want to output such an unchanged subtree, you could output the original string from the html file. For generating the output from the DOM, xmlc runs through the hole tree, even through these unchanged nodes and generates the html. If it had a refernce to the original html, it could output the part of the orioginal html it was created from.

The current time consumption for outputting html is quite high as you might know. But there are other ways to speed up as well. So the thing in question between me and mark is whether it is better to use the original html string or the html produced by the DOM.

Is that understandable?

Arno



<Prev in Thread] Current Thread [Next in Thread>
Google Custom Search

News | FAQ | advertise