webhiker@xxxxxxxxxx wrote:
One thing you you forgot to test, and hence include in your report, was
performance when using non "large" documents.
In writing the XML:DB benchmarks (xmldbench,sourceforge.net) I've noted
that using xml documents of about 30K in size, eXist
falls over at anything over and above about 30 000 resources, whereas
Xindice 1.0 AND 1.1b happily performs up to about 100 000
documents.
Your test data of 5MB documents would have missed this important note,
and I still question the reasoning in having a 5MB xml document anyway -
usually this is symptomatic of a design problem. (IMHO)
I wouldn't take that generalization too seriously. XML happens to
be a common serialization format for a lot of content, and I've
commonly seen 25-100MB XML documents which I'd hardly characterize
as "design problems". E.g., the ITIS zoological database has chunks
already broken up from the bigger database, each chunk is as big as
25MB. I assume the XML serializations of the Cyc ontology will be
very large, like 100MB or bigger. It really depends on the demands
of any specific application. It might be quite inappropriate to
break up certain documents into smaller pieces, especially if one
wants to conserve the ID namespace, etc.
Murray
......................................................................
Murray Altheim http://kmi.open.ac.uk/people/murray/
Knowledge Media Institute
The Open University, Milton Keynes, Bucks, MK7 6AA, UK .
The New Zealand Herald : Latest World News
Kitten survives street sweeper
http://www.nzherald.co.nz/latestnewsstory.cfm?storyID=3539584
[must be an important kitten]
|