Please take our Survey
logo       

Choosing A Webhost:
A web hosting service is a type of Internet hosting service that allows individuals and organizations to provide their own website accessible via the World Wide Web. Web hosts are companies that provide space on a server they own for use by their clients as well as providing Internet connectivity, typically in a data center. Web hosts can also provide data center space and connectivity to the Internet for servers they do not own to be located in their data center, called colocation. more...

Re: Java OpenGroupware API - annotations: msg#00018

cms.opengroupware.xmlrpc.devel

Subject: Re: Java OpenGroupware API - annotations

On Mittwoch, Sep 3, 2003, at 15:38 Europe/Berlin, Werner Schuster wrote:
Hmm... one thing I noted in this conversation,
is the different conceptions about how high-level Enumerators (Iterators)
are;

Yes. Enumerators and Arrays are just concepts living at the same level of "abstractness" (Stack would be another one).
Yet in the actual case (records fetched using XML-RPC over high latency and slow links), enumerators match the actual data flow while arrays do not. So arrays should be build upon a stream (stream=enumerator).

I personally think of Enumerator as more high-level than Lists; the idea
is that with an I don't have to think about what type of Container
I am accessing; I simply call "next()" until nothing is left;

If you work on streamed data, the list needs to be build upon an enumerator. If you have indexed data, the enumerator needs to build upon a list. ;-)

Of course it depends on the underlying data structure which is a stream in case of HTTP. It may certainly be less obvious that XML-RPC results should also be treated as a stream - but for practical purposes it should be done that way.

(well, one could then claim that HTTP/XML-RPC then is the wrong base technology ;-) - this may even be true, yet SQL databases also work in a sequential way, so this matches pretty well)

A List, for me, is lower level, as it is more similar to the actual
memory layout (one element after the other) and I have to use a counter
var to walk the list.

Of course. If you have a linked list, the enumerator will be more like the datastructure and indexed access be unnatural ;-) Now don't tell me that you never need linked lists ;-)

So, in my eyes, the Enumerator (Iterator) is more of a convenience
thing that most people will use;

No. Its just a different approach of accessing data.

I have a problem with only offering an Enumerator and basing all
accesses on it (eg. some convenience List class, that uses it);

You decide, I only want to bring up issues you'll run into ;-)

...image sample...
But its a real pain in the neck, if you want to access a specific pixel;

Yes, enumerators are usually more difficult to use. But scalable APIs *are* difficult to use ;-)
If network IO is as inexpensive as direct memory access and if we have 100GB RAM machines, enumerators may not be worth the effort. But we have practical constraints.

This is getting theoretical, but interesting nevertheless ;-):

I suppose, the programmers of this lib realized this and added some
methods for indexed access (ie. something like getAt(x,y) );
The problem was, they didn't access the data directly, but by way of
the iterators... If you're not groaning with disbelief yet, think
about this: if you want to access the pixel at pos. (100,100), you have to
have 100*100 iterations with the iterator...; Now imagine the performance
of a program, that has to walk the pixel data from the bottom-right to
the top-left pixel...

Whether or not this makes sense heavily on the application *and* on the data format in memory.
In case the whole image (eg consider a digital camera, high resolution image which easily has hundreds of megabytes) is really in memory as a 1:1 matrix, direct access of course makes more sense. But if it is kept eg in RLE encoding to reduce the memory requirement (in RAM!) by factor 50 - enumerators actually *do* make a lot sense.
Scanning the data with a GHz CPU will be way faster than consuming 50times more memory.

I know, that DB and Image Manipulations are different domains and whats
bad for one need not be bad for the other, but thats one reason I am
bit sceptic about using only Enumerators.

No, it pretty much shows the same issues. You are assuming that the image is kept completely in memory as a 1:1 matrix - but for a lot of digital media or print applications it often is not. And while providing indexed access may be more convenient, it is very likely to be much slower - often with results so slow that it can't be used in practice.

Its just about choosing the right algorithm/abstraction for the task at hand. If you only want to deal with 100 contact records in JOGI - indexed access is just fine and way easier (and probably faster as well). If you want to deal with larger databases, it still will be easier, but load times and memory requirements will be abysmal.

If you strictly focus on an OGo specific GUI client, this may be true.
In any other case it is not.
Mostly you will *not* keep them in memory - because you can't!

Hmm... but thinking about OG.o ... would there really be such
huge amounts of data? I mean, how many Person, Enterprise, ... objects
are we talking about here?

I guess at least something like 100.000-1.000.000 objects. This obviously depends on the installation.

Note: the issue is not only about RAM (though thats probably the bigger problem). It is also about retrieval speed/API-latency over HTTP/XML.

JSP pages
- render object, forget object, render object, forget object
Export tools
- export CSV line, forget object, export CSV line, forget object

True, it would be nice to work with a constant amount of data
with one of these (basically just keeping the currently used
elements in memory);

Of course.

Glow GUI Client
- map JOGI Person to SDBC Contact, forget object, etc
OGo GUI Client
- extract attributes you need (name,street,email), forget object

Hmmm... I don't think so; for GUI applications, the MVC approach would
be used, so the Objects returned by JOGI would be used as models that
store the data, and the GUI uses this data to display it; so, if you
want to keep displaying the data, you need the stuff in memory;

That would be highly inefficient for larger datasets.

Hmm... it seems, like there might actually be some need for ... hm...
two APIs in JOGI; they would be like DOM and SAX for XML;
-the current JOGI implementation would be like DOM, which is easy to
use, but memory intensive and unusable for big amounts of data;
- the other would only allow to process data serially as it comes in,
but would use only constant memory; but it would still offer high-level
objects (Account, Person,...) to easily access the data (so you don't
have to fiddle with XmlRpc results);

Again, the comparison of SAX and DOM is excellent :-) DOM is just fine for small documents, but for big ones it doesn't scale.

But I do not agree that we need two separate APIs. The fetch API using the list can be easily layered upon the Enumerator one since this is natural (results coming in a streamed way, the list is being built).

Although... offering a streaming API that really uses the advantages of
streaming (ie. allowing access the first elements while the rest
is still on the road) has its own set of problems;
Well... actually only one problem: we would have to write our own XmlRpc
implementation, because currently available ones don't do that (at least
as far as I know); if you call a function that returns a list of things,
the method call returns when the *whole* list has been retrieved and made
available as a Java List;

Well, why break the JOGI API just because the initial backend implementation is broken ;-)


Anyway, I have made all my points. Decide now but don't complain later - after all its established practice that Java stuff is rewritten from scratch every six months ;-)

regards,
Helge
--
OpenGroupware.org - http://www.opengroupware.org

--
OpenGroupware.org XML-RPC
xmlrpc@xxxxxxxxxxxxxxxxx
http://mail.opengroupware.org/mailman/listinfo/xmlrpc



<Prev in Thread] Current Thread [Next in Thread>
Google Custom Search

Recently Viewed:
solaris.opensol...    editors.vim/200...    web.turbogears....    jakarta.ant.dev...    mathematics.max...    text.unicode.ge...    lang.ruby.core/...    xfce.announce/2...    network.centeri...    php.cvs.pear/20...    user-groups.lin...    kde.devel.quant...    file-systems.ar...    redhat.fedora.t...    apple.fink.auto...    gnome.orbit.gen...    qplus.devel/200...    culture.transpo...    video.dri.user/...    operators.nanog...   
Home | advertise | OSDir is an inevitable website. super tiny logo

Free Magazines

Cisco News
Receive a free quarterly e-newsletter with exclusive articles on how Cisco IT uses its own products and solutions to enable the business.
subscribe

Systems Management News, the newspaper for IT systems administration and data center managers! Each issue of Systems Management News is chock-full of news and analysis to help you understand what's happening in your field.
subscribe

The Enterprise Newsweekly eWeek is the essential technology information source for builders of e-business.
subscribe

Oracle Magazine Oracle Magazine contains technology strategy articles, sample code, tips, Oracle and partner news, how to articles for developers and DBAs, and more. Oracle (NASDAQ: ORCL) is the world's largest enterprise software company.
subscribe

Total Telecom Total Telecom is "The Economist of the communications industry".
subscribe

Navigation