osdir.com
mailing list archive
Mozy Online Backup: 2GB Free. Automatic. Secure.

Subject: Re: IIBTree.multiunion and list comprehensions - msg#00025

List: web.zope.zodb

Date: Prev Next Index Thread: Prev Next Index
On Wed, 10 Dec 2003 09:46:04 -0200
Christian Robottom Reis <kiko@xxxxxxxxxxxx> wrote:

> On Tue, Dec 09, 2003 at 11:37:03PM -0500, Casey Duncan wrote:
> > > multiunion(tree.keys())
> >
> > The *keys* of the tree are sets? Huh?
>
> You *know* I meant tree.values() <wink>

8^)

> > > takes about the same time to run as
> > >
> > > [value for set in tree.keys() for value in set] (*)
> >
> > So you create a set out of this get the unique members?
>
> Actually, all I want is a sequence of values -- it could be a set or a
> list.
>
> What I'm doing is collating all the values in all the sets in my BTree.
> This BTree holds as values IITreeSets containing a number of integer
> OIDs. When I want to pick up *all* the values in the BTree, I need to
> join all these IITreeSets together and produce one big set. The fact
> that it's unique doesn't really matter in my case given that any given
> OID appears only once [as a value].

In that case it may be cheapest for you not to do the union up front at all.
How about creating a lazy iterator that just walks down every member of every
set in the tree? The big downside of that is that you can't really do anything
with the result set besides iterating it, such as sorting or intersecting it.

> > > Note that multiunion returns an IISet(), while the latter returns a
> > > list; converting between is pretty cheap given the context.
> >
> > Note that creating IITreeSets is often fastest because it allocates
> > new buckets as it grows rather than resizing a single memory block. An
> > IISet is essentially just a sorted C array of unique ints.
>
> IIRC, IITreeSets also have the advantage of "unghosting as you go",
> whereas IISets are unpersisted completely when accessing one of its
> items. I'm wondering why multiunion creates IISets instead of
> IITreeSets, though.

Probably because they are a lot simpler, and they are intended for in memory
use. Its also possible (although I don't know if its implemented this way) for
multiunion to know the maximum possible size up front. In that case the memory
for the set's array could all be allocated once. In fact, the C code could just
work with an int array and then make it into an IISet at end. Plus searches and
insertions are simpler operations on an IISet.

-Casey

_______________________________________________
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list - ZODB-Dev@xxxxxxxx
http://mail.zope.org/mailman/listinfo/zodb-dev



Was this page helpful?
Yes No
Thread at a glance:

Previous Message by Date: click to view message preview

Re: [Problem] "_v_" variables too volatile

Unless I do not here strong objections, I will extend the ZODB such that: +1 +1 from me too. * setting a "_v_" attribute of a "Persistent" object sets a "_p_volatile" attribute. Im not keen on the name, because objects in this state are _less_ volatile than others. I think the name should say "lives until the end of a full transaction", but I dont have a good suggestion. _p_vital _p_animate _p_living _p_operative _p_zoetic * "incrGC" gets an argument "flushVolatile", defaulting to "1". Can we use True and False for boolean values in a Python API? -- Steve Alexander _______________________________________________ For more information about ZODB, see the ZODB Wiki: http://www.zope.org/Wikis/ZODB/ ZODB-Dev mailing list - ZODB-Dev@xxxxxxxx http://mail.zope.org/mailman/listinfo/zodb-dev

Next Message by Date: click to view message preview

Re: [Problem] "_v_" variables too volatile

On Wed, 10 Dec 2003 12:32:19 +0100 Dieter Maurer <dieter@xxxxxxxxxxxx> wrote: [snip] > Unless I do not here strong objections, I will extend > the ZODB such that: > > * setting a "_v_" attribute of a "Persistent" object > sets a "_p_volatile" attribute. > > * "incrGC" gets an argument "flushVolatile", defaulting to "1". > If not set, objects with "_p_volatile" are not flushed > > * in "subtransaction commit/abort" "incrGC" is called > with "flushVolatile=0". An application may also wish to set this flag itself if it knows that it will need this object again in subsequent transactions. The GC could defer purging objects with this flag set until after objects without it set are purged first. As for the name, I agree that _p_volatile is not the best. How about: _p_enduring _p_held _p_lingering _p_kept -Casey _______________________________________________ For more information about ZODB, see the ZODB Wiki: http://www.zope.org/Wikis/ZODB/ ZODB-Dev mailing list - ZODB-Dev@xxxxxxxx http://mail.zope.org/mailman/listinfo/zodb-dev

Previous Message by Thread: click to view message preview

Re: IIBTree.multiunion and list comprehensions

On Tue, Dec 09, 2003 at 11:37:03PM -0500, Casey Duncan wrote: > > multiunion(tree.keys()) > > The *keys* of the tree are sets? Huh? You *know* I meant tree.values() <wink> > > takes about the same time to run as > > > > [value for set in tree.keys() for value in set] (*) > > So you create a set out of this get the unique members? Actually, all I want is a sequence of values -- it could be a set or a list. What I'm doing is collating all the values in all the sets in my BTree. This BTree holds as values IITreeSets containing a number of integer OIDs. When I want to pick up *all* the values in the BTree, I need to join all these IITreeSets together and produce one big set. The fact that it's unique doesn't really matter in my case given that any given OID appears only once [as a value]. > > Note that multiunion returns an IISet(), while the latter returns a > > list; converting between is pretty cheap given the context. > > Note that creating IITreeSets is often fastest because it allocates > new buckets as it grows rather than resizing a single memory block. An > IISet is essentially just a sorted C array of unique ints. IIRC, IITreeSets also have the advantage of "unghosting as you go", whereas IISets are unpersisted completely when accessing one of its items. I'm wondering why multiunion creates IISets instead of IITreeSets, though. Take care, -- Christian Robottom Reis | http://async.com.br/~kiko/ | [+55 16] 261 2331 _______________________________________________ For more information about ZODB, see the ZODB Wiki: http://www.zope.org/Wikis/ZODB/ ZODB-Dev mailing list - ZODB-Dev@xxxxxxxx http://mail.zope.org/mailman/listinfo/zodb-dev

Next Message by Thread: click to view message preview

Re: IIBTree.multiunion and list comprehensions

On Wed, Dec 10, 2003 at 09:27:51AM -0500, Casey Duncan wrote: > > What I'm doing is collating all the values in all the sets in my BTree. > > This BTree holds as values IITreeSets containing a number of integer > > OIDs. When I want to pick up *all* the values in the BTree, I need to > > join all these IITreeSets together and produce one big set. The fact > > that it's unique doesn't really matter in my case given that any given > > OID appears only once [as a value]. > > In that case it may be cheapest for you not to do the union up front > at all. How about creating a lazy iterator that just walks down every > member of every set in the tree? The big downside of that is that you > can't really do anything with the result set besides iterating it, > such as sorting or intersecting it. This is probably a good idea (though it requires Python 2.2, which we haven't yet in IndexedCatalog). You last phrase confused me, however: do you mean I *can* sort and intersect the result set? If so, it would be neat, since what I want to do in many cases is intersect this with another set (resulting from a boolean query). My main worry with *that*, however, is that intersecting will end up having a similar cost to doing multiunion, since the cost I'm concerned about is the first-run cost (IOW, the cost of unpersisting the actual sets) and AFAIHS multiunion is too fast to notice once the sets are in-memory. I realize that first-time costs aren't really important for long-running web applications; however, we're running a desktop system, and having the initial operation execute slowly doesn't give a nice `first impression', if you believe in such things. Take care, -- Christian Robottom Reis | http://async.com.br/~kiko/ | [+55 16] 261 2331 _______________________________________________ For more information about ZODB, see the ZODB Wiki: http://www.zope.org/Wikis/ZODB/ ZODB-Dev mailing list - ZODB-Dev@xxxxxxxx http://mail.zope.org/mailman/listinfo/zodb-dev
Sign up for updates to this mailing list. email:
Loading Comments...
Home | News | Patents | Sitemap | FAQ | advertise

Advertising by