|
|
Mozy Online Backup: 2GB Free. Automatic. Secure.
Subject: Re: IIBTree.multiunion and list comprehensions - msg#00025
List: web.zope.zodb
On Wed, 10 Dec 2003 09:46:04 -0200
Christian Robottom Reis <kiko@xxxxxxxxxxxx> wrote:
> On Tue, Dec 09, 2003 at 11:37:03PM -0500, Casey Duncan wrote:
> > > multiunion(tree.keys())
> >
> > The *keys* of the tree are sets? Huh?
>
> You *know* I meant tree.values() <wink>
8^)
> > > takes about the same time to run as
> > >
> > > [value for set in tree.keys() for value in set] (*)
> >
> > So you create a set out of this get the unique members?
>
> Actually, all I want is a sequence of values -- it could be a set or a
> list.
>
> What I'm doing is collating all the values in all the sets in my BTree.
> This BTree holds as values IITreeSets containing a number of integer
> OIDs. When I want to pick up *all* the values in the BTree, I need to
> join all these IITreeSets together and produce one big set. The fact
> that it's unique doesn't really matter in my case given that any given
> OID appears only once [as a value].
In that case it may be cheapest for you not to do the union up front at all.
How about creating a lazy iterator that just walks down every member of every
set in the tree? The big downside of that is that you can't really do anything
with the result set besides iterating it, such as sorting or intersecting it.
> > > Note that multiunion returns an IISet(), while the latter returns a
> > > list; converting between is pretty cheap given the context.
> >
> > Note that creating IITreeSets is often fastest because it allocates
> > new buckets as it grows rather than resizing a single memory block. An
> > IISet is essentially just a sorted C array of unique ints.
>
> IIRC, IITreeSets also have the advantage of "unghosting as you go",
> whereas IISets are unpersisted completely when accessing one of its
> items. I'm wondering why multiunion creates IISets instead of
> IITreeSets, though.
Probably because they are a lot simpler, and they are intended for in memory
use. Its also possible (although I don't know if its implemented this way) for
multiunion to know the maximum possible size up front. In that case the memory
for the set's array could all be allocated once. In fact, the C code could just
work with an int array and then make it into an IISet at end. Plus searches and
insertions are simpler operations on an IISet.
-Casey
_______________________________________________
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/
ZODB-Dev mailing list - ZODB-Dev@xxxxxxxx
http://mail.zope.org/mailman/listinfo/zodb-dev
Was this page helpful?
Thread at a glance:
Previous Message by Date:
click to view message preview
Re: [Problem] "_v_" variables too volatile
Unless I do not here strong objections, I will extend
the ZODB such that:
+1
+1 from me too.
* setting a "_v_" attribute of a "Persistent" object
sets a "_p_volatile" attribute.
Im not keen on the name, because objects in this state are _less_ volatile
than others. I think the name should say "lives until the end of a full
transaction", but I dont have a good suggestion.
_p_vital
_p_animate
_p_living
_p_operative
_p_zoetic
* "incrGC" gets an argument "flushVolatile", defaulting to "1".
Can we use True and False for boolean values in a Python API?
--
Steve Alexander
_______________________________________________
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/
ZODB-Dev mailing list - ZODB-Dev@xxxxxxxx
http://mail.zope.org/mailman/listinfo/zodb-dev
Next Message by Date:
click to view message preview
Re: [Problem] "_v_" variables too volatile
On Wed, 10 Dec 2003 12:32:19 +0100
Dieter Maurer <dieter@xxxxxxxxxxxx> wrote:
[snip]
> Unless I do not here strong objections, I will extend
> the ZODB such that:
>
> * setting a "_v_" attribute of a "Persistent" object
> sets a "_p_volatile" attribute.
>
> * "incrGC" gets an argument "flushVolatile", defaulting to "1".
> If not set, objects with "_p_volatile" are not flushed
>
> * in "subtransaction commit/abort" "incrGC" is called
> with "flushVolatile=0".
An application may also wish to set this flag itself if it knows that it will
need this object again in subsequent transactions. The GC could defer purging
objects with this flag set until after objects without it set are purged first.
As for the name, I agree that _p_volatile is not the best. How about:
_p_enduring
_p_held
_p_lingering
_p_kept
-Casey
_______________________________________________
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/
ZODB-Dev mailing list - ZODB-Dev@xxxxxxxx
http://mail.zope.org/mailman/listinfo/zodb-dev
Previous Message by Thread:
click to view message preview
Re: IIBTree.multiunion and list comprehensions
On Tue, Dec 09, 2003 at 11:37:03PM -0500, Casey Duncan wrote:
> > multiunion(tree.keys())
>
> The *keys* of the tree are sets? Huh?
You *know* I meant tree.values() <wink>
> > takes about the same time to run as
> >
> > [value for set in tree.keys() for value in set] (*)
>
> So you create a set out of this get the unique members?
Actually, all I want is a sequence of values -- it could be a set or a
list.
What I'm doing is collating all the values in all the sets in my BTree.
This BTree holds as values IITreeSets containing a number of integer
OIDs. When I want to pick up *all* the values in the BTree, I need to
join all these IITreeSets together and produce one big set. The fact
that it's unique doesn't really matter in my case given that any given
OID appears only once [as a value].
> > Note that multiunion returns an IISet(), while the latter returns a
> > list; converting between is pretty cheap given the context.
>
> Note that creating IITreeSets is often fastest because it allocates
> new buckets as it grows rather than resizing a single memory block. An
> IISet is essentially just a sorted C array of unique ints.
IIRC, IITreeSets also have the advantage of "unghosting as you go",
whereas IISets are unpersisted completely when accessing one of its
items. I'm wondering why multiunion creates IISets instead of
IITreeSets, though.
Take care,
--
Christian Robottom Reis | http://async.com.br/~kiko/ | [+55 16] 261 2331
_______________________________________________
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/
ZODB-Dev mailing list - ZODB-Dev@xxxxxxxx
http://mail.zope.org/mailman/listinfo/zodb-dev
Next Message by Thread:
click to view message preview
Re: IIBTree.multiunion and list comprehensions
On Wed, Dec 10, 2003 at 09:27:51AM -0500, Casey Duncan wrote:
> > What I'm doing is collating all the values in all the sets in my BTree.
> > This BTree holds as values IITreeSets containing a number of integer
> > OIDs. When I want to pick up *all* the values in the BTree, I need to
> > join all these IITreeSets together and produce one big set. The fact
> > that it's unique doesn't really matter in my case given that any given
> > OID appears only once [as a value].
>
> In that case it may be cheapest for you not to do the union up front
> at all. How about creating a lazy iterator that just walks down every
> member of every set in the tree? The big downside of that is that you
> can't really do anything with the result set besides iterating it,
> such as sorting or intersecting it.
This is probably a good idea (though it requires Python 2.2, which we
haven't yet in IndexedCatalog). You last phrase confused me, however: do
you mean I *can* sort and intersect the result set? If so, it would be
neat, since what I want to do in many cases is intersect this with
another set (resulting from a boolean query).
My main worry with *that*, however, is that intersecting will end up
having a similar cost to doing multiunion, since the cost I'm concerned
about is the first-run cost (IOW, the cost of unpersisting the actual
sets) and AFAIHS multiunion is too fast to notice once the sets are
in-memory.
I realize that first-time costs aren't really important for long-running
web applications; however, we're running a desktop system, and having
the initial operation execute slowly doesn't give a nice `first
impression', if you believe in such things.
Take care,
--
Christian Robottom Reis | http://async.com.br/~kiko/ | [+55 16] 261 2331
_______________________________________________
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/
ZODB-Dev mailing list - ZODB-Dev@xxxxxxxx
http://mail.zope.org/mailman/listinfo/zodb-dev
|
|