On 2018-06-27 11:11:37 +1200, Gregory Ewing wrote:
> Bart wrote:
> > x = set(range(10_000_000))
> > This used up 460MB of RAM (the original 100M I tried exhausted the memory).
> > The advantage of Pascal-style sets is that that same set will occupy
> > only 1.25MB, as it is a bit-map.
> That's true, but they're also extremely limited compared to
> the things you can do with Python sets. They're really two
> quite different things, designed for different use cases.
> Compact sets of integers are possible in Python, but not
> as a built-in type. This suggests that they're not needed
Also, when such sets are needed, a simple array of bits may not be
sufficient. For example, sets may be sparse, or they may have almost all
except a few bits set. In these cases a tree or RLE representation is
much smaller and faster. There are a number of such implementations
(Judy Arrays and Roaring Bitmaps come to mind). Each has its
advantages and limitations, so its probably better to leave the choice
to the author.
 Judy is a C library. Roaring bitmaps are a data structure which has
been implemented in several languages. I haven't checked whether
there are packages on pypi. Shouldn't be too hard to implement if
one needs it.
_ | Peter J. Holzer | we build much bigger, better disasters now
|_|_) | | because we have much more sophisticated
| | | hjp at hjp.at | management tools.
__/ | http://www.hjp.at/ | -- Ross Anderson <https://www.edge.org/>
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 833 bytes
Desc: not available