random.sample with large weighted sample-sets?
I'm not coming up with the right keywords to find what I'm hunting.
I'd like to randomly sample a modestly compact list with weighted
distributions, so I might have
data = (
and I'd like to random.sample() it as if it was a 100-element list.
However, ideally, this could be done in O(size-of-data) storage
rather than requiring the build-out of the entire set just for
sampling purposes, as the actual data can get a bit large. For this
small toy data-set, I can use
sample_me = sum(([s]*n for s,n in data, )
but for large counts, the list returned from sum() grinds my system
because I start swapping. What am I missing? (links to relevant
keywords/searches/algorithms welcome in lieu of actually answering