osdir.com


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

random.sample with large weighted sample-sets?


How efficient does this thing need to be?

You can always just turn it into a two-dimensional sampling problem by
thinking of the data as a function f(x=item), generating a random x=xr
in [0,x], then generating a random y in [0,max(f(x))].  The xr is
accepted if 0 < y <= max(f(xr)), or rejected (and another attempt made) if
y > max(f(xr)).