Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p><a href="http://code.activestate.com/recipes/117241/" rel="noreferrer">This activestate recipe</a> gives an easy-to-follow approach, specifically the version in the comments that doesn't require you to pre-normalize your weights:</p> <pre><code>import random def weighted_choice(items): """items is a list of tuples in the form (item, weight)""" weight_total = sum((item[1] for item in items)) n = random.uniform(0, weight_total) for item, weight in items: if n &lt; weight: return item n = n - weight return item </code></pre> <p>This will be slow if you have a large list of items. A binary search would probably be better in that case... but would also be more complicated to write, for little gain if you have a small sample size. <a href="http://code.activestate.com/recipes/498229/" rel="noreferrer">Here's an example of the binary search approach in python</a> if you want to follow that route.</p> <p>(I'd recommend doing some quick performance testing of both methods on your dataset. The performance of different approaches to this sort of algorithm is often a bit unintuitive.)</p> <hr> <p><strong>Edit:</strong> I took my own advice, since I was curious, and did a few tests.</p> <p>I compared four approaches:</p> <p><em>The weighted_choice function above.</em></p> <p><em>A binary-search choice function like so:</em></p> <pre><code>def weighted_choice_bisect(items): added_weights = [] last_sum = 0 for item, weight in items: last_sum += weight added_weights.append(last_sum) return items[bisect.bisect(added_weights, random.random() * last_sum)][0] </code></pre> <p><em>A compiling version of 1:</em></p> <pre><code>def weighted_choice_compile(items): """returns a function that fetches a random item from items items is a list of tuples in the form (item, weight)""" weight_total = sum((item[1] for item in items)) def choice(uniform = random.uniform): n = uniform(0, weight_total) for item, weight in items: if n &lt; weight: return item n = n - weight return item return choice </code></pre> <p><em>A compiling version of 2:</em></p> <pre><code>def weighted_choice_bisect_compile(items): """Returns a function that makes a weighted random choice from items.""" added_weights = [] last_sum = 0 for item, weight in items: last_sum += weight added_weights.append(last_sum) def choice(rnd=random.random, bis=bisect.bisect): return items[bis(added_weights, rnd() * last_sum)][0] return choice </code></pre> <p>I then built a big list of choices like so:</p> <pre><code>choices = [(random.choice("abcdefg"), random.uniform(0,50)) for i in xrange(2500)] </code></pre> <p>And an excessively simple profiling function:</p> <pre><code>def profiler(f, n, *args, **kwargs): start = time.time() for i in xrange(n): f(*args, **kwargs) return time.time() - start </code></pre> <p><strong>The results:</strong></p> <p>(Seconds taken for 1,000 calls to the function.)</p> <ul> <li>Simple uncompiled: 0.918624162674</li> <li>Binary uncompiled: 1.01497793198</li> <li>Simple compiled: 0.287325024605</li> <li>Binary compiled: 0.00327413797379</li> </ul> <p>The "compiled" results include the average time taken to compile the choice function once. (I timed 1,000 compiles, then divided that time by 1,000, and added the result to the choice function time.)</p> <p>So: if you have a list of items+weights which change very rarely, the binary compiled method is <em>by far</em> the fastest.</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    2. VO
      singulars
      1. This table or related slice is empty.
    3. VO
      singulars
      1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload