StackOverflow2013

Note that there are some explanatory texts on larger screens.

plurals

PO
primarykey
Id
526300
data
AcceptedAnswerId
0
AnswerCount
0
ClosedDate
CommentCount
2
CommunityOwnedDate
CreationDate
2009-02-08T20:26:28.310
FavoriteCount
0
LastActivityDate
2009-02-09T04:48:48.227
LastEditDate
2009-02-09T04:48:48.227
LastEditorUserId
53192
OwnerUserId
53192
ParentId
526255
PostTypeId
2
Score
26
ViewCount
0
LastEditorDisplayName
David
text
Body
<a href="http://code.activestate.com/recipes/117241/" rel="noreferrer">This activestate recipe</a> gives an easy-to-follow approach, specifically the version in the comments that doesn't require you to pre-normalize your weights: <pre><code>import random def weighted_choice(items): """items is a list of tuples in the form (item, weight)""" weight_total = sum((item[1] for item in items)) n = random.uniform(0, weight_total) for item, weight in items: if n < weight: return item n = n - weight return item </code></pre> This will be slow if you have a large list of items. A binary search would probably be better in that case... but would also be more complicated to write, for little gain if you have a small sample size. <a href="http://code.activestate.com/recipes/498229/" rel="noreferrer">Here's an example of the binary search approach in python</a> if you want to follow that route. (I'd recommend doing some quick performance testing of both methods on your dataset. The performance of different approaches to this sort of algorithm is often a bit unintuitive.) <hr> Edit: I took my own advice, since I was curious, and did a few tests. I compared four approaches: The weighted_choice function above. A binary-search choice function like so: <pre><code>def weighted_choice_bisect(items): added_weights = [] last_sum = 0 for item, weight in items: last_sum += weight added_weights.append(last_sum) return items[bisect.bisect(added_weights, random.random() * last_sum)][0] </code></pre> A compiling version of 1: <pre><code>def weighted_choice_compile(items): """returns a function that fetches a random item from items items is a list of tuples in the form (item, weight)""" weight_total = sum((item[1] for item in items)) def choice(uniform = random.uniform): n = uniform(0, weight_total) for item, weight in items: if n < weight: return item n = n - weight return item return choice </code></pre> A compiling version of 2: <pre><code>def weighted_choice_bisect_compile(items): """Returns a function that makes a weighted random choice from items.""" added_weights = [] last_sum = 0 for item, weight in items: last_sum += weight added_weights.append(last_sum) def choice(rnd=random.random, bis=bisect.bisect): return items[bis(added_weights, rnd() * last_sum)][0] return choice </code></pre> I then built a big list of choices like so: <pre><code>choices = [(random.choice("abcdefg"), random.uniform(0,50)) for i in xrange(2500)] </code></pre> And an excessively simple profiling function: <pre><code>def profiler(f, n, *args, **kwargs): start = time.time() for i in xrange(n): f(*args, **kwargs) return time.time() - start </code></pre> The results: (Seconds taken for 1,000 calls to the function.) <ul> <li>Simple uncompiled: 0.918624162674</li> <li>Binary uncompiled: 1.01497793198</li> <li>Simple compiled: 0.287325024605</li> <li>Binary compiled: 0.00327413797379</li> </ul> The "compiled" results include the average time taken to compile the choice function once. (I timed 1,000 compiles, then divided that time by 1,000, and added the result to the choice function time.) So: if you have a list of items+weights which change very rarely, the binary compiled method is by far the fastest.
Tags
Title
singulars
PostAcceptedAnswerId
1. This table or related slice is empty.
PostParentId
1. POProbability distribution in Python
 singulars
 PostTypePostTypeId
 PTQuestion
PostTypePostTypeId
1. PTAnswer
UserLastEditorUserId
1. USDavid
UserOwnerUserId
1. USDavid
plurals
PostLinksPostIdRelatedPostId
1. This table or related slice is empty.
PostLinksRelatedPostIdPostId
1. This table or related slice is empty.
PostsAcceptedAnswerId
1. POProbability distribution in Python
 singulars
 PostTypePostTypeId
 PTQuestion
PostsParentIdCreationDate
1. This table or related slice is empty.
VotesPostIdCreationDate
1. VO
 singulars
 PostPostId
 PO
 UserUserId
 This table or related slice is empty.
 VoteTypeVoteTypeId
 VTUpMod
2. VO
 singulars
 PostPostId
 PO
 UserUserId
 This table or related slice is empty.
 VoteTypeVoteTypeId
 VTUpMod
3. VO
 singulars
 PostPostId
 PO
 UserUserId
 This table or related slice is empty.
 VoteTypeVoteTypeId
 VTUpMod
CommentsPostId
1. COI do not understand why the latter functions compile (compiled) while the initials do not (uncompiled). Could you explain or direct me to some info on this? Thx a lot!
 singulars
 PostPostId
 PO
 UserUserId
 USNicholas Leonard
2. CO"compile" isn't exactly the correct word, honestly -- it's the "Factory Pattern". Those functions are pre-calculating as much work as possible, then returning a new function (a "closure") that can do just the choosing part.
 singulars
 PostPostId
 PO
 UserUserId
 USDavid

Querying!

Guidance

A row detail

Detail views are divided into sections. All the information in the data section comes from columns in the selected row. The other sections display data from other, related rows.

Related data can be related in a to-one or a to-many fashion. Captions of data related in a to-many fashion link to a list view showing a filtered view of the table.

Try moving around until you find a non-empty to-many entry and click on the label to get to one. You can move back to the root by clicking on the database name in the header.