StackOverflow2013

Note that there are some explanatory texts on larger screens.

plurals

PO
primarykey
Id
15480110
data
AcceptedAnswerId
0
AnswerCount
0
ClosedDate
CommentCount
1
CommunityOwnedDate
CreationDate
2013-03-18T15:07:45.517
FavoriteCount
0
LastActivityDate
2013-03-18T15:54:44.217
LastEditDate
2013-03-18T15:54:44.217
LastEditorUserId
15055
OwnerUserId
15055
ParentId
15479777
PostTypeId
2
Score
14
ViewCount
0
LastEditorDisplayName
text
Body
Instead of summing the dot product of each pair, which requires <code>n * m</code> operations, you can sum all of the vectors in each list and just do the final dot product, which will only take <code>n + m</code> operations. Before: <pre><code>def calc_slow(L1, L2): result = 0 for n, m in itertools.product(L1, L2): result += np.dot(n, m) return result </code></pre> After: <pre><code>def calc_fast(L1, L2): L1_sums = np.zeros(len(L1[0])) L2_sums = np.zeros(len(L2[0])) for vec in L1: L1_sums += vec for vec in L2: L2_sums += vec return np.dot(L1_sums, L2_sums) </code></pre> EDIT: Even faster version, taking advantage of numpy: <pre><code>def calc_superfast(L1, L2): return np.dot(np.array(L1).sum(0), np.array(L2).sum(0)) </code></pre> The output is the same: <pre><code>print X[0], Y[0], calc_slow(X[0], Y[0]) print X[0], Y[0], calc_fast(X[0], Y[0]) </code></pre> prints: <pre><code>[[1, 2, 3], [2, 4, 5], [2, 3, 4]] [[12, 14, 15], [12, 13, 14]] 711 [[1, 2, 3], [2, 4, 5], [2, 3, 4]] [[12, 14, 15], [12, 13, 14]] 711.0 </code></pre> Timing it shows that there is significant improvement. Timing code: <pre><code>import random import time def rand_vector(size=3): return [random.randint(1, 100) for _ in xrange(3)] def rand_list(length=200): return [rand_vector() for _ in xrange(length)] print "Generating lists..." L1 = rand_list(200) L2 = rand_list(200) print "Running slow..." s = time.time() print calc_slow(L1, L2) print "Slow for (%d, %d) took %.2fs" % (len(L1), len(L2), time.time() - s) print "Running fast..." s = time.time() print calc_fast(L1, L2) print "Fast for (%d, %d) took %.2fs" % (len(L1), len(L2), time.time() - s) </code></pre> Sample outputs: <pre><code>Generating lists... Running slow... 75715569 Slow for (100, 100) took 1.48s Running fast... 75715569.0 Fast for (100, 100) took 0.03s Generating lists... Running slow... 309169971 Slow for (200, 200) took 5.29s Running fast... 309169971.0 Fast for (200, 200) took 0.04s Running fast... 3.05185703539e+12 Fast for (20000, 20000) took 1.94s </code></pre> The operation for two lists of 20000 elements each only took ~2 seconds, whereas I predict it would take several hours with the old method. <hr> The reason this works is that you can group the operations together. Assuming you have the following lists: <pre><code>L1 = [a, b, c], [d, e, f], [g, h, i] L2 = [u, v, w], [x, y, z] </code></pre> Then summing the dot product of each pair would look like this: <pre><code>a*u + b*v + c*w + a*x + b*y + c*z + d*u + e*v + f*w + d*x + e*y + f*z + g*u + h*v + i*w + g*x + h*y + i*z </code></pre> You can factor out the <code>u</code>, <code>v</code>, <code>w</code>, <code>x</code>, <code>y</code>, and <code>z</code> elements: <pre><code>u*(a + d + g) + v*(b + e + h) + w*(c + f + i) + x*(a + d + g) + y*(b + e + h) + z*(c + f + i) </code></pre> Then you can further factor out those sums: <pre><code>(u + x)*(a + d + g) + (v + y)*(b + e + h) + (w + z)*(c + f + i) </code></pre> Which is just the dot product of the summed vectors from each initial list.
Tags
Title
singulars
PostAcceptedAnswerId
1. This table or related slice is empty.
PostParentId
1. POefficient algorithm instead of looping
 singulars
 PostTypePostTypeId
 PTQuestion
PostTypePostTypeId
1. PTAnswer
UserLastEditorUserId
1. USClaudiu
UserOwnerUserId
1. USClaudiu
plurals
PostLinksPostIdRelatedPostId
1. This table or related slice is empty.
PostLinksRelatedPostIdPostId
1. This table or related slice is empty.
PostsAcceptedAnswerId
1. POefficient algorithm instead of looping
 singulars
 PostTypePostTypeId
 PTQuestion
PostsParentIdCreationDate
1. This table or related slice is empty.
VotesPostIdCreationDate
1. VO
 singulars
 PostPostId
 PO
 UserUserId
 This table or related slice is empty.
 VoteTypeVoteTypeId
 VTUpMod
2. VO
 singulars
 PostPostId
 PO
 UserUserId
 This table or related slice is empty.
 VoteTypeVoteTypeId
 VTUpMod
3. VO
 singulars
 PostPostId
 PO
 UserUserId
 This table or related slice is empty.
 VoteTypeVoteTypeId
 VTUpMod
CommentsPostId
1. COThanks man.. 2 thumbs up!
 singulars
 PostPostId
 PO
 UserUserId
 USMoj

Querying!

Guidance

A row detail

Detail views are divided into sections. All the information in the data section comes from columns in the selected row. The other sections display data from other, related rows.

Related data can be related in a to-one or a to-many fashion. Captions of data related in a to-many fashion link to a list view showing a filtered view of the table.

Try moving around until you find a non-empty to-many entry and click on the label to get to one. You can move back to the root by clicking on the database name in the header.