StackOverflow2013

Note that there are some explanatory texts on larger screens.

plurals

PO
primarykey
Id
16574535
data
AcceptedAnswerId
0
AnswerCount
0
ClosedDate
CommentCount
3
CommunityOwnedDate
CreationDate
2013-05-15T20:38:12.527
FavoriteCount
0
LastActivityDate
2013-05-15T22:12:35.907
LastEditDate
2013-05-15T22:12:35.907
LastEditorUserId
190597
OwnerUserId
190597
ParentId
16572192
PostTypeId
2
Score
7
ViewCount
0
LastEditorDisplayName
text
Body
How to eliminate the alternation effect: The very last line of the Proof of Theorem 2 reads, <blockquote> By reversing the roles of H and W, the update rule for W can similarly be shown to be nonincreasing. </blockquote> Thus we can surmise that updating <code>H</code> can be done independently of updating <code>W</code>. That means after updating <code>H</code>: <pre><code>H = H * H_coeff </code></pre> we should also update the intermediate value <code>WH</code> before updating <code>W</code>: <pre><code>WH = W.dot(H) W = W * W_coeff </code></pre> Both updates decrease the divergence. Try it: Just stick <code>WH = W.dot(H)</code> before the computation for <code>W_coeff</code>, and the alternation effect goes away. <hr> Simplifying the code: When dealing with NumPy arrays, use their <code>mean</code> and <code>sum</code> methods, and avoid using the Python <code>sum</code> function: <pre><code>avg_V = sum(sum(V))/n/m </code></pre> can be written as <pre><code>avg_V = V.mean() </code></pre> and <pre><code>divergence = sum(sum(V * np.log(V/WH) - V + WH)) # equation (3) </code></pre> can be written as <pre><code>divergence = ((V * np.log(V_over_WH)) - V + WH).sum() </code></pre> Avoid the Python builtin <code>sum</code> function because <ul> <li>it is slower than the NumPy <code>sum</code> method, and</li> <li>it is not as versatile as the NumPy <code>sum</code> method. (It does not allow you to specify the axis on which to sum. We managed to eliminate two calls to Python's <code>sum</code> with one call to NumPy's <code>sum</code> or <code>mean</code> method.)</li> </ul> <hr> Eliminate the triple for-loop: But a bigger improvement in both speed and readability can be had by replacing <pre><code>H_coeff = np.zeros(H.shape) for a in range(r): for mu in range(m): for i in range(n): H_coeff[a, mu] += W[i, a] * V[i, mu] / WH[i, mu] H_coeff[a, mu] /= sum(W)[a] H = H * H_coeff </code></pre> with <pre><code>V_over_WH = V/WH H *= (np.dot(V_over_WH.T, W) / W.sum(axis=0)).T </code></pre> <hr> Explanation: If you look at the equation 5 update rule for <code>H</code>, first notice that indices for <code>V</code> and <code>(W H)</code> are identical. So you can replace <code>V / (W H)</code> with <pre><code>V_over_WH = V/WH </code></pre> Next, note that in the numerator we are summing over the index i, which is the first index in both <code>W</code> and <code>V_over_WH</code>. We can express that as matrix multiplication: <pre><code>np.dot(V_over_WH.T, W).T </code></pre> And the denominator is simply: <pre><code>W.sum(axis=0).T </code></pre> If we divide the numerator and denominator <pre><code>(np.dot(V_over_WH.T, W) / W.sum(axis=0)).T </code></pre> we get a matrix indexed by the two remaining indices, alpha and mu, in that order. That is the same as the indices for <code>H</code>. So we want to multiply H by this ratio element-wise. Perfect. NumPy multiplies arrays element-wise by default. Thus, we can express the entire update rule for <code>H</code> as <pre><code>H *= (np.dot(V_over_WH.T, W) / W.sum(axis=0)).T </code></pre> <hr> So, putting it all together: <pre><code>import numpy as np np.random.seed(1) def update(V, W, H, WH, V_over_WH): # equation (5) H *= (np.dot(V_over_WH.T, W) / W.sum(axis=0)).T WH = W.dot(H) V_over_WH = V / WH W *= np.dot(V_over_WH, H.T) / H.sum(axis=1) WH = W.dot(H) V_over_WH = V / WH return W, H, WH, V_over_WH def factor(V, r, iterations=100): n, m = V.shape avg_V = V.mean() W = np.random.random(n * r).reshape(n, r) * avg_V H = np.random.random(r * m).reshape(r, m) * avg_V WH = W.dot(H) V_over_WH = V / WH for i in range(iterations): W, H, WH, V_over_WH = update(V, W, H, WH, V_over_WH) # equation (3) divergence = ((V * np.log(V_over_WH)) - V + WH).sum() print("At iteration {i}, the Kullback-Liebler divergence is {d}".format( i=i, d=divergence)) return W, H V = np.arange(0.01, 1.01, 0.01).reshape(10, 10) # V = np.arange(1,101).reshape(10,10).astype('float') W, H = factor(V, 6) </code></pre>
Tags
Title
singulars
PostAcceptedAnswerId
1. This table or related slice is empty.
PostParentId
1. POnon-negative matrix factorization failing to converge
 singulars
 PostTypePostTypeId
 PTQuestion
PostTypePostTypeId
1. PTAnswer
UserLastEditorUserId
1. USunutbu
UserOwnerUserId
1. USunutbu
plurals
PostLinksPostIdRelatedPostId
1. This table or related slice is empty.
PostLinksRelatedPostIdPostId
1. This table or related slice is empty.
PostsAcceptedAnswerId
1. POnon-negative matrix factorization failing to converge
 singulars
 PostTypePostTypeId
 PTQuestion
PostsParentIdCreationDate
1. This table or related slice is empty.
VotesPostIdCreationDate
1. VO
 singulars
 PostPostId
 PO
 UserUserId
 This table or related slice is empty.
 VoteTypeVoteTypeId
 VTUpMod
2. VO
 singulars
 PostPostId
 PO
 UserUserId
 This table or related slice is empty.
 VoteTypeVoteTypeId
 VTAcceptedByOriginator
3. VO
 singulars
 PostPostId
 PO
 UserUserId
 This table or related slice is empty.
 VoteTypeVoteTypeId
 VTUpMod
CommentsPostId

Querying!

Guidance

A row detail

Detail views are divided into sections. All the information in the data section comes from columns in the selected row. The other sections display data from other, related rows.

Related data can be related in a to-one or a to-many fashion. Captions of data related in a to-many fashion link to a list view showing a filtered view of the table.

Try moving around until you find a non-empty to-many entry and click on the label to get to one. You can move back to the root by clicking on the database name in the header.