Note that there are some explanatory texts on larger screens.

plurals
  1. POLatent Semantic Analysis in Python discrepancy
    primarykey
    data
    text
    <p>I'm trying to follow the <a href="http://en.wikipedia.org/wiki/Latent_semantic_analysis" rel="nofollow">Wikipedia Article on latent semantic indexing</a> in Python using the following code:</p> <pre><code>documentTermMatrix = array([[ 0., 1., 0., 1., 1., 0., 1.], [ 0., 1., 1., 0., 0., 0., 0.], [ 0., 0., 0., 0., 0., 1., 1.], [ 0., 0., 0., 1., 0., 0., 0.], [ 0., 1., 1., 0., 0., 0., 0.], [ 1., 0., 0., 1., 0., 0., 0.], [ 0., 0., 0., 0., 1., 1., 0.], [ 0., 0., 1., 1., 0., 0., 0.], [ 1., 0., 0., 1., 0., 0., 0.]]) u,s,vt = linalg.svd(documentTermMatrix, full_matrices=False) sigma = diag(s) ## remove extra dimensions... numberOfDimensions = 4 for i in range(4, len(sigma) -1): sigma[i][i] = 0 queryVector = array([[ 0.], # same as first column in documentTermMatrix [ 0.], [ 0.], [ 0.], [ 0.], [ 1.], [ 0.], [ 0.], [ 1.]]) </code></pre> <p>How the math says it should work:</p> <pre><code>dtMatrixToQueryAgainst = dot(u, dot(s,vt)) queryVector = dot(inv(s), dot(transpose(u), queryVector)) similarityToFirst = cosineDistance(queryVector, dtMatrixToQueryAgainst[:,0] # gives 'matrices are not aligned' error. should be 1 because they're the same </code></pre> <p>What does work, with math that looks incorrect: ( from <a href="http://www.gototheboard.com/articles/An_Example_of_Latent_Semantic_Indexing" rel="nofollow">here</a>)</p> <pre><code>dtMatrixToQueryAgainst = dot(s, vt) queryVector = dot(transpose(u), queryVector) similarityToFirst = cosineDistance(queryVector, dtMatrixToQueryAgainsst[:,0]) # gives 1, which is correct </code></pre> <p>Why does route work, and the first not, when everything I can find about the math of LSA shows the first as correct? I feel like I'm missing something obvious...</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload