Note that there are some explanatory texts on larger screens.

plurals
  1. POhow do I detect zero-vectors that make k-means cosine crash Matlab?
    primarykey
    data
    text
    <p>I'm running kmeans on a large dataset and I'm always getting the error below:</p> <pre><code>Error using kmeans (line 145) Some points have small relative magnitudes, making them effectively zero. Either remove those points, or choose a distance other than 'cosine'. Error in runkmeans (line 7) [L, C]=kmeans(data, 10, 'Distance', 'cosine', 'EmptyAction', 'drop') </code></pre> <p>My problem is that even when I add a 1 to all the vectors, I still get this error. I would expect it to pass then, but apparently there are too many zero's still (that is what is causing it, right?).</p> <p>My question is this: what is the condition that makes Matlab decide that a point has "a small relative magnitude" and "is effectively zero"?</p> <p>I want to remove all these points from my dataset using python, before I hand over the data to Matlab, because I need to compare my results with a gold standard that I process in python.</p> <p>Thanks in advance!</p> <p><strong>EDIT-ANSWER</strong></p> <p>The correct answer was given below, but in case someone finds this question through Google, here's how you remove the "effectively zero-vectors" from your matrix in python. Every row (!) is a data point, so you want to transpose in python or Matlab if you're running kmeans:</p> <pre><code>def getxnorm(data): return np.sqrt(np.sum(data ** 2, axis=1)) def remove_zero_vector(data, startxnorm, excluded=[]): eps = 2.2204e-016 xnorm = getxnorm(data) if np.min(xnorm) &lt;= (eps * np.max(xnorm)): local_index=np.transpose(np.where(xnorm == np.min(xnorm)))[0][0] global_index=np.transpose(np.where(startxnorm == np.min(xnorm)))[0][0] data=np.delete(data, local_index, 0) # data with zero vector removed excluded.append(global_index) # add global index to list of excluded vectors return remove_zero_vector(data, startxnorm, excluded) else: return (data, excluded) </code></pre> <p>I'm sure there's a much more scipythonic way for doing this, but it'll do :-)</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload