Note that there are some explanatory texts on larger screens.

plurals
  1. POComputing sparse pairwise distance matrix in R
    primarykey
    data
    text
    <p>I have a <code>NxM</code> matrix and I want to compute the <code>NxN</code> matrix of Euclidean distances between the <code>M</code> points. In my problem, <code>N</code> is about 100,000. As I plan to use this matrix for a k-nearest neighbor algorithm, I only need to keep the <code>k</code> smallest distances, so the resulting <code>NxN</code> matrix is very sparse. This is in contrast to what comes out of <code>dist()</code>, for example, which would result in a dense matrix (and probably storage problems for my size <code>N</code>).</p> <p>The packages for kNN that I've found so far (<code>knnflex</code>, <code>kknn</code>, etc) all appear to use dense matrices. Also, the <code>Matrix</code> package does not offer a pairwise distance function.</p> <p>Closer to my goal, I see that the <code>spam</code> package has a <code>nearest.dist()</code> function that allows one to only consider distances less than some threshold, <code>delta</code>. In my case, however, a particular value of <code>delta</code> may produce too many distances (so that I have to store the <code>NxN</code> matrix densely) or too few distances (so that I can't use kNN).</p> <p>I have seen previous discussion on trying to perform <a href="https://stackoverflow.com/questions/3039646/k-means-clustering-in-r-on-very-large-sparse-matrix">k-means clustering</a> using the <code>bigmemory/biganalytics</code> packages, but it doesn't seem like I can leverage these methods in this case.</p> <p>Does anybody know a function/implementation that will compute a distance matrix in a sparse fashion in R? My (dreaded) backup plan is to have two <code>for</code> loops and save results in a <code>Matrix</code> object.</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload