Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    text
    copied!<p>If you want a faster implementation of the mahalanobis distance, you just have to re-write your algorithm and mimic the one used by R. It's pretty straightforward </p> <p>I modified a little bit your function <code>mahalanobis_arma</code> to turn <code>mu</code> to a <code>rowvec</code>.</p> <p>Basically I just translated the R code to RcppArmadillo</p> <pre><code>mahalanobis function (x, center, cov, inverted = FALSE, ...) { x &lt;- if (is.vector(x)) matrix(x, ncol = length(x)) else as.matrix(x) x &lt;- sweep(x, 2, center) if (!inverted) cov &lt;- solve(cov, ...) setNames(rowSums((x %*% cov) * x), rownames(x)) } &lt;bytecode: 0x6e5b408&gt; &lt;environment: namespace:stats&gt; </code></pre> <p>Here it is</p> <pre><code>#include &lt;RcppArmadillo.h&gt; #include &lt;Rcpp.h&gt; // [[Rcpp::depends("RcppArmadillo")]] // [[Rcpp::export]] arma::vec Mahalanobis(arma::mat x, arma::rowvec center, arma::mat cov){ int n = x.n_rows; arma::mat x_cen; x_cen.copy_size(x); for (int i=0; i &lt; n; i++) { x_cen.row(i) = x.row(i) - center; } return sum((x_cen * cov.i()) % x_cen, 1); } // [[Rcpp::export]] arma::vec mahalanobis_arma( arma::mat x , arma::rowvec mu, arma::mat sigma ){ int n = x.n_rows; arma::vec md(n); for (int i=0; i&lt;n; i++){ arma::mat x_i = x.row(i) - mu; arma::mat Y = arma::solve( sigma, arma::trans(x_i) ); md(i) = arma::as_scalar(x_i * Y); } return md; } </code></pre> <p>Now, let's compare this new armadillo version (<code>Mahalanobis</code>), your first version (<code>mahalanobis_arma</code>) and the R implementation (<code>mahalanobis</code>). </p> <p>I save this Cpp code as <code>mahalanobis.cpp</code></p> <pre><code>require(RcppArmadillo) sourceCpp("mahalanobis.cpp") set.seed(1) x &lt;- matrix(rnorm(10000 * 10), ncol = 10) Sx &lt;- cov(x) all.equal(c(Mahalanobis(x, colMeans(x), Sx)) ,mahalanobis(x, colMeans(x), Sx)) ## [1] TRUE all.equal(mahalanobis_arma(x, colMeans(x), Sx) ,Mahalanobis(x, colMeans(x), Sx)) ## [1] TRUE require(rbenchmark) benchmark(Mahalanobis(x, colMeans(x), Sx), mahalanobis(x, colMeans(x), Sx), mahalanobis_arma(x, colMeans(x), Sx), order = "elapsed") ## test replications elapsed ## 1 Mahalanobis(x, colMeans(x), Sx) 100 0.124 ## 2 mahalanobis(x, colMeans(x), Sx) 100 0.741 ## 3 mahalanobis_arma(x, colMeans(x), Sx) 100 4.509 ## relative user.self sys.self user.child sys.child ## 1 1.000 0.173 0.077 0 0 ## 2 5.976 0.804 0.670 0 0 ## 3 36.363 4.386 4.626 0 0 </code></pre> <p>As you can see the new implementation is faster than the R one. I'm pretty sure that we can do better here by using cholesky decomposition to solve the covariance matrix or by using other matrix decomposition.</p> <p>Finally, we can just plug this <code>Mahalanobis</code> function into your <code>dmvnorm</code> and test it :</p> <pre><code>require(mvtnorm) set.seed(1) sigma &lt;- matrix(c(4, 2, 2, 3), ncol = 2) x &lt;- rmvnorm(n = 5000000, mean = c(1, 2), sigma = sigma, method = "chol") all.equal(mvtnorm::dmvnorm(x, t(1:2), .2 + diag(2), FALSE), c(dmvnorm(x, t(1:2), .2+diag(2), FALSE))) ## [1] TRUE benchmark(mvtnorm::dmvnorm(x, t(1:2), .2 + diag(2), FALSE), dmvnorm(x, t(1:2), .2+diag(2), FALSE), order = "elapsed") ## test replications ## 2 dmvnorm(x, t(1:2), 0.2 + diag(2), FALSE) 100 ## 1 mvtnorm::dmvnorm(x, t(1:2), 0.2 + diag(2), FALSE) 100 ## elapsed relative user.self sys.self user.child sys.child ## 2 35.366 1.000 31.117 4.193 0 0 ## 1 60.770 1.718 56.666 13.236 0 0 </code></pre> <p>It almost twice as fast now.</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload