Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>The problem is solved by deleting </p> <pre><code>dlclose(lib_handle); </code></pre> <p>from .cpp file. This yields the following:</p> <pre><code>#include &lt;Rcpp.h&gt; #include &lt;dlfcn.h&gt; using namespace Rcpp; using namespace std; typedef void (*func)(int*, int*, float*, int*, float*); RcppExport SEXP gpuQR_Rcpp(SEXP x_, SEXP n_rows_, SEXP n_cols_) { vector&lt;float&gt; x = as&lt;vector&lt;float&gt; &gt;(x_); int n_rows = as&lt;int&gt;(n_rows_); int n_cols = as&lt;int&gt;(n_cols_); vector&lt;float&gt; scale(n_cols); void* lib_handle = dlopen("path/gpuQR.so", RTLD_LAZY); if (!lib_handle) { Rcout &lt;&lt; dlerror() &lt;&lt; endl; } else { func gpuQR = (func) dlsym(lib_handle, "gpuQR"); gpuQR(&amp;n_rows, &amp;n_cols, &amp;(x[0]), &amp;n_rows, &amp;(scale[0])); } for(int ii = 1; ii &lt; n_rows; ii++) { for(int jj = 0; jj &lt; n_cols; jj++) { if(ii &gt; jj) { x[ii + jj * n_rows] *= scale[jj]; } } } return wrap(x); } </code></pre> <p>The .cpp file can be compiled in <em>R</em> using:</p> <pre><code>library(Rcpp) PKG_LIBS &lt;- sprintf('%s $(LAPACK_LIBS) $(BLAS_LIBS) $(FLIBS)', Rcpp:::RcppLdFlags()) PKG_CPPFLAGS &lt;- sprintf('%s', Rcpp:::RcppCxxFlags()) Sys.setenv(PKG_LIBS = PKG_LIBS , PKG_CPPFLAGS = PKG_CPPFLAGS) R &lt;- file.path(R.home(component = 'bin'), 'R') file &lt;- 'path/gpuQR_Rcpp.cpp' cmd &lt;- sprintf('%s CMD SHLIB %s', R, paste(file, collapse = ' ')) system(cmd) </code></pre> <p>The actual .c file linking to <em>culatools</em> is:</p> <pre><code>#include&lt;cula.h&gt; void gpuQR(const int *m, const int *n, float *a, const int *lda, float *tau) { culaInitialize(); culaSgeqrf(m[0], n[0], a, lda[0], tau); culaShutdown(); } </code></pre> <p>It can be compiled using:</p> <pre><code>gcc -c -I/usr/local/cula/include gpuQR.c gcc -shared -Wl,-rpath,/usr/local/cula/lib64 -L/usr/local/cula/lib64 -lcula_lapack -o gpuQR.so gpuQR.o </code></pre> <p>The QR decomposition can then be performed in <em>R</em> using:</p> <pre><code>dyn.load('path/gpuQR_Rcpp.so') set.seed(100) n_row &lt;- 3 n_col &lt;- 3 A &lt;- matrix(rnorm(n_row * n_col), n_row, n_col) res &lt;- .Call('gpuQR_Rcpp', c(A), n_row, n_col) matrix(res, n_row, n_col) [,1] [,2] [,3] [1,] 0.5250958 -0.8666927 0.8594266 [2,] -0.2504899 -0.3878644 -0.1277837 [3,] 0.1502908 0.4742033 -0.8804248 qr(A)$qr [,1] [,2] [,3] [1,] 0.5250957 -0.8666925 0.8594266 [2,] -0.2504899 -0.3878643 -0.1277838 [3,] 0.1502909 0.4742033 -0.8804247 </code></pre> <p>Here are the results from a benchmark using a NVIDIA GeForce 9400M GPU with 16 CUDA cores:</p> <pre><code>n_row &lt;- 1000; n_col &lt;- 1000 A &lt;- matrix(rnorm(n_row * n_col), n_row, n_col) B &lt;- A; dim(B) &lt;- NULL res &lt;- benchmark(.Call('gpuQR_Rcpp', B, n_row, n_col), qr(A), columns = c('test', 'replications', 'elapsed', 'relative'), order = 'relative') test replications elapsed relative 1 .Call("gpuQR_Rcpp", B, n_row, n_col) 100 38.037 1.000 2 qr(A) 100 152.575 4.011 </code></pre>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload