Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    text
    copied!<p>A 3-dimensional cumulative distribution function is a function of 3 variables. If you estimate it on a grid, it could be represented as a 3-dimensional array, but it would be imprecise and huge (your function returns a 1-dimensional array, so it is not what it is computing).</p> <p>Given a point <code>x</code>, just compute the proportion of points all of whose coordinates are less than those of <code>x</code>.</p> <pre><code>z &lt;- matrix(runif(60000), 3, 20000) emp.cdf &lt;- function(z) function(x) mean( apply( z &lt;= x, 2, all ) ) emp.cdf(z)( c(.5,.5,.5) ) # Approximately 1/8 </code></pre> <p>The following reproduces the plots in the document you cite:</p> <pre><code>n &lt;- 10 z &lt;- matrix(runif(2*n), 2, n) f &lt;- emp.cdf(z) g &lt;- function(u,v) f(c(u,v)) persp( outer( sort(z[1,]), sort(z[2,]), Vectorize(g) ) ) x &lt;- seq(0,1,length=100) persp( outer( x, x, Vectorize(g) ) ) </code></pre> <p>If you want to evaluate the cumulative probability distribution on the initial points, you can just use <code>apply</code> (if you wanted to evaluate it on a grid, you could use <code>expand.grid</code> to build it).</p> <pre><code>n &lt;- 100 z &lt;- matrix(runif(3*n), 3, n) f &lt;- emp.cdf(z) p &lt;- apply( z, 2, f ) </code></pre> <p>But this algorithm is quadratic: there are <code>n</code> probabilities to compute, and for each of them, we examine all <code>3*n</code> coordinates. For your 20,000 points, that will take a while.</p> <p>You can use a divide-and-conquer approach to speed up the computations, but it is not straightforward: pick up a point at random, use it to separate the space into 8 octants, recursively compute the number of points in each octant; you can then use the resulting <a href="http://en.wikipedia.org/wiki/Octree" rel="nofollow">tree</a> to compute the probability at any point, by examining only a fraction of the points.</p> <p>This is not unlike the preprocessing step used to compute the <a href="http://en.wikipedia.org/wiki/Nearest_neighbor_search#Space_partitioning" rel="nofollow">k-nearest neighbours</a>, or to speed up <a href="http://en.wikipedia.org/wiki/Barnes%E2%80%93Hut_simulation" rel="nofollow">n-body simulations</a>.</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload