Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>Since the question originally had a tag 'bioinformatics' I'll mention the <a href="http://bioconductor.org" rel="nofollow noreferrer">Bioconductor</a> package <em><a href="http://bioconductor.org/packages/release/bioc/html/IRanges.html" rel="nofollow noreferrer">IRanges</a></em> (and it's companion for ranges on genomes <em><a href="http://bioconductor.org/packages/release/bioc/html/GenomicRanges.html" rel="nofollow noreferrer">GenomicRanges</a></em>)</p> <pre><code>&gt; library(IRanges) &gt; xx &lt;- c(1,1,1,1,1,1,0,0,0,0,1,1,1,1) &gt; sl = slice(Rle(xx), 1) &gt; sl Views on a 14-length Rle subject views: start end width [1] 1 6 6 [1 1 1 1 1 1] [2] 11 14 4 [1 1 1 1] </code></pre> <p>which could be coerced to a matrix, but that would often not be convenient for whatever the next step is</p> <pre><code>&gt; matrix(c(start(sl), end(sl)), ncol=2)      [,1] [,2] [1,]    1    6 [2,]   11   14 </code></pre> <p>Other operations might start on the <code>Rle</code>, e.g., </p> <pre><code>&gt; xx = c(2,2,2,3,3,3,0,0,0,0,4,4,1,1) &gt; r = Rle(xx) &gt; m = cbind(start(r), end(r))[runValue(r) != 0,,drop=FALSE] &gt; m [,1] [,2] [1,] 1 3 [2,] 4 6 [3,] 11 12 [4,] 13 14 </code></pre> <p><strong>See the help page <code>?Rle</code></strong> for the full flexibility of the <code>Rle</code> class; to go from a matrix like that above to a new Rle as asked in the comment below, one might create a new Rle of appropriate length and then subset-assign using an IRanges as index</p> <pre><code>&gt; r = Rle(0L, max(m)) &gt; r[IRanges(m[,1], m[,2])] = 1L &gt; r integer-Rle of length 14 with 3 runs Lengths: 6 4 4 Values : 1 0 1 </code></pre> <p>One could expand this to a full vector</p> <pre><code>&gt; as(r, "integer") [1] 1 1 1 1 1 1 0 0 0 0 1 1 1 1 </code></pre> <p>but often it's better to continue the analysis on the Rle. The class is very flexible, so one way of going from <code>xx</code> to an integer vector of 1's and 0's is</p> <pre><code>&gt; as(Rle(xx) &gt; 0, "integer") [1] 1 1 1 1 1 1 0 0 0 0 1 1 1 1 </code></pre> <p>Again, though, it often makes sense to stay in Rle space. And <a href="https://stackoverflow.com/questions/17121205/r-matrix-to-indexes/17121362#17121362">Arun</a>'s answer to your separate question is probably best of all.</p> <p><strong>Performance</strong> (speed) is important, although in this case I think the Rle class provides a lot of flexibility that would weigh against poor performance, and ending up at a matrix is an unlikely end-point for a typical analysis. Nonetheles the IRanges infrastructure <em>is</em> performant</p> <pre><code>eddi &lt;- function(xx) matrix(which(diff(c(0,xx,0)) != 0) - c(0,1), ncol = 2, byrow = TRUE) iranges = function(xx) { sl = slice(Rle(xx), 1) matrix(c(start(sl), end(sl)), ncol=2) } iranges.1 = function(xx) { r = Rle(xx) cbind(start(r), end(r))[runValue(r) != 0, , drop=FALSE] } </code></pre> <p>with</p> <pre><code>&gt; xx = sample(c(0, 1), 1e5, TRUE) &gt; microbenchmark(eddi(xx), iranges(xx), iranges.1(xx), times=10) Unit: milliseconds expr min lq median uq max neval eddi(xx) 45.88009 46.69360 47.67374 226.15084 234.8138 10 iranges(xx) 112.09530 114.36889 229.90911 292.84153 294.7348 10 iranges.1(xx) 31.64954 31.72658 33.26242 35.52092 226.7817 10 </code></pre>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    2. VO
      singulars
      1. This table or related slice is empty.
    3. VO
      singulars
      1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload