Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>Sorry for digging up an old question. I couldn't look up the function to do this job on the train, so I wrote one myself.</p> <p>I was proud to find out that it's a tiny bit faster.<br> It's less flexible though.</p> <p>But it plays nice with <code>ave</code>, which is what I needed.</p> <pre><code>repeat.before = function(x) { # repeats the last non NA value. Keeps leading NA ind = which(!is.na(x)) # get positions of nonmissing values if(is.na(x[1])) # if it begins with a missing, add the ind = c(1,ind) # first position to the indices rep(x[ind], times = diff( # repeat the values at these indices c(ind, length(x) + 1) )) # diffing the indices + length yields how often } # they need to be repeated x = c(NA,NA,'a',NA,NA,NA,NA,NA,NA,NA,NA,'b','c','d',NA,NA,NA,NA,NA,'e') xx = rep(x, 1000000) system.time({ yzoo = na.locf(xx,na.rm=F)}) ## user system elapsed ## 2.754 0.667 3.406 system.time({ yrep = repeat.before(xx)}) ## user system elapsed ## 0.597 0.199 0.793 </code></pre> <h2>Edit</h2> <p>As this became my most upvoted answer, I was reminded often that I don't use my own function, because I often need zoo's <code>maxgap</code> argument. Because zoo has some weird problems in edge cases when I use dplyr + dates that I couldn't debug, I came back to this today to improve my old function.</p> <p>I benchmarked my improved function and all the other entries here. For the basic set of features, <code>tidyr::fill</code> is fastest while also not failing the edge cases. The Rcpp entry by @BrandonBertelsen is faster still, but it's inflexible regarding the input's type (he tested edge cases incorrectly due to a misunderstanding of <code>all.equal</code>).</p> <p>If you need <code>maxgap</code>, my function below is faster than zoo (and doesn't have the weird problems with dates). </p> <p>I put up the <a href="http://rpubs.com/rubenarslan/repeat_last_na_locf" rel="noreferrer">documentation of my tests</a>.</p> <h3>new function</h3> <pre><code>repeat_last = function(x, forward = TRUE, maxgap = Inf, na.rm = FALSE) { if (!forward) x = rev(x) # reverse x twice if carrying backward ind = which(!is.na(x)) # get positions of nonmissing values if (is.na(x[1]) &amp;&amp; !na.rm) # if it begins with NA ind = c(1,ind) # add first pos rep_times = diff( # diffing the indices + length yields how often c(ind, length(x) + 1) ) # they need to be repeated if (maxgap &lt; Inf) { exceed = rep_times - 1 &gt; maxgap # exceeding maxgap if (any(exceed)) { # any exceed? ind = sort(c(ind[exceed] + 1, ind)) # add NA in gaps rep_times = diff(c(ind, length(x) + 1) ) # diff again } } x = rep(x[ind], times = rep_times) # repeat the values at these indices if (!forward) x = rev(x) # second reversion x } </code></pre> <p>I've also put the function in my <a href="https://github.com/rubenarslan/formr" rel="noreferrer">formr package</a> (Github only).</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    2. VO
      singulars
      1. This table or related slice is empty.
    3. VO
      singulars
      1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload