Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>Code below. The initial question was not totally well-defined, I had asked for these clarifications:</p> <ol> <li>Is it guaranteed that at least the first and/or last entries are non-NA? <strong>[No]</strong></li> <li>What to do if all entries in a row are NA? <strong>[Leave as-is]</strong></li> <li>Do you care how ties are split i.e. how to treat the middle NA in <code>1 3 NA NA NA 5 7</code>? <strong>[Don't-care/ left]</strong></li> <li>Do you have an upper-bound (S) on the longest contiguous span of NAs in a row? (I'm thinking a recursive solution if S is small. Or a dataframe solution with <em><code>ifelse</code></em> if S is large and number of rows and cols is large.) <strong>[worst-case S could be pathologically large, hence recursion should not be used]</strong></li> </ol> <p>geoffjentry, re your solution your bottlenecks will be the serial calculation of <em><code>nearest.non.na.pos</code></em> and the serial assignment <em><code>dat[na.pos] &lt;- dat[non.na.pos[nearest.non.na.pos]]</code></em> For a large gap of length G all we really need to compute is that the first (G/2, round up) items fill-from-left, the rest from right. (I could post an answer using <em><code>ifelse</code></em> but it would look similar.) Are your criteria <strong>runtime</strong>, big-O efficiency, temp memory usage, or code legibility?</p> <p>Coupla possible tweaks:</p> <ul> <li>only need to compute <em><code>N &lt;- length(dat)</code></em> once</li> <li>common-case speed enhance: <em><code>if (length(na.pos) == 0)</code></em> skip row, since it has no NAs</li> <li><em><code>if (length(na.pos) == length(dat)-1)</code></em> the (rare) case where there is only one non-NA entry hence we fill entire row with it</li> </ul> <p>Outline solution:</p> <p>Sadly na.locf does not work on an entire dataframe, you must use sapply, row-wise:</p> <pre><code>na.fill_from_nn &lt;- function(x) { row.na &lt;- is.na(x) fillFromLeft &lt;- na.locf(x, na.rm=FALSE) fillFromRight &lt;- na.locf(x, fromLast=TRUE, na.rm=FALSE) disagree &lt;- rle(fillFromLeft!=fillFromRight) for (loc in (disagree)) { ... resolve conflicts, row-wise } } sapply(dat, na.fill_from_nn) </code></pre> <p>Alternatively, since as you say contiguous NAs are rare, do a fast-and-dumb <em><code>ifelse</code></em> to fill isolated NAs from left. This will operate data-frame wise => makes the common-case fast. Then handle all the other cases with a row-wise for-loop. (This will affect the tiebreak on middle elements in a long span of NAs, but you say you don't care.)</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    2. VO
      singulars
      1. This table or related slice is empty.
    3. VO
      singulars
      1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload