Note that there are some explanatory texts on larger screens.

plurals
  1. POOptimizing random-access bilinear sampling
    primarykey
    data
    text
    <p>I'm working on an old-school "image warp" filter. Essentially, I have a 2D array of pixels (ignoring for the present the question of whether they are color, grayscale, float, RGBA, etc.) and another 2D array of vectors (with floating-point components), with the image being <em>at least</em> as large as the vector array. In pseudo code, I want to do this:</p> <pre><code>FOR EACH PIXEL (x,y) vec = vectors[x,y] // Get vector val = get(img, x + vec.x, y + vec.y) // Get input at &lt;x,y&gt; + vec output[x,y] = val // Write to output </code></pre> <p>The catch is that <code>get()</code> needs to bilinearly sample the input image, because the vectors can refer to subpixel coordinates. But unlike bilinear sampling in, say, texture mapping where we can work the interpolation math into a loop so it's all just adds, here the reads are from random locations. So <code>get()</code>'s definition looks something like this:</p> <pre><code>FUNCTION get(in,x,y) ix = floor(x); iy = floor(y) // Integer upper-left coordinates xf = x - ix; yf = y - iy // Fractional parts a = in[ix,iy]; b = in[iy+1,iy] // Four bordering pixel values b = in[ix,iy+1]; d = in[ix+1,iy+1] ab = lerp(a,b,xf) // Interpolate cd = lerp(c,d,xf) RETURN lerp(ab,cd,yf) </code></pre> <p>and <code>lerp()</code> is simply</p> <pre><code>FUNCTION lerp(a,b,x) RETURN (1-x)*a + x*b </code></pre> <p>Assuming that neither the input image nor the vector array is known in advance, what kind of high-level optimizations are possible? (Note: "Use the GPU" is cheating.) One thing I can think of is rearranging the interpolation math in <code>get()</code> so that we can cache the pixel reads and intermediate calculations for a given (ix,iy). That way, if successive accesses are to the same subpixel quad, we can avoid some work. If the vector array is known in advance, then we can rearrange it so that the coordinates passed to <code>get()</code> tend to be more local. This might help with cache locality as well, but at the expense of having the writes to <code>output</code> be all over the place. But then we can't do fancy things like scale the vectors on the fly, or even move the warp effect from its original precalculated location.</p> <p>The only other possibility would be to use fixed-point vector components, perhaps with <em>very</em> limited fractional parts. E.g., if the vectors only have 2-bit fractional components, then there are only 16 subpixel regions that could be accessed. We could precompute the weights for these, and avoid much of the interpolation math altogether, but with a hit to quality.</p> <p>Any other ideas? I want to accumulate a few different methods before I implement them and see which is the best. If someone could point me to source code of a fast implementation, that would be great.</p>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload