Note that there are some explanatory texts on larger screens.

plurals
  1. POFast method to copy memory with translation - ARGB to BGR
    primarykey
    data
    text
    <h2>Overview</h2> <p>I have an image buffer that I need to convert to another format. The origin image buffer is four channels, 8 bits per channel, Alpha, Red, Green, and Blue. The destination buffer is three channels, 8 bits per channel, Blue, Green, and Red.</p> <p>So the brute force method is:</p> <pre><code>// Assume a 32 x 32 pixel image #define IMAGESIZE (32*32) typedef struct{ UInt8 Alpha; UInt8 Red; UInt8 Green; UInt8 Blue; } ARGB; typedef struct{ UInt8 Blue; UInt8 Green; UInt8 Red; } BGR; ARGB orig[IMAGESIZE]; BGR dest[IMAGESIZE]; for(x = 0; x &lt; IMAGESIZE; x++) { dest[x].Red = orig[x].Red; dest[x].Green = orig[x].Green; dest[x].Blue = orig[x].Blue; } </code></pre> <p>However, I need more speed than is provided by a loop and three byte copies. I'm hoping there might be a few tricks I can use to reduce the number of memory reads and writes, given that I'm running on a 32 bit machine.</p> <h2>Additional info</h2> <p>Every image is a multiple of at least 4 pixels. So we could address 16 ARGB bytes and move them into 12 RGB bytes per loop. Perhaps this fact can be used to speed things up, especially as it falls nicely into 32 bit boundaries.</p> <p>I have access to OpenCL - and while that requires moving the entire buffer into the GPU memory, then moving the result back out, the fact that OpenCL can work on many portions of the image simultaneously, and the fact that large memory block moves are actually quite efficient may make this a worthwhile exploration.</p> <p>While I've given the example of small buffers above, I really am moving HD video (1920x1080) and sometimes larger, mostly smaller, buffers around, so while a 32x32 situation may be trivial, copying 8.3MB of image data byte by byte is really, really bad.</p> <p>Running on Intel processors (Core 2 and above) and thus there are streaming and data processing commands I'm aware exist, but don't know about - perhaps pointers on where to look for specialized data handling instructions would be good.</p> <p>This is going into an OS X application, and I'm using XCode 4. If assembly is painless and the obvious way to go, I'm fine traveling down that path, but not having done it on this setup before makes me wary of sinking too much time into it.</p> <p>Pseudo-code is fine - I'm not looking for a complete solution, just the algorithm and an explanation of any trickery that might not be immediately clear.</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload