Note that there are some explanatory texts on larger screens.

plurals
  1. PORgb to grayscale conversion with arm neon
    primarykey
    data
    text
    <p>I´m trying to convert from rgb to grayscale efficiently, so I got a function from <a href="http://computer-vision-talks.com/2011/02/a-very-fast-bgra-to-grayscale-conversion-on-iphone/" rel="nofollow">here</a> where it explains how to convert from rgba to grayscale. Now I´m trying to do the same but with just rgb. I changed some things but it doesn´t seem to work well. I don´t know why, does anyone see my mistake?</p> <pre><code>void neon_asm_convert(uint8_t * __restrict dest, uint8_t * __restrict src, int numPixels) { __asm__ volatile( "lsr %2, %2, #3 \n" "# build the three constants: \n" "mov r4, #28 \n" // Blue channel multiplier "mov r5, #151 \n" // Green channel multiplier "mov r6, #77 \n" // Red channel multiplier "vdup.8 d4, r4 \n" "vdup.8 d5, r5 \n" "vdup.8 d6, r6 \n" "0: \n" "# load 8 pixels: \n" //RGBR "vld4.8 {d0-d3}, [%1]! \n" "# do the weight average: \n" "vmull.u8 q7, d0, d4 \n" "vmlal.u8 q7, d1, d5 \n" "vmlal.u8 q7, d2, d6 \n" "# shift and store: \n" "vshrn.u16 d7, q7, #8 \n" // Divide q3 by 256 and store in the d7 "vst1.8 {d7}, [%0]! \n" "subs %2, %2, #1 \n" // Decrement iteration count "# load 8 pixels: \n" "vld4.8 {d8-d11}, [%1]! \n" //Other GBRG "# do the weight average: \n" "vmull.u8 q7, d3, d4 \n" "vmlal.u8 q7, d8, d5 \n" "vmlal.u8 q7, d9, d6 \n" "# shift and store: \n" "vshrn.u16 d7, q7, #8 \n" // Divide q3 by 256 and store in the d7 "vst1.8 {d7}, [%0]! \n" "subs %2, %2, #1 \n" // Decrement iteration count "# load 8 pixels: \n" "vld4.8 {d0-d3}, [%1]! \n" "# do the weight average: \n" "vmull.u8 q7, d10, d4 \n" "vmlal.u8 q7, d11, d5 \n" "vmlal.u8 q7, d0, d6 \n" "# shift and store: \n" "vshrn.u16 d7, q7, #8 \n" // Divide q3 by 256 and store in the d7 "vst1.8 {d7}, [%0]! \n" "subs %2, %2, #1 \n" // Decrement iteration count "# do the weight average: \n" "vmull.u8 q7, d1, d4 \n" "vmlal.u8 q7, d2, d5 \n" "vmlal.u8 q7, d3, d6 \n" "# shift and store: \n" "vshrn.u16 d7, q7, #8 \n" // Divide q3 by 256 and store in the d7 "vst1.8 {d7}, [%0]! \n" "subs %2, %2, #1 \n" // Decrement iteration count "bne 0b \n" // Repeat unil iteration count is not zero : : "r"(dest), "r"(src), "r"(numPixels) : "r4", "r5", "r6" ); } </code></pre>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload