Note that there are some explanatory texts on larger screens.

plurals
  1. POFast Gaussian Blur image filter with ARM NEON
    primarykey
    data
    text
    <p>I'm trying to make a mobile fast version of Gaussian Blur image filter.</p> <p>I've read other questions, like: <a href="https://stackoverflow.com/questions/9158818/fast-gaussian-blur-on-unsigned-char-image-arm-neon-intrinsics-ios-dev?answertab=votes#tab-top">Fast Gaussian blur on unsigned char image- ARM Neon Intrinsics- iOS Dev</a></p> <p>For my purpose i need only a fixed size (7x7) fixed sigma (2) Gaussian filter.</p> <p>So, before optimizing for ARM NEON, I'm implementing 1D Gaussian Kernel in C++, and comparing performance with OpenCV GaussianBlur() method directly in mobile environment (Android with NDK). This way it will result in a much simpler code to optimize.</p> <p>However the result is that my implementation is 10 times slower then OpenCV4Android version. I've read that OpenCV4 Tegra have optimized GaussianBlur implementation, but I don't think that standard OpenCV4Android have those kind of optimizations, so why is my code so slow?</p> <p>Here is my implementation (note: reflect101 is used for pixel reflection when applying filter near borders):</p> <pre><code>Mat myGaussianBlur(Mat src){ Mat dst(src.rows, src.cols, CV_8UC1); Mat temp(src.rows, src.cols, CV_8UC1); float sum, x1, y1; // coefficients of 1D gaussian kernel with sigma = 2 double coeffs[] = {0.06475879783, 0.1209853623, 0.1760326634, 0.1994711402, 0.1760326634, 0.1209853623, 0.06475879783}; //Normalize coeffs float coeffs_sum = 0.9230247873f; for (int i = 0; i &lt; 7; i++){ coeffs[i] /= coeffs_sum; } // filter vertically for(int y = 0; y &lt; src.rows; y++){ for(int x = 0; x &lt; src.cols; x++){ sum = 0.0; for(int i = -3; i &lt;= 3; i++){ y1 = reflect101(src.rows, y - i); sum += coeffs[i + 3]*src.at&lt;uchar&gt;(y1, x); } temp.at&lt;uchar&gt;(y,x) = sum; } } // filter horizontally for(int y = 0; y &lt; src.rows; y++){ for(int x = 0; x &lt; src.cols; x++){ sum = 0.0; for(int i = -3; i &lt;= 3; i++){ x1 = reflect101(src.rows, x - i); sum += coeffs[i + 3]*temp.at&lt;uchar&gt;(y, x1); } dst.at&lt;uchar&gt;(y,x) = sum; } } return dst; } </code></pre>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload