StackOverflow2013

Note that there are some explanatory texts on larger screens.

plurals

PODrastic performance differences: debug vs release
text
Body
copied!<p>I have a simple algorithm which converts a bayer image channel (BGGR,RGGB,GBRG,GRBG) to rgb (demosaicing but without neighbors). In my implementation I have pre-set offset vectors which help me to translate the bayer channel index to its corresponding rgb channel indices. Only problem is I'm getting awful performance in debug mode with MSVC11. Under release, for an input of 3264X2540 size the function completes in ~60ms. For the same input in debug, the function completes in ~20,000ms. That's more than X300 difference and since some developers are runnig my application in debug, it's unacceptable.</p> <p>My code:</p> <pre><code>void ConvertBayerToRgbImageDemosaic(int* BayerChannel, int* RgbChannel, int Width, int Height, ColorSpace ColorSpace) { int rgbOffsets[4]; //translates color location in Bayer block to it's location in RGB block. So R->0, G->1, B->2 std::vector<int> bayerToRgbOffsets[4]; //the offsets from every color in the Bayer block to (bayer) indices it will be copied to (R,B are copied to all indices, Gr to R and Gb to B). //calculate offsets according to color space switch (ColorSpace) { case ColorSpace::BGGR: /* B G G R */ rgbOffsets[0] = 2; //B->0 rgbOffsets[1] = 1; //G->1 rgbOffsets[2] = 1; //G->1 rgbOffsets[3] = 0; //R->0 //B is copied to every pixel in it's block bayerToRgbOffsets[0].push_back(0); bayerToRgbOffsets[0].push_back(1); bayerToRgbOffsets[0].push_back(Width); bayerToRgbOffsets[0].push_back(Width + 1); //Gb is copied to it's neighbouring B bayerToRgbOffsets[1].push_back(-1); bayerToRgbOffsets[1].push_back(0); //GR is copied to it's neighbouring R bayerToRgbOffsets[2].push_back(0); bayerToRgbOffsets[2].push_back(1); //R is copied to every pixel in it's block bayerToRgbOffsets[3].push_back(-Width - 1); bayerToRgbOffsets[3].push_back(-Width); bayerToRgbOffsets[3].push_back(-1); bayerToRgbOffsets[3].push_back(0); break; ... other color spaces } for (auto row = 0; row < Height; row++) { for (auto col = 0, bayerIndex = row * Width; col < Width; col++, bayerIndex++) { auto colorIndex = (row%2)*2 + (col%2); //0...3, For example in BGGR: 0->B, 1->Gb, 2->Gr, 3->R //iteration over bayerToRgbOffsets is O(1) since it is either sized 2 or 4. std::for_each(bayerToRgbOffsets[colorIndex].begin(), bayerToRgbOffsets[colorIndex].end(), [&](int colorOffset) { auto rgbIndex = (bayerIndex + colorOffset) * 3 + rgbOffsets[offset]; RgbChannel[rgbIndex] = BayerChannel[bayerIndex]; }); } } } </code></pre> <p>What I've tried: I tried turing on optimization (/O2) for the debug build with no significant differences. I tried replacing the inner <code>for_each</code> statement with a plain old <code>for</code> loop but to no avail. I have a very similar algorithm which converts bayer to "green" rgb (without copying the data to neighboring pixels in the block) in which I'm not using the <code>std::vector</code> and there there is the expected runtime difference between debug and release (X2-X3). So, could the <code>std::vector</code> be the problem? If so, how do I overcome it?</p>

Querying!

Guidance

An individual column

Larger individual text columns get their own page to allow for proper reading.

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload