Note that there are some explanatory texts on larger screens.

plurals
  1. POC++ performance: checking a block of memory for having specific values in specific cells
    text
    copied!<p>I'm doing a research on 2D Bin Packing algorithms. I've asked <a href="https://stackoverflow.com/questions/4904049/php-array-performance">similar question</a> regarding PHP's performance - it was too slow to pack - and now the code is converted to C++.</p> <p>It's still pretty slow. What my program does is consequently allocating blocks of dynamic memory and populating them with a character 'o'</p> <pre><code>char* bin; bin = new (nothrow) char[area]; if (bin == 0) { cout &lt;&lt; "Error: " &lt;&lt; area &lt;&lt; " bytes could not be allocated"; return false; } for (int i=0; i&lt;area; i++) { bin[i]='o'; } </code></pre> <p>(their size is between 1kb and 30kb for my datasets) </p> <p>Then the program checks different combinations of 'x' characters inside of current memory block. </p> <pre><code>void place(char* bin, int* best, int width) { for (int i=best[0]; i&lt;best[0]+best[1]; i++) for (int j=best[2]; j&lt;best[2]+best[3]; j++) bin[i*width+j] = 'x'; } </code></pre> <p>One of the functions that checks the non-overlapping gets called millions of times during a runtime.</p> <pre><code>bool fits(char* bin, int* pos, int width) { for (int i=pos[0]; i&lt;pos[0]+pos[1]; i++) for (int j=pos[2]; j&lt;pos[2]+pos[3]; j++) if (bin[i*width+j] == 'x') return false; return true; } </code></pre> <p>All other stuff takes only a percent of the runtime, so I need to make these two guys (fits and place) faster. Who's the culprit?</p> <p>Since I only have two options 'x' and 'o', I could try to use just one bit instead of the whole byte the char takes. But I'm more concerned with the speed, you think it would make the things faster?</p> <p>Thanks!</p> <p>Update: I replaced <code>int* pos</code> with <code>rect pos</code> (the same for <code>best</code>), as MSalters suggested. At first I saw improvement, but I tested more with bigger datasets and it seems to be back to normal runtimes. I'll try other techniques suggested and will keep you posted.</p> <p>Update: using <code>memset</code> and <code>memchr</code> sped up things about twice. Replacing 'x' and 'o' with '\1' and '\0' didn't show any improvement. <code>__restrict</code> wasn't helpful either. Overall, I'm satisfied with the performance of the program now since I also made some improvements to the algorithm itself. I'm yet to try using a bitmap and compiling with -02 (-03)... Thanks again everybody.</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload