Note that there are some explanatory texts on larger screens.

plurals
  1. POProper parallelization using parfor
    primarykey
    data
    text
    <p>I have written some image processing code in Matlab I'd like to speed up using prallel processing. I picked the task that takes longest: Applying gaussian blur to the image. With the help of file exchange I have already gotten a faster gaussian blur than imfilter(). However, it will still not scale up. Here is my testing code:</p> <pre><code>clear all clc image_paths = dir('C:\pics\Baustahl\3Bleche(3)\*.png'); image_paths = sort({image_paths.name}); count = 600; Img = cell(1, count); Mean = cell(1, count); for i = 1:count Img{i} = imread(['C:\pics\Baustahl\3Bleche(3)\ ' image_paths{i}]); Img{i} = Img{i}(:,:,1); Img{i} = single(Img{i})./255; end clear vars bilder fprintf(1, 'Starting processing....\n'); starttime = tic; parfor (i = 1:count, 4) Mean{i} = imgaussian(Img{i}, 25, 81); end elapsedtime = toc(starttime); fprintf(1, 'Finished processing. (%d Files in %.1fs, %.1f files/second)\n', count, elapsedtime, count/elapsedtime); fprintf(1, '\n'); clear vars Img Mean </code></pre> <p>My system has a Q6600 and 4GB of RAM (which is sufficient) and if I restrict MATLAB to one core I get:</p> <blockquote> <p>Finished processing. (600 Files in 12.0s, 49.8 files/second)</p> </blockquote> <p>Two cores: </p> <blockquote> <p>Finished processing. (600 Files in 7.5s, 80.3 files/second)</p> </blockquote> <p>Using all four cores I get the following:</p> <blockquote> <p>Finished processing. (600 Files in 5.7s, 104.7 files/second)</p> </blockquote> <p>That's a speedup by two, although the performance should quadruple. Every iteration is independent of the others, therefore this is really well suited for parallel processing. <strong>Why does it scale so badly?</strong></p> <p>Things I have tried:</p> <ol> <li>I installed OpenCV and EmguCV wrapper for C#. Using a standard for-loop I get ~52 files/second, using the Parallel.ForEach() I get ~200 files/second. (As I expected)</li> <li><p>Using a different amount of files. Does not change the files/second if I choose enough files:</p> <blockquote> <p>20 Files in 0.3s, 72.7 files/second<br> 60 Files in 0.6s, 94.3 files/second<br> 200 Files in 1.9s, 105.3 files/second</p> </blockquote></li> <li><p>Using a normal for-loop in Matlab changes nothing. Apparently it does the same amount of parallelization (auto-magically) as the parfor-loop.</p></li> <li>Explicitly specify that it should use 4 worker threads (see above) - does nothing at all. (Not even degrade performance when set to 1) </li> </ol> <p>The files are approx. 640x480 grayscale and have similar contents.</p> <p>Any thoughts?</p>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload