Note that there are some explanatory texts on larger screens.

plurals
  1. PODataset array indexing is very slow with Statistics Toolbox
    text
    copied!<p>Why is indexing into a dataset array so slow? A peak into the dataset.subsref function shows that all the columns of the dataset are stored in a cell array. However, cell indexing is much, much faster than dataset indexing, which is just indexing into a cell array under the hood. My guess is that this has to do with some overhead with MATLAB OOP. Any ideas on how to speed this up?</p> <pre class="lang-matlab prettyprint-override"><code>%% Using R2011a, PCWIN64 feature accel off; % turn off JIT dat = (1:1e6)'; dat2 = repmat({'abc'}, 1e6, 1); celldat = {dat dat2}; ds = dataset(dat, dat2); N = 1e2; tic; for j = 1:N tmp = celldat{2}; end toc; tic; for j = 1:N tmp2 = ds.dat2; % 2.778sec spent on line 262 of dataset.subsref end toc; feature accel on; % turn JIT back on </code></pre> <pre class="lang-none prettyprint-override"><code>Elapsed time is 0.000165 seconds. Elapsed time is 2.778995 seconds. </code></pre> <p><strong>EDIT:</strong> I've updated the example to be more like the problem I'm seeing. A huge amount of time is spent on line 262 of dataset.subsref - "b = a.data{varIndex};". It's very strange to me since it is a simple cell dereference. I'm wondering if there is a OOP trick that will allow me to index into "a.data" without the strange overhead.</p> <p><strong>EDIT2:</strong> As per Andrew's suggestion, I've submitted this as a bug to MatWorks. Will update if I hear anything from them.</p> <p><strong>EDIT3:</strong> Matlab responded and said they are aware of the problem now and will fix it in a future release. They noted that the problem is specific to cell arrays, and to try to avoid them if possible.</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload