Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    text
    copied!<p>Many thanks for everyone on the help provided. I've been thinking a lot about everybody's input and I believe that in the simulation setup the following is the optimal solution:</p> <pre><code>SetAttributes[linkedList, HoldAllComplete]; temporarySymbols = linkedList[]; SetAttributes[bookmarkSymbol, Listable]; bookmarkSymbol[symbol_]:= With[{old = temporarySymbols}, temporarySymbols= linkedList[old,symbol]]; registerConfiguration[index_]:=registerConfiguration[index]= Module[ { cs = linkedList[], bookmarkConfiguration, accumulator }, (* remember the symbols we generate so we can remove them later *) bookmarkSymbol[{cs,bookmarkConfiguration,accumulator}]; getCs[index] := List @@ Flatten[cs, Infinity, linkedList]; getCsAndFreqs[index] := {getCs[index],accumulator /@ getCs[index]}; accumulator[_]=0; bookmarkConfiguration[c_]:=bookmarkConfiguration[c]= With[{oldCs=cs}, cs = linkedList[oldCs, c]]; Function[c, bookmarkConfiguration[c]; accumulator[c]++; ] ] pattern = Verbatim[RuleDelayed][Verbatim[HoldPattern][HoldPattern[registerConfiguration [_Integer]]],_]; clearSimulationData := Block[{symbols}, DownValues[registerConfiguration]=DeleteCases[DownValues[registerConfiguration],pattern]; symbols = List @@ Flatten[temporarySymbols, Infinity, linkedList]; (*Print["symbols to purge: ", symbols];*) ClearAll /@ symbols; temporarySymbols = linkedList[]; ] </code></pre> <p>It is based on Leonid's solution from one of previous posts, appended with belsairus' suggestion to include extra indexing for configurations that have been processed. Previous approaches are adapted so that configurations can be naturally registered and extracted using the same code more or less. This is hitting two flies at once since bookkeeping and retrieval and strongly interrelated. </p> <p>This approach will work better in the situation when one wants to add simulation data incrementally (all curves are normally noisy so one has to add runs incrementally to obtain good plots). The sparse array approach will work better when data are generated in one go and then analyzed, but I do not remember being personally in such a situation where I had to do that.</p> <p>Also, I was rather naive thinking that the data extraction and generation could be treated separately. In this particular case it seems one should have both perspectives in mind. I profoundly apologise for bluntly dismissing any previous suggestions in this direction (there were few implicit ones).</p> <p>There are some open/minor problems that I do not know how to handle, e.g. when clearing the symbols I cannot clear headers like accumulator$164, I can only clean subvalues associated with it. Have not clue why. Also, if <code>With[{oldCs=cs}, cs = linkedList[oldCs, c]];</code> is changed into something like <code>cs = linkedList[cs, c]];</code> configurations are not stored. Have no clue either why the second option does not work. But these minor problems are well defined satellite issues that one can address in the future. By and large the problem seems solved by the generous help from all involved.</p> <p>Many thanks again for all the help.</p> <p>Regards Zoran</p> <p>p.s. There are some timings, but to understand what is going on I will append the code that is used for benchmarking. In brief, idea is to generate lists of configurations and just Map through them by invoking registerConfiguration. This essentially simulates data generation process. Here is the code used for testing:</p> <pre><code>fillSimulationData[sampleArg_] :=MapIndexed[registerConfiguration[#2[[1]]][#1]&amp;, sampleArg,{2}]; sampleForIndex[index_]:= Block[{nsamples,min,max}, min = Max[1,Floor[(9/10)maxSamplesPerIndex]]; max = maxSamplesPerIndex; nsamples = RandomInteger[{min, max}]; RandomInteger[{1,10},{nsamples,ntypes}] ]; generateSample := Table[sampleForIndex[index],{index, 1, nindexes}]; measureGetCsTime :=((First @ Timing[getCs[#]])&amp; /@ Range[1, nindexes]) // Max measureGetCsAndFreqsTime:=((First @ Timing[getCsAndFreqs[#]])&amp; /@ Range[1, nindexes]) // Max reportSampleLength[sampleArg_] := StringForm["Total number of confs = ``, smallest accumulator length ``, largest accumulator length = ``", Sequence@@ {Total[#],Min[#],Max[#]}&amp; [Length /@ sampleArg]] </code></pre> <p>The first example is relatively modest:</p> <pre><code>clearSimulationData; nindexes=100;maxSamplesPerIndex = 1000; ntypes = 2; largeSample1 = generateSample; reportSampleLength[largeSample1]; Total number of confs = 94891, smallest accumulator length 900, largest accumulator length = 1000; First @ Timing @ fillSimulationData[largeSample1] </code></pre> <p>gives 1.375 secs which is fast I think.</p> <pre><code>With[{times = Table[measureGetCsTime, {50}]}, ListPlot[times, Joined -&gt; True, PlotRange -&gt; {0, Max[times]}]] </code></pre> <p>gives times around 0.016 secs, and</p> <pre><code>With[{times = Table[measureGetCsAndFreqsTime, {50}]}, ListPlot[times, Joined -&gt; True, PlotRange -&gt; {0, Max[times]}]] </code></pre> <p>gives same times. Now the real killer</p> <pre><code>nindexes = 10; maxSamplesPerIndex = 100000; ntypes = 10; largeSample3 = generateSample; largeSample3 // Short {{{2,2,1,5,1,3,7,9,8,2},92061,{3,8,6,4,9,9,7,8,7,2}},8,{{4,10,1,5,9,8,8,10,8,6},95498,{3,8,8}}} </code></pre> <p>reported as</p> <pre><code>Total number of confs = 933590, smallest accumulator length 90760, largest accumulator length = 96876 </code></pre> <p>gives generation times of ca 1.969 - 2.016 secs which is unbeliavably fast. I mean this is like going through the gigantic list of ca one million elements and applying a function to each element. </p> <p>The extraction times for configs and {configs, freqs} are roughly 0.015 and 0.03 secs respectivelly.</p> <p>To me this is a mind blowing speed I would never expect from Mathematica!</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload