Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p><em><strong>Summary</em></strong>: <strong>The time difference is explained when analysing the time is takes to allocate the arrays</strong>. The last allocated calloc takes just a bit more time whereas the other (or all when using mmap) take virtualy no time. The real allocation in memory is probably deferred when first accessed.</p> <p>I don't know enough about the internal of memory allocation on Linux. But I ran your script slightly modified: I've added a third array and some extra iterations per array operations. And I have taken into account the remark of Old Pro that the time to allocate the arrays was not taken into account.</p> <p>Conclusion: Using calloc takes longer than using mmap for the allocation (mmap virtualy uses no time when you allocate the memory, it's probably postponed later when fist accessed), and using my program there is almost no difference in the end between using mmap or calloc for the overall program execution.</p> <p>Anyway, first remark, both memory allocation happen in the memory mapping region and not in the heap. To verify this, I've added a quick n' dirty pause so you can check the memory mapping of the process (/proc//maps)</p> <p>Now to your question, the last allocated array with calloc seems to be really allocated in memory (not postponed). As arr1 and arr2 behaves now exactly the same (the first iteration is slow, subsequent iterations are faster). Arr3 is faster for the first iteration because the memory was allocated earlier. When using the A macro, then it is arr1 which benefits from this. My guess would be that the kernel has preallocated the array in memory for the last calloc. Why? I don't know... I've tested it also with only one array (so I removed all occurence of arr2 and arr3), then I have the same time (roughly) for all 10 iterations of arr1.</p> <p>Both malloc and mmap behave the same (results not shown below), the first iteration is slow, subsequent iterations are faster for all 3 arrays.</p> <p>Note: all results were coherent accross the various gcc optimisation flags (-O0 to -O3), so it doesn't look like the root of the behaviour is derived from some kind of gcc optimsation.</p> <p>Note2: Test run on Ubuntu Precise Pangolin (kernel 3.2), with GCC 4.6.3</p> <pre><code>#include &lt;stdlib.h&gt; #include &lt;stdio.h&gt; #include &lt;sys/mman.h&gt; #include &lt;time.h&gt; #define SIZE 500002816 #define ITERATION 10 #if defined(USE_MMAP) # define ALLOC(a, b) (mmap(NULL, a * b, PROT_READ | PROT_WRITE, \ MAP_PRIVATE | MAP_ANONYMOUS, -1, 0)) #elif defined(USE_MALLOC) # define ALLOC(a, b) (malloc(b * a)) #elif defined(USE_CALLOC) # define ALLOC calloc #else # error "No alloc routine specified" #endif int main() { clock_t start, finish, gstart, gfinish; start = clock(); gstart = start; #ifdef A unsigned int *arr1 = ALLOC(sizeof(unsigned int), SIZE); unsigned int *arr2 = ALLOC(sizeof(unsigned int), SIZE); unsigned int *arr3 = ALLOC(sizeof(unsigned int), SIZE); #else unsigned int *arr3 = ALLOC(sizeof(unsigned int), SIZE); unsigned int *arr2 = ALLOC(sizeof(unsigned int), SIZE); unsigned int *arr1 = ALLOC(sizeof(unsigned int), SIZE); #endif finish = clock(); unsigned int i, j; double intermed, finalres; intermed = ((double)(finish - start))/CLOCKS_PER_SEC; printf("Time to create: %.2f\n", intermed); printf("arr1 addr: %p\narr2 addr: %p\narr3 addr: %p\n", arr1, arr2, arr3); finalres = 0; for (j = 0; j &lt; ITERATION; j++) { start = clock(); { for (i = 0; i &lt; SIZE; i++) arr1[i] = (i + 13) * 5; } finish = clock(); intermed = ((double)(finish - start))/CLOCKS_PER_SEC; finalres += intermed; printf("Time A: %.2f\n", intermed); } printf("Time A (average): %.2f\n", finalres/ITERATION); finalres = 0; for (j = 0; j &lt; ITERATION; j++) { start = clock(); { for (i = 0; i &lt; SIZE; i++) arr2[i] = (i + 13) * 5; } finish = clock(); intermed = ((double)(finish - start))/CLOCKS_PER_SEC; finalres += intermed; printf("Time B: %.2f\n", intermed); } printf("Time B (average): %.2f\n", finalres/ITERATION); finalres = 0; for (j = 0; j &lt; ITERATION; j++) { start = clock(); { for (i = 0; i &lt; SIZE; i++) arr3[i] = (i + 13) * 5; } finish = clock(); intermed = ((double)(finish - start))/CLOCKS_PER_SEC; finalres += intermed; printf("Time C: %.2f\n", intermed); } printf("Time C (average): %.2f\n", finalres/ITERATION); gfinish = clock(); intermed = ((double)(gfinish - gstart))/CLOCKS_PER_SEC; printf("Global Time: %.2f\n", intermed); return 0; } </code></pre> <p>Results:</p> <p>Using USE_CALLOC</p> <blockquote> <pre><code>Time to create: 0.13 arr1 addr: 0x7fabcb4a6000 arr2 addr: 0x7fabe917d000 arr3 addr: 0x7fac06e54000 Time A: 0.67 Time A: 0.48 ... Time A: 0.47 Time A (average): 0.48 Time B: 0.63 Time B: 0.47 ... Time B: 0.48 Time B (average): 0.48 Time C: 0.45 ... Time C: 0.46 Time C (average): 0.46 </code></pre> </blockquote> <p>With USE_CALLOC and A</p> <blockquote> <pre><code>Time to create: 0.13 arr1 addr: 0x7fc2fa206010 arr2 addr: 0xx7fc2dc52e010 arr3 addr: 0x7fc2be856010 Time A: 0.44 ... Time A: 0.43 Time A (average): 0.45 Time B: 0.65 Time B: 0.47 ... Time B: 0.46 Time B (average): 0.48 Time C: 0.65 Time C: 0.48 ... Time C: 0.45 Time C (average): 0.48 </code></pre> </blockquote> <p>Using USE_MMAP</p> <blockquote> <pre><code>Time to create: 0.0 arr1 addr: 0x7fe6332b7000 arr2 addr: 0x7fe650f8e000 arr3 addr: 0x7fe66ec65000 Time A: 0.55 Time A: 0.48 ... Time A: 0.45 Time A (average): 0.49 Time B: 0.54 Time B: 0.46 ... Time B: 0.49 Time B (average): 0.50 Time C: 0.57 ... Time C: 0.40 Time C (average): 0.43 </code></pre> </blockquote>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    2. VO
      singulars
      1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload