StackOverflow2013

Note that there are some explanatory texts on larger screens.

plurals

PO
text
Body
copied!<p>Valgrind actually can do a quite precise measurement for you. All that you need is to write as simplest as possible example which calls your function.</p> <p>for example a program which just prints its arguments (passed to the <code>main()</code> function) using for-loop and <code>std::cout</code> produce the following output:</p> <pre><code>zaufi@gentop /work/tests $ valgrind --tool=drd --show-stack-usage=yes ./stack-usage-test-1 ==26999== drd, a thread error detector ==26999== Copyright (C) 2006-2012, and GNU GPL'd, by Bart Van Assche. ==26999== Using Valgrind-3.8.1 and LibVEX; rerun with -h for copyright info ==26999== Command: ./stack-usage-test-1 ==26999== ./stack-usage-test-1 ==26999== thread 1 finished and used 11944 bytes out of 8388608 on its stack. Margin: 8376664 bytes. ==26999== ==26999== For counts of detected and suppressed errors, rerun with: -v ==26999== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0) </code></pre> <p>As one may see the only thread will consume almost 12K on stack. And definitely most of this space was "wasted" <strong>before</strong> the <code>main()</code>. To make a better measurement it is necessary to run a target function in a separate thread. Smth like this:</p> <pre><code>#include <iostream> #include <thread> int main(int argc, char* argv[]) { auto thr = std::thread([](){std::cout << __PRETTY_FUNCTION__ << std::endl;}); thr.join(); return 0; } </code></pre> <p>This code will produce the following output:</p> <pre><code>==27029== thread 2 finished and used 1840 bytes out of 8384512 on its stack. Margin: 8382672 bytes. ==27029== thread 1 finished and used 11992 bytes out of 8388608 on its stack. Margin: 8376616 bytes. </code></pre> <p>that is definitely better. So measuring a function which do nothing, you've got a <em>minimum stack usage</em> (in the last example it's about 1840 bytes). So if you would call your target function in a separate thread, you have to substract 1840 bytes (or even less) from a result...</p> <hr> <p>almost the same you can do yourself using the following simple algorithm:</p> <ol> <li>allocate 8M buffer from a heap (default stack size for linux/pthreads, but you may alloc actually any other (reasonable) size)</li> <li>fill it w/ some pattern</li> <li>fork a new thread with a stack assigned to just allocated and filled area (using <code>pthread_attr_setstack()</code> (or friends))</li> <li>as soon as you can call your target function whithin that thread and exit</li> <li>in the main thread, after <code>pthread_join()</code> successed, analyze your buffer to find an area, where the pattern you assigned before has not preserved</li> </ol> <p>(even) in that case you'd better to do a first measurement on a thread which is do nothing -- just to get a minimum usage size as above.</p>

Querying!

Guidance

An individual column

Larger individual text columns get their own page to allow for proper reading.

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload