Note that there are some explanatory texts on larger screens.

plurals
  1. POstd::promise::set_value on FIFO thread pinned to a core doesn't wake std::future
    primarykey
    data
    text
    <p>I am trying to create a system which has a deterministic realtime response.</p> <p>I create a number of <a href="http://code.google.com/p/cpuset/" rel="nofollow" title="cpusets on google code"><code>cpusets</code></a>, move all non-critical tasks and unpinned kernel threads to one set, and then pin each of my realtime threads to its own cpuset, each of which consists of a single cpu.</p> <pre><code>$ non-critical tasks and unpinned kernel threads cset proc --move --fromset=root --toset=system cset proc --kthread --fromset=root --toset=system $ realtime threads cset proc --move --toset=shield/RealtimeTest1/thread1 --pid=17651 cset proc --move --toset=shield/RealtimeTest1/thread2 --pid=17654 </code></pre> <p>My scenario is this:</p> <ul> <li>Thread 1: <code>SCHED_OTHER</code>, pinned to <code>set1</code>, waiting on <code>std::future&lt;void&gt;</code></li> <li>Thread 2: <code>SCHED_FIFO</code>, pinned to <code>set2</code>, calls <code>std::promise&lt;void&gt;::set_value()</code></li> </ul> <p><strong>Thread 1 blocks forever.</strong> However, <strong>if I change Thread 2 so be <code>SCHED_OTHER</code></strong>, Thread 1 is able to continue.</p> <p>I have run an <code>strace -f</code> to get more insight; it seems Thread 1 is waiting on a <code>futex</code> (I assume the internals of <code>std::future</code>) but is never woken up.</p> <p>I'm absolutely stymied - is there any way to have a thread pin itself to a core and set its scheduler to FIFO , and then use a <code>std::promise</code> to wake up another thread which is waiting for it to complete this so-called realtime setup?</p> <p>The code for thread1 creating thread2 is as follows:</p> <pre><code>// Thread1: std::promise&lt;void&gt; p; std::future &lt;void&gt; f = p.get_future(); _thread = std::move(std::thread(std::bind(&amp;Dispatcher::Run, this, std::ref(p)))); LOG_INFO &lt;&lt; "waiting for thread2 to start" &lt;&lt; std::endl; if (f.valid()) f.wait(); </code></pre> <p>and the <em>Run</em> function for thread2 is as follows:</p> <pre><code>// Thread2: LOG_INFO &lt;&lt; "started: threadId=" &lt;&lt; Thread::GetId() &lt;&lt; std::endl; Realtime::Service* rs = Service::Registry::Lookup&lt;Realtime::Service&gt;(); if (rs) rs-&gt;ConfigureThread(this-&gt;Name()); // this does the pinning and FIFO etc LOG_INFO &lt;&lt; "thread2 has started" &lt;&lt; std::endl; p.set_value(); // indicate fact that the thread has started </code></pre> <p>The strace output follows:</p> <ul> <li>Thread 1 is <code>[pid 17651]</code></li> <li>Thread 2 is <code>[pid 17654]</code></li> </ul> <p>In the interests of brevity I have removed some of the output.</p> <pre><code>//////// Thread 1 creates thread 2 and waits on a future //////// [pid 17654] gettid() = 17654 [pid 17651] write(2, "09:29:52 INFO waiting for thread"..., 4309:29:52 INFO waiting for thread2 to start &lt;unfinished ...&gt; [pid 17654] gettid( &lt;unfinished ...&gt; [pid 17651] &lt;... write resumed&gt; ) = 43 [pid 17654] &lt;... gettid resumed&gt; ) = 17654 [pid 17651] futex(0xd52294, FUTEX_WAIT_PRIVATE, 1, NULL &lt;unfinished ...&gt; [pid 17654] gettid() = 17654 [pid 17654] write(2, "09:29:52 INFO thread2 started: t"..., 6109:29:52 INFO thread2 started: threadId=17654 ) = 61 //////// &lt;snip&gt; thread2 performs pinning, FIFO, etc &lt;/snip&gt; //////// [pid 17654] write(2, "09:29:52 INFO thread2 has starte"..., 3409:29:52 INFO thread2 has started ) = 34 [pid 17654] futex(0xd52294, FUTEX_CMP_REQUEUE_PRIVATE, 1, 2147483647, 0xd52268, 2) = 1 [pid 17651] &lt;... futex resumed&gt; ) = 0 [pid 17654] futex(0xd522c4, FUTEX_WAKE_PRIVATE, 2147483647 &lt;unfinished ...&gt; [pid 17651] futex(0xd52268, FUTEX_WAKE_PRIVATE, 1 &lt;unfinished ...&gt; [pid 17654] &lt;... futex resumed&gt; ) = 0 [pid 17651] &lt;... futex resumed&gt; ) = 0 //////// blocks here forever //////// </code></pre> <p>You can see that pid 17651 (thread1) reports <code>futex resumed</code>, but is it maybe running on the wrong cpu and getting blocked behind thread2 which is running as <code>FIFO</code>?</p> <p><strong>Update: It seems this is an issue with threads <em>not running on the cpus they have been pinned to</em>.</strong></p> <p><code>top -p 17649 -H</code> with <code>f,j</code> to bring up the <code>last used cpu</code> shows that <strong>thread 1 is indeed running on thread 2's cpu</strong>. </p> <pre><code>top - 10:00:59 up 18:17, 3 users, load average: 7.16, 7.61, 4.18 Tasks: 3 total, 2 running, 1 sleeping, 0 stopped, 0 zombie Cpu(s): 7.1%us, 0.1%sy, 0.0%ni, 89.5%id, 0.0%wa, 0.0%hi, 3.3%si, 0.0%st Mem: 8180892k total, 722800k used, 7458092k free, 43364k buffers Swap: 8393952k total, 0k used, 8393952k free, 193324k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ P COMMAND 17654 root -2 0 54080 35m 7064 R 100 0.4 5:00.77 3 RealtimeTest 17649 root 20 0 54080 35m 7064 S 0 0.4 0:00.05 2 RealtimeTest 17651 root 20 0 54080 35m 7064 R 0 0.4 0:00.00 3 RealtimeTest </code></pre> <p>However, if I look at the <code>cpuset</code> filesystem, I can see that my tasks are <em>supposedly</em> pinned to the cpus I requested:</p> <pre><code>/cpusets/shield/RealtimeTest1 $ for i in `find -name tasks`; do echo $i; cat $i; echo "------------"; done ./thread1/tasks 17651 ------------ ./main/tasks 17649 ------------ ./thread2/tasks 17654 ------------ </code></pre> <p>Displaying the cpuset config:</p> <pre><code>$ cset set --list -r cset: Name CPUs-X MEMs-X Tasks Subs Path ------------ ---------- - ------- - ----- ---- ---------- root 0-23 y 0-1 y 279 2 / system 0,2,4,6,8,10 n 0 n 202 0 /system shield 1,3,5,7,9,11 n 1 n 0 2 /shield RealtimeTest1 1,3,5,7 n 1 n 0 4 /shield/RealtimeTest1 thread1 3 n 1 n 1 0 /shield/RealtimeTest1/thread1 thread2 5 n 1 n 1 0 /shield/RealtimeTest1/thread2 main 1 n 1 n 1 0 /shield/RealtimeTest1/main </code></pre> <p>From this I would say that thread2 is <em>supposed</em> to be on cpu 5, but top says it's running on cpu 3.</p> <p>Interestingly, <code>sched_getaffinity</code> reports what <code>cpuset</code> does - that thread1 is on cpu 3 and thread2 is on cpu 5.</p> <p>However, looking at <code>/proc/17649/task</code> to find the <code>last_cpu</code> each of its tasks ran on:</p> <pre><code>/proc/17649/task $ for i in `ls -1`; do cat $i/stat | awk '{print $1 " is on " $(NF - 5)}'; done 17649 is on 2 17651 is on 3 17654 is on 3 </code></pre> <p><strong><code>sched_getaffinity</code> reports one thing, but reality is another</strong></p> <p>Interestingly, <code>main</code> thread [<code>pid 17649</code>] is supposed to be on cpu 1 (according to the <code>cset</code> output), but in fact it is running on cpu 2 (which is on another socket)</p> <p>So I would say that <code>cpuset</code> is not working?</p> <p>My machine configuration is:</p> <pre><code>$ cat /etc/SuSE-release SUSE Linux Enterprise Server 11 (x86_64) VERSION = 11 PATCHLEVEL = 1 $ uname -a Linux foobar 2.6.32.12-0.7-default #1 SMP 2010-05-20 11:14:20 +0200 x86_64 x86_64 x86_64 GNU/Linux </code></pre>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload