StackOverflow2013

Note that there are some explanatory texts on larger screens.

plurals

PO
text
Body
copied!<p>Start with an empty linked list of waiting threads. The head should be set to 0.</p> <p>Use CAS, compare and swap, to insert a thread at the head of the list of waiters. If the head =-1, then do not insert or wait. You can safely use CAS to insert items at the head of a linked list if you do it right.</p> <p>After being inserted, the waiting thread should wait on SIGUSR1. Use sigwait() to do this.</p> <p>When ready, the signaling thread uses CAS to set the head of wait list to -1. This prevents any more threads from adding themselves to the wait list. Then the signaling thread iterates the threads in the wait list and calls pthread_kill(&thread, SIGUSR1) to wake up each waiting thread.</p> <p>If SIGUSR1 is sent before a call to sigwait, sigwait will return immediately. Thus, there will not be a race between adding a thread to the wait list and calling sigwait.</p> <p>EDIT:</p> <p>Why is CAS faster than a mutex? Laymen's answer (I'm a layman). Its faster for some things in some situations, because it has lower overhead when there is NO race. So if you can reduce your concurrent problem down to needing to change 8-16-32-64-128 bits of contiguous memory, and a race is not going to happen very often, CAS wins. CAS is basically a slightly more fancy/expensive mov instruction right where you were going to do a regular "mov" anyway. Its a "lock exchng" or something like that. </p> <p>A mutex on the other hand is a whole bunch of extra stuff, that gets other cache lines dirty and uses more memory barriers, etc. Although CAS acts as a memory barrier on the x86, x64, etc. Then of course you have to unlock the mutex which is probably about the same amount of extra stuff. </p> <p>Here is how you add an item to a linked list using CAS:</p> <pre><code>while (1) { pOldHead = pHead; <-- snapshot of the world. Start of the race. pItem->pNext = pHead; if (CAS(&pHead, pOldHead, pItem)) <-- end of the race if phead still is pOldHead break; // success } </code></pre> <p>So how often do you think your code is going to have multiple threads at that CAS line at the exact same time? In reality....not very often. We did tests that just looped adding millions of items with multiple threads at the same time and it happens way less than 1% of the time. In a real program, it might never happen. </p> <p>Obviously if there is a race you have to go back and do that loop again, but in the case of a linked list, what does that cost you?</p> <p>The downside is that you can't do very complex things to that linked list if you are going to use that method to add items to the head. Try implementing a double linked list. What a pain.</p> <p><strong>EDIT:</strong> </p> <p>In the code above I use a macro CAS. If you are using linux, CAS = macro using __sync_bool_compare_and_swap. See <a href="http://gcc.gnu.org/onlinedocs/gcc-4.1.2/gcc/Atomic-Builtins.html" rel="nofollow">gcc atomic builtins</a>. If you are using windows, CAS = macro using something like InterlockedCompareExchange. Here is what an inline function in windows might look like:</p> <pre><code>inline bool CAS(volatile WORD* p, const WORD nOld, const WORD nNew) { return InterlockedCompareExchange16((short*)p, nNew, nOld) == nOld; } inline bool CAS(volatile DWORD* p, const DWORD nOld, const DWORD nNew) { return InterlockedCompareExchange((long*)p, nNew, nOld) == nOld; } inline bool CAS(volatile QWORD* p, const QWORD nOld, const QWORD nNew) { return InterlockedCompareExchange64((LONGLONG*)p, nNew, nOld) == nOld; } inline bool CAS(void*volatile* p, const void* pOld, const void* pNew) { return InterlockedCompareExchangePointer(p, (PVOID)pNew, (PVOID)pOld) == pOld; } </code></pre>

Querying!

Guidance

An individual column

Larger individual text columns get their own page to allow for proper reading.

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload