Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    text
    copied!<p>Your modified version introduces a race condition:</p> <ul> <li>Thread A: if(S->Value &lt; 0) // Value = 1</li> <li>Thread B: if(S->Value &lt; 0) // Value = 1</li> <li>Thread A: S->Value--; // Value = 0</li> <li>Thread B: S->Value--; // Value = -1</li> </ul> <p>Both threads have acquired a count=1 semaphore. Oops. Note that there's another problem even if they're non-preemptible (see below), but for completeness, here's a discussion on atomicity and how <em>real</em> locking protocols work.</p> <p>When working with protocols like this, it's very important to nail down exactly what atomic primitives you are using. Atomic primitives are such that they seem to execute instantaneously, without being interleaved with any other operations. You cannot just take a big function and call it atomic; you have to <em>make</em> it atomic somehow, using other atomic primitives.</p> <p>Most CPUs offer a primitive called 'atomic compare and exchange'. I'll abbreviate it cmpxchg from here on. The semantics are like so:</p> <pre><code>bool cmpxchg(long *ptr, long old, long new) { if (*ptr == old) { *ptr = new; return true; } else { return false; } } </code></pre> <p><code>cmpxchg</code> is not <em>implemented</em> with this code. It is in the CPU hardware, but behaves a bit like this, only atomically.</p> <p>Now, let's add to this some additional helpful functions (built out of other primitives):</p> <ul> <li>add_waitqueue(waitqueue) - Sets our process state to sleeping and adds us to a wait queue, but <em>continues executing</em> (ATOMIC)</li> <li>schedule() - Switch threads. If we're in a sleeping state, we don't run again until awakened (BLOCKING)</li> <li>remove_waitqueue(waitqueue) - removes our process from a wait queue, then sets our state to awakened if it isn't already (ATOMIC)</li> <li>memory_barrier() - ensures that any reads/writes logically before this point <em>actually are performed before this point</em>, avoiding nasty memory ordering issues (we'll assume all other atomic primitives come with a free memory barrier, although this isn't always true) (CPU/COMPILER PRIMITIVE)</li> </ul> <p>Here's how a typical semaphore acquisition routine will look. It's a bit more complex than your example, because I've explicitly nailed down what atomic operations I'm using:</p> <pre><code>void sem_down(sem *pSem) { while (1) { long spec_count = pSem-&gt;count; read_memory_barrier(); // make sure spec_count doesn't start changing on us! pSem-&gt;count may keep changing though if (spec_count &gt; 0) { if (cmpxchg(&amp;pSem-&gt;count, spec_count, spec_count - 1)) // ATOMIC return; // got the semaphore without blocking else continue; // count is stale, try again } else { // semaphore count is zero add_waitqueue(pSem-&gt;wqueue); // ATOMIC // recheck the semaphore count, now that we're in the waitqueue - it may have changed if (pSem-&gt;count == 0) schedule(); // NOT ATOMIC remove_waitqueue(pSem-&gt;wqueue); // ATOMIC // loop around again to try to acquire the semaphore } } } </code></pre> <p>You'll note that the actual test for a non-zero <code>pSem-&gt;count</code>, in a real-world semaphore_down function, is accomplished by <code>cmpxchg</code>. You can't trust any other read; the value can change an instant after you read the value. We simply can't separate the value check and the value modification.</p> <p>The <code>spec_count</code> here is <em>speculative</em>. This is important. I'm essentially making a guess at what the count will be. It's a pretty good guess, but it's a guess. <code>cmpxchg</code> will fail if my guess is wrong, at which point the routine has to loop and try again. If I guess 0, then I will either be woken up (as it ceases to be zero while I'm on the waitqueue), or I will notice it's not zero anymore in the schedule test.</p> <p>You should also note that there is no possible way to make a function that contains a blocking operation atomic. It's nonsensical. Atomic functions, by definition, appear to execute instantaneously, not interleaved with anything else whatsoever. But a blocking function, by definition, waits for something else to happen. This is inconsistent. Likewise, no atomic operation can be 'split up' across a blocking operation, which it is in your example.</p> <p>Now, you could do away with a lot of this complexity by declaring the function non-preemptable. By using locks or other methods, you simply ensure only one thread is ever running (not including blocking of course) in the semaphore code at a time. But a problem still remains then. Start with a value of 0, where C has taken the semaphore down twice, then:</p> <ul> <li>Thread A: if (S->Value &lt; 0) // Value = 0</li> <li>Thread A: Block....</li> <li>Thread B: if (S->Value &lt; 0) // Value = 0</li> <li>Thread B: Block....</li> <li>Thread C: S->Value++ // value = 1</li> <li>Thread C: Wakeup(A)</li> <li>(Thread C calls signal() again)</li> <li>Thread C: S->Value++ // value = 2</li> <li>Thread C: Wakeup(B)</li> <li>(Thread C calls wait())</li> <li>Thread C: if (S->Value &lt;= 0) // Value = 2</li> <li>Thread C: S->Value-- // Value = 1</li> <li>// A and B have been woken</li> <li>Thread A: S->Value-- // Value = 0</li> <li>Thread B: S->Value-- // Value = -1</li> </ul> <p>You could probably fix this with a loop to recheck S->value - again, assuming you are on a single processor machine and your semaphore code is preemptable. Unfortunately, these assumptions are false on all desktop OSes :)</p> <p>For more discussion on how <em>real</em> locking protocols work, you might be interested in the paper "<a href="http://www.kernel.org/doc/ols/2002/ols2002-pages-479-495.pdf" rel="nofollow">Fuss, Futexes and Furwocks: Fast Userlevel Locking in Linux</a>"</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload