Note that there are some explanatory texts on larger screens.

plurals
  1. POLock-free queue algorithm, repeated reads for consistency
    primarykey
    data
    text
    <p>I'm studying the <a href="http://www.cs.rochester.edu/research/synchronization/pseudocode/queues.html" rel="nofollow">lock-free (en-,de-)queue algorithms of Michael and Scott</a>. The problem is I can't explain/understand (nor the paper does, apart from the comments in the code itself) a couple of lines.</p> <p>Enqueue:</p> <pre><code> enqueue(Q: pointer to queue_t, value: data type) E1: node = new_node() // Allocate a new node from the free list E2: node-&gt;value = value // Copy enqueued value into node E3: node-&gt;next.ptr = NULL // Set next pointer of node to NULL E4: loop // Keep trying until Enqueue is done E5: tail = Q-&gt;Tail // Read Tail.ptr and Tail.count together E6: next = tail.ptr-&gt;next // Read next ptr and count fields together E7: if tail == Q-&gt;Tail // Are tail and next consistent? // Was Tail pointing to the last node? E8: if next.ptr == NULL // Try to link node at the end of the linked list E9: if CAS(&amp;tail.ptr-&gt;next, next, &lt;node, next.count+1&gt;) E10: break // Enqueue is done. Exit loop E11: endif E12: else // Tail was not pointing to the last node // Try to swing Tail to the next node E13: CAS(&amp;Q-&gt;Tail, tail, &lt;next.ptr, tail.count+1&gt;) E14: endif E15: endif E16: endloop // Enqueue is done. Try to swing Tail to the inserted node E17: CAS(&amp;Q-&gt;Tail, tail, &lt;node, tail.count+1&gt;) </code></pre> <p>Why is E7 needed? Does correctness depend on it? Or is it merely an optimization? This <code>if</code> can fail if another thread successfully executed E17, or D10 below, (and changed Q->Tail) while the first thread has executed E5 but not yet E7. But what if E17 is executed right after the first thread executes E7? </p> <p><strong>edit</strong>: Does this last sentence prove that E7 cannot be more than an optimization? My intuition is that <em>it does</em>, since I give a scenario were "apparently" the <code>if</code> statement would make the wrong decision, yet the algorithm would still be supposed to work correctly. But then we could replace the <code>if</code>'s condition with a random condition, without affecting correctness. Any hole in this argument?</p> <p>Dequeue:</p> <pre><code>dequeue(Q: pointer to queue_t, pvalue: pointer to data type): boolean D1: loop // Keep trying until Dequeue is done D2: head = Q-&gt;Head // Read Head D3: tail = Q-&gt;Tail // Read Tail D4: next = head.ptr-&gt;next // Read Head.ptr-&gt;next D5: if head == Q-&gt;Head // Are head, tail, and next consistent? D6: if head.ptr == tail.ptr // Is queue empty or Tail falling behind? D7: if next.ptr == NULL // Is queue empty? D8: return FALSE // Queue is empty, couldn't dequeue D9: endif // Tail is falling behind. Try to advance it D10: CAS(&amp;Q-&gt;Tail, tail, &lt;next.ptr, tail.count+1&gt;) D11: else // No need to deal with Tail // Read value before CAS // Otherwise, another dequeue might free the next node D12: *pvalue = next.ptr-&gt;value // Try to swing Head to the next node D13: if CAS(&amp;Q-&gt;Head, head, &lt;next.ptr, head.count+1&gt;) D14: break // Dequeue is done. Exit loop D15: endif D16: endif D17: endif D18: endloop D19: free(head.ptr) // It is safe now to free the old node D20: return TRUE // Queue was not empty, dequeue succeeded </code></pre> <p>Again, why D5 is needed? Correctness or optimization? I'm not sure what "consistency" these tests give, since it seems they can get inconsistent right after the <code>if</code> succeeds.</p> <p>This looks like a standard technique. Can someone explain the motivation behind it? To me, it seems like the intention is to avoid doing an (expensive) CAS in those few cases it can be noticed that it would definitely fail, but at the cost of always doing an extra read, which is not supposed to be so much cheaper itself (e.g. in Java, <code>Q-&gt;Tail</code> would be required to be volatile, so we would know we are not merely reading a copy cached in a register but reading the real thing, which would be translated in prepending the read with a fence of some sort), so I'm not sure what's really going on here... thanks.</p> <p><strong>edit</strong> This has been ported to Java, more precisely in <a href="http://fuseyism.com/classpath/doc/java/util/concurrent/ConcurrentLinkedQueue-source.html" rel="nofollow">ConcurrentLinkedQueue</a>, e.g. E7 is line 194, while D5 is line 212.</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload