Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>I'm not entirely sure this is correct, but I thought I'd throw it out for comment. First, let me introduce a lockless disjoint set algorithm, which will form an important part of my proposed algorithm.</p> <h2>Lockless disjoint set algorithm</h2> <p>I assume the presence of a two-pointer-sized compare-and-swap operation on your choice of CPU architecture. This is available on x86 and x64 architectures at the least.</p> <p>The algorithm is largely the same as described on <a href="http://en.wikipedia.org/wiki/Disjoint-set_data_structure" rel="nofollow">the Wikipedia page for the single threaded case</a>, with some modifications for safe lockless operation. First, we require that the rank and parent elements to both be pointer-sized, and aligned to 2*sizeof(pointer) in memory, for atomic CAS later on.</p> <p>Find() need not change; the worst case is that the path compression optimization will fail to have full effect in the presence of simultaneous writers.</p> <p>Union() however, must change:</p> <pre><code>function Union(x, y) redo: x = Find(x) y = Find(y) if x == y return xSnap = AtomicRead(x) -- read both rank and pointer atomically ySnap = AtomicRead(y) -- this operation may be done using a CAS if (xSnap.parent != x || ySnap.parent != y) goto redo -- Ensure x has lower rank (meaning y will be the new root) if (xSnap.rank &gt; ySnap.rank) swap(xSnap, ySnap) swap(x, y) -- if same rank, use pointer value as a fallback sort else if (xSnap.rank == ySnap.rank &amp;&amp; x &gt; y) swap(xSnap, ySnap) swap(x, y) yNew = ySnap yNew.rank = max(yNew.rank, xSnap.rank + 1) xNew = xSnap xNew.parent = y if (!CAS(y, ySnap, yNew)) goto redo if (!CAS(x, xSnap, xNew)) goto redo return </code></pre> <p>This should be safe in that it will never form loops, and will always result in a proper union. We can confirm this by observing that:</p> <ul> <li>First, prior to termination, one of the two roots will always end up with a parent pointing to the other. Therefore, as long as there is no loop, the merge succeeds.</li> <li>Second, rank always increases. After comparing the order of x and y, we know x has lower rank than y at the time of the snapshot. In order for a loop to form, another thread would need to have increased x's rank first, then merged x and y. However in the CAS that writes x's parent pointer, we check that rank has not changed; therefore, y's rank must remain greater than x.</li> </ul> <p>In the event of simultaneous mutation, it <em>is</em> possible that y's rank may be increased, then return to redo due to a conflict. However, this implies that either y is no longer a root (in which case rank is irrelevant) or that y's rank has been increased by another process (in which case the second go around will have no effect and y will have correct rank).</p> <p>Therefore, there should be no chance of loops forming, and this lockless disjoint-set algorithm <em>should</em> be safe.</p> <p>And now on to the application to your problem...</p> <h2>Assumptions</h2> <p>I make the assumption that ridge segments can only intersect at their endpoints. If this is not the case, you will need to alter phase 1 in some manner.</p> <p>I also make the assumption that co-habitation of a single integer pixel location is sufficient for ridge segments can be connected. If not, you will need to change the array in phase 1 to hold multiple candidate ridge segments+disjoint-set pairs, and filter through to find ones that are actually connected.</p> <p>The disjoint set structures used in this algorithm shall carry a reference to a line segment in their structures. In the event of a merge, we choose one of the two recorded segments arbitrarily to represent the set.</p> <h2>Phase 1: Local line identification</h2> <p>We start by dividing the map into sectors, each of which will be processed as a seperate job. Multiple jobs may be processed in different threads, but each job will be processed by only one thread. If a ridge segment crosses a sector, it is split into two segments, one for each sector.</p> <p>For each sector, an array mapping pixel position to a disjoint-set structure is established. Most of this array will be discarded later, so its memory requirements should not be too much of a burden.</p> <p>We then proceed over each line segment in the sector. We first choose a disjoint set representing the entire line the segment forms a part of. We first look up each endpoint in the pixel-position array to see if a disjoint set structure has already been assigned. If one of the endpoints is already in this array, we use the assigned disjoint set. If <em>both</em> are in the array, we perform a merge on the disjoint sets, and use the new root as our set. Otherwise, we create a new disjoint-set, and associate with the disjoint-set structure a reference to the current line segment. We then write back into the pixel-position array our new disjoint set's root for each of our endpoints.</p> <p>This process is repeated for each line segment in the sector; by the end, we will have identified all lines completely within the sector by a disjoint set.</p> <p>Note that since the disjoint sets are not yet shared between threads, there's no need to use compare-and-swap operations yet; simply use the normal single-threaded union-merge algorithm. Since we do not free any of the disjoint set structures until the algorithm completes, allocation can also be made from a per-thread bump allocator, making memory allocation (virtually) lockless and O(1).</p> <p>Once a sector is completely processed, all data in the pixel-position array is discarded; however data corresponding to pixels on the edge of the sector is copied to a new array and kept for the next phase.</p> <p>Since iterating over the entire image is O(x*y), and disjoint-merge is effectively O(1), this operation is O(x*y) and requires working memory O(m+2*x*y/k+k^2) = O(x*y/k+k^2), where t is the number of sectors, k is the width of a sector, and m is the number of partial line segments in the sector (depending on how often lines cross borders, m may vary significantly, but it will never exceed the number of line segments). The memory carried over to the next operation is O(m + 2*x*y/k) = O(x*y/k)</p> <h2>Phase 2: Cross-sector merges</h2> <p>Once all sectors have been processed, we then move to merging lines that cross sectors. For each border between sectors, we perform lockless merge operations on lines that cross the border (ie, where adjacent pixels on each side of the border have been assigned to line sets).</p> <p>This operation has running time O(x+y) and consumes O(1) memory (we must retain the memory from phase 1 however). Upon completion, the edge arrays may be discarded.</p> <h2>Phase 3: Collecting lines</h2> <p>We now perform a multi-threaded map operation over all allocated disjoint-set structure objects. We first skip any object which is not a root (ie, where obj.parent != obj). Then, starting from the representative line segment, we move out from there and collect and record any information desired about the line in question. We are assured that only one thread is looking at any given line at a time, as intersecting lines would have ended up in the same disjoint-set structure.</p> <p>This has O(m) running time, and memory usage dependent on what information you need to collect about these line segments.</p> <h2>Summary</h2> <p>Overall, this algorithm should have O(x*y) running time, and O(x*y/k + k^2) memory usage. Adjusting k gives a tradeoff between transient memory usage on the phase 1 processes, and the longer-term memory usage for the adjacency arrays and disjoint-set structures carried over into phase 2.</p> <p>Note that I have not actually tested this algorithm's performance in the real world; it is also possible that I have overlooked concurrency issues in the lockless disjoint-set union-merge algorithm above. Comments welcome :)</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    2. VO
      singulars
      1. This table or related slice is empty.
    3. VO
      singulars
      1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload