Note that there are some explanatory texts on larger screens.

plurals
  1. POFinding pixels that make an image unique within a list, can you improve on brute force?
    primarykey
    data
    text
    <p>Suppose I have a list of strings where each string is</p> <ul> <li>exactly 4 characters long and</li> <li>unique within the list.</li> </ul> <p>For each of these strings I want to identify the position of the characters within the string that make the string unique.</p> <p>So for a list of three strings</p> <pre><code>abcd abcc bbcb </code></pre> <p>For the first string I want to identify the character in 4th position <em>d</em> since <em>d</em> does not appear in the 4th position in any other string. </p> <p>For the second string I want to identify the character in 4th position <em>c</em>.</p> <p>For the third string it I want to identify the character in 1st position <em>b</em> AND the character in 4th position, also <em>b</em>.</p> <p>This could be concisely represented as</p> <pre><code>abcd -&gt; ...d abcc -&gt; ...c bbcb -&gt; b..b </code></pre> <p>If you consider the same problem but with a list of binary numbers</p> <pre><code>0101 0011 1111 </code></pre> <p>Then the result I want would be</p> <pre><code>0101 -&gt; ..0. 0011 -&gt; .0.. 1111 -&gt; 1... </code></pre> <p>Staying with the binary theme I can use XOR to identify which bits are unique within <strong>two</strong> binary numbers since</p> <pre><code>0101 ^ 0011 = 0110 </code></pre> <p>which I can interpret as meaning that in this case the 2nd and 3rd bits (reading left to right) are unique between these two binary numbers. This technique might be a red herring unless somehow it can be extended to the larger list.</p> <p>A brute-force approach would be to look at each string in turn, and for each string to iterate through vertical slices of the remainder of the strings in the list. </p> <p>So for the list</p> <pre><code>abcd abcc bbcb </code></pre> <p>I would start with</p> <pre><code>abcd </code></pre> <p>and iterate through vertical slices of</p> <pre><code>abcc bbcb </code></pre> <p>where these vertical slices would be</p> <pre><code>a | b | c | c b | b | c | b </code></pre> <p>or in list form, "ab", "bb", "cc", "cb".</p> <p>This would result in four comparisons</p> <pre><code>a : ab -&gt; . (a is not unique) b : bb -&gt; . (b is not unique) c : cc -&gt; . (c is not unique) d : cb -&gt; d (d is unique) </code></pre> <p>or concisely</p> <pre><code>abcd -&gt; ...d </code></pre> <p>Maybe it's wishful thinking, but I have a feeling that there should be an elegant and general solution that would apply to an arbitrarily large list of strings (or binary numbers). But if there is I haven't yet been able to see it.</p> <p>I hope to use this algorithm to to derive minimal signatures from a collection of unique images (bitmaps) in order to efficiently identify those images at a future time. If future efficiency wasn't a concern I would use a simple hash of each image.</p> <p>Can you improve on brute force? </p> <p><strong>Edit</strong> The approach I'm warming to is building a map of pixels to images</p> <pre><code>sprawl[Tuple&lt;x=10, y=33,color=f1fefd&gt;] =&gt; { image17, image23, ... } sprawl[Tuple&lt;x=10, y=34,color=f1fef0&gt;] =&gt; { image11 ... } </code></pre> <p>and then using that map to identify the minimal set of signature pixels for each image.</p> <p>If a pixel (identified by x, y, color) references just one image then I have found a perfect (minimal) signature for that image.</p> <p>It's more complicated if an image has no unique pixels, but since I know all images are unique within the list I should be able to combine two or more pixel references (but as few as possible) to deduce the image.</p> <p><strong>Update</strong> </p> <p>I've been working on an algorithm for this. My problem is very similar to <a href="https://stackoverflow.com/questions/2249908/optimized-ocr-black-white-pixel-algorithm">this one</a>, and I've written up my algorithm as an <a href="https://stackoverflow.com/questions/2249908/optimized-ocr-black-white-pixel-algorithm/2873004#2873004">answer to that question</a>. This update is to flag the attention of anyone still following (I see five bookmarks). I'm working on this in isolation so any and all feedback is welcome, even if just to observe that I haven't made myself clear!</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload