Note that there are some explanatory texts on larger screens.

plurals
  1. POCompare two images the python/linux way
    primarykey
    data
    text
    <p>Trying to solve a problem of preventing duplicate images to be uploaded.</p> <p>I have two JPGs. Looking at them I can see that they are in fact identical. But for some reason they have different file size (one is pulled from a backup, the other is another upload) and so they have a different md5 checksum. </p> <p>How can I efficiently and confidently compare two images in the same sense as a human would be able to see that they are clearly identical?</p> <p>Example: <a href="http://static.peterbe.com/a.jpg" rel="noreferrer">http://static.peterbe.com/a.jpg</a> and <a href="http://static.peterbe.com/b.jpg" rel="noreferrer">http://static.peterbe.com/b.jpg</a></p> <p><strong>Update</strong> </p> <p>I wrote this script:</p> <pre><code>import math, operator from PIL import Image def compare(file1, file2): image1 = Image.open(file1) image2 = Image.open(file2) h1 = image1.histogram() h2 = image2.histogram() rms = math.sqrt(reduce(operator.add, map(lambda a,b: (a-b)**2, h1, h2))/len(h1)) return rms if __name__=='__main__': import sys file1, file2 = sys.argv[1:] print compare(file1, file2) </code></pre> <p>Then I downloaded the two visually identical images and ran the script. Output:</p> <pre><code>58.9830484122 </code></pre> <p>Can anybody tell me what a suitable cutoff should be?</p> <p><strong>Update II</strong></p> <p>The difference between a.jpg and b.jpg is that the second one has been saved with PIL:</p> <pre><code>b=Image.open('a.jpg') b.save(open('b.jpg','wb')) </code></pre> <p>This apparently applies some very very light quality modifications. I've now solved my problem by applying the same PIL save to the file being uploaded without doing anything with it and it now works!</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload