Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    text
    copied!<p>There is a quick and dirty solution, and that's simply sliding a window over the target image and computing some measure of similarity at each location, then picking the location with the highest similarity. Then you compare the similarity to a threshold, if the score is above the threshold, you conclude the image is there and that's the location; if the score is below the threshold, then the image isn't there.</p> <p>As a similarity measure, you can use normalized correlation or sum of squared differences (aka L2 norm). As people mentioned, this will not deal with scale changes. So you also rescale your original image multiple times and repeat the process above with each scaled version. Depending on the size of your input image and the range of possible scales, this may be good enough, and it's easy to implement.</p> <p>A proper solution is to use affine invariants. Try looking up "wide-baseline stereo matching", people looked at that problem in that context. The methods that are used are generally something like this:</p> <h2>Preprocessing of the original image</h2> <ul> <li>Run an "interest point detector". This will find a few points in the image which are easily localizable, e.g. corners. There are many detectors, a detector called "harris-affine" works well and is pretty popular (so implementations probably exist). Another option is to use the Difference-of-Gaussians (DoG) detector, it was developed for SIFT and works well too.</li> <li>At each interest point, extract a small sub-image (e.g. 30x30 pixels)</li> <li>For each sub-image, compute a "descriptor", some representation of the image content in that window. Again, many descriptors exist. Things to look at are how well the descriptor describes the image content (you want two descriptors to match only if they are similar) and how invariant it is (you want it to be the same even after scaling). In your case, I'd recommend using SIFT. It is not as invariant as some other descriptors, but can cope with scale well, and in your case scale is the only thing that changes.</li> </ul> <p>At the end of this stage, you will have a set of descriptors.</p> <h2>Testing (with the new test image).</h2> <ul> <li>First, you run the same interest point detector as in step 1 and get a set of interest points. You compute the same descriptor for each point, as above. Now you have a set of descriptors for the target image as well.</li> <li>Next, you look for matches. Ideally, to each descriptor from your original image, there will be some pretty similar descriptor in the target image. (Since the target image is larger, there will also be "leftover" descriptors, i.e. points that don't correspond to anything in the original image.) So if enough of the original descriptors match with enough similarity, then you know the target is there. Moreover, since the descriptors are location-specific, you will also know where in the target image the original image is.</li> </ul>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload