Note that there are some explanatory texts on larger screens.

plurals
  1. POFuzzy Template matching?
    primarykey
    data
    text
    <p>I'm attempting to wrap my head around the basics of CV. The bit that initially got me interested was template matching (it was mentioned in a Pycon talk unrelated to CV), so I figured I'd start there. </p> <p>I started with this image: </p> <p><img src="https://i.stack.imgur.com/cn7PB.jpg" alt="Scene from SMB3"></p> <p>Out of which I want to detect Mario. So I cut him out:</p> <p><img src="https://i.stack.imgur.com/auXZU.png" alt="The Plumber"></p> <p>I understand the concept of sliding the template around the image to see the best fit, and following a tutorial, I'm able to find mario with the following code: </p> <pre><code>def match_template(img, template): s = time.time() img_size = cv.GetSize(img) template_size = cv.GetSize(template) img_result = cv.CreateImage((img_size[0] - template_size[0] + 1, img_size[1] - template_size[1] + 1), cv.IPL_DEPTH_32F, 1) cv.Zero(img_result) cv.MatchTemplate(img, template, img_result, cv.CV_TM_CCORR_NORMED) min_val, max_val, min_loc, max_loc = cv.MinMaxLoc(img_result) # inspect.getargspec(cv.MinMaxLoc) print min_val print max_val print min_loc print max_loc cv.Rectangle(img, max_loc, (max_loc[0] + template.width, max_loc[1] + template.height), cv.Scalar(120.), 2) print time.time() - s cv.NamedWindow("Result") cv.ShowImage("Result", img) cv.WaitKey(0) cv.DestroyAllWindows() </code></pre> <p>So far so good, but then I came to realize that this is incredibly fragile. It will only ever find Mario with that specific background, and with that specific animation frame being displayed. </p> <p>So I'm curious, given that Mario will always have the same Mario-ish attributes, (size, colors) is there a technique with which I could find him regardless of whether his currect frame is standing still, or one of the various run cycle sprites? Kind of like fuzzy matching that you can do on strings, but for images. </p> <p>Maybe since he's the only red thing, there is a way of simply tracking the red pixels? </p> <p>The whole other issue is removing the background from the template. Maybe that would help the MatchTemplate function find Mario even though he doesn't exactly match the tempate? As of now, I'm not entirely sure how that would work ( I see that there is a mask param in MatchTemplate, but I'll have to investigate further) </p> <p>My main question is whether or not template matching is the way to go about detecting an image that is mostly the same, but varies (like when he's walking), or is there another technique I should look into? </p> <h2>Update:</h2> <h2>Attempts at matching other Marios</h2> <hr> <p>Going off of mmgp's suggestion that it should be workable for matching other things, I ran a couple of tests.</p> <p>I used this as the template to match: </p> <p><img src="https://i.stack.imgur.com/EYs9B.png" alt="Super mario"></p> <p>And then took a couple of screen shots to test the matching against. </p> <p>For the first, I successfully find Mario, and get a max value of 1. </p> <p><img src="https://i.stack.imgur.com/RyYor.png" alt="enter image description here"></p> <p>However, trying to find jumping Mario results in a complete misfire. </p> <p><img src="https://i.stack.imgur.com/zBL1Y.png" alt="Misfire"></p> <p>Now granted, the mario in the template, and the mario in the scene is facing opposite directions, as well as being different animation frames, but I would think they still match a <em>lot</em> more than anything else in the image -- if only for the colors alone. But it targets the platform as being the closest match to the template. </p> <p>Note that the max value for this one was <code>0.728053808212</code>.</p> <p>Next I tried a scene <em>without</em> mario to see what would happen. </p> <p><img src="https://i.stack.imgur.com/szqDb.png" alt="enter image description here"></p> <p>But oddly enough, I get the <em>exact</em> result as the image with jumping mario -- right down to the similarity value: <code>0.728053808212</code>. Mario being in the picture is just as accurate as him <em>not</em> being in the picture. </p> <p>Really strange! I don't know the actual details of the underlying algorithm, but I'd imagine, from a standard deviation perspective, the boxes in the scene that at least match the Red in template Mario's suit would be closer to the mean distance than a blue platform, right? So, it's extra confusing that it's not even in the general area of where I would expect it to be. </p> <p>I'm guessing this is user error on my end, or maybe just a misunderstanding. </p> <p>Why would a scene with a similar Mario have as much of a match as a scene with no Mario at all? </p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. COGiven the nature of the application you are working with, template matching can work pretty well and should be considered until you can show a good example of where it fails. Your application doesn't require neither scale invariance or translation invariance, and only a bit of rotation invariance -- the only problem I can see is when Mario is ducking. To solve that, just consider a different template where Mario is ducking and then you do the template matching twice. Doing the matching by red colors is certainly possible, but you will need to combine it with another tool to make it adequate.
      singulars
    2. CO@mmgp I added a few examples of the failures I'm getting. I'm having pretty poor luck matching anything other than something which has an exact copy of my template. I added a couple pictures so you can see what's happening.
      singulars
    3. COFair enough. Here is a result I can get: http://i.imgur.com/6RfS51J.png (yellow rectangle), for the image without Mario the maximum cross-correlated point I got was ~0.34. Clarification: I used only the red channel, from which I calculated interior morphological gradient from both input and template, and only then I performed the cross-correlation. The maximum value I got was ~0.44, which is low but justifiable. The template is very different from the Mario in that other frame, for instance now I see two hands, two eyes, two feet, and about everything else is at least slightly different too.
      singulars
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload