StackOverflow2013

Note that there are some explanatory texts on larger screens.

plurals

PORecognizing similar shapes at random scale and translation
primarykey
Id
14921347
data
AcceptedAnswerId
15091737
AnswerCount
3
ClosedDate
CommentCount
5
CommunityOwnedDate
CreationDate
2013-02-17T13:01:20.320
FavoriteCount
5
LastActivityDate
2013-03-05T16:49:18.767
LastEditDate
2013-02-26T22:50:28.557
LastEditorUserId
172211
OwnerUserId
172211
ParentId
0
PostTypeId
1
Score
8
ViewCount
1997
LastEditorDisplayName
text
Body
Playing around with finding stuff on a graphical screen, I'm currently at a loss about how to find a given shape within an image. The shape in the image could have a different scale and will be at some unknown x,y offset, of course. Aside from pixel artifacts resulting from different scales, there is also a little noise in both images, so I need a somewhat tolerant search. Here's the image I am looking for. <img src="https://i.stack.imgur.com/jrIB2.png" alt="Farmerama frame"> It should show up somewhere in a screen dump of my (dual) screen buffer, roughly 3300 x 1200 pixels in size. I'd of course expect to find it in a browser window, but that information shouldn't be necessary. The object of this exercise (so far) is to come up with a result that says: <ul> <li>Yes, the wooden frame (of that approximate color and that, possibly slightly truncated, shape) was found on my screen (or not); and</li> <li>the game's client area (the black area inside the frame) occupies the rectangle from <code>(x1,y1)</code> to <code>(x2,y2)</code>.</li> </ul> I would like to be robust against scaling and the noise that's likely to be introduced by dithering. On the other hand, I can rule out some of the usual CV challenges, such as rotation or non-rigidity. That frame shape is dead easy for the human brain to discern, how hard can it be for a dedicated piece of software? This is an Adobe Flash application, and until recently I had thought that perceiving the images from a game GUI should be easy as pie. I'm looking for an algorithm that is able to find the x,y translation at which the greatest possible overlap between the needle and haystack occur, and if possible without having to be iterated through a series of possible scale factors. Ideally, the algorithm could abstract out the "shape-ness" of the images in a way that's independent of scale. I've read some interesting things about Fourier Transforms to accomplish something similar: Given a target image at the same scale, FFT and some matrix math yielded up the points in the bigger image that corresponded to the search pattern. But I don't have the theoretical background to put this into practice, nor do I know if this approach will gracefully handle the scale problem. Help would be appreciated! Technology: I'm programming in Clojure/Java but could adapt algorithms in other languages. I think I should be able to interface with libraries that follow C calling conventions but I would prefer a pure Java solution. <hr> You may be able to understand why I've shied away from presenting the actual image. It's just a silly game, but the task of screen-reading it is proving much more challenging than I had thought. I'm obviously able to do an exhaustive search of my screen buffer for the very pixels (excluding the black) that make up my image, and that even runs in under a minute. But my ambition was to find that wooden frame using a technique that would match the shape regardless of differences that might arise from scaling and dithering. Dithering, in fact, is one of many frustrations I'm having with this project. I've been working on extracting some useful vectors by edge extraction, but edges are woefully elusive because the pixels of any given area have widely inconsistent colors - so it's hard to tell real edges from local dithering artifacts. I had no idea that such a simple-looking game would produce graphics that are so hard for software to perceive. Should I start off by locally averaging pixels before I start looking for features? Should I reduce color depth by throwing out the least significant bits of the pixel color values? I'm trying for a pure Java solution (actually programming in Clojure/Java mix) so I'm not wild about opencv (which installs .DLL's or .so's with C code). Please don't worry about my choice of language, the learning experience is much more interesting to me than performance.
Tags
<opencv><image-processing><computer-vision><fft><image-recognition>
Title
Recognizing similar shapes at random scale and translation
singulars
PostAcceptedAnswerId
1. PO
 singulars
 PostTypePostTypeId
 PTAnswer
PostParentId
1. This table or related slice is empty.
PostTypePostTypeId
1. PTQuestion
UserLastEditorUserId
1. USCarl Smotricz
UserOwnerUserId
1. USCarl Smotricz
plurals
PostLinksPostIdRelatedPostId
1. This table or related slice is empty.
PostLinksRelatedPostIdPostId
1. This table or related slice is empty.
PostsAcceptedAnswerId
1. This table or related slice is empty.
PostsParentIdCreationDate
1. PO
 singulars
 PostTypePostTypeId
 PTAnswer
2. PO
 singulars
 PostTypePostTypeId
 PTAnswer
3. PO
 singulars
 PostTypePostTypeId
 PTAnswer
VotesPostIdCreationDate
1. VO
 singulars
 PostPostId
 PORecognizing similar shapes at random scale and translation
 UserUserId
 USCarl Smotricz
 VoteTypeVoteTypeId
 VTBountyStart
2. VO
 singulars
 PostPostId
 PORecognizing similar shapes at random scale and translation
 UserUserId
 This table or related slice is empty.
 VoteTypeVoteTypeId
 VTUpMod
3. VO
 singulars
 PostPostId
 PORecognizing similar shapes at random scale and translation
 UserUserId
 USnick_w
 VoteTypeVoteTypeId
 VTFavorite
CommentsPostId

Querying!

Guidance

A row detail

Detail views are divided into sections. All the information in the data section comes from columns in the selected row. The other sections display data from other, related rows.

Related data can be related in a to-one or a to-many fashion. Captions of data related in a to-many fashion link to a list view showing a filtered view of the table.

Try moving around until you find a non-empty to-many entry and click on the label to get to one. You can move back to the root by clicking on the database name in the header.