StackOverflow2013

Note that there are some explanatory texts on larger screens.

plurals

POAlgorithm to detect corners of paper sheet in photo
text
Body
copied!<p>What is the best way to detect the corners of an invoice/receipt/sheet-of-paper in a photo? This is to be used for subsequent perspective correction, before OCR.</p> <h2>My current approach has been:</h2> <p>RGB > Gray > Canny Edge Detection with thresholding > Dilate(1) > Remove small objects(6) > clear boarder objects > pick larges blog based on Convex Area. > [corner detection - Not implemented]</p> <p>I can't help but think there must be a more robust 'intelligent'/statistical approach to handle this type of segmentation. I don't have a lot of training examples, but I could probably get 100 images together.</p> <h2>Broader context:</h2> <p>I'm using matlab to prototype, and planning to implement the system in OpenCV and Tesserect-OCR. This is the first of a number of image processing problems I need to solve for this specific application. So I'm looking to roll my own solution and re-familiarize myself with image processing algorithms. </p> <p>Here are some sample image that I'd like the algorithm to handle: If you'd like to take up the challenge the large images are at <a href="http://madteckhead.com/tmp">http://madteckhead.com/tmp</a> </p> <p><a href="http://madteckhead.com/tmp/IMG_0773_sml.jpg">case 1 http://madteckhead.com/tmp/IMG_0773_sml.jpg</a> <a href="http://madteckhead.com/tmp/IMG_0774_sml.jpg">case 2 http://madteckhead.com/tmp/IMG_0774_sml.jpg</a> <a href="http://madteckhead.com/tmp/IMG_0775_sml.jpg">case 3 http://madteckhead.com/tmp/IMG_0775_sml.jpg</a> <a href="http://madteckhead.com/tmp/IMG_0776_sml.jpg">case 4 http://madteckhead.com/tmp/IMG_0776_sml.jpg</a></p> <h2>In the best case this gives:</h2> <p><a href="http://madteckhead.com/tmp/IMG_0773_canny.jpg">case 1 - canny http://madteckhead.com/tmp/IMG_0773_canny.jpg</a> <a href="http://madteckhead.com/tmp/IMG_0773_postcanny.jpg">case 1 - post canny http://madteckhead.com/tmp/IMG_0773_postcanny.jpg</a> <a href="http://madteckhead.com/tmp/IMG_0773_blob.jpg">case 1 - largest blog http://madteckhead.com/tmp/IMG_0773_blob.jpg</a></p> <h2>However it fails easily on other cases:</h2> <p><a href="http://madteckhead.com/tmp/IMG_0774_canny.jpg">case 2 - canny http://madteckhead.com/tmp/IMG_0774_canny.jpg</a> <a href="http://madteckhead.com/tmp/IMG_0774_postcanny.jpg">case 2 - post canny http://madteckhead.com/tmp/IMG_0774_postcanny.jpg</a> <a href="http://madteckhead.com/tmp/IMG_0774_blob.jpg">case 2 - largest blog http://madteckhead.com/tmp/IMG_0774_blob.jpg</a></p> <p>Thanks in advance for all the great ideas! I love SO!</p> <h2>EDIT: Hough Transform Progress</h2> <p>Q: What algorithm would cluster the hough lines to find corners? Following advice from answers I was able to use the Hough Transform, pick lines, and filter them. My current approach is rather crude. I've made the assumption the invoice will always be less than 15deg out of alignment with the image. I end up with reasonable results for lines if this is the case (see below). But am not entirely sure of a suitable algorithm to cluster the lines (or vote) to extrapolate for the corners. The Hough lines are not continuous. And in the noisy images, there can be parallel lines so some form or distance from line origin metrics are required. Any ideas?</p> <p><a href="http://madteckhead.com/tmp/IMG_0773_hough.jpg">case 1 http://madteckhead.com/tmp/IMG_0773_hough.jpg</a> <a href="http://madteckhead.com/tmp/IMG_0774_hough.jpg">case 2 http://madteckhead.com/tmp/IMG_0774_hough.jpg</a> <a href="http://madteckhead.com/tmp/IMG_0775_hough.jpg">case 3 http://madteckhead.com/tmp/IMG_0775_hough.jpg</a> <a href="http://madteckhead.com/tmp/IMG_0776_hough.jpg">case 4 http://madteckhead.com/tmp/IMG_0776_hough.jpg</a></p>

Querying!

Guidance

An individual column

Larger individual text columns get their own page to allow for proper reading.

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload