Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    text
    copied!<p>In order to compute the warping, you need to compute a <a href="http://en.wikipedia.org/wiki/Homography" rel="noreferrer">homography</a> between the four corners of your input rectangle and the screen.</p> <p>Since your webcam polygon seems to have an arbitrary shape, a full perspective homography can be used to convert it to a rectangle. It's not that complicated, and you can solve it with a mathematical function (should be easily available) known as <a href="http://en.wikipedia.org/wiki/Singular_value_decomposition" rel="noreferrer">Singular Value Decomposition or SVD</a>.</p> <p><strong>Background information:</strong></p> <p>For planar transformations like this, you can easily describe them with a homography, which is a 3x3 matrix <code>H</code> such that if any point on or in your webcam polygon, say <code>x1</code> were multiplied by <code>H</code>, i.e. <code>H*x1</code>, we would get a point on the screen (rectangular), i.e. <code>x2</code>.</p> <p>Now, note that these points are represented by their homogeneous coordinates which is nothing but adding a third coordinate (the reason for which is beyond the scope of this post). So, suppose your coordinates for <code>X1</code> were, <code>(100,100)</code>, then the homogeneous representation would be a column vector <code>x1 = [100;100;1]</code> (where <code>;</code> represents a new row).</p> <p>Ok, so now we have 8 homogeneous vectors representing 4 points on the webcam polygon and the 4 corners of your screen - this is all we need to compute a homography.</p> <p><strong>Computing the homography:</strong></p> <p><strong><em>A little math:</em></strong> I'm not going to get into the math, but briefly this is how we solve it:</p> <p>We know that 3x3 matrix <code>H</code>,</p> <pre><code>H = h11 h12 h13 h21 h22 h23 h31 h32 h33 where hij represents the element in H at the ith row and the jth column </code></pre> <p>can be used to get the new screen coordinates by <code>x2 = H*x1</code>. Also, the result will be something like <code>x2 = [12;23;0.1]</code> so to get it in the screen coordinates, we normalize it by the third element or <code>X2 = (120,230)</code> which is <code>(12/0.1,23/0.1)</code>.</p> <p>So this means each point in your webcam polygon (<code>WP</code>) can be multiplied by <code>H</code> (and then normalized) to get your screen coordinates (<code>SC</code>), i.e.</p> <pre><code>SC1 = H*WP1 SC2 = H*WP2 SC3 = H*WP3 SC4 = H*WP4 where SCi refers to the ith point in screen coordinates and WPi means the same for the webcam polygon </code></pre> <p><strong><em>Computing H:</em></strong> (the quick and painless explanation)</p> <p>Pseudocode:</p> <pre><code>for n = 1 to 4 { // WP_n refers to the 4th point in the webcam polygon X = WP_n; // SC_n refers to the nth point in the screen coordinates // corresponding to the nth point in the webcam polygon // For example, WP_1 and SC_1 is the top-left point for the webcam // polygon and the screen coordinates respectively. x = SC_n(1); y = SC_n(2); // A is the matrix which we'll solve to get H // A(i,:) is the ith row of A // Here we're stacking 2 rows per point correspondence on A // X(i) is the ith element of the vector X (the webcam polygon coordinates, e.g. (120,230) A(2*n-1,:) = [0 0 0 -X(1) -X(2) -1 y*X(1) y*X(2) y]; A(2*n,:) = [X(1) X(2) 1 0 0 0 -x*X(1) -x*X(2) -x]; } </code></pre> <p>Once you have A, just compute <code>svd(A)</code> which will give decompose it into <em>U,S,V<sup>T</sup></em> (such that <em>A = USV<sup>T</sup></em>). The vector corresponding to the smallest singular value is <code>H</code> (once you reshape it into a 3x3 matrix). </p> <p>With <code>H</code>, you can retrieve the "warped" coordinates of your widget marker location by multiplying it with <code>H</code> and normalizing.</p> <p><strong>Example:</strong></p> <p>In your particular example if we assume that your screen size is 800x600,</p> <pre><code>WP = 98 119 583 569 86 416 80 409 1 1 1 1 SC = 0 799 0 799 0 0 599 599 1 1 1 1 where each column corresponds to corresponding points. </code></pre> <p>Then we get:</p> <pre><code>H = -0.0155 -1.2525 109.2306 -0.6854 0.0436 63.4222 0.0000 0.0001 -0.5692 </code></pre> <p>Again, I'm not going into the math, but if we normalize <code>H</code> by <code>h33</code>, i.e. divide each element in <code>H</code> by <code>-0.5692</code> in the example above,</p> <pre><code>H = 0.0272 2.2004 -191.9061 1.2042 -0.0766 -111.4258 -0.0000 -0.0002 1.0000 </code></pre> <p>This gives us a lot of insight into the transformation. </p> <ul> <li><code>[-191.9061;-111.4258]</code> defines the translation of your points (in pixels)</li> <li><code>[0.0272 2.2004;1.2042 -0.0766]</code> defines the <a href="http://en.wikipedia.org/wiki/Transformation_matrix#Affine_transformations" rel="noreferrer">affine transformation</a> (which is essentially scaling and rotation). </li> <li>The last <code>1.0000</code> is so because we scaled <code>H</code> by it and </li> <li><code>[-0.0000 -0.0002]</code> denotes the projective transformation of your webcam polygon.</li> </ul> <p>Also, you can check if <code>H</code> is accurate my multiplying <code>SC = H*WP</code> and normalizing each column with its last element:</p> <pre><code>SC = H*WP 0.0000 -413.6395 0 -411.8448 -0.0000 0.0000 -332.7016 -308.7547 -0.5580 -0.5177 -0.5554 -0.5155 </code></pre> <p>Dividing each column, by it's last element (e.g. in column 2, <code>-413.6395/-0.5177</code> and <code>0/-0.5177</code>):</p> <pre><code>SC -0.0000 799.0000 0 799.0000 0.0000 -0.0000 599.0000 599.0000 1.0000 1.0000 1.0000 1.0000 </code></pre> <p>Which is the desired result.</p> <p><strong>Widget Coordinates:</strong></p> <p>Now, your widget coordinates can be transformed as well <code>H*[452;318;1]</code>, which (after normalizing is <code>(561.4161,440.9433)</code>.</p> <p>So, this is what it would look like after warping: <img src="https://imgur.com/QLNDQ.png" alt="Warped"></p> <p>As you can see, the green <code>+</code> represents the widget point after warping.</p> <p><strong>Notes:</strong></p> <ol> <li>There are some nice pictures in this <a href="http://plus.maths.org/issue23/features/criminisi/index.html" rel="noreferrer">article</a> explaining homographies.</li> <li>You can play with transformation matrices <a href="http://mathforum.org/mathimages/index.php/Transformation_Matrix" rel="noreferrer">here</a></li> </ol> <h3>MATLAB Code:</h3> <pre><code>WP =[ 98 119 583 569 86 416 80 409 1 1 1 1 ]; SC =[ 0 799 0 799 0 0 599 599 1 1 1 1 ]; A = zeros(8,9); for i = 1 : 4 X = WP(:,i); x = SC(1,i); y = SC(2,i); A(2*i-1,:) = [0 0 0 -X(1) -X(2) -1 y*X(1) y*X(2) y]; A(2*i,:) = [X(1) X(2) 1 0 0 0 -x*X(1) -x*X(2) -x]; end [U S V] = svd(A); H = transpose(reshape(V(:,end),[3 3])); H = H/H(3,3); </code></pre> <h3>A</h3> <pre><code> 0 0 0 -98 -86 -1 0 0 0 98 86 1 0 0 0 0 0 0 0 0 0 -119 -416 -1 0 0 0 119 416 1 0 0 0 -95081 -332384 -799 0 0 0 -583 -80 -1 349217 47920 599 583 80 1 0 0 0 0 0 0 0 0 0 -569 -409 -1 340831 244991 599 569 409 1 0 0 0 -454631 -326791 -799 </code></pre>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload