Note that there are some explanatory texts on larger screens.

plurals
  1. POMeasure variation of data points from a line; To Catch a Dip
    primarykey
    data
    text
    <h1>How can I measure this area in C++?</h1> <p><em>(update: I posted the solution and code as an answer rather than edit the question again)</em></p> <p><img src="https://i.stack.imgur.com/uThZH.png" alt="How to quantify the blue area between ideal curve and measured curve"><br> The ideal line (dashed red) is the plot from starting point with the average rise added with each angle of measurement; this I obtain via average. I measured the test data in black. How can I quantify the area of the dip in blue? X-axis is unitized, so slopes and math are simplified.</p> <p>I could determine a cutoff for the size of areas like this and then flag this part for retesting or failure. Rarely, there is another dip that appears closer to the right, but setting a cutoff value for standard deviation usually fails those parts.</p> <h2>Update</h2> <p>Diego's answer helped me visualize this. Now that I can see what I'm trying to do, I'll work on the algorithm to implement the "homemade dip detector". :)<br> <img src="https://i.stack.imgur.com/0PQJw.png" alt="Better visualization of the problem"></p> <hr> <h2>Why?</h2> <p>I created a <a href="https://forum.sparkfun.com/viewtopic.php?f=14&amp;t=36254" rel="nofollow noreferrer" title="Forum thread describing the hardware">test bench</a> to test throttle position sensors I'm selling. I'm trying to programatically quantify how straight the plot is by analyzing the data collected. This one particular model is vexing me.</p> <p>Sample plot of a part I prefer not to sell: <img src="https://i.stack.imgur.com/gfx2w.jpg" alt="Test data has a curve in it"></p> <p>The X axis are evenly spaced angles of throttle opening. The stepper motor turns the input shaft, stopping every 0.75° to measure the output on a 10 bit ADC, which gets translated to the Y axis. The plot is the translation of <code>data[idx]</code> to <code>idx,value</code> mapped to <code>(x,y)</code> bitmap coordinates. Then I draw lines between the points within the bitmap using Bresenham's algorithm.</p> <p><em>My other TPS products produce <a href="http://ca-cycleworks.com/media/catalog/product/p/f/pf1c_printout.jpg" rel="nofollow noreferrer" title="Picture of a PF1C tps plot">amazingly linear output</a>.</em></p> <p>The lower (left) portion of the plot is crucial to normal usage of any motor vehicle; it's when you're driving around town, entering parking lots, etc. This particular part has a tendency to develop a dip around 15° opening and I wish to use the program to quantify this "dip" in the curve and rely less upon the tester's intuition. In the above example, the plot dips but doesn't return to what an ideal line might be.</p> <p>Even though this is an embedded application, printing the report takes 10 seconds, thus I do not consider stepping through an array of 120 points of data multiple times a waste of cycles. Also, since I'm using a <a href="http://www.digilentinc.com/Products/Detail.cfm?Prod=CHIPKIT-UC32" rel="nofollow noreferrer" title="Manufacturer Website">uC32 PIC32 microcontroller</a>, there's plenty of memory, so I have the luxury of being able to ponder this problem within the controller.</p> <hr> <h2>What I'm trying already</h2> <p><strong>Array of rise between test points:</strong> I dismiss the X-axis entirely, considering it unitized, and then make an array of change from one reading to the next. This array is what contributes to the report's "Min rise between points: 0 Max: 14". I call this array <kbd>deltas</kbd>.</p> <p>I've tried using <strong>standard deviation</strong> on <kbd>deltas</kbd>, however, during testing I have found that a low Std Dev is not a reliable measure for this part. If the dip quickly returns to the original line implied by early data points, the Std Dev can be deceptively low (observed to be as low as 2.3) but the part is still something I wouldn't want to use. I tried setting a cutoff at 2.6, but it failed too many parts with great plots. <em>The other, more linear part linked to above can reliably count on Std Dev for quality.</em></p> <p><strong>Kurtosis</strong> seems not to apply for this situation at all. I learned of <a href="http://en.wikipedia.org/wiki/Kurtosis" rel="nofollow noreferrer" title="Wiki article">Kurtosis</a> today and found a <a href="http://www.johndcook.com/skewness_kurtosis.html" rel="nofollow noreferrer">Statistics Library</a> which includes Kurtosis and Skewness. During continued testing, I found that of these two measures, there was not a trend of positive, negative, or amplitude which would correspond to either passing or failing. That same gentleman has shared a linear regression library, but I believe Lin Reg is unrelated to my situation, as I am comfortable with the assumption of the AVG of <kbd>deltas</kbd> being my ideal line. Linear Regression and R^2 are more for finding a line from less ideal data or much larger sets.</p> <p><strong>Comparing each delta to AVG and Std Dev</strong> I set up a monitor to check each delta against final average of the <kbd>deltas</kbd>'s data. Here, too, I couldn't find a reliable metric. Too many good parts would not pass a test restricting any delta to within 2x Std Dev away from the Average. Ultimately, the only variation from AVG I could settle on is to be within <code>AVG+Std Dev</code> difference from the AVG itself. Anything more restrictive would fail otherwise good parts. And the elusive dip around 15° opening can sneak through this test.</p> <p><strong>Homemade dip detector</strong> When feeding <kbd>deltas</kbd> to the serial monitor of the computer, I observed consecutive negative <kbd>deltas</kbd> during the dip, so I programmed in a dip detector, but it feels very crude to me. If there are 5 or more negative <kbd>deltas</kbd> in a row, I sum them. I have seen that if I take that sum the dip's differences from AVG then divide by the number of negative deltas, a value over 2.9 or 3 could mean a fail. I have observed dips lasting from 6 to 15 deltas. Readily observable dips would have their differences from AVG sum up to -35.</p> <p><strong>Trending accumulated variation from the AVG</strong> The above made me think watching the summation of <kbd>deltas</kbd> as it wanders away from AVG could be the answer. Meaning, I step through the array and sum the differences of each delta from AVG. I thought I was on to something until a good part blew this theory. I was seeing a trend of the fewer times the running sum varied from <code>AVG</code> by less than <code>2x AVG</code>, the more straight the line appeared. Many ideal parts would only show 8 or less delta points where the <code>sumOfDiffs</code> would stray from the AVG very far.</p> <pre><code>float sumOfDiffs=0.0; for( int idx=0; idx&lt;stop; idx++ ){ float spread = deltas[idx] - line-&gt;AdcAvgRise; sumOfDiffs = sumOfDiffs + spread; ... testVal = 2*line-&gt;AdcAvgRise; if( sumOfDiffs &gt; testVal || sumOfDiffs &lt; -testVal ){ flag = 'S'; } ... } </code></pre> <p>And then a part with a fantastic linear plot came through with 58 data points where <code>sumOfDiffs</code> was more than twice the AVG! I find this amazing, as at the end of the ~120 data points, <code>sumOfDiffs</code> value is -0.000057. </p> <p>During testing, the final <code>sumOfDiffs</code> result would often register as 0.000000 and only on exceptionally bad parts would it be greater than .000100. I found this quite surprising, actually: how a "bad part" can have accumulated great accuracy.</p> <p><strong>Sample output from monitoring sumOfDiffs</strong> This below output shows a dip happening. The test watches as the running <code>sumOfDiffs</code> is more than 2x the AVG away from the AVG for the whole test. This dip lasts from <kbd>deltas</kbd> <code>idx</code> of 23 through 49; starts at 17.25° and lasts for 19.5°. </p> <pre><code>Avg rise: 6.75 Std dev: 2.577 idx: delta diff from avg sumOfDiffs Flag 23: 5 -1.75 -14.05 S 24: 6 -0.75 -14.80 S 25: 7 0.25 -14.55 S 26: 5 -1.75 -16.30 S 27: 3 -3.75 -20.06 S 28: 3 -3.75 -23.81 S 29: 7 0.25 -23.56 S 30: 4 -2.75 -26.31 S 31: 2 -4.75 -31.06 S 32: 8 1.25 -29.82 S 33: 6 -0.75 -30.57 S 34: 9 2.25 -28.32 S 35: 8 1.25 -27.07 S 36: 5 -1.75 -28.82 S 37: 15 8.25 -20.58 S 38: 7 0.25 -20.33 S 39: 5 -1.75 -22.08 S 40: 9 2.25 -19.83 S 41: 10 3.25 -16.58 S 42: 9 2.25 -14.34 S 43: 3 -3.75 -18.09 S 44: 6 -0.75 -18.84 S 45: 11 4.25 -14.59 S 47: 3 -3.75 -16.10 S 48: 8 1.25 -14.85 S 49: 8 1.25 -13.60 S Final Sum of diffs: 0.000030 RunningStats analysis: NumDataValues= 125 Mean= 6.752 StandardDeviation= 2.577 Skewness= 0.251 Kurtosis= -0.277 </code></pre> <hr> <p><strong>Sobering note about quality:</strong> what started me on this journey was learning how major automotive OEM suppliers consider a 4 point test to be the standard measure for these parts. My <a href="https://www.facebook.com/photo.php?fbid=10151332087166366&amp;set=a.10151332087146366.1073741825.144922416365&amp;type=3&amp;theater" rel="nofollow noreferrer" title="Pic of v1 bench on Facebook">first test bench</a> used an Arduino with 8k of RAM, didn't have a TFT display nor a printer, and a mechanical resolution of only 3°! Back then I simply tested <kbd>deltas</kbd> being within arbitrary total bounds and choosing a limit of how big any single delta could be. My 120+ point test feels high class compared to that 30 point test from before, but that test had no idea about these dips.</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload