Note that there are some explanatory texts on larger screens.

plurals
  1. POFinding mean with least number of iterations
    text
    copied!<p>I have a list of measurements with the following properties:</p> <ol> <li>The measurements are expensive. Fewer measurements -> better</li> <li>They are all positive. In fact, there is a positive lower limit and I can't get any values below that. This lower limit is what I need to know with some confidence.</li> <li>They will distribute around one or more median values</li> <li>I know that there is another "better" median when I find an outlier which is smaller than <code>median - 2*variance</code> because the distance between the "best" median and the lower limit is always smaller than two times the width of the normal distribution</li> </ol> <p>Goal: Find the best median with the least amount of iterations with a confidence of, say, 90%.</p> <p>I'd prefer the smallest value but the smallest median is good enough.</p> <p>What I'm looking for is a piece of code where I feed the measurements and which tells me the median and how confident it is that this median is the one I seek.</p> <p>Background: I want to time Java methods. I could run the test for a couple of minutes to average outliers out but when looking at the data, it's pretty obvious for a human that the values quickly accumulate around the median value.</p> <p>Unless the JIT kicks in and the median suddenly jumps. Eventually, you will end up with a curve that is very steep left of the smallest median (i.e. the variance on the left side of the median is low) and a long, soft slope on the right side with a bump where the pre-JIT median was.</p> <p><a href="http://www.pdark.de/7ac2ce265a6a19ff53cb8591536ce8c2eae8ea74/performance.zip" rel="nofollow">Sample test data (13KB)</a></p> <p><code>testConnect-count.csv</code> is a histogram of the values, <code>testConnect-history.csv</code> is the sequence of measurements. The goal is find an algorithm which returns the smaller median around <code>115000</code> by reading the smallest number of values from <code>testConnect-history.csv</code></p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload