StackOverflow2013

Note that there are some explanatory texts on larger screens.

plurals

PO
text
Body
copied!<p>I think a good metric will be:</p> <pre><code>let the result set be s1,s2,...,sk let MAX be max{sum(si) for each i} f({s1,...,sk}) = Sigma(MAX-sum(si)) for each i) </code></pre> <p>the upside: a perfect distribution will yield always 0! <br>the downside: if there is none perferct solution, the best result will not yield 0.</p> <p>a greedy heuristic for this problem will be:</p> <pre><code>sorted<-sort(S) (let's say sorted[0] is the highest) s1=s2=...=sk= {} for each x in sorted: s <- find_min() (*) s.add(x) </code></pre> <p>where find_min() yields s such that sum(s) <= sum(si) for each si.</p> <p>this solution's will yield f (metrics defined above) such that <code>f(sol) <= (k-1)*max{S}</code> (from here it is a proof for this bound):</p> <p><br><strong>claim</strong>: for each subset s, <code>MAX- sum(s) <= max{S}</code> <br><strong>proof</strong> - by induction: at each step, the claim is true for the temporary solution. <br>in each step, let MAX be max{sum(si)} at the start of the iteration (before the add)!</p> <pre><code>base: the set of subsets at start is {},{},.. MAX=sum(si)=0 for each si. step: assume the assumption is true for iteration i, we'll show it is also true for iteration i+1: let s be the set that x was added to, then MAX-sum(s) <= max{S} (induction assumption). if sum(s) + x <= MAX: we are done, MAX was not changed. else: we sorted the elements at start, so x <= max{S}, and thus if s was chosen (sum(si) >= sum(s) for each si) and sum(s) + x > MAX then: for each si, sum(si) + x >= sum(s) + x, so sum(s)+x - sum(si) <= x <= max{S}. since sum(s)+x will be the MAX next iteration, we are done. </code></pre> <p>because for each set <code>MAX-sum(si) <= max{S}</code> (and obviously, for the max set, <code>MAX-sum(si)=0</code>), at overall <code>Sigma(MAX-sum(si)) <= (k-1)*max{S}</code>, as promised.</p> <p><strong>EDIT :</strong> <br>I had some spare time, so I programmed both heuristics suggested by me and by @Akhil, and both metrics, first of all, both results are conclusive (according to <a href="http://en.wikipedia.org/wiki/Wilcoxon_signed-rank_test" rel="nofollow noreferrer">Wilcoxon</a>'s pair-t test), but which is better is defined by which metric you choose, surprisingly, the algorithm which tried to minimize f() (@Akhil`s) scored lower for this same f, but higher for the second metric. <img src="https://i.stack.imgur.com/UkaWg.png" alt="@Akhil's metrics graph"></p> <p><img src="https://i.stack.imgur.com/O1V5O.png" alt="@Amit's metrics graph"></p>

Querying!

Guidance

An individual column

Larger individual text columns get their own page to allow for proper reading.

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload