Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p><strong>On the problem's complexity</strong><br> Since an <em>exact</em> count is requested (as opposed to say asking if <em>at least</em> x inputs are on), the problem is very clearly <code>O(n)</code> : </p> <ul> <li>one needs to visit every single input, and,</li> <li>the work for each input is independent from n (while the work for a given input may vary based on the particular value of the input, the amount of work [for this input] doesn't vary if the number of inputs is changed.)</li> </ul> <p>We can certainly implement sub-optimal algorithms which, for example, would [unnecessarily] visit all other inputs as each input is being processed, making it an O(n^2) implementation, but that is of course silly.</p> <p>This being asserted, the questions is therefore probably about...<br> <strong>tricks which would make the implementation faster</strong><br> It should be noted that while such tricks may exist, the <em>complexity</em> of the algorithm/problem remains stubbornly at O(n).<br></p> <p>Trick 1: <strong>Better storage for inputs</strong><br> Unfortunately, the question indicates that the inputs come from named variables and that the cost for any conversion of the input [for the sake of allowing faster counting] would have to be taken into account for the evaluation of the overall performance of the algorithm. Although this eventually depends on the underlying language, runtime etc., the need to account for the conversion cost very likely condemns any algorithm based on alternate storage to be slower than solutions which keep the inputs as-is.</p> <p>Trick 2: <strong>short-circuit the evaluation</strong><br> The idea is to return <code>false</code> as soon (or shortly after) as either</p> <ul> <li>the running count of inputs that are on is bigger than X the number (or, if we are counting the number of input that are off, when this count exceeds (n - X))</li> <li>the number of inputs left to test plus the running count of inputs that are on is less that X. (or something similar in the case of counting the off inputs).</li> </ul> <p>This trick is relatively straight forward, but the extra cost for computing the values needed in the early exit tests may offset the gains made by [statically] exiting early.</p> <p>Trick 3: <strong>use reverse logic</strong>: count the number of inputs that are off rather than these are are on. (or count both).<br> The cost/benefits of this approach depends on both the number of on input to test for (the X of the question) and on the statistical prior we may have about the inputs (is the number on on-inputs at a given time relatively uniformly distibuted or do we tend to have only a few inputs on (or off)).</p> <p>The <a href="https://stackoverflow.com/questions/6695333/most-efficient-algorithm-for-determining-if-x-out-of-n-inputs-are-true/6696154#6696154">solution proposed by Chris Acheson</a> provides a baseline for the use of both Trick 2 and Trick 3. Assuming that we could make a few assumptions about the distribution of the inputs' state, additional performance improvements to this baseline would be driven such "priors": some quick heuristics done prior to the counting of the inputs per se would determine whether which state we should count (on or off or both), which limit we should test for etc. and branch to the corresponding "versions" of the algorithm.</p> <p>Additional gains are also possible, if we know the individual probability of a given input to be on or off, as we'd then test for the most (or least) likely ones first, to quickly get to our "short circuit value".</p> <p><strong>On the best-case/worse-case "complexity" of these optimized algorithms</strong><br> Assuming that</p> <ul> <li>the number of inputs that are on at a given time is uniformly distributed</li> <li>all inputs have a 50% change of being on at a given time</li> <li>X is randomly selected between 1 and n</li> </ul> <p>A combination of Tricks #2 and #3 could be <code>O(X/2)</code> on average (I need to do the math, but that seems right). However I think it wiser to talk in terms of <code>number of operations</code> relative to X and/or n, rather than misusing the O-notation...<br> Assuming that all operations roughly incur the same cost</p> <ul> <li>Initialize a counter or variable</li> <li>Test an input or a variable</li> <li>addition or subtraction</li> <li>etc</li> </ul> <p>It is easier and also more exact to compute the total number of operations needed for a given algorithm to complete, and hence to use such counts, for various best/worse/average cases to help decide on specific algorithms.<br> To illustrate this, a naive implementation that merely would systematically count all on-inputs and only compare the counter at the end, would be of O(n) complexity and complete in all cases in roughly 1 + 2*n + 1 operations. Such an algorithm may prove to be overall ,better than a fancier algorithm which while being, say, O(X), O((X+n)/2) and O(n) in the best, average and worse cases, respectively, may well use X*3, (X+n)* 1.5, and n*3 operations in these same cases. </p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    2. VO
      singulars
      1. This table or related slice is empty.
    3. VO
      singulars
      1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload