Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>While I adore the grep solution for its elegance and for reminding (or teaching) me about a method in Enumerable that I'd forgotten (or overlooked completely), it's slow, slow, slow. I agree 100% that creating the <code>Array#mode</code> method is a good idea, however - this is Ruby, we don't need a library of functions that act on arrays, we can create a mixin that adds the necessary functions <em>into</em> the Array class itself.</p> <p>But the inject(Hash) alternative uses a sort, which we also don't really need: we just want the value with the highest occurrence.</p> <p>Neither of the solutions address the possibility that more than one value may be the mode. Maybe that's not an issue in the problem as stated (can't tell). I think I'd want to know if there was a tie, though, and anyway, I think we can improve a little on the performance.</p> <pre><code>require 'benchmark' class Array def mode1 sort_by {|i| grep(i).length }.last end def mode2 freq = inject(Hash.new(0)) { |h,v| h[v] += 1; h } sort_by { |v| freq[v] }.last end def mode3 freq = inject(Hash.new(0)) { |h,v| h[v] += 1; h } max = freq.values.max # we're only interested in the key(s) with the highest frequency freq.select { |k, f| f == max } # extract the keys that have the max frequency end end arr = Array.new(1_000) { |i| rand(100) } # something to test with Benchmark.bm(30) do |r| res = {} (1..3).each do |i| m = "mode#{i}" r.report(m) do 100.times do res[m] = arr.send(m).inspect end end end res.each { |k, v| puts "%10s = %s" % [k, v] } end </code></pre> <p>And here's output from a sample run.</p> <pre><code> user system total real mode1 34.375000 0.000000 34.375000 ( 34.393000) mode2 0.359000 0.000000 0.359000 ( 0.359000) mode3 0.219000 0.000000 0.219000 ( 0.219000) mode1 = 41 mode2 = 41 mode3 = [[41, 17], [80, 17], [72, 17]] </code></pre> <p>The "optimised" mode3 took 60% of the time of the previous record-holder. Note also the multiple highest-frequency entries.</p> <p>EDIT</p> <p>A few months down the line, I noticed <a href="https://stackoverflow.com/questions/412169/ruby-how-to-find-item-in-array-which-has-the-most-occurrences/1909239#1909239">Nilesh's answer</a>, which offered this:</p> <pre><code>def mode4 group_by{|i| i}.max{|x,y| x[1].length &lt;=&gt; y[1].length}[0] end </code></pre> <p>It doesn't work with 1.8.6 out of the box, because that version doesn't have Array#group_by. ActiveSupport has it, for the Rails developers, although it seems about 2-3% slower than mode3 above. Using the (excellent) <a href="http://github.com/marcandre/backports" rel="nofollow noreferrer">backports</a> gem, though, produces a 10-12% <em>gain</em>, as well as delivering a whole pile of 1.8.7 and 1.9 extras.</p> <p>The above applies to <strong>1.8.6</strong> only - and mainly only if installed on Windows. Since I have it installed, here's what you get from IronRuby 1.0 (on .NET 4.0):</p> <pre><code>========================== IronRuby ===================================== (iterations bumped to **1000**) user system total real mode1 (I didn't bother :-)) mode2 4.265625 0.046875 4.312500 ( 4.203151) mode3 0.828125 0.000000 0.828125 ( 0.781255) mode4 1.203125 0.000000 1.203125 ( 1.062507) </code></pre> <p>So in the event that performance is super-critical, benchmark the options on your Ruby version &amp; OS. <a href="http://en.wiktionary.org/wiki/your_mileage_may_vary" rel="nofollow noreferrer">YMMV</a>.</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    2. VO
      singulars
      1. This table or related slice is empty.
    3. VO
      singulars
      1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload