Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>For this specific case, use either a <a href="http://docs.python.org/2/library/collections.html#collections.Counter" rel="nofollow"><code>collections.Counter()</code></a> or a <a href="http://docs.python.org/2/library/collections.html#collections.defaultdict" rel="nofollow"><code>collections.defaultdict()</code></a> object instead:</p> <pre><code>import collections dct = collections.defaultdict(int) for c in string: dict[c] += 1 </code></pre> <p>or</p> <pre><code>dct = collections.Counter(string) </code></pre> <p>Both are subclasses of the standard <code>dict</code> type. The <code>Counter</code> type adds some more helpful functionality like summing two counters or listing the most common entities that have been counted. The <code>defaultdict</code> class can also be given other default types; use <code>defaultdict(list)</code> for example to collect things into lists per key.</p> <p>When you want to compare performance of two different approaches, you want to use the <a href="http://docs.python.org/2/library/timeit.html" rel="nofollow"><code>timeit</code> module</a>:</p> <pre><code>&gt;&gt;&gt; import timeit &gt;&gt;&gt; def intest(dct, values): ... for c in values: ... if c in dct: ... dct[c]+=1 ... else: ... dct[c]=1 ... &gt;&gt;&gt; def get(dct, values): ... for c in values: ... dct[c] = dct.get(c, 0) + 1 ... &gt;&gt;&gt; values = range(10) * 10 &gt;&gt;&gt; timeit.timeit('test(dct, values)', 'from __main__ import values, intest as test; dct={}') 22.210275888442993 &gt;&gt;&gt; timeit.timeit('test(dct, values)', 'from __main__ import values, get as test; dct={}') 27.442166090011597 </code></pre> <p>This shows that using <code>in</code> is a little faster.</p> <p>There is, however, a <em>third</em> option to consider; catching the <code>KeyError</code> exception:</p> <pre><code>&gt;&gt;&gt; def tryexcept(dct, values): ... for c in values: ... try: ... dct[c] += 1 ... except KeyError: ... dct[c] = 1 ... &gt;&gt;&gt; timeit.timeit('test(dct, values)', 'from __main__ import values, tryexcept as test; dct={}') 18.023509979248047 </code></pre> <p>which happens to be the fastest, because only 1 in 10 cases are for a new key.</p> <p>Last but not least, the two alternatives I proposed:</p> <pre><code>&gt;&gt;&gt; def default(dct, values): ... for c in values: ... dct[c] += 1 ... &gt;&gt;&gt; timeit.timeit('test(dct, values)', 'from __main__ import values, default as test; from collections import defaultdict; dct=defaultdict(int)') 15.277361154556274 &gt;&gt;&gt; timeit.timeit('Counter(values)', 'from __main__ import values; from collections import Counter') 38.657804012298584 </code></pre> <p>So the <code>Counter()</code> type is slowest, but <code>defaultdict</code> is <em>very</em> fast indeed. <code>Counter()</code>s do a lot more work though, and the extra functionality can bring ease of development and execution speed benefits elsewhere.</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    2. VO
      singulars
      1. This table or related slice is empty.
    3. VO
      singulars
      1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload