Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>For your smaller example, with a limited diversity of elements, you can use a set and a dict comprehension:</p> <pre><code>&gt;&gt;&gt; mylist = [1,1,1,1,1,1,2,3,2,2,2,2,3,3,4,5,5,5,5] &gt;&gt;&gt; {k:mylist.count(k) for k in set(mylist)} {1: 6, 2: 5, 3: 3, 4: 1, 5: 4} </code></pre> <p>To break it down, <code>set(mylist)</code> uniquifies the list and makes it more compact:</p> <pre><code>&gt;&gt;&gt; set(mylist) set([1, 2, 3, 4, 5]) </code></pre> <p>Then the dictionary comprehension steps through the unique values and sets the count from the list. </p> <p>This also is <em>significantly</em> faster than using Counter and faster than using setdefault:</p> <pre><code>from __future__ import print_function from collections import Counter from collections import defaultdict import random mylist=[1,1,1,1,1,1,2,3,2,2,2,2,3,3,4,5,5,5,5]*10 def s1(mylist): return {k:mylist.count(k) for k in set(mylist)} def s2(mlist): return Counter(mylist) def s3(mylist): mydict=dict() for index in mylist: mydict[index] = mydict.setdefault(index, 0) + 1 return mydict def s4(mylist): mydict={}.fromkeys(mylist,0) for k in mydict: mydict[k]=mylist.count(k) return mydict def s5(mylist): mydict={} for k in mylist: mydict[k]=mydict.get(k,0)+1 return mydict def s6(mylist): mydict=defaultdict(int) for i in mylist: mydict[i] += 1 return mydict def s7(mylist): mydict={}.fromkeys(mylist,0) for e in mylist: mydict[e]+=1 return mydict if __name__ == '__main__': import timeit n=1000000 print(timeit.timeit("s1(mylist)", setup="from __main__ import s1, mylist",number=n)) print(timeit.timeit("s2(mylist)", setup="from __main__ import s2, mylist, Counter",number=n)) print(timeit.timeit("s3(mylist)", setup="from __main__ import s3, mylist",number=n)) print(timeit.timeit("s4(mylist)", setup="from __main__ import s4, mylist",number=n)) print(timeit.timeit("s5(mylist)", setup="from __main__ import s5, mylist",number=n)) print(timeit.timeit("s6(mylist)", setup="from __main__ import s6, mylist, defaultdict",number=n)) print(timeit.timeit("s7(mylist)", setup="from __main__ import s7, mylist",number=n)) </code></pre> <p>On my machine that prints (Python 3):</p> <pre><code>18.123854104997008 # set and dict comprehension 78.54796334600542 # Counter 33.98185228800867 # setdefault 19.0563529439969 # fromkeys / count 34.54294775899325 # dict.get 21.134678319009254 # defaultdict 22.760544238000875 # fromkeys / loop </code></pre> <p>For Larger lists, like 10 million integers, with more diverse elements (1,500 random ints), use defaultdict or fromkeys in a loop:</p> <pre><code>from __future__ import print_function from collections import Counter from collections import defaultdict import random mylist = [random.randint(0,1500) for _ in range(10000000)] def s1(mylist): return {k:mylist.count(k) for k in set(mylist)} def s2(mlist): return Counter(mylist) def s3(mylist): mydict=dict() for index in mylist: mydict[index] = mydict.setdefault(index, 0) + 1 return mydict def s4(mylist): mydict={}.fromkeys(mylist,0) for k in mydict: mydict[k]=mylist.count(k) return mydict def s5(mylist): mydict={} for k in mylist: mydict[k]=mydict.get(k,0)+1 return mydict def s6(mylist): mydict=defaultdict(int) for i in mylist: mydict[i] += 1 return mydict def s7(mylist): mydict={}.fromkeys(mylist,0) for e in mylist: mydict[e]+=1 return mydict if __name__ == '__main__': import timeit n=1 print(timeit.timeit("s1(mylist)", setup="from __main__ import s1, mylist",number=n)) print(timeit.timeit("s2(mylist)", setup="from __main__ import s2, mylist, Counter",number=n)) print(timeit.timeit("s3(mylist)", setup="from __main__ import s3, mylist",number=n)) print(timeit.timeit("s4(mylist)", setup="from __main__ import s4, mylist",number=n)) print(timeit.timeit("s5(mylist)", setup="from __main__ import s5, mylist",number=n)) print(timeit.timeit("s6(mylist)", setup="from __main__ import s6, mylist, defaultdict",number=n)) print(timeit.timeit("s7(mylist)", setup="from __main__ import s7, mylist",number=n)) </code></pre> <p>Prints:</p> <pre><code>2825.2697427899984 # set and dict comprehension 42.607481333994656 # Counter 22.77713537499949 # setdefault 2853.11187016801 # fromkeys / count 23.241977066005347 # dict.get 15.023175164998975 # defaultdict 18.28165417900891 # fromkeys / loop </code></pre> <p>You can see that solutions that relay on <code>count</code> with a moderate number of times through the large list will suffer badly/catastrophically in comparison to other solutions. </p>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    2. VO
      singulars
      1. This table or related slice is empty.
    3. VO
      singulars
      1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload