Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    text
    copied!<p>You should watch this <a href="https://www.youtube.com/watch?v=C4Kc8xzcA68" rel="noreferrer">video</a> (although it is CPython<sup>1</sup> specific and about dictionaries -- but I assume it applies to sets as well). </p> <p>Basically, python hashes the elements and takes the last N bits (where N is determined by the size of the set) and uses those bits as array indices to place the object in memory. The objects are then yielded in the order they exist in memory. Of course, the picture gets a little more complicated when you need to resolve collisions between hashes, but that's the gist of it.</p> <p>Also note that the order that they are printed out is determined by the order that you put them in (due to collisions). So, if you reorder the list you pass to <code>set_2</code>, you might get a different order out if there are key collisions.</p> <p>For example:</p> <pre><code>list1 = [8,16,24] set(list1) #set([8, 16, 24]) list2 = [24,16,8] set(list2) #set([24, 16, 8]) </code></pre> <p><em>Note</em> the fact that the order is preserved in these sets is "coincidence" and has to do with collision resolution (which I don't know anything about). The point is that the last 3 bits of <code>hash(8)</code>, <code>hash(16)</code> and <code>hash(24)</code> are the same. Because they are the same, collision resolution takes over and puts the elements in "backup" memory locations instead of the first (best) choice and so whether <code>8</code> occupies a location or <code>16</code> is determined by which one arrived at the party first and took the "best seat".</p> <p>If we repeat the example with <code>1</code>, <code>2</code> and <code>3</code>, you will get a consistent order no matter what order they have in the input list:</p> <pre><code>list1 = [1,2,3] set(list1) # set([1, 2, 3]) list2 = [3,2,1] set(list2) # set([1, 2, 3]) </code></pre> <p>since the last 3 bits of <code>hash(1)</code>, <code>hash(2)</code> and <code>hash(3)</code> are unique.</p> <hr> <p><sup>1</sup><strong>Note</strong> The implementation described here applies to CPython <code>dict</code> and <code>set</code>. I think that the general description is valid for all modern versions of CPython up to 3.6. However, starting with CPython3.6, there is an additional implementation detail that actually preserves the insertion order for iteration for <code>dict</code>. It appears that <code>set</code> still do not have this property. The data structure is described by <a href="https://morepypy.blogspot.com/2015/01/faster-more-memory-efficient-and-more.html" rel="noreferrer">this blog post</a> by the pypy folks (who started using this before the CPython folks). The original idea (at least for the python ecosystem) <a href="https://mail.python.org/pipermail/python-dev/2012-December/123028.html" rel="noreferrer">is archived on the python-dev mailing list</a>.</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload