Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    text
    copied!<p>You asked a lot of questions. I'll do my best to answer some of them, and hopefully you'll be able to figure out the rest (ask if you need help).</p> <h3>First question: explain behaviour of <code>id</code></h3> <pre><code>&gt;&gt;&gt; n1 = N() &gt;&gt;&gt; n2 = N() &gt;&gt;&gt; id(n1) == id(n2) False </code></pre> <p>This shows that Python creates a new object each time you call an object constructor. This makes sense, because <em>this is exactly what you asked for</em>! If you wanted to allocate only one object, but give it two names, then you could have written this:</p> <pre><code>&gt;&gt;&gt; n1 = N() &gt;&gt;&gt; n2 = n1 &gt;&gt;&gt; id(n1) == id(n2) True </code></pre> <h3>Second question: why not copy-on-write?</h3> <p>You go on to ask why Python doesn't implement a copy-on-write strategy for object allocation. Well, the current strategy, of constructing an object every time you call a constructor, is:</p> <ol> <li>simple to implement;</li> <li>explicit (does exactly what you ask for);</li> <li>easy to document and understand.</li> </ol> <p>Also, the use cases for copy-on-write are not compelling. It only saves storage if many identical objects get created and are never modified. But in that case, why create many identical objects? Why not use a single object?</p> <h3>Third question: explain allocation behaviour</h3> <p>In CPython, the <code>id</code> of an object is (secretly!) its address in memory. See the function <code>builtin_id</code> in <a href="http://hg.python.org/cpython/file/f8e7fe70c075/Python/bltinmodule.c#l907" rel="noreferrer"><code>bltinmodule.c</code>, line 907</a>.</p> <p>You can investigate Python's memory allocation behaviour by making a class with <code>__init__</code> and <code>__del__</code> methods:</p> <pre><code>class N: def __init__(self): print "Creating", id(self) def __del__(self): print "Destroying", id(self) &gt;&gt;&gt; id(N()) Creating 4300023352 Destroying 4300023352 4300023352 </code></pre> <p>You can see that Python was able to destroy the object immediately, which allows it to reclaim the space for re-use by the next allocation. Python uses <a href="http://en.wikipedia.org/wiki/Reference_counting" rel="noreferrer">reference counting</a> to keep track of how many references there are to each object, and when there are no more references to an object, it gets destroyed. Within the execution of the same statement, the same memory may get re-used several times. For example:</p> <pre><code>&gt;&gt;&gt; id(N()), id(N()), id(N()) Creating 4300023352 Destroying 4300023352 Creating 4300023352 Destroying 4300023352 Creating 4300023352 Destroying 4300023352 (4300023352, 4300023352, 4300023352) </code></pre> <h3>Fourth question: explain the "juggling"</h3> <p><s>I am afraid I cannot reproduce the "juggling" behaviour you exhibit (where alternately created objects get different addresses). Can you give more details, such as Python version and operating system? What results do you get if you use my class <code>N</code>?</s></p> <p>OK, I can reproduce the juggling if I make my class <code>N</code> inherit from <code>object</code>.</p> <p>I have a theory about why this happens, but I have not checked it in a debugger, so please take it with a pinch of salt.</p> <p>First, you need to understand a bit about how Python's memory manager works. Go read through <a href="http://hg.python.org/cpython/file/f8e7fe70c075/Objects/obmalloc.c" rel="noreferrer"><code>obmalloc.c</code></a> and come back when you're done. I'll wait.</p> <p>...</p> <p>All understood? Good. So now you know that Python manages small objects by sorting them into pools by size: each 4 KiB pool contains objects in a small range of sizes, and there's a free list to help the allocator to quickly find a slot for the next object to be allocated.</p> <p>Now, the Python interactive shell is also creating objects: the abstract syntax tree and the compiled byte code, for example. My theory is that when <code>N</code> is a new-style class, it's size is such that it goes into the same pool as some other object that is allocated by the interactive shell. So the sequence of events looks something like this:</p> <ol> <li><p>User enters <code>id(N())</code></p></li> <li><p>Python allocates a slot in pool <em>P</em> for the object just created (call this slot <em>A</em>).</p></li> <li><p>Python destroys the object and returns its slot to the free list for pool <em>P</em>.</p></li> <li><p>The interactive shell allocates some object, call it <em>O</em>. This happens to be the right size to go into pool <em>P</em>, so it gets slot <em>A</em> that was just freed.</p></li> <li><p>User enters <code>id(N())</code> again.</p></li> <li><p>Python allocates a slot in pool <em>P</em> for the object just created. Slot <em>A</em> is full (still contains object <em>O</em>), so it gets slot <em>B</em> instead.</p></li> <li><p>The interactive shell forgets about object <em>O</em>, so it gets destroyed, and slot <em>A</em> is returned to the free list for pool <em>P</em>.</p></li> </ol> <p>You can see that this explains the alternating behaviour. In the case where the user types <code>id(N()),id(N())</code>, the interactive shell doesn't get a chance to stick its oar in between the two allocations, so they can both go in the same slot in the pool.</p> <p>This also explains why it doesn't happen for old-style objects. Presumably the old-style objects are a different size, so they go in a different pool, and don't share slots with whatever objects the interactive shell is creating.</p> <h3>Fifth question: what objects might the interactive shell be allocating?</h3> <p>See <a href="http://hg.python.org/cpython/file/5c7520e02d5a/Python/pythonrun.c" rel="noreferrer"><code>pythonrun.c</code></a> for the details, but basically the interactive shell:</p> <ol> <li><p>Reads your input and allocates strings contains your code.</p></li> <li><p>Calls the parser, which constructs an abstract syntax tree describing the code.</p></li> <li><p>Calls the compiler, which constructs the compiled byte code.</p></li> <li><p>Call the evaluator, which allocates objects for stack frames, locals, globals etc.</p></li> </ol> <p>I don't know exactly which of these objects is to blame for the "juggling". Not the input strings (strings have their own specialized allocator); not the abstract syntax tree (it gets thrown away after it's been compiled). Maybe it's the byte code object.</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload