Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    text
    copied!<p>Some misinterpretation of what the standard mandates here comes from the use of processes vs. threads, and what that means for the "handle" situation you're talking about. In particular, you missed this part:</p> <blockquote> <p>Handles can be created or destroyed by explicit user action, without affecting the underlying open file description. Some of the ways to <strong>create</strong> them include fcntl(), dup(), fdopen(), fileno(), and <strong><code>fork()</code></strong>. They can be destroyed by at least fclose(), close(), and the exec functions. [ ... ] Note that after a fork(), two handles exist where one existed before.</p> </blockquote> <p>from the POSIX spec section you quote above. The reference to "create [ handles using ] <code>fork</code>" isn't elaborated on further in this section, but the spec for <a href="http://pubs.opengroup.org/onlinepubs/9699919799/functions/fork.html" rel="noreferrer"><code>fork()</code></a> adds a little detail:</p> <blockquote> <p>The child process shall have <strong>its own copy</strong> of the parent's file descriptors. Each of the child's file descriptors shall <strong>refer to the same</strong> open file description with the corresponding file descriptor of the parent.</p> </blockquote> <p>The relevant bits here are:</p> <ul> <li>the child has <em>copies</em> of the parent's file descriptors</li> <li>the child's copies refer to the same "thing" that the parent can access via said fds</li> <li>file <em>descript</em><strong><em>ors</em></strong> and file <em>descript</em><strong><em>ions</em></strong> are <strong><em>not</em></strong> the same thing; in particular, a file descriptor is a <em>handle</em> in the above sense.</li> </ul> <p>This is what the first quote refers to when it says "<code>fork()</code> creates [ ... ] handles" - they're created as <em>copies</em>, and therefore, from that point on, <em>detached</em>, and no longer updated in lockstep.</p> <p>In your example program, every child <em>process</em> gets its very own copy which starts at the same state, but after the act of copying, these filedescriptors / handles have become <em>independent instances</em>, and therefore the writes race with each other. This is perfectly acceptable regarding the standard, because <a href="http://pubs.opengroup.org/onlinepubs/9699919799/functions/write.html" rel="noreferrer"><code>write()</code></a> only guarentees:</p> <blockquote> <p>On a regular file or other file capable of seeking, the actual writing of data shall proceed from the position in the file indicated by the file offset associated with fildes. Before successful return from write(), the file offset shall be incremented by the number of bytes actually written.</p> </blockquote> <p>This means that while they all start the write at the same offset (because the fd <em>copy</em> was initialized as such) they might, even if successful, all write different amounts (there's no guarantee by the standard that a write request of <code>N</code> bytes will write <em>exactly</em> <code>N</code> bytes; it can succeed for anything <code>0 &lt;=</code> actual <code>&lt;= N</code>), and due to the ordering of the writes being unspecified, the whole example program above therefore has unspecified results. Even if the total requested amount is written, all the standard above says that the file offset is <em>incremented</em> - it does not say it's atomically (once only) incremented, nor does it say that the actual writing of data will happen in an atomic fashion.</p> <p>One thing is guaranteed though - you should never see anything in the file that has not either been there before any of the writes, or that had not come from either of the data written by any of the writes. If you do, that'd be corruption, and a bug in the filesystem implementation. What you've observed above might well be that ... if the final results can't be explained by re-ordering of parts of the writes.</p> <p>The use of <code>O_APPEND</code> fixes this, because using that, again - see <a href="http://pubs.opengroup.org/onlinepubs/9699919799/functions/write.html" rel="noreferrer"><code>write()</code></a>, does:</p> <blockquote> <p>If the O_APPEND flag of the file status flags is set, the file offset shall be set to the end of the file prior to each write and no intervening file modification operation shall occur between changing the file offset and the write operation.</p> </blockquote> <p>which is the "prior to" / "no intervening" serializing behaviour that you seek.</p> <p>The use of <em>threads</em> would change the behaviour partially - because threads, on creation, do not receive <em>copies</em> of the filedescriptors / handles but operate on the actual (shared) one. Threads would not (necessarily) all start writing at the same offset. But the option for partial-write-success will still means that you may see interleaving in ways you might not want to see. Yet it'd possibly still be fully standards-conformant.</p> <p><strong>Moral</strong>: Do not count on a POSIX/UNIX standard being <em>restrictive by default</em>. The specifications are deliberately relaxed in the common case, and require <em>you as the programmer</em> to be explicit about your intent.</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload