Note that there are some explanatory texts on larger screens.

plurals
  1. POneed for the last '\0' in fgets
    primarykey
    data
    text
    <p>I've seen several usage of <code>fgets</code> (for example, <a href="https://stackoverflow.com/a/7672607/1317875">here</a>) that go like this:</p> <pre><code>char buff[7]=""; </code></pre> <p>(...)</p> <pre><code>fgets(buff, sizeof(buff), stdin); </code></pre> <p>The interest being that, if I supply a long input like "aaaaaaaaaaa", <code>fgets</code> will truncate it to "aaaaaa" here, because the 7th character will be used to store <code>'\0'</code>.</p> <p>However, when doing this:</p> <pre><code>int i=0; for (i=0;i&lt;7;i++) { buff[i]='a'; } printf("%s\n",buff); </code></pre> <p>I will always get 7 <code>'a'</code>s, and the program will not crash. But if I try to write 8 <code>'a'</code>s, it will. </p> <p>As I saw it later, the reason for this is that, at least on my system, when I allocate <code>char buff[7]</code> (with or without <code>=""</code>), the 8th byte (counting from 1, not from 0) gets set to 0. From what I guess, things are done like this precisely so that a <code>for</code> loop with 7 writes, followed by a string formatted read, could succeed, whether the last character to be written was <code>'\0'</code> or not, and thus avoiding the need for the programmer to set the last '\0' himself, when writing chars individually.</p> <p>From this, it follows that in the case of</p> <pre><code>fgets(buff, sizeof(buff), stdin); </code></pre> <p>and then providing a too long input, the resulting <code>buff</code>string will automatically have two <code>'\0'</code> characters, one inside the array, and one right after it that was written by the system.</p> <p>I have also observed that doing</p> <pre><code>fgets(buff,(sizeof(buff)+17),stdin); </code></pre> <p>will still work, and output a very long string, without crashing. From what I guessed, this is because <code>fgets</code> will keep writing until <code>sizeof(buff)+17</code>, and the last char to be written will precisely be a <code>'\0'</code>, ensuring that any forthcoming string reading process would terminate properly (although the memory is messed up anyway).</p> <p>But then, what about <code>fgets(buff, (sizeof(buff)+1),stdin);</code>? this would use up all the space that was rightfully allocated in <code>buff</code>, and then write a <code>'\0'</code> right after it, thus overwriting...the <code>'\0'</code> previously written by the system. In other words, yes, <code>fgets</code> would go out of bounds, but it can be proven that when adding only one to the length of the write, the program will never crash.</p> <p>So in the end, here comes the question: why does <code>fgets</code> always terminates its write with a <code>'\0'</code>, when another <code>'\0'</code>, placed by the system right after the array, already exists? why not do like in the one by one <code>for</code>-loop based write, that can access the whole of the array and write anything the programmer wants, without endangering anything?</p> <p>Thank you very much for your answer!</p> <p>EDIT: indeed, there is no proof possible, as long as I do not know whether this 8th <code>'\0'</code> that mysteriously appears upon allocation of buff[7], is part of the C standard or not, specifically for string arrays. If not, then...it's just luck that it works :-)</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. COBe careful thinking that because something doesn't crash it means it's actually correct; usually errors like this result in "undefined behavior". Sometimes you'll get a segfault, sometimes you won't. If you have buff[7], there's no guarantee that the 8th byte will be a \0, it could be anything.
      singulars
    2. CO"I will always get 7 'a's, and the program will not crash" - That you expect it *should/could* crash at least suggests you understand *undefined behavior*. Regarding your question, because that is how [`fgets`](http://en.cppreference.com/w/c/io/fgets) is required to behave. If you have `char a;` and pass `&a` with some arbitrary size greater than 1 would you *expect* anything *definitive* ? Its a C-api, and like most, either useful or as undefined in its behavior, depending on how *you* call it.
      singulars
    3. COI understand that any testing on my single machine will never prove anything. I was just thinking of the string viewed as an array, where you write anything you want without thinking about what the last cell would contain (like you would in an `int[]`), and then thought of as a string, i.e. as a word, with the omnipresent fear of the missing terminating character. Because of that, the standard may have included this 8th '\0' as a hard-set parameter. As I don't know the details of the standard...I was asking the question: is this eighth `'\0'` part of the C standard?
      singulars
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload