StackOverflow2013

Note that there are some explanatory texts on larger screens.

plurals

POneed for the last '\0' in fgets
text
Body
copied!<p>I've seen several usage of <code>fgets</code> (for example, <a href="https://stackoverflow.com/a/7672607/1317875">here</a>) that go like this:</p> <pre><code>char buff[7]=""; </code></pre> <p>(...)</p> <pre><code>fgets(buff, sizeof(buff), stdin); </code></pre> <p>The interest being that, if I supply a long input like "aaaaaaaaaaa", <code>fgets</code> will truncate it to "aaaaaa" here, because the 7th character will be used to store <code>'\0'</code>.</p> <p>However, when doing this:</p> <pre><code>int i=0; for (i=0;i<7;i++) { buff[i]='a'; } printf("%s\n",buff); </code></pre> <p>I will always get 7 <code>'a'</code>s, and the program will not crash. But if I try to write 8 <code>'a'</code>s, it will. </p> <p>As I saw it later, the reason for this is that, at least on my system, when I allocate <code>char buff[7]</code> (with or without <code>=""</code>), the 8th byte (counting from 1, not from 0) gets set to 0. From what I guess, things are done like this precisely so that a <code>for</code> loop with 7 writes, followed by a string formatted read, could succeed, whether the last character to be written was <code>'\0'</code> or not, and thus avoiding the need for the programmer to set the last '\0' himself, when writing chars individually.</p> <p>From this, it follows that in the case of</p> <pre><code>fgets(buff, sizeof(buff), stdin); </code></pre> <p>and then providing a too long input, the resulting <code>buff</code>string will automatically have two <code>'\0'</code> characters, one inside the array, and one right after it that was written by the system.</p> <p>I have also observed that doing</p> <pre><code>fgets(buff,(sizeof(buff)+17),stdin); </code></pre> <p>will still work, and output a very long string, without crashing. From what I guessed, this is because <code>fgets</code> will keep writing until <code>sizeof(buff)+17</code>, and the last char to be written will precisely be a <code>'\0'</code>, ensuring that any forthcoming string reading process would terminate properly (although the memory is messed up anyway).</p> <p>But then, what about <code>fgets(buff, (sizeof(buff)+1),stdin);</code>? this would use up all the space that was rightfully allocated in <code>buff</code>, and then write a <code>'\0'</code> right after it, thus overwriting...the <code>'\0'</code> previously written by the system. In other words, yes, <code>fgets</code> would go out of bounds, but it can be proven that when adding only one to the length of the write, the program will never crash.</p> <p>So in the end, here comes the question: why does <code>fgets</code> always terminates its write with a <code>'\0'</code>, when another <code>'\0'</code>, placed by the system right after the array, already exists? why not do like in the one by one <code>for</code>-loop based write, that can access the whole of the array and write anything the programmer wants, without endangering anything?</p> <p>Thank you very much for your answer!</p> <p>EDIT: indeed, there is no proof possible, as long as I do not know whether this 8th <code>'\0'</code> that mysteriously appears upon allocation of buff[7], is part of the C standard or not, specifically for string arrays. If not, then...it's just luck that it works :-)</p>

Querying!

Guidance

An individual column

Larger individual text columns get their own page to allow for proper reading.

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload