StackOverflow2013

Note that there are some explanatory texts on larger screens.

plurals

PO
text
Body
copied!<p>With thanks to those who responded, and having now read relevant portions of the C99 standard, I have come to agree with the somewhat surprising conclusion that storing an arbitrary non-EOF value returned by <code>fgetc()</code> as type <code>char</code> without loss of fidelity is not guaranteed to be possible. In large part, that arises from the possibility that <code>char</code> cannot represent as many distinct values as <code>unsigned char</code>.</p> <p>For their part, the stdio functions guarantee that if data are written to a (binary) stream and subsequently read back, then the read back data will compare equal to the original data. That turns out to have much narrower implications than I at first thought, but it does mean that <code>fputs()</code> must output a distinct value for each distinct <code>char</code> it successfully outputs, and that whatever conversion <code>fgets()</code> applies to store input bytes as type <code>char</code> must accurately reverse the conversion, if any, by which <code>fputs()</code> would produce the input byte as its output. As far as I can tell, however, <code>fputs()</code> and <code>fgets()</code> are permitted to fail on any input they don't like, so it is not certain that fputs() maps every possible <code>char</code> value to an <code>unsigned char</code>.</p> <p>Moreover, although <code>fputs()</code> and <code>fgets()</code> operate as if by performing sequences of <code>fputc()</code> and <code>fgetc()</code> calls, respectively, it is not specified what conversions they might perform between <code>char</code> values in memory and the underlying <code>unsigned char</code> values on the stream. <strong>If</strong> a platform's <code>fputs()</code> uses standard integer conversion for that purpose, however, then the correct back-conversion is as I proposed:</p> <pre><code>int c = fgetc(stream); char buf; if (c >= 0) buf = (char) ((c > CHAR_MAX) ? (c - (UCHAR_MAX + 1)) : c); </code></pre> <p>That arises directly from the integer conversion rules, which specify that integer values are converted to unsigned types by adding or subtracting the integer multiple of <target type>_MAX + 1 needed to bring the result into the range of the target type, supported by the constraints on representation of integer types. Its correctness for that purpose does not depend on the specific representation of <code>char</code> values or on whether <code>char</code> is treated as signed or unsigned.</p> <p>However, if <code>char</code> cannot represent as many distinct values as <code>unsigned char</code>, or if there are <code>char</code> values that <code>fgets()</code> refuses to output (e.g. negative ones), then there are possible values of <code>c</code> that could not have resulted from a <code>char</code> conversion in the first place. No back-conversion argument is applicable to such bytes, and there may not even be a meaningful sense of <code>char</code> values corresponding to them. In any case, whether the given conversion is the correct reverse-conversion for data written by <code>fputs()</code> seems to be implementation defined. It is certainly implementation-defined whether <code>buf = (char) c</code> will have the same effect, though it does have on very many systems.</p> <p>Overall, I am struck by just how many details of C I/O behavior are implementation defined. That was an eye-opener for me.</p>

Querying!

Guidance

An individual column

Larger individual text columns get their own page to allow for proper reading.

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload