Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>Several questions have addressed what you've done wrong and how to fix it, but you also said (emphasis mine):</p> <blockquote> <p>can somebody explain why, and <strong>why this style of the coding is bad</strong> </p> </blockquote> <p>I think <code>scanf</code> is a terrible way to read input. It's inconsistent with <code>printf</code>, makes it easy to forget to check for errors, makes it hard to recover from errors, and is incompatable with ordinary (and easier to do correctly) read operations (like <code>fgets</code> and company).</p> <p>First, note that the <code>"%s"</code> format will read only until it sees whitespace. Why whitespace? Why does <code>"%s"</code> print out an entire string, but reads in strings in such a limited capacity?</p> <p>If you'd like to read in an entire line, as you may often be wont to do, <code>scanf</code> provides... with <code>"%[^\n]"</code>. What? What is that? When did this become Perl?</p> <p>But the real problem is that neither of those are safe. They both freely overflow with no bounds checking. Want bounds checking? Okay, you got it: <code>"%10s"</code> (and <code>"%10[^\n]"</code> is starting to look even worse). That will only read 9 characters, and add a terminating nul-character automatically. So that's good... for when our array size <em>never needs to change</em>.</p> <p>What if we want to pass the size of our array as an argument to <code>scanf</code>? <code>printf</code> can do this:</p> <pre><code>char string[] = "Hello, world!"; printf("%.*s\n", sizeof string, string); // prints whole message; printf("%.*s\n", 6, string); // prints just "Hello," </code></pre> <p>Want to do the same thing with <code>scanf</code>? Here's how:</p> <pre><code>static char tmp[/*bit twiddling to get the log10 of SIZE_MAX plus a few*/]; // if we did the math right we shouldn't need to use snprintf snprintf(tmp, sizeof tmp, "%%%us", bufsize); scanf(tmp, buffer); </code></pre> <p>That's right - <code>scanf</code> doesn't support the <code>"%.*s"</code> variable precision <code>printf</code> does, so to do dynamic bounds checking with <code>scanf</code> we have to <em>construct our own format string</em> in a temporary buffer. This is all kinds of bad, and even though it's actually safe here it will look like a really bad idea to anyone just dropping in.</p> <p>Meanwhile, let's look at another world. Let's look at the world of <code>fgets</code>. Here's how we read in a line of data with <code>fgets</code>:</p> <pre><code>fgets(buffer, bufsize, stdin); </code></pre> <p>Infinitely less headache, no wasted processor time converting an integer precision into a string that will only be reparsed by the library back into an integer, and all the relevant elements are sitting there on <em>one line</em> for us to see how they work together.</p> <p>Granted, this may not read an entire line. It will only read an entire line if the line is shorter than <code>bufsize - 1</code> characters. Here's how we can read an entire line:</p> <pre><code>char *readline(FILE *file) { size_t size = 80; // start off small size_t curr = 0; char *buffer = malloc(size); while(fgets(buffer + curr, size - curr, file)) { if(strchr(buffer + curr, '\n')) return buffer; // success curr = size - 1; size *= 2; char *tmp = realloc(buffer, size); if(tmp == NULL) /* handle error */; buffer = tmp; } /* handle error */; } </code></pre> <p>The <code>curr</code> variable is an optimization to prevent us from rechecking data we've already read, and is unnecessary (although useful as we read more data). We could even use the return value of <code>strchr</code> to strip off the ending <code>"\n"</code> character if you preferred.</p> <p>Notice also that <code>size_t size = 80;</code> as a starting place is completely arbitrary. We could use 81, or 79, or 100, or add it as a user-supplied argument to the function. We could even add an <code>int (*inc)(int)</code> argument, and change <code>size *= 2;</code> to <code>size = inc(size);</code>, allowing the user to control how fast the array grows. These can be useful for efficiency, when reallocations get costly and boatloads of lines of data need to be read and processed.</p> <p>We could write the same with <code>scanf</code>, but think of how many times we'd have to rewrite the format string. We could limit it to a constant increment, instead of the doubling (easily) implemented above, and never have to adjust the format string; we could give in and just store the number, do the math with as above, and use <code>snprintf</code> to convert it to a format string <em>every time we reallocate</em> so that <code>scanf</code> can convert it back to the same number; we could limit our growth and starting position in such a way that we can manually adjust the format string (say, just increment the digits), but this could get hairy after a while and may require recursion (!) to work cleanly.</p> <p>Furthermore, it's hard to mix reading with <code>scanf</code> with reading with other functions. Why? Say you want to read an integer from a line, then read a string from the next line. You try this:</p> <pre><code>int i; char buf[BUSIZE]; scanf("%i", &amp;i); fgets(buf, BUFSIZE, stdin); </code></pre> <p>That will read the "2" but then <code>fgets</code> will read an empty line because <code>scanf</code> didn't read the newline! Okay, take two:</p> <pre><code>... scanf("%i\n", &amp;i); ... </code></pre> <p>You think this eats up the newline, and it does - but it also eats up leading whitespace on the next line, because <code>scanf</code> can't tell the difference between newlines and other forms of whitespace. (Also, turns out you're writing a Python parser, and leading whitespace in lines is important.) To make this work, you have to call <code>getchar</code> or something to read in the newline and throw it away it:</p> <pre><code>... scanf("%i", &amp;i); getchar(); ... </code></pre> <p>Isn't that silly? What happens if you use <code>scanf</code> in a function, but don't call <code>getchar</code> because you don't know whether the next read is going to be <code>scanf</code> or something saner (or whether or not the next character is even going to be a newline)? Suddenly the best way to handle the situation seems to be to pick one or the other: do we use <code>scanf</code> exclusively and never have access to <code>fgets</code>-style full-control input, or do we use <code>fgets</code> exclusively and make it harder to perform complex parsing?</p> <p>Actually, the answer is <em>we don't</em>. We use <code>fgets</code> (or non-<code>scanf</code> functions) exclusively, and when we need <code>scanf</code>-like functionality, <em>we just call <code>sscanf</code> on the strings!</em> We don't need to have <code>scanf</code> mucking up our filestreams unnecessarily! We can have all the precise control over our input we want and <em>still</em> get all the functionality of <code>scanf</code> formatting. And even if we couldn't, many <code>scanf</code> format options have near-direct corresponding functions in the standard library, like the infinitely more flexible <code>strtol</code> and <code>strtod</code> functions (and friends). Plus, <code>i = strtoumax(str, NULL)</code> for C99 sized integer types is a lot cleaner looking than <code>scanf("%" SCNuMAX, &amp;i);</code>, and a lot safer (we can use that <code>strtoumax</code> line unchanged for smaller types and let the implicit conversion handle the extra bits, but with <code>scanf</code> we have to make a temporary <code>uintmax_t</code> to read into).</p> <p>The moral of this story: avoid <code>scanf</code>. If you need the formatting it provides, and don't want to (or can't) do it (more efficiently) yourself, use <code>fgets</code> / <code>sscanf</code>.</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    2. VO
      singulars
      1. This table or related slice is empty.
    3. VO
      singulars
      1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload