Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>I'm still not sure why you would want to use <code>scanf()</code> in <code>main()</code>. It would presumably mean changing the interface of <code>stemfile()</code> (including the name since it would no longer be processing a file) to take a character string as input. And <code>scanf()</code> is going to make life difficult; it will read strings separated by blanks, which may be part of its attraction, but it will include any punctuation that is included in the 'word'.</p> <p>As Randall noted, the code in the existing function is a little obsure; I think it could be written more simply as follows:</p> <pre><code>#include &lt;stdio.h&gt; #include &lt;ctype.h&gt; #define LETTER(x) isalpha(x) extern int stem(char *s, int lo, int hi); static void stemfile(FILE * f) { int ch; while ((ch = getc(f)) != EOF) { if (LETTER(ch)) { char s[1024]; int i = 0; s[i++] = ch; while ((ch = getc(f)) != EOF &amp;&amp; LETTER(ch)) s[i++] = ch; if (ch != EOF) ungetc(ch, f); s[i] = '\0'; s[stem(s, 0, i-1)+1] = 0; /* the previous line calls the stemmer and uses its result to zero-terminate the string in s */ printf("%s", s); } else putchar(ch); } } </code></pre> <p>I've slightly simplified things by making <code>s</code> into a simple local variable (it appears to have been a global, as does <code>imax</code>), removing <code>imax</code> and the <code>increase_s()</code> function. Those are largely incidental to the operation of the function.</p> <p>If you want this to process a (null-terminated) string instead, then:</p> <pre><code>static void stemstring(const char *src) { char ch; while ((ch = *src++) != '\0') { if (LETTER(ch)) { int i = 0; char s[1024]; s[i++] = ch; while ((ch = *src++) != '\0' &amp;&amp; LETTER(ch)) s[i++] = ch; if (ch != '\0') src--; s[i-1] = '\0'; s[stem(s,0,i-1)+1] = 0; /* the previous line calls the stemmer and uses its result to zero-terminate the string in s */ printf("%s",s); } else putchar(ch); } } </code></pre> <p>This systematically changes <code>getc(f)</code> into <code>*src++</code>, <code>EOF</code> into <code>\0</code>, and <code>ungetc()</code> into <code>src--</code>. It also (safely) changes the type of <code>ch</code> from <code>int</code> (necessary for I/O) to <code>char</code>. If you are worried about buffer overflow, you have to work a bit harder in the function, but few words in practice will be even 1024 bytes (and you could use 4096 as easily as 1024, with correspondingly smaller - infinitesimal - chance of real data overflowing the buffer. You need to judge whether that is a 'real' risk for you.</p> <p>The main program can become quite simply:</p> <pre><code>int main(void) { char string[1024]; while (scanf("%1023s", string) == 1) stemstring(string); return(0); } </code></pre> <p>Clearly, because of the '1023' in the format, this will never overflow the inner buffer. (NB: Removed the <code>.</code> from <s><code>"%.1023s"</code></s> in first version of this answer; <code>scanf()</code> is not the same as <code>printf()</code>!).</p> <hr> <p>Challenged: does this work?</p> <p>Yes - this code below (adding a dummy <code>stem()</code> function and slightly modifying the printing) works reasonably well for me:</p> <pre><code>#include &lt;stdio.h&gt; #include &lt;ctype.h&gt; #include &lt;assert.h&gt; #define LETTER(x) isalpha(x) #define MAX(x, y) (((x) &gt; (y)) ? (x) : (y)) static int stem(const char *s, int begin, int end) { assert(s != 0); return MAX(end - begin - 3, 3); } static void stemstring(const char *src) { char ch; while ((ch = *src++) != '\0') { if (LETTER(ch)) { int i = 0; char s[1024]; s[i++] = ch; while ((ch = *src++) != '\0' &amp;&amp; LETTER(ch)) s[i++] = ch; if (ch != '\0') src--; s[i-1] = '\0'; s[stem(s,0,i-1)+1] = 0; /* the previous line calls the stemmer and uses its result to zero-terminate the string in s */ printf("&lt;&lt;%s&gt;&gt;\n",s); } else putchar(ch); } putchar('\n'); } int main(void) { char string[1024]; while (scanf("%1023s", string) == 1) stemstring(string); return(0); } </code></pre> <h3>Example dialogue</h3> <pre><code>H: assda23 C: &lt;&lt;assd&gt;&gt; C: 23 H: 3423///asdrrrf12312 C: 3423///&lt;&lt;asdr&gt;&gt; C: 12312 H: 12//as//12 C: 12//&lt;&lt;a&gt;&gt; C: //12 </code></pre> <p>The lines marked <code>H:</code> are human input (the <code>H:</code> was not part of the input); the lines marked <code>C:</code> are computer output.</p> <hr> <h3>Next attempt</h3> <p>The trouble with concentrating on grotesquely overlong words (1023-characters and more) is that you can overlook the simple. With <code>scanf()</code> reading data, you automatically get single 'words' with no spaces in them as input. Here's a debugged version of <code>stemstring()</code> with debugging printing code in place. The problem was two off-by-one errors. One was in the assignment <code>s[i-1] = '\0';</code> where the <code>-1</code> was not needed. The other was in the handling of the end of a string of letters; the <code>while ((ch = *src++) != '\0') left</code>src<code>one place too far, which led to interesting effects with short words entered after long words (when the difference in length was 2 or more). There's a fairly detailed trace of the test case I devised, using words such as 'great' and 'book' which you diagnosed (correctly) as being mishandled. The</code>stem()` function here simply prints its inputs and outputs, and returns the full length of the string (so there is no stemming occurring).</p> <pre><code>#include &lt;stdio.h&gt; #include &lt;ctype.h&gt; #include &lt;assert.h&gt; #define LETTER(x) isalpha(x) #define MAX(x, y) (((x) &gt; (y)) ? (x) : (y)) static int stem(const char *s, int begin, int end) { int len = end - begin + 1; assert(s != 0); printf("ST (%d,%d) &lt;&lt;%*.*s&gt;&gt; RV %d\n", begin, end, len, len, s, len); // return MAX(end - begin - 3, 3); return len; } static void stemstring(const char *src) { char ch; printf("--&gt;&gt; stemstring: &lt;&lt;%s&gt;&gt;\n", src); while ((ch = *src++) != '\0') { if (ch != '\0') printf("LP &lt;&lt;%c%s&gt;&gt;\n", ch, src); if (LETTER(ch)) { int i = 0; char s[1024]; s[i++] = ch; while ((ch = *src++) != '\0' &amp;&amp; LETTER(ch)) s[i++] = ch; src--; s[i] = '\0'; printf("RD (%d) &lt;&lt;%s&gt;&gt;\n", i, s); s[stem(s, 0, i-1)+1] = '\0'; /* the previous line calls the stemmer and uses its result to zero-terminate the string in s */ printf("RS &lt;&lt;%s&gt;&gt;\n", s); } else printf("NL &lt;&lt;%c&gt;&gt;\n", ch); } //putchar('\n'); printf("&lt;&lt;-- stemstring\n"); } int main(void) { char string[1024]; while (scanf("%1023s", string) == 1) stemstring(string); return(0); } </code></pre> <p>The debug-laden output is shown (the first line is the typed input; the rest is the output from the program):</p> <pre><code>what a great book this is! What.hast.thou.done? --&gt;&gt; stemstring: &lt;&lt;what&gt;&gt; LP &lt;&lt;what&gt;&gt; RD (4) &lt;&lt;what&gt;&gt; ST (0,3) &lt;&lt;what&gt;&gt; RV 4 RS &lt;&lt;what&gt;&gt; &lt;&lt;-- stemstring --&gt;&gt; stemstring: &lt;&lt;a&gt;&gt; LP &lt;&lt;a&gt;&gt; RD (1) &lt;&lt;a&gt;&gt; ST (0,0) &lt;&lt;a&gt;&gt; RV 1 RS &lt;&lt;a&gt;&gt; &lt;&lt;-- stemstring --&gt;&gt; stemstring: &lt;&lt;great&gt;&gt; LP &lt;&lt;great&gt;&gt; RD (5) &lt;&lt;great&gt;&gt; ST (0,4) &lt;&lt;great&gt;&gt; RV 5 RS &lt;&lt;great&gt;&gt; &lt;&lt;-- stemstring --&gt;&gt; stemstring: &lt;&lt;book&gt;&gt; LP &lt;&lt;book&gt;&gt; RD (4) &lt;&lt;book&gt;&gt; ST (0,3) &lt;&lt;book&gt;&gt; RV 4 RS &lt;&lt;book&gt;&gt; &lt;&lt;-- stemstring --&gt;&gt; stemstring: &lt;&lt;this&gt;&gt; LP &lt;&lt;this&gt;&gt; RD (4) &lt;&lt;this&gt;&gt; ST (0,3) &lt;&lt;this&gt;&gt; RV 4 RS &lt;&lt;this&gt;&gt; &lt;&lt;-- stemstring --&gt;&gt; stemstring: &lt;&lt;is!&gt;&gt; LP &lt;&lt;is!&gt;&gt; RD (2) &lt;&lt;is&gt;&gt; ST (0,1) &lt;&lt;is&gt;&gt; RV 2 RS &lt;&lt;is&gt;&gt; LP &lt;&lt;!&gt;&gt; NL &lt;&lt;!&gt;&gt; &lt;&lt;-- stemstring --&gt;&gt; stemstring: &lt;&lt;What.hast.thou.done?&gt;&gt; LP &lt;&lt;What.hast.thou.done?&gt;&gt; RD (4) &lt;&lt;What&gt;&gt; ST (0,3) &lt;&lt;What&gt;&gt; RV 4 RS &lt;&lt;What&gt;&gt; LP &lt;&lt;.hast.thou.done?&gt;&gt; NL &lt;&lt;.&gt;&gt; LP &lt;&lt;hast.thou.done?&gt;&gt; RD (4) &lt;&lt;hast&gt;&gt; ST (0,3) &lt;&lt;hast&gt;&gt; RV 4 RS &lt;&lt;hast&gt;&gt; LP &lt;&lt;.thou.done?&gt;&gt; NL &lt;&lt;.&gt;&gt; LP &lt;&lt;thou.done?&gt;&gt; RD (4) &lt;&lt;thou&gt;&gt; ST (0,3) &lt;&lt;thou&gt;&gt; RV 4 RS &lt;&lt;thou&gt;&gt; LP &lt;&lt;.done?&gt;&gt; NL &lt;&lt;.&gt;&gt; LP &lt;&lt;done?&gt;&gt; RD (4) &lt;&lt;done&gt;&gt; ST (0,3) &lt;&lt;done&gt;&gt; RV 4 RS &lt;&lt;done&gt;&gt; LP &lt;&lt;?&gt;&gt; NL &lt;&lt;?&gt;&gt; &lt;&lt;-- stemstring </code></pre> <p>The techniques shown - printing diagnostic information at key points in the program - is one way of debugging a program such as this. The alternative is stepping through the code with a source code debugger - <code>gdb</code> or its equivalent. I probably more often use print statements, but I'm an old fogey who finds IDE's too hard to use (because they don't behave like the command line I'm used to).</p> <p>Granted, it isn't your code any more, but I do think you should have been able to do most of the debugging yourself. I'm grateful that you reported the trouble with my code. However, you also need to learn how to diagnose problems in other people's code; how to instrument it; how to characterize and locate the problems. You could then report the problem with precision - "you goofed with your end of word condition, and ...".</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    2. VO
      singulars
      1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload