Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>Your code is slightly wrong, since it counts sub-words as occurrences of words. For example "This is a bad test".count('a') will give 2 rather than 1.</p> <p>Word-splitting correctly is a bit tricky, but here's a simple start that breaks words at punctuation or spaces.</p> <pre><code>def input_files(f, word): print sum(re.split('[ .,;:"]').count(word) for line in f) </code></pre> <p>A good alternative would be to use a regular expression to find occurrences of the word (although I think this makes things slightly harder).</p> <p>This code has a few other improvements over your version: if you use a file object as an iterator, you get lines without having to do an explicit <code>readlines()</code> (this avoids reading the entire file into RAM and representing it as a list), and when you say <code>if word in line: s += line.count(word)</code> you're actually making things slower than if you just write <code>s += line.count(word)</code> since it requires 2 scans over the line rather than just one.</p> <p>I've also passed the word you're scanning for into the function, because it makes the code more obvious (and you could even write unit tests for this version).</p> <p>To continue... rather than printing the word-count out, you probably want to return it (since you want to find the files with the greatest word-count). Then you can count the occurrences of the given word per file, and sort them.</p> <p>Here's a solution, that uses command-line arguments and that doesn't have any error-checking. Usage: [program] word file1 file2...</p> <pre><code>import sys def words_in_file(filename, word): with open(filename, 'r') as f: return sum(re.split('[ .,;:"]', line).count(word) for line in f) def files_by_wordcount(filenames, word): counts = [(words_in_file(filename, word), filename) for filename in filenames] return sorted(counts, reverse=True) if __name__ == '__main__': for count, filename in files_by_wordcount(sys.argv[2:], sys.argv[1]): print filename, count </code></pre>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    2. VO
      singulars
      1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload