Note that there are some explanatory texts on larger screens.

plurals
  1. POPython, loop through lines in a file; if line equals line in another file, return original line
    text
    copied!<p>Text file 1 has the following format:</p> <pre><code>'WORD': 1 'MULTIPLE WORDS': 1 'WORD': 2 </code></pre> <p>etc.</p> <p>I.e., a word separated by a colon followed by a number. </p> <p>Text file 2 has the following format:</p> <pre><code>'WORD' 'WORD' </code></pre> <p>etc. </p> <p>I need to extract single words (i.e., only WORD not MULTIPLE WORDS) from File 1 and, if they match a word in File 2, return the word from File 1 along with its value.</p> <p>I have some poorly functioning code:</p> <pre><code>def GetCounts(file1, file2): target_contents = open(file1).readlines() #file 1 as list--&gt; 'WORD': n match_me_contents = open(file2).readlines() #file 2 as list -&gt; 'WORD' ls_stripped = [x.strip('\n') for x in match_me_contents] #get rid of newlines match_me_as_regex= re.compile("|".join(ls_stripped)) for line in target_contents: first_column = line.split(':')[0] #get the first item in line.split number = line.split(':')[1] #get the number associated with the word if len(first_column.split()) == 1: #get single word, no multiple words """ Does the word from target contents match the word from match_me contents? If so, return the line from target_contents""" if re.findall(match_me_as_regex, first_column): print first_column, number #OUTPUT: WORD, n WORD, n etc. </code></pre> <p>Because of the use of regex, the output is shotty. The code will return 'asset, 2', for example, since re.findall() will match 'set' from match_me. I need to match the target_word with the entire word from match_me to block the bad output resulting from partial regex matches. </p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload