Note that there are some explanatory texts on larger screens.

plurals
  1. POLevinshtein Distance of two words from text file with Python
    text
    copied!<p>I have a small 30 line text file with two similar words on each line. I need to calculate the <a href="http://en.wikipedia.org/wiki/Levenshtein_distance" rel="nofollow">levenshtein distance</a> between the two words on each line. I also need to use a <a href="http://en.wikipedia.org/wiki/Memoization" rel="nofollow">memoize</a> function while calculating the distance. I am pretty new to Python and algorithms in general, so this is proving to be quite difficult of me. I have the file open and being read, but I cannot figure out how to assign each of the two words to variables 'a' &amp; 'b' to calculate the distance. </p> <p>Here is my current script that ONLY prints the document as of right now:</p> <pre><code>txt_file = open('wordfile.txt', 'r') def memoize(f): cache = {} def wrapper(*args, **kwargs): try: return cache[args] except KeyError: result = f(*args, **kwargs) cache[args] = result return result return wrapper @memoize def lev(a,b): if len(a) &gt; len(b): a,b = b,a b,a = a,b current = range(a+1) for i in range(1,b+1): previous, current = current, [i]+[0]*n for j in range(1,a+1): add, delete = previous[j]+1, current[j-1]+1 change = previous[j-1] if a[j-1] != b[i-1]: change = change + 1 current[j] = min(add, delete, change) return current[b] if __name__=="__main__": with txt_file as f: for line in f: print line </code></pre> <p>Here are a few words from the text file so you all get an idea:</p> <p>archtypes, archetypes</p> <p>propietary, proprietary</p> <p>recogize, recognize</p> <p>exludes, excludes</p> <p>tornadoe, tornado</p> <p>happenned, happened</p> <p>vacinity, vicinity</p> <p><strong>HERE IS AN UPDATED VERSION OF THE SCRIPT, STILL NOT FUNCTIONAL BUT BETTER</strong>:</p> <pre><code>class memoize: def __init__(self, function): self.function = function self.memoized = {} def __call__(self, *args): try: return self.memoized[args] except KeyError: self.memoized[args] = self.function(*args) return self.memoized[args] @memoize def lev(a,b): n, m = len(a), len(b) if n &gt; m: a, b = b, a n, m = m, n current = range(n + 1) for i in range(1, m + 1): previous, current = current, [i] + [0] * n for j in range(1, n + 1): add, delete = previous[j] + 1, current[j - 1] + 1 change = previous[j - 1] if a[j - 1] != b[i - 1]: change = change + 1 current[j] = min(add, delete, change) return current[n] if __name__=="__main__": for pair in open("wordfile.txt", "r"): a,b = pair.split() lev(a, b) </code></pre>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload