StackOverflow2013

Note that there are some explanatory texts on larger screens.

plurals

PO
text
Body
copied!<p>Implementation of <a href="https://stackoverflow.com/users/840372/jeroen-coupe">Jeroen Coupé</a> idea from <a href="https://stackoverflow.com/a/8924228/1052325">his answer</a> with letters count:</p> <pre><code>from collections import defaultdict, Counter def find_longest(origin, known_words): return iter_longest(origin, known_words).next() def iter_longest(origin, known_words, min_length=1): origin_map = Counter(origin) for i in xrange(len(origin) + 1, min_length - 1, -1): for word in known_words[i]: if check_same_letters(origin_map, word): yield word def check_same_letters(origin_map, word): new_map = Counter(word) return all(new_map[let] <= origin_map[let] for let in word) def load_words_from(file_path): known_words = defaultdict(list) with open(file_path) as f: for line in f: word = line.strip() known_words[len(word)].append(word) return known_words if __name__ == '__main__': known_words = load_words_from('words_list.txt') origin = 'raepkwaen' big_origin = 'raepkwaenaqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnm' print find_longest(big_origin, known_words) print list(iter_longest(origin, known_words, 5)) </code></pre> <p>Output (for my small 58000 words dict):</p> <pre><code>counterrevolutionaries ['reawaken', 'awaken', 'enwrap', 'weaken', 'weaker', 'apnea', 'arena', 'awake', 'aware', 'newer', 'paean', 'parka', 'pekan', 'prank', 'prawn', 'preen', 'renew', 'waken', 'wreak'] </code></pre> <p>Notes:</p> <ul> <li><p>It's simple implementation without optimizations.</p></li> <li><p><code>words_list.txt</code> - can be <code>/usr/share/dict/words</code> on Linux.</p></li> </ul> <p><strong>UPDATE</strong></p> <p>In case we need to find word only once, and we have dictionary with words sorted by length, e.g. by this script:</p> <pre><code>with open('words_list.txt') as f: words = f.readlines() with open('words_by_len.txt', 'w') as f: for word in sorted(words, key=lambda w: len(w), reverse=True): f.write(word) </code></pre> <p>We can find longest word without loading full dict to memory:</p> <pre><code>from collections import Counter import sys def check_same_letters(origin_map, word): new_map = Counter(word) return all(new_map[let] <= origin_map[let] for let in word) def iter_longest_from_file(origin, file_path, min_length=1): origin_map = Counter(origin) origin_len = len(origin) with open(file_path) as f: for line in f: word = line.strip() if len(word) > origin_len: continue if len(word) < min_length: return if check_same_letters(origin_map, word): yield word def find_longest_from_file(origin, file_path): return iter_longest_from_file(origin, file_path).next() if __name__ == '__main__': origin = sys.argv[1] if len(sys.argv) > 1 else 'abcdefghijklmnopqrstuvwxyz' print find_longest_from_file(origin, 'words_by_len.txt') </code></pre>

Querying!

Guidance

An individual column

Larger individual text columns get their own page to allow for proper reading.

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload