Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    text
    copied!<p>I guess the <a href="http://en.wikipedia.org/wiki/Damerau%E2%80%93Levenshtein_distance" rel="nofollow noreferrer">Levenshtein distance</a> is more useful here than the <a href="http://en.wikipedia.org/wiki/Hamming_distance" rel="nofollow noreferrer">Hamming distance</a>.</p> <p>Let's take an example: We take the word <code>example</code> and restrict ourselves to a Levenshtein distance of&nbsp;1. Then we can enumerate all possible misspellings that exist:</p> <ul> <li>1 insertion (208) <ul> <li>aexample</li> <li>bexample</li> <li>cexample</li> <li>...</li> <li>examplex</li> <li>exampley</li> <li>examplez</li> </ul></li> <li>1 deletion (7) <ul> <li>xample</li> <li>eample</li> <li>exmple</li> <li>...</li> <li>exampl</li> </ul></li> <li>1 substitution (182) <ul> <li>axample</li> <li>bxample</li> <li>cxample</li> <li>...</li> <li>examplz</li> </ul></li> </ul> <p>You could store each misspelling in the database, and link that to the correct spelling, <code>example</code>. That works and would be quite fast, but creates a huge database.</p> <p>Notice how most misspellings occur by doing the same operation with a different character:</p> <ul> <li>1 insertion (8) <ul> <li>?example</li> <li>e?xample</li> <li>ex?ample</li> <li>exa?mple</li> <li>exam?ple</li> <li>examp?le</li> <li>exampl?e</li> <li>example?</li> </ul></li> <li>1 deletion (7) <ul> <li>xample</li> <li>eample</li> <li>exmple</li> <li>exaple</li> <li>examle</li> <li>exampe</li> <li>exampl</li> </ul></li> <li>1 substitution (7) <ul> <li>?xample</li> <li>e?ample</li> <li>ex?mple</li> <li>exa?ple</li> <li>exam?le</li> <li>examp?e</li> <li>exampl?</li> </ul></li> </ul> <p>That looks quite manageable. You could generate all these "hints" for each word and store them in the database. When the user enters a word, generate all "hints" from that and query the database.</p> <p>Example: User enters <code>exaple</code> (notice missing <code>m</code>).</p> <pre><code>SELECT DISTINCT word FROM dictionary WHERE hint = '?exaple' OR hint = 'e?xaple' OR hint = 'ex?aple' OR hint = 'exa?ple' OR hint = 'exap?le' OR hint = 'exapl?e' OR hint = 'exaple?' OR hint = 'xaple' OR hint = 'eaple' OR hint = 'exple' OR hint = 'exale' OR hint = 'exape' OR hint = 'exapl' OR hint = '?xaple' OR hint = 'e?aple' OR hint = 'ex?ple' OR hint = 'exa?le' OR hint = 'exap?e' OR hint = 'exapl?' </code></pre> <p><code>exaple</code> with 1 insertion == <code>exa?ple</code> == <code>example</code> with 1 substitution</p> <p>See also: <a href="https://stackoverflow.com/questions/307291/how-does-the-google-did-you-mean-algorithm-work">How does the Google “Did you mean?” Algorithm work?</a></p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload