StackOverflow2013

Note that there are some explanatory texts on larger screens.

plurals

PO
text
Body
copied!<p>There several ways to make your computation faster:</p> <ul> <li>Instead of plain dynamic programming, you can use A* search (by using a heuristic that partial alignment up to (i,j) must necessarily have |i-j| deletions or insertions in it).</li> <li>If you're comparing one sequence with a whole bunch of other ones, you can save time by computing the dynamic programming table (or the A* search state) for the part leading up to that prefix and re-use the part of your computation. If you stick with the dynamic programming table, you could sort the library of strings by lexicographic order and then only throw away the part that changes (e.g. if you have 'banana' and want to compare it with 'panama' and 'panamericanism', you can reuse the part of the table that covers 'panam').</li> <li>If most of the strings are quite similar, you can save time by looking for a common prefix and excluding the common prefix from the computation (e.g. the LCS of "panama" and "panamericanism" is the common prefix "panam" plus the LCS of "a" and "ericanism")</li> <li>if the strings are quite dissimilar, you can use the symbol counts to get a lower bound on the number of edits (e.g., "AAAB" to "ABBB" need at least 2 edits because there are 3 As in one and only 1 A in the other). This can also be used in the heuristic for an A* search.</li> </ul> <p>EDIT: for the comparing-to-the-same-set-of-stings case, one person suggested a BK-Tree data structure in <a href="https://stackoverflow.com/questions/1609742/efficient-way-of-calculating-likeness-scores-of-strings-when-sample-size-is-large">Efficient way of calculating likeness scores of strings when sample size is large?</a></p>

Querying!

Guidance

An individual column

Larger individual text columns get their own page to allow for proper reading.

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload