StackOverflow2013

Note that there are some explanatory texts on larger screens.

plurals

PO
text
Body
copied!<p>This is <a href="http://en.wikipedia.org/wiki/Topological_sorting" rel="noreferrer">topological sorting</a> on a <a href="http://en.wikipedia.org/wiki/Directed_acyclic_graph" rel="noreferrer">directed acyclic graph</a>. You need to first build the graph: vertices are letters, and there's an edge if one is lexicographically less than the other. The topological order then gives you the answer.</p> <p>A contradiction is when the directed graph is not acyclic. Uniqueness is determined by whether or not a Hamiltonian path exists, which is testable in polynomial time.</p> <hr> <h3>Building the graph</h3> <p>You do this by comparing each two consecutive "words" from the dictionary. Let's say you have these two words appearing one after another:</p> <pre><code>x156@ x1$#2z </code></pre> <p>Then you find the longest common prefix, <code>x1</code> in this case, and check the immediately following characters after this prefix. In this case,, we have <code>5</code> and <code>$</code>. Since the words appear in this order in the dictionary, we can determine that <code>5</code> must be lexicographically smaller than <code>$</code>.</p> <p>Similarly, given the following words (appearing one after another in the dictionary)</p> <pre><code>jhdsgf 19846 19846adlk </code></pre> <p>We can tell that <code>'j' < '1'</code> from the first pair (where the longest common prefix is the empty string). The second pair doesn't tell us anything useful (since one is a prefix of another, so there are no characters to compare).</p> <p>Now suppose later we see the following:</p> <pre><code>oi1019823 oij(*#@&$ </code></pre> <p>Then we've found a contradiction, because this pair says that <code>'1' < 'j'</code>.</p> <hr> <h3>The topological sort</h3> <p>There are two traditional ways to do topological sorting. Algorithmically simpler is the <a href="http://en.wikipedia.org/wiki/Depth-first_search" rel="noreferrer">depth-first search</a> approach, where there's an edge from <code>x</code> to <code>y</code> if <code>y < x</code>.</p> <p>The pseudocode of the algorithm is given in Wikipedia:</p> <pre><code>L ← Empty list that will contain the sorted nodes S ← Set of all nodes with no incoming edges function visit(node n) if n has not been visited yet then mark n as visited for each node m with an edge from n to m do visit(m) add n to L for each node n in S do visit(n) </code></pre> <p>Upon conclusion of the above algorithm, the list <code>L</code> would contain the vertices in topological order.</p> <hr> <h3>Checking uniqueness</h3> <p>The following is a quote from Wikipedia:</p> <blockquote> <p>If a topological sort has the property that all pairs of consecutive vertices in the sorted order are connected by edges, then these edges form a directed Hamiltonian path in the DAG. If a Hamiltonian path exists, the topological sort order is unique; no other order respects the edges of the path. Conversely, if a topological sort does not form a Hamiltonian path, the DAG will have two or more valid topological orderings, for in this case it is always possible to form a second valid ordering by swapping two consecutive vertices that are not connected by an edge to each other. Therefore, it is possible to test in polynomial time whether a unique ordering exists, and whether a Hamiltonian path exists.</p> </blockquote> <p>Thus, to check if the order is unique or not, you simply check if all two consecutive vertices in <code>L</code> (from the above algorithm) are connected by direct edges. If they are, then the order is unique.</p> <hr> <h3>Complexity analysis</h3> <p>Once the graph is built, topological sort is <code>O(|V|+|E|)</code>. Uniqueness check is <code>O(|V| edgeTest)</code>, where <code>edgeTest</code> is the complexity of testing whether two vertices are connected by an edge. With an <a href="http://en.wikipedia.org/wiki/Adjacency_matrix" rel="noreferrer">adjacency matrix</a>, this is <code>O(1)</code>.</p> <p>Building the graph requires only a single linear scan of the dictionary. If there are <code>W</code> words, then it's <code>O(W cmp)</code>, where <code>cmp</code> is the complexity of comparing two words. You always compare two subsequent words, so you can do all sorts of optimizations if necessary, but otherwise a naive comparison is <code>O(L)</code> where <code>L</code> is the length of the words.</p> <p>You may also shortcircuit reading the dictionary once you've determined that you have enough information about the alphabet, etc, but even a naive building step would take <code>O(WL)</code>, which is the size of the dictionary.</p>

Querying!

Guidance

An individual column

Larger individual text columns get their own page to allow for proper reading.

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload