Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p><strong>Dynamic Programming solution:</strong></p> <p>Have a last position variable for each of the words you're looking for.</p> <p>Have a total count of distinct seen words you're looking for (will never decrease, max = count of words you're looking for).</p> <p>For each word position in the input:</p> <ul> <li>If the word exists in the list of words you're looking for, update the last position of that word.</li> <li>Increase the total count if the updated last position was not initialized.</li> <li>If the total count is equal to the max, loop through the last positions and find the smallest one. The distance between the current position and that value will be the length of the substring. Record these values and find the best one over all positions.</li> </ul> <p>An optimization is to have <a href="http://en.wikipedia.org/wiki/Heap_%28data_structure%29">a heap</a> for last positions to reduce the time taken to find the smallest one (should be used together with some structure (possibly a hash- or tree-map) that allows fast lookup of pointers into the heap given a word).</p> <p><strong>Example:</strong></p> <p>Input: <code>This is a test. This is a programming test. a programming test this is</code></p> <p>Looking for: <code>this, test, a, programming</code></p> <pre><code> 1 2 3 4 5 6 7 8 9 10 11 12 13 14 This is a test. This is a programming test. a programming test this is this -1 1 1 1 1 5 5 5 5 5 5 5 5 13 13 test -1 -1 -1 -1 4 4 4 4 4 9 9 9 12 12 12 a -1 -1 -1 3 3 3 3 7 7 7 10 10 10 10 10 programming -1 -1 -1 -1 -1 -1 -1 -1 8 8 8 11 11 11 11 Count 0 1 1 2 3 3 3 3 4 4 4 4 4 4 4 Substr len NA NA NA NA NA NA NA NA 5 5 6 7 8 4 5 Shortest len NA NA NA NA NA NA NA NA 5 5 5 5 5 4 4 </code></pre> <p>Best result: <code>a programming test this</code>, length = 4.</p> <p><strong>Complexity Analysis:</strong></p> <p>Let <code>n</code> be the number of words in the input and <code>k</code> the number of words we're looking for.</p> <p>The algorithm only does one pass through the input and, at each step, does <code>O(log k)</code> work for the <code>getMin</code> operation (with the heap optimization).</p> <p>Thus the total time taken is <code>O(n log k)</code>.</p> <p><strong>Dealing with duplicates:</strong></p> <p>If duplicates are allowed in the words we're looking for (and the target sequence must match all occurrences), the algorithm above won't work as is, but a simple fix is to have each distinct word have its own heap of pointers into the original heap (the value in this heap being the same as the value in the original heap), with the maximum size of this heap being the number of occurrences of that word in the words we're looking for.</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    2. VO
      singulars
      1. This table or related slice is empty.
    3. VO
      singulars
      1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload