Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>I think you're looking for a solution to a special case of the <a href="http://en.wikipedia.org/wiki/Longest_common_substring_problem" rel="nofollow">longest common substring problem</a>. While that problem is solvable using suffix trees or dynamic programming, your special case of sorted "strings" is easier to solve.</p> <p>Here's code that I think will give you the values you want. It's single argument is a sequence of sorted sequences. Its return value is list containing a 2-tuple for each of the inner sequences. The tuple values are slice indexes to the longest common substring between the sequences. Note that if there is no common substring, the tuples will all be <code>(0,0)</code>, which will result in empty slices (which I think is correct, since the empty slices will all be equal to each other!).</p> <p>The code:</p> <pre><code>def longest_common_substring_sorted(sequences): l = len(sequences) current_indexes = [0]*l current_substring_length = 0 current_substring_starts = [0]*l longest_substring_length = 0 longest_substring_starts = current_substring_starts while all(index &lt; len(sequence) for index, sequence in zip(current_indexes, sequences)): m = min(sequence[index] for index, sequence in zip(current_indexes, sequences)) common = True for i in range(l): if sequences[i][current_indexes[i]] == m: current_indexes[i] += 1 else: common = False if common: current_substring_length += 1 else: if current_substring_length &gt; longest_substring_length: longest_substring_length = current_substring_length longest_substring_starts = current_substring_starts current_substring_length = 0 current_substring_starts = list(current_indexes) if current_substring_length &gt; longest_substring_length: longest_substring_length = current_substring_length longest_substring_starts = current_substring_starts return [(i, i+longest_substring_length) for i in longest_substring_starts] </code></pre> <p>Test output:</p> <pre><code>&gt;&gt;&gt; a=[1,2,3,4,5,6] &gt;&gt;&gt; b=[1,2,3,5,6,7] &gt;&gt;&gt; c=[3,4,5,6,7,8] &gt;&gt;&gt; longest_common_substring_sorted((a,b,c)) [(4, 6), (3, 5), (2, 4)] </code></pre> <p>I'm sorry about not having commented the code very well. The algorithm is somewhat similar to the merge step of a mergesort. Basically, it keeps track of an index into each of the sequences. As it iterates, it increments all of the indexes that correspond to values that are equal to the smallest value. If the current values in all of the lists are equal (to the smallest value, and thus to each other), it knows that it is within a substring common to all of them. When a substring ends, it is checked against the longest substring found so far.</p>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload