Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    text
    copied!<p>Sort the strings. Keep the first X characters of each string. If this prefix is not unique to the string before and after, then advance until unique characters (compared to the string before and after) are found. (If no unique characters are found, the string has no unique part, see bottom of post) Add ellipses before and after those unique characters.</p> <p>Note that this still might look funny:</p> <pre><code>Microsoft Office -&gt; Micro...ffice Microsoft Outlook -&gt; Micro...utlook </code></pre> <p>I don't know what language you're looking to do this in, but here's a Python implementation.</p> <pre><code>def unique_index(before, current, after, size): '''Returns the index of the first part of _current_ of length _size_ that is unique to it, _before_, and _after_. If _current_ has no part unique to it, _before_, and _after_, it returns the _size_ letters at the end of _current_''' before_unique = False after_unique = False for i in range(len(current)-size): #this will be incorrect in the case mentioned below if i &gt; len(before)-1 or before[i] != current[i]: before_unique = True if i &gt; len(after)-1 or after[i] != current[i]: after_unique = True if before_unique and after_unique: return i return len(current)-size def ellipsize(entries, prefix_size, max_string_length): non_prefix_size = max_string_length - prefix_size #-len("...")? Post isn't clear about this. #If you want to preserve order then make a copy and make a mapping from the copy to the original entries.sort() ellipsized = [] # you could probably remove all this indexing with something out of itertools for i in range(len(entries)): current = entries[i] #entry is already short enough, don't need to truncate if len(current) &lt;= max_string_length: ellipsized.append(current) continue #grab empty strings if there's no string before/after if i == 0: before = '' else: before = entries[i-1] if i == len(entries)-1: after = '' else: after = entries[i+1] #Is the prefix unique? If so, we're done. current_prefix = entries[i][:prefix_size] if not before.startswith(current_prefix) and not after.startswith(current_prefix): ellipsized.append(current[:max_string_length] + '...') #again, possibly -3 #Otherwise find the unique part after the prefix if it exists. else: index = prefix_size + unique_index(before[prefix_size:], current[prefix_size:], after[prefix_size:], non_prefix_size) if index == prefix_size: header = '' else: header = '...' if index + non_prefix_size == len(current): trailer = '' else: trailer = '...' ellipsized.append(entries[i][:prefix_size] + header + entries[i][index:index+non_prefix_size] + trailer) return ellipsized </code></pre> <p>Also, you mention the string themselves are unique, but do they all have unique parts? For example, "Microsoft" and "Microsoft Internet Explorer 7" are two different strings, but the first has no part that is unique from the second. If this is the case, then you'll have to add something to your spec as to what to do to make this case unambiguous. (If you add "Xicrosoft", "MXcrosoft", "MiXrosoft", etc. to the mix with these two strings, there is <em>no</em> unique string shorter than the original string to represent "Microsoft") (Another way to think about it: if you have all possible X letter strings you can't compress them all to X-1 or less strings. Just like no compression method can compress <strong>all</strong> inputs, as this is essentially a compression method.)</p> <p>Results from original post:</p> <pre><code>&gt;&gt;&gt; for entry in ellipsize(["Microsoft Internet Explorer 6", "Microsoft Internet Explorer 7", "Microsoft Internet Explorer 8", "Mozilla Firefox 3", "Mozilla Firefox 4", "Google Chrome 14"], 7, 20): print entry Google Chrome 14 Microso...et Explorer 6 Microso...et Explorer 7 Microso...et Explorer 8 Mozilla Firefox 3 Mozilla Firefox 4 &gt;&gt;&gt; for entry in ellipsize(["Minutes of Company Meeting, 5/25/2010 -- Internal use only", "Minutes of Company Meeting, 6/24/2010 -- Internal use only", "Minutes of Company Meeting, 7/23/2010 -- Internal use only"], 15, 40): print entry Minutes of Comp...5/25/2010 -- Internal use... Minutes of Comp...6/24/2010 -- Internal use... Minutes of Comp...7/23/2010 -- Internal use... </code></pre>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload