Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>It's impossible to answer your question without more information. As you've stated it, you want to remove duplicates from an iterable. You can do that with <code>itertools.groupby</code>:</p> <pre><code>&gt;&gt;&gt; "".join(c for c, _ in groupby("yeeessssss")) 'yes' </code></pre> <p>Of course, that will remove <strong>all</strong> duplicates:</p> <pre><code>&gt;&gt;&gt; dedupe = lambda s: "".join(c for c, _ in groupby(s)) &gt;&gt;&gt; dedupe("hello") 'helo' &gt;&gt;&gt; dedupe("Mississippi") 'Misisipi' </code></pre> <p>I think your question is probably much more difficult; namely, how to normalise words which might have duplicate letters into <em>actual English words</em>. This is basically impossible to do precisely -- what would <code>beeeeeee</code> or <code>feeeed</code> become? -- but, with a lot of effort, you could probably approximate it by any of various heuristics.</p> <p>One simple one would be to see if the word is in a dictionary, and if not, remove duplicate letters one at a time until it is. This will be very inefficient, but might work.</p> <p>Another way would be to use a natural-language library to convert the word to some "normal form". This might be by how it sounds, how it is spelled, or something else. You could then find the closest word to that normal form and use <em>it</em> to give your de-duplicated word.</p> <p>Yet another way would be to define some sort of "modification distance" between strings, where you assign a fixed cost to each of the operations "delete a character", "insert a character", and "modify a character". You could then compute the closest word to the input under this metric. This is a well-studied problem because it is used in bioinformatics, and there is an elegant dynamic programming approach to it. Unfortunately, it's also really quite challenging to work out (a related question was a several-week coursework project in my undergraduate degree).</p> <hr> <p>;tl,dr</p> <p>Just removing duplicates is easy. Finding the best approximation as an <em>English word</em> is Very Hard.</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    2. VO
      singulars
      1. This table or related slice is empty.
    3. VO
      singulars
      1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload