Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    text
    copied!<p>You can see what it considers to be matching blocks:</p> <pre><code>&gt;&gt;&gt; difflib.SequenceMatcher(isjunk=lambda x: x == " ", a="a b c", b="a bc").get_matching_blocks() [Match(a=0, b=0, size=3), Match(a=4, b=3, size=1), Match(a=5, b=4, size=0)] </code></pre> <p>The first two tell you that it matches "a b" to "a b" and "c" to "c". (The last one is trivial)</p> <p>The question is why "a b" can be matched. I found the answer to this in the code. First the algorithm finds a bunch of matching blocks by repeatedly calling find_longest_match. What's notable about find_longest_match is that it allows the junk character to exist on the ends of the string:</p> <pre><code>If isjunk is defined, first the longest matching block is determined as above, but with the additional restriction that no junk element appears in the block. Then that block is extended as far as possible by matching (only) junk elements on both sides. So the resulting block never matches on junk except as identical junk happens to be adjacent to an "interesting" match. </code></pre> <p>This means that first it considers "a " and " b" to be matches (allowing the space character on the end of "a " and at the beginning of " b").</p> <p>Then, the interesting part: the code does one last check to see if any of the blocks are adjacent, and merges them if they are. See this comment in the code:</p> <pre><code> # It's possible that we have adjacent equal blocks in the # matching_blocks list now. Starting with 2.5, this code was added # to collapse them. </code></pre> <p>So basically it's matching "a " and " b", then merging those two blocks into "a b" and calling that a match, despite the space character being junk.</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload