Note that there are some explanatory texts on larger screens.

plurals
  1. POIs there a diff algorithm that preserves line ownership
    primarykey
    data
    text
    <p>My goal is coming up with a script to track the point a line was added, even if the line is subsequently modified or moved around (both of which confuse traditional vcs 'blame' scripts. I've done some minor background research (see bottom) but didn't find anything useful. I have a concept for how to proceed but the runtime would be atrocious (there's a factorial involved).</p> <p>The two missing features are tracking edited-in-place lines separate from a deletion-and-addition of that line, and tracking entire functions moved around so they're in different hunks. For those experienced with diff but unfamiliar with the terminology, a subsequence is a contiguous group of <code>+</code> or <code>-</code> lines, with a type of either <code>delete</code> (all <code>-</code>), <code>add</code> (all <code>+</code>), or <code>replace</code> (a combination). I need more information, on moves and <code>edit-in-place</code> lines, vaguely alluded to in an entry on <a href="http://c2.com/cgi/wiki?DiffAlgorithm" rel="nofollow noreferrer">c2: DiffAlgorithm</a> (paragraph starts with "My favorite mode"). Does anyone know what that is? (seems to be based on Tichy, see bottom.)</p> <hr> <p>Here's more info on the two missing features:</p> <ol> <li>no concept of a change on a line, (a fourth type, something like <code>edit-in-place</code>). In this hunk, the parent of 'bc' is 'b' but 'd' is new and isn't a descendant of 'b':</li> </ol> <pre> a -b +bc +d </pre> <p>The workaround for this isn't too complicated, if the position of edits is the same (just an expanded version of <a href="http://trac.edgewall.org/browser/trunk/trac/versioncontrol/diff.py?marks=149-162#L148" rel="nofollow noreferrer"><code>markup_instraline_changes</code></a> but comparing edit distance on all equal-sized subsets of old and new lines.</p> <ol start="2"> <li>no concept of "moving" code that preserves the ownership of the lines, e.g. this diff shouldn't alter the ownership of "line", although its position changes.</li> </ol> <pre> a -line c +line </pre> <p>This could be dealt with in the same way but with much worse runtime (instead of only checking single blocks marked 'replace', you'd need to check Levenshtein distance between all added against all removed lines) and with likely false positives (some, like whitespace-only lines, aren't relevant to my problem).</p> <p>Research I've done: reading about <a href="http://www.ddj.com/184407970?pgno=1" rel="nofollow noreferrer">gestalt pattern matching</a> (Ratcliff and Obershelp, used in Python's difflib) and <a href="http://www.xmailserver.org/diff2.pdf" rel="nofollow noreferrer">An O(ND) Difference Algorithm and its Variations</a> (EW Myers).</p> <p>After posting the question, I found references to Tichy84 which appears to be <a href="http://portal.acm.org/citation.cfm?doid=357401.357404" rel="nofollow noreferrer">The string-to-string correction problem with block moves</a> (which I haven't read yet) according to Walter Tichy's paper a year later <a href="http://www.cs.purdue.edu/homes/trinkle/RCS/rcs.ps" rel="nofollow noreferrer">on RCS</a></p>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload