Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>I'd see this as largely equivalent to the <em>diff</em> / <em>patch</em> problem, where the aim is to find the minimal number of changes between one text file and another and then to apply those changes. In that case, the implementation defines the operations:</p> <ul> <li>add or insert</li> <li>delete</li> <li>change</li> </ul> <p>... but not <em>move</em>. The reason for omitting <em>move</em> isn't immediately obvious to me but I'd strongly suspect that including <em>move</em> would require a very costly computation to find the optimal movement.</p> <p>So, if we restrict the operations to those listed above, the Hunt-McIlroy algorithm described in <a href="http://www.cs.dartmouth.edu/~doug/diff.pdf" rel="nofollow noreferrer">An Algorithm for Differential File Comparison</a>, or one of its <a href="http://en.wikipedia.org/wiki/Diff" rel="nofollow noreferrer">descendents</a>, would find a close-to-optimal set of changes. </p> <p>The difference between your problem and the classic <em>diff</em> / <em>patch</em> is that you have a two-dimensional table, while <em>diff</em> / <em>patch</em> deals with a one-dimensional set of items (lines of text). The best way to convert the 2-D problem into the 1-D problem would depend on the particular characteristics of the changes that tend to be made in your data table. </p> <p>For example, if the table is <em>n</em> rows by <em>m</em> columns and changes tend to be grouped in rows or rows are inserted or deleted as a whole, then you would likely be best to consider the table as if it was a text file and do the <em>diff</em> line by line. Alternatively, if changes tend to be grouped in columns or columns are inserted or deleted, you could do the <em>diff</em> column by column. If the alterations include inserting or deleting individual cells (which result in the subsequent cells shifting right or left as a result), you could treat the table as if each cell in the table was on a separate line of the text file, linearising the table in either row-first or column-first order. </p> <p>Without knowing the details of your problem in this respect, however, my inclination would be to avoid premature optimisation. Thus, I'd tend to start by implementing the Hunt-McIlroy algorithm row-by-row if <em>m</em> &lt; <em>n</em>, or column-by-column if <em>n</em> &lt; <em>m</em>, then profile the application in use for a while before deciding that either a more sophisticated version of the algorithm or an alternative mapping of your problem to the Hunt-McIlroy solution was warranted. </p> <p>A good discussion of a variety of <em>diff</em> algorithms can be found <a href="https://stackoverflow.com/questions/805626/diff-algorithm">here on stackoverflow</a>.</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    2. VO
      singulars
      1. This table or related slice is empty.
    3. VO
      singulars
      1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload