Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    text
    copied!<p>For the purposes of the discussion below, I'll assume that you have some way of reading the CSV files into a class. I'll call that class <code>MyRecord</code>.</p> <p>Load the files into separate lists, call them <code>NewList</code> and <code>OldList</code>:</p> <pre><code>List&lt;MyRecord&gt; NewList = LoadFile("newFilename"); List&lt;MyRecord&gt; OldList = LoadFile("oldFilename"); </code></pre> <p>There's perhaps a more elegant way to do this with LINQ, but the idea is to do a straight merge. First you have to sort the two lists. Either your <code>MyRecord</code> class implements <code>IComparable</code>, or you supply your own comparison delegate:</p> <pre><code>NewList.Sort(/* delegate here */); OldList.Sort(/* delegate here */); </code></pre> <p>You can skip the delegate if <code>MyRecord</code> implements <code>IComparable</code>.</p> <p>Now it's a straight merge.</p> <pre><code>int ixNew = 0; int ixOld = 0; while (ixNew &lt; NewList.Count &amp;&amp; ixOld &lt; OldList.Count) { // Again with the comparison delegate. // I'll assume that MyRecord implements IComparable int cmpRslt = OldList[ixOld].CompareTo(NewList[ixNew]); if (cmpRslt == 0) { // records have the same customer id. // compare for changes. ++ixNew; ++ixOld; } else if (cmpRslt &lt; 0) { // this old record is not in the new file. It's been deleted. ++ixOld; } else { // this new record is not in the old file. It was added. ++ixNew; } } // At this point, one of the lists might still have items. while (ixNew &lt; NewList.Count) { // NewList[ixNew] is an added record ++ixNew; } while (ixOld &lt; OldList.Count) { // OldList[ixOld] is a deleted record } </code></pre> <p>With just 120,000 records, that should execute very quickly. I would be very surprised if doing the merge took as long as loading the data from disk.</p> <p><strong>EDIT: A LINQ solution</strong></p> <p>I pondered how one would do this with LINQ. I can't do exactly the same thing as the merge above, but I can get the added, removed, and changed items in separate collections.<br> For this to work, <code>MyRecord</code> will have to implement <code>IEquatable&lt;MyRecord&gt;</code> and also override <code>GetHashCode</code>.</p> <pre><code>var AddedItems = NewList.Except(OldList); var RemovedItems = OldList.Except(NewList); var OldListLookup = OldList.ToLookup(t =&gt; t.Id); var ItemsInBothLists = from newThing in NewList let oldThing = OldListLookup[newThing.Id].FirstOrDefault() where oldThing != null select new { oldThing = oldThing, newThing = newThing }; </code></pre> <p>In the above, I assume that <code>MyRecord</code> has an <code>Id</code> property that is unique.</p> <p>If you want just the changed items instead of all the items that are in both lists:</p> <pre><code>var ChangedItems = from newThing in NewList let oldThing = OldListLookup[newThing.Id].FirstOrDefault() where oldThing != null &amp;&amp; CompareItems(oldThing, newThing) != 0 select new { oldThing = oldThing, newThing = newThing }; </code></pre> <p>The assumption is that the <code>CompareItems</code> method will do a deep comparison of the two items and return 0 if they compare equal or non-zero if something has changed.</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload