Note that there are some explanatory texts on larger screens.

plurals
  1. POBest way to avoid adding duplicates in database
    text
    copied!<p>I have a SQL Server table with three columns: </p> <p><strong>Table1</strong></p> <pre><code>col1 int col2 int col3 string </code></pre> <p>I have a unique constraint defined for all three columns <code>(col1, col2, col3)</code></p> <p>Now, I have a .csv file from which I want to add records in this table and the *.csv file can have duplicate records.</p> <p>I have searched for various options for avoiding duplicates in above scenario. Below are the three options which are working well for me. Please have a look and throw some ideas on pros/cons of each method so I can choose the best one.</p> <p><strong>Option#1 :</strong> </p> <p>Avoiding duplicates in the first place i.e. while adding objects to the list from csv file. I have used <code>HashSet&lt;T&gt;</code> for this and overridden below methods for type T: </p> <pre><code>public override int GetHashCode() { return col1.GetHashCode() + col2.GetHashCode() + col3.GetHashCode(); } public override bool Equals(object obj) { var other = obj as T; if (other == null) { return false; } return col1 == other.col1 &amp;&amp; col2 == other.col2 &amp;&amp; col3 == other.col3; } </code></pre> <p><strong>option #2</strong> </p> <p>Having <code>List&lt;T&gt;</code> instead of <code>HashSet&lt;T&gt;</code>.</p> <p>Removing duplicates after all the objects are added to <code>List&lt;T&gt;</code></p> <pre><code> List&lt;T&gt; distinctObjects = allObjects .GroupBy(x =&gt; new {x.col1, x.col2, x.col3}) .Select(x =&gt; x.First()).ToList(); </code></pre> <p><strong>option #3</strong> </p> <p>Removing duplicates after all the objects are added to <code>DataTable</code>.</p> <pre><code>public static DataTable RemoveDuplicatesRows(DataTable dataTable) { IEnumerable&lt;DataRow&gt; uniqueRows = dataTable.AsEnumerable().Distinct(DataRowComparer.Default); DataTable dataTable2 = uniqueRows.CopyToDataTable(); return dataTable2; } </code></pre> <p>Although I have not compared their running time, but I prefer option#1 as I am removing duplicates as a first step - so moving ahead only with what is required.</p> <p>Please share your views so I can choose the best one.</p> <p>Thanks a lot!</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload