Note that there are some explanatory texts on larger screens.

plurals
  1. POHow to remove duplicate lines from text file having more than 200K rows and its size is 1 GB?
    primarykey
    data
    text
    <p>Presently i am using following code......its working for only 300 lines of text file..and it will take 2minits to execute this program code... but my text file is having more than 200k of rows(lines), so this code is not working for that file...so plz anyone help me to solve this problem...thanks in advance..</p> <pre><code>string[] source = System.IO.File.ReadAllLines(@"C:\Documents and Settings\finaloutput.txt"); var q1 = (from line in source let fields = line.Split(',') select new { autoid = fields[0], ATMID = fields[4], DATE = fields[2], TIME = fields[3], CARDNo = fields[5], TRANSId = fields[6], SEQNo = fields[7], TRANSIT = fields[8], CheckNo = fields[9], CATEGORY = fields[10], SCORE = fields[11], //THRESHOLD = fields[12] }); var ids = (from d in q1 where d.CATEGORY != "Accepted" group d by new { d.ATMID, d.DATE, d.CARDNo, d.TRANSIT, d.CheckNo } into grp select grp.Min(x =&gt; x.autoid)); var toDelete = (from d in q1 where !ids.Contains(d.autoid) &amp;&amp; d.CATEGORY != "Accepted" select d.autoid); // source1.DeleteOnSubmit(toDelete); var distinct = (from d in q1 where !toDelete.Contains(d.autoid) select d); // Makes a list of the DeletedFields // var list_Of_CSV_ItemsDeleted = distinct.Select(x =&gt; string.Join(",", x.autoid)); // Makes a list of the distinct Fields var list_Of_CSV_ItemsDistinct = distinct.Select(x =&gt; string.Join(",", x.autoid, x.ATMID, x.DATE, x.TIME, x.CARDNo, x.TRANSId, x.SEQNo, x.TRANSIT, x.CheckNo, x.CATEGORY, x.SCORE)); System.IO.File.WriteAllLines(@"C:\Documents and Settings\distict1.txt", list_Of_CSV_ItemsDistinct); </code></pre>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload