Note that there are some explanatory texts on larger screens.

plurals
  1. POfind duplicate columns and replace matches with count
    primarykey
    data
    text
    <p>I have a tab delimited file which has duplicate named headings;</p> <pre><code>[Column1] \t [Column2] \t [test] \t [test] \t [test] \t [test] \t [Column3] \t [Column4] </code></pre> <p>What I want to do, is re-name the columns that are duplicated [test] with a integer. So would become something like</p> <pre><code>[Column1] \t [Column2] \t [test1] \t [test2] \t [test3] \t [test4] \t [Column3] \t [Column4] </code></pre> <p>So far, I can isolate the first row. Then count the matches I have found</p> <pre><code>string destinationUnformmatedFileName = @"C:\New\20130816_Opportunities_unFormatted.txt"; string destinationFormattedFileName = @"C:\New\20130816_Opportunities_Formatted.txt"; var unformattedFileStream = File.Open(destinationUnformmatedFileName, FileMode.Open, FileAccess.Read); // Open (unformatted) file for reading var formattedFileStream = File.Open(destinationFormattedFileName, FileMode.Create, FileAccess.Write); // Create (formattedFile) for writing StreamReader sr = new StreamReader(unformattedFileStream); StreamWriter sw = new StreamWriter(formattedFileStream); int rowCounter = 0; // Read each row in the unformatted file while ((currentRow = sr.ReadLine()) != null) { //First row, lets check for duplicate names if (rowCounter = 0) { // Write column name to array string delimiter = "\t"; string[] fieldNames = currentRow.Split(delimiter.ToCharArray()); foreach (string fieldName in fieldNames) { // fieldName must be followed by a tab for it to be a duplicate // original code - causing the issue //Regex rgx = new Regex("\\t(" + fieldName + ")\\t"); // Edit - resolved the issue Regex rgx = new Regex("(?&lt;=\\t|^)(" + fieldName + ")(\\t)+"); // Count how many occurances of fieldName in currentRow int count = rgx.Matches(currentRow).Count; //MessageBox.Show("Match Count = " + count.ToString()); // If we have a duplicate field name if (count &gt; 1) { string newFieldName = "\t" + fieldName + count.ToString() + "\t"; //MessageBox.Show(newFieldName); currentRow = rgx.Replace(currentRow, newFieldName, 1); } } } rowCounter++; } </code></pre> <p>I think I'm on the right track, but I don't think the regex's are working correctly?</p> <p>Edit: I think I have figured out how to find the pattern with using;</p> <pre><code>Regex rgx = new Regex("(?&lt;=\\t|^)(" + fieldName + ")(\\t)+"); </code></pre> <p>Its not a deal breaker, but only problem now is that it labels;</p> <pre><code>[Column1] \t [Column2] \t [test4] \t [test3] \t [test2] \t [test] \t [Column3] \t [Column4] </code></pre> <p>Instead of</p> <pre><code>[Column1] \t [Column2] \t [test1] \t [test2] \t [test3] \t [test4] \t [Column3] \t [Column4] </code></pre>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload