Note that there are some explanatory texts on larger screens.

plurals
  1. POlinux awk comparing two csv files and creating a new file with a flag
    primarykey
    data
    text
    <p>I have 2 CSV files that i need to compare and get the difference to a newly formatted file. The samples are given below.</p> <p><strong>OLD file</strong></p> <pre><code>DTL,11111111,1111111111111111,11111111111,Y,N,xx,xx DTL,22222222,2222222222222222,22222222222,Y,Y,cc,cc DTL,33333333,3333333333333333,33333333333,Y,Y,dd,dd DTL,44444444,4444444444444444,44444444444,Y,Y,ss,ss DTL,55555555,5555555555555555,55555555555,Y,Y,qq,qq </code></pre> <p><strong>NEW file</strong></p> <pre><code>DTL,11111111,1111111111111111,11111111111,Y,Y,xx,xx DTL,22222222,2222222222222222,22222222222,Y,N,cc,cc DTL,44444444,4444444444444444,44444444444,Y,Y,ss,ss DTL,55555555,5555555555555555,55555555555,Y,Y,qq,qq DTL,77777777,7777777777777777,77777777777,N,N,ee,ee </code></pre> <p><strong>Output file</strong></p> <p>I want to compare the old and new CSV files and to find the changes that has effected in the new file and UPDATE a FLAG to denote these changes</p> <p>U - if the new file record is UPDATED D - if a record existing in the old file is deleted in the new file N - if a record existing in the new file is not available in the old file</p> <p>the sample output file is this.</p> <pre><code>DTL,11111111,1111111111111111,11111111111,Y,Y,xx,xx U DTL,22222222,2222222222222222,22222222222,Y,N,cc,cc U DTL,33333333,3333333333333333,33333333333,Y,Y,dd,dd D DTL,77777777,7777777777777777,77777777777,N,N,ee,ee N </code></pre> <p>I used diff command but it will repeat the UPDATED record too which is not I want.</p> <pre><code> DTL,11111111,1111111111111111,11111111111,Y,N,xx,xx DTL,22222222,2222222222222222,22222222222,Y,Y,cc,cc DTL,33333333,3333333333333333,33333333333,Y,Y,dd,dd --- DTL,11111111,1111111111111111,11111111111,Y,Y,xx,xx DTL,22222222,2222222222222222,22222222222,Y,N,cc,cc 5a5 DTL,77777777,7777777777777777,77777777777,N,N,ee,ee </code></pre> <p>I used an AWK single line command to filter out my records as well</p> <pre><code> awk 'NR==FNR{A[$1];next}!($1 in A)' FS=: old.csv new.csv </code></pre> <p>the problem with this is is doesnt get me the records only belonging to the OLD file. which is </p> <pre><code>DTL,33333333,3333333333333333,33333333333,Y,Y,dd,dd </code></pre> <p>I initiated an driven bash script as well to ahieve this but didnt find much help with a good example.</p> <pre><code> myscript.awk BEGIN { FS = "," # input field seperator OFS = "," # output field seperator } NR &gt; 1 { #flag # N - new record D- Deleted U - Updated id = $1 name = $2 flag = 'N' # This prints the columns in the new order. The commas tell Awk to use the character set in OFS print id,name,flag } &gt;&gt; awk -f myscript.awk old.csv new.csv &gt; formatted.csv </code></pre>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload