Note that there are some explanatory texts on larger screens.

plurals
  1. POusing AWK how to remove these kind of Duplicates?
    primarykey
    data
    text
    <p>I am new to AWK, I have some basic ideas in AWK. I want to remove duplicates in a file, for example:</p> <pre><code> 0008.ASIA. NS AS2.DNS.ASIA.CN. 0008.ASIA. NS AS2.DNS.ASIA.CN. ns1.0008.asia. NS AS2.DNS.ASIA.CN. www.0008.asia. NS AS2.DNS.ASIA.CN. anish.asia NS AS2.DNS.ASIA.CN. ns2.anish.asia NS AS2.DNS.ASIA.CN ANISH.asia. NS AS2.DNS.ASIA.CN. </code></pre> <p>This is a sample file, from that using this command I got the output like this:</p> <pre><code>awk 'BEGIN{IGNORECASE=1}/^[^ ]+asia/ { gsub(/\.$/,"",$1);split($1,a,".")} length(a)==2{b[$1]++;}END{for (x in b)print x}' </code></pre> <blockquote> <p>0008.ASIA.<br> anish.asia.<br> ANISH.asia</p> </blockquote> <p>But I want output like this</p> <pre><code> 008.ASIA anish.asia </code></pre> <p>or</p> <pre><code>008.ASIA ANISH.asia </code></pre> <p>How do I remove these kind of duplicates?</p> <p>Thanks in Advance Anish kumar.V</p> <p>Thanks for your immediate reponse, Actually I wrote a complete script in bash, now I am in final stage. How to invoke python in that :-(</p> <pre><code>#!/bin/bash current_date=`date +%d-%m-%Y_%H.%M.%S` today=`date +%d%m%Y` yesterday=`date -d 'yesterday' '+%d%m%Y'` RootPath=/var/domaincount/asia/ MainPath=$RootPath${today}asia LOG=/var/tmp/log/asia/asiacount$current_date.log mkdir -p $MainPath echo Intelliscan Process started for Asia TLD $current_date exec 6&gt;&amp;1 &gt;&gt; $LOG ################################################################################################# ## Using Wget Downloading the Zone files it will try only one time if ! wget --tries=1 --ftp-user=USERNAME --ftp-password=PASSWORD ftp://ftp.anish.com:21/zonefile/anish.zone.gz then echo Download Not Success Domain count Failed With Error exit 1 fi ###The downloaded file in Gunzip format from that we need to unzip and start the domain count process#### gunzip asia.zone.gz &gt; $MainPath/$today.asia ###### It will start the Count ##### awk '/^[^ ]+ASIA/ &amp;&amp; !_[$1]++{print $1; tot++}END{print "Total",tot,"Domains"}' $MainPath/$today.asia &gt; $RootPath/zonefile/$today.asia awk '/Total/ {print $2}' $RootPath/zonefile/$today.asia &gt; $RootPath/$today.count a=$(&lt; $RootPath/$today.count) b=$(&lt; $RootPath/$yesterday.count) c=$(awk 'NR==FNR{a[$0];next} $0 in a{tot++}END{print tot}' $RootPath/zonefile/$today.asia $RootPath/zonefile/$yesterday.asia) echo "$current_date Count For Asia TlD $a" echo "$current_date Overall Count For Asia TlD $c" echo "$current_date New Registration Domain Counts $((c - a))" echo "$current_date Deleted Domain Counts $((c - b))" exec &gt;&amp;6 6&gt;&amp;- cat $LOG | mail -s "Asia Tld Count log" 07anis@gmail.com </code></pre> <p>In that</p> <pre><code> awk '/^[^ ]+ASIA/ &amp;&amp; !_[$1]++{print $1; tot++}END{print "Total",tot,"Domains"}' $MainPath/$today.asia &gt; $RootPath/zonefile/$today.asia </code></pre> <p>in this part only now I am searching how to get the distinct values so any suggestions using AWK is better for me. Thanks again for your immediate response.</p>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload