Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations wOOdy-Soft on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Appeneding Files using AWK or shell!

Status
Not open for further replies.

james777

Programmer
Jul 9, 2000
41
US
HI,<br>Can any of you tell me how to write a unix shell script using AWK or general scripting<br>to append 3 files horizontally into a single file with two fields common in each file and the common fields doesn't repeat in the final file.<br><br>Example data::<br>file1<br><br>1234,abc,a12,zx,10<br>1235,acd,b35,dc,10<br>1456,cds,c25,fr,10<br>1290,cds,r45,fx,50<br><br><br>file2<br><br>1456,678,rrr,4679,10<br>1234,234,qqq,3456,10<br>1235,478,ddd,7890,10<br>1290,456,ttt,5867,50<br><br>file3<br><br>1290,ddd,345,50<br>1235,aaa,110,10<br>1234,ttt,100,10<br>1456,fff,234,10<br><br><br>The first and last fields for the three files are same but the order<br>they appear in each file might be different.<br>But it should grab the same first and last fields and write to final file<br>like below.<br><br>the final file should be like this.<br><br><br>1234,abc,a12,zx,234,qqq,3456,ttt,100,10<br>1235,acd,b35,dc,478,ddd,7890,aaa,110,10<br>1456,cds,c25,fr,678,rrr,4679,fff,234,10<br>1290,cds,r45,fx,456,ttt,5867,ddd,345,50<br><br>Any suggestions thanks<br>james.
 
#! /bin/sh<br>#<br>#&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;File:&nbsp;&nbsp;appnd<br>#<br>#&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Purpose:&nbsp;&nbsp;Concatenates selected fields of three files<br>#&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;by deleting the last field from first file,<br>#&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;deleting first and last field from second<br>#&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;file and removing the first comma that is<br>#&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;left and concatenating the second line to the<br>#&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;first line, then, deleting the first field<br>#&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;and comma from the third line and concatenating<br>#&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;it to the previously concatenated line.<br>#<br>#&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Usage:&nbsp;&nbsp;appnd file1 file2 file3 outfile<br>#<br>#&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Notes:&nbsp;&nbsp;This is a nawk program in a Bourne shell wrapper<br>#&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;in order to pass in file names.<br>#<br>#&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;The files are sorted prior to processing to make<br>#&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;it easier to key off the first field.<br>#<br>#&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;If original line order is required, another<br>#&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;program that indexes off the first field of<br>#&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;file1 and the first field of outfile is left<br>#&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;as an exercise.<br><br><br><br>sort $1 &gt; sfile1<br>sort $2 &gt; sfile2<br>sort $3 &gt; sfile3<br><br><br>fileone=&quot;sfile1&quot;<br>filetwo=&quot;sfile2&quot;<br>filethree=&quot;sfile3&quot;<br><br>nawk 'BEGIN {<br><br>&nbsp;&nbsp;&nbsp;&nbsp;FS = OFS = &quot;,&quot;<br>&nbsp;<br>&nbsp;&nbsp;&nbsp;&nbsp;while ( getline &lt; &quot;'$fileone'&quot; &gt; 0 )<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;if($0 !~ /^$/) {<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;key1[++i] = $1<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;$NF=&quot;&quot;<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;one<i> = $0<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;}<br><br>&nbsp;&nbsp;&nbsp;&nbsp;while ( getline &lt; &quot;'$filetwo'&quot; &gt; 0 )<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;if($1 ~ key1[++j]) {<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;$1=&quot;&quot;<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;$NF=&quot;&quot;<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;sub(&quot;,&quot;,&quot;&quot;)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;one[j] = one[j]$0<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;}<br><br>&nbsp;&nbsp;&nbsp;&nbsp;while ( getline &lt; &quot;'$filethree'&quot; &gt; 0 )<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;if($1 ~ key1[++k]) {<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;$1=&quot;&quot;<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;sub(&quot;,&quot;,&quot;&quot;)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;one[k] = one[k]$0<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;print one[k]<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;}<br><br>}' &gt; $4<br><br><br><br># This gets you 90% of the way there.&nbsp;&nbsp;Hope this helps some!<br><br> <p>flogrr<br><a href=mailto:flogr@yahoo.com>flogr@yahoo.com</a><br><a href= > </a><br>
 
# I asume the file 3 always has
# only 4 fields.


awk -F&quot;,&quot; '{
if( length( arr[ $1, 1]) == 0 ) {
ii++
arar[ ii] = $1
arr[ $1, 1] = $1
arr[ $1, 2] = $5
arr[ $1, 3] = $2 &quot;,&quot; $3 &quot;,&quot; $4
}
else arr[ $1, 3] = arr[ $1, 3] &quot;,&quot; $2 &quot;,&quot; $3 &quot;,&quot; $4
}
END{
for( x = 1; x <= ii; x++)
print arr[ arar[ x], 1] &quot;,&quot; arr[ arar[ x], 3]
} ' f1.txt f2.txt f3.txt
[sig][/sig]
 
If the key field is the first then I can do this:
[tt]
sort -b -t&quot;,&quot; file1 > file1.s
sort -b -t&quot;,&quot; file2 > file2.s
sort -b -t&quot;,&quot; file3 > file3.s
join -o 0,1.2,1.3,1.4,2.2,2.3,2.4,1.5 -t&quot;,&quot; -1 1 -2 1 file1.s file2.s > joined1
join -o 0,1.2,1.3,1.4,1.5,1.6,1.7,2.2,2.3,2.4 -1 1 -2 1 joined1 file3.s > joined
[tt]

Firstm join needs the file sorted with the -b option, and tell it the field sep is a comma.
Next: join the first two sorted files using the first field in the first file (-1 1) and the first field in the second file (-2 1), the separator is a comma too and the generated file have the key column, the second, third and fourth column of the first file, the second, third and fourth columns of the second file and the fifth column of the first file. Put all that in the &quot;joined1&quot; file.
Then join the results file with the third file and put that in the &quot;joined&quot;.

I guess is easier to type and modify what I wrote and not some awkward AWK routine... NO FLAMES please.

I hope it works... [sig][/sig]
 
If the key field is the first then I can do this:
[tt]
sort -b -t&quot;,&quot; file1 > file1.s
sort -b -t&quot;,&quot; file2 > file2.s
sort -b -t&quot;,&quot; file3 > file3.s
join -o 0,1.2,1.3,1.4,2.2,2.3,2.4,1.5 -t&quot;,&quot; -1 1 -2 1 file1.s file2.s > joined1
join -o 0,1.2,1.3,1.4,1.5,1.6,1.7,2.2,2.3,2.4 -1 1 -2 1 joined1 file3.s > joined
[tt]

First join needs the file sorted with the -b option, and tell it the field sep is a comma.
Next: join the first two sorted files using the first field in the first file (-1 1) and the first field in the second file (-2 1), the separator is a comma too and the generated file have the key column, the second, third and fourth column of the first file, the second, third and fourth columns of the second file and the fifth column of the first file. Put all that in the &quot;joined1&quot; file.
Then join the results file with the third file and put that in the &quot;joined&quot;.

I guess is easier to type and modify what I wrote and not some awkward AWK routine... NO FLAMES please.

I hope it works... [sig][/sig]
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top