Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gmmastros on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

building from multiple files 1

Status
Not open for further replies.

sbgirl54

Programmer
Aug 16, 2012
7
0
0
FR
Hi! Basically, I have 2 text files and need to combine them into one while making a few changes. The commands I came up with were too long/slow so I figured some more familiar with awk could help me.

Basically I have these 2 files, were N~50000 :

file1.res

1 a1 a2 ... a13
2 b1 b2 ... b13
.
.
.
N x1 x2 ... x13

and file2.res

1 aa1 aa2 ... aa13
2 bb1 bb2 ... bb13
.
.
.
N xx1 xx2 ... xx13

I need an output file of the form:

x1 a1 aa1 b1 bb1 ... x1 xx1
x2 a2 aa2 b2 bb2 ... x2 xx2
.
.
x13 a13 aa13 b13 bb13 ... x13 xx13


I manage to do it using intermediate files, but for really big N that's most inconvenient, so can anyone see a one-liner for this one?
 
I forgot to say x1 to x13 in the last table are headers, they're not part of any previous file.
 
Something like this?

Code:
awk '
        [blue]NR[/blue]==[blue]FNR[/blue] {
                [olive]if[/olive] ([blue]NF[/blue]>maxnf) maxnf=[blue]NF[/blue]
                [olive]for[/olive] (i=1;i<=[blue]NF[/blue];i++) f1[[blue]FNR[/blue],i]=[blue]$i[/blue]
                file1nr=[blue]FNR[/blue]
                [b]next[/b]
        }
        {
                [olive]if[/olive] ([blue]NF[/blue]>maxnf) maxnf=[blue]NF[/blue]
                [olive]for[/olive] (i=1;i<=[blue]NF[/blue];i++) f2[[blue]FNR[/blue],i]=[blue]$i[/blue]
        }
        [green]END[/green] {
                [olive]for[/olive] (i=2;i<=maxnf;i++) {
                        [b]printf[/b] [red]"[/red][purple]%s [/purple][red]"[/red],f1[file1nr,i]
                        [olive]for[/olive] (j=1; j<=file1nr; j++)
                                [b]printf[/b] [red]"[/red][purple]%s %s [/purple][red]"[/red],f1[j,i],f2[j,i]
                        [b]printf[/b] [red]"[/red][purple]\n[/purple][red]"[/red]
                }
        }
' file1.res file2.res

Not sure it caters for your second comment about headers correctly... but should point you in the right direction.

Annihilannic
[small]tgmlify - code syntax highlighting for your tek-tips posts[/small]
 
Yo, thank you :) That's very helpful. It work and I also sorted the headers. I really augth to learn this awk thing better, it's quite impressive what it can achieve.
 
Thanks again! Could you just explain a detail to me please?

My file ended up with too many lines, so I added this command:

awk -v n=13 'NR>n{print a[NR%n]}{a[NR%n]=$0}'

to removes the last 13 lines, it works, but I would rather use

awk -v n=NR/2 'NR>n{print a[NR%n]}{a[NR%n]=$0}'

to remove half of the lines (might not always be 13 in the future). However an error message says it's trying to divide by 0... What's going on?
 
-v n=NR/2" is processed only once before the main script is run, also it isn't interpreted as awk code, so you just end up with the text "NR/2" in the variable.

Also NR is a dynamic value; it increases as awk processes each line of input, so NR/2 would be a moving target.

If you are processing the file in 2 phases (like my example), then you'll need to calculate your equivalent to NR/2 in the END { ... } clause of the script, something like this:

Code:
[COLOR=#006600]                        # ...[/color]
                        [COLOR=#0000FF]for[/color] (j=[COLOR=#FF0000]1[/color]; j<=file1nr/[COLOR=#FF0000]2[/color]; j++)
                                [COLOR=#FF0000]printf[/color] [COLOR=#808080]"%s %s "[/color],f1[j,i],f2[j,i]
                        [COLOR=#006600]# ...[/color]

Annihilannic
[small]tgmlify - code syntax highlighting for your tek-tips posts[/small]
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top