Combine two simple steps into one. Please help. Thanks

dynax · Nov 11, 2006

Guys and Gals

I need help big time in some simple Perl scripts. Let me explain.

I have two txt files with this type of format:

Text1 file 1
aaa 234 bbb ccc ddd eee fff
bbb 456 asd fsd fsdf fds fdsf
cdc 999 dfs dfl asl asd asd
aaa 999 bbb ccc ddd eee fff
bbb 578 asd fsd fsdf fds fdsf
cdc 346 dfsd dflkj aslk asdlkj asdi

Text file 2
aaa 234 bbb ccc ddd eee fff
bbb 456 asd fsd fsdf fds fdsf
cdc 888 dfs dfl asl asd asd
aaa 888 bbb ccc ddd eee fff
bbb 578 asd fsd fsdf fds fdsf
cdc 346 dfsd dflkj aslk asdlkj asdi

I want the the output to be

cdc 999 dfs dfl asl asd asd cdc 888 dfs dfl asl asd asd
aaa 999 bbb ccc ddd eee fff aaa 888 bbb ccc ddd eee fff

Basically, I want to extract some part of textfile1 and someother part of textfile2 then combine them into another text file3.

I've been doing this using two separate scripts:

while (<STDIN>) {
chomp;
s/^\s+//;
$number1= 999; # or 888
while (<>) {
($col1, $col2, $col3,@rest) = split; #Good but provide no testing

if ($col2 =~ /^\d+$/ & ($col2 =~ $number1)) {
printf ("%s\t %.1lf\t %.1lf\n",$col1,$col2,$col3);
};
};
};

to extract the 888 and 999 part of the two txt files and pipe those into out1.txt and out2.txt

Then I use

$tab = "\t";

open(FILE1, $ARGV[0]) || die "Cannot open $ARGV[0]: $!\n";
open(FILE2, $ARGV[1]) || die "Cannot open $ARGV[0]: $!\n";

#open(FILE1, $ARGV[0]);
#open(FILE2, $ARGV[1]);

while ($line = <FILE1>) {
$line =~ s/\n//;
print STDOUT $line . $tab . <FILE2>
}

and pipe that to the final txt file.

Please show me a better way to do this in one step similar to

perl script input_file2 input_file2 > output.txt

I hope I explain myself rather clear. Thank you

audiopro · Nov 12, 2006

What is the criteria for which lines to extract and combine?
From your example it appears to be non matching, corresponding lines.

Keith

http://www.studiosoft.co.uk

PaulTEG · Nov 12, 2006

Code:

open FH1, "<$dir/$ARGV[0]" or die $!;
open FH2, "<$dir/$ARGV[0]" or die $!;
while (<FH1>) {
  @cols=split /\t/, $_;
  if ($cols[1] eq "999") {
    $hash{$cols[0].$cols[1].$cols[2].$cols[3].$cols[4]}{0}=$_;
  }
}
while (<FH2>) {
  @cols=split /\t/, $_;
  if ($cols[1] eq "888") {
    $hash{$cols[0].$cols[1].$cols[2].$cols[3].$cols[4]}{1}=$_;
  }
}
open OF, ">$ARGV[2]" or die $!;
foreach $key (sort keys $hash) {
  print OF $hash{$key}{0}.$sep.$hash{$key}{0};
}
close FH1;
close FH2;
close OF;

Not tested btw, and this will de-dupe on matching the rest of the line apart from the 888 or 999. This is probably wrong, but should give you enough to play with

Paul
------------------------------------
Spend an hour a week on CPAN, helps cure all known programming ailments ;-)

dynax · Nov 12, 2006

Thank you Paul and audiopro.

Paul, I'll try the script you created to see what's going on.

audiopro, I just want to extract the entire line based on the col2 value.

Thank you.
NK

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

Combine two simple steps into one. Please help. Thanks

dynax

Technical User

audiopro

Programmer

PaulTEG

Technical User

dynax

Technical User

Similar threads

Part and Inventory Search

Sponsor