Hi all,
I'm a newbie in using Perl. I've got a csv file that is comma delimited. (Basically, there are 10 fields separated by commas on each line. Each field is a string field.) I need to:
1) get rid of the first line of the input file, the 2nd and 5th field for all the records
2) for the 3rd and 8th field, I need to take a substring of that field. The substring will start after the word TAX: and end before the first occurence of either a asterisk or then end of the field (which will be a comma)
I searched on this forum and have started the code....but couldn't get very far with it. I think I'll get a problem later on since I have gotten rid of the comma delimiters, then later for the output I would have to concatenate the fields again with a comma. Any ideas? I hope you can help me. Thanks in advance!
Example:
input file
-----------
Heading 1,2,3,4,5,6,7,8,9,10
aaa,bbbb,ccTAX:code*cccc,dddd,eee,fff,ggggg,hhhhTAX:code2,iii,jjj
Aaaa,Bbb,CcccTAX:code3,Dd,Eeee,Fffff,Gggg,hTAX:code4*hh,Iiii,Jjjjjjj
Output file
-------------
aaa,code,dddd,fff,ggggg,code2,iii,jjj
Aaaa,code3,Dd,Fffff,Gggg,code4,Iiii,Jjjjjjj
I'm a newbie in using Perl. I've got a csv file that is comma delimited. (Basically, there are 10 fields separated by commas on each line. Each field is a string field.) I need to:
1) get rid of the first line of the input file, the 2nd and 5th field for all the records
2) for the 3rd and 8th field, I need to take a substring of that field. The substring will start after the word TAX: and end before the first occurence of either a asterisk or then end of the field (which will be a comma)
I searched on this forum and have started the code....but couldn't get very far with it. I think I'll get a problem later on since I have gotten rid of the comma delimiters, then later for the output I would have to concatenate the fields again with a comma. Any ideas? I hope you can help me. Thanks in advance!
Example:
input file
-----------
Heading 1,2,3,4,5,6,7,8,9,10
aaa,bbbb,ccTAX:code*cccc,dddd,eee,fff,ggggg,hhhhTAX:code2,iii,jjj
Aaaa,Bbb,CcccTAX:code3,Dd,Eeee,Fffff,Gggg,hTAX:code4*hh,Iiii,Jjjjjjj
Output file
-------------
aaa,code,dddd,fff,ggggg,code2,iii,jjj
Aaaa,code3,Dd,Fffff,Gggg,code4,Iiii,Jjjjjjj
Code:
#!/usr/local/bin/perl
open (intax, "taxcoding.csv");
open (outtax, ">taxout.csv");
$_ = <intax>;
$input = <intax>;
while($input){
@fields = parse_csv($input);
$fields[$i]
$_ = $fields[1];
$_ = $fields[4];
$input = <intax>;
}
close (intax);
close (outtax);
sub parse_csv {
my $text = shift;
my @new = ();
push(@new, $+) while $text =~ m{
"([^\"\\]*(?:\\.[^\"\\]*)*)",?
| ([^,]+),?
| ,
}gx;
push(@new, undef) if substr($text, -1,1) eq ',';
return @new;
}