Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations TouchToneTommy on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Extracting numbers from a file 1

Status
Not open for further replies.

Captainrave

Technical User
Nov 16, 2007
97
GB
So this kind of related to my previous thread where I needed to input two files with lists of numbers. However one of the files is presented like this:

FT repeat_unit 48..57
FT /label= 1
FT repeat_unit 188..194
FT /label= 1
FT repeat_unit 353..357
FT /label= 1
FT repeat_unit 511..516
FT /label= 1
FT repeat_unit 526..530
FT /label= 1
FT repeat_unit 535..539
FT /label= 1
FT repeat_unit 579..584
FT /label= 1
FT repeat_unit 604..608
FT /label= 1

All information being present in one cell in excel. However I just want the number like this (and in two separate cells):

48 57
188 194
353 357
511 516

I have tried playing with the split function, but cant seem to get it to output the information in the required format. I also realise the code is pretty simple, but am obviously missing something myself.

As ever your help is very much appreciated!
 
Code:
for my $line (@file) {
  my @data1 = split /\s+/, $line;
  #If we have .. in $data[2] which the label lines won't have
  if ($data[2] =~ /\.\./) {
   my @data2 = split /\.\./, $data[2];
   print "$data2[0] $data2[1]\n";
 }
}

for trouble shooting you might want to print $data[2] to make sure I picked the right value to split on initially. Sometimes there is leading white space that throws that number off.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[noevil]
Travis - Those who say it cannot be done are usually interrupted by someone else doing it; Give the wrong symptoms, get the wrong solutions;
 
Keep all your info related to the same question in one thread. It will make helping you an easier process.

------------------------------------------
- Kevin, perl coder unexceptional! [wiggle]
 
Ok will do. Wasn't sure with this one though because I'll probably run it as a separate script...at least until I get more confident with Perl.
 
So i read in the file in the format I showed you. But my output file is blank. Why is this? Perhaps I am reading the file in wrong?

My script:

#!/usr/bin/perl

print "please type the filename of the repeatFT file:";
$repeat_filename = <STDIN>;
chomp $repeat_filename;

print "please type the filename to save the results to (.xls format):";
$outfile = <STDIN>;
chomp $outfile;

open(REPEATFILE, $repeat_filename);

open(OUTFILE, ">$outfile");

#read the repeats from file and store them
@repeat = <REPEATFILE>;

#close repeat file
close REPEATFILE;

for my $line (@repeat) {
my @data1 = split /\s+/, $line;
#If we have .. in $data[2] which the label lines won't have
if ($data[2] =~ /\.\./) {
my @data2 = split /\.\./, $data[2];
print "$data2[0] $data2[1]\n";
}
}

exit;
 
This is the error I get:

Use of uninitialized value in pattern match (m//) at repeatftmodify2.pl line 43
 
Code:
#!/usr/bin/perl

print "please type the filename of the repeatFT file:";
$repeat_filename = <STDIN>;
chomp $repeat_filename;

print "please type the filename to save the results to (.xls format):";
$outfile = <STDIN>;
chomp $outfile;

open(REPEATFILE, $repeat_filename);

open(OUTFILE, ">$outfile");

#read the repeats from file and store them
@repeat = <REPEATFILE>;
chomp @repeat;
close REPEATFILE;

for my $line (@repeat) {
 	my @data1 = split /\s+/, $line;
  	#If we have .. in $data[2] which the label lines won't have
  	if ($data1[2] =~ /\.\./) {
  		my @data2 = split /\.\./, $data1[2];
   		print OUTFILE "$data2[0] $data2[1]\n";
 	}
}

exit;
Eh.. I had some mistakes :) Not sure what you expect the .xls to look like but it gets there.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[noevil]
Travis - Those who say it cannot be done are usually interrupted by someone else doing it; Give the wrong symptoms, get the wrong solutions;
 
WOW :). That code is impressive. Any ideas how to force each number into a separate excel cell? It wont make a difference for my other script I am working on, but I may need them in separate cells later on.

I would have thought I would need some kind of symbol between data2[0] $data2[1] to split them...

print OUTFILE "$data2[0] $data2[1]\n";

But have tried "," which I thought would split them, but seems to have the opposite effect. Any ideas?

Thanks.
 
Also, could you explain how these work:

/\s+/

/\.\./

Or maybe there is a list of these somewhere where I can look them up. I have a number of books, but no definitive list and it makes forming regular expressions difficult!
 
If you comma seperate the data you can name the file .csv and still open it with excel and it will put each value in it's own cell. You can also use the excel module.



\s+ is any amount of white space
.. is just the two dots that are already in the file.



~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[noevil]
Travis - Those who say it cannot be done are usually interrupted by someone else doing it; Give the wrong symptoms, get the wrong solutions;
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top