Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Chriss Miller on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

need Help with Perl matching!!

Status
Not open for further replies.

JMeng

Programmer
Joined
Sep 25, 2007
Messages
2
Location
US
Hi I have the following text that I would like to extract the numerical data from:

Statistics for all 1D-facet surfaces:
Surface area (input units ^1) = 1.1111111111111111
Surface area (lambda^1) = 1.1111111111111111
Facets/lambda**1:
Min = 123.1111111111111
Max = 111.1111111111111
Ave = 111.1111111111111
Edges/lambda:
Min = 4.111111111111111
Max = 11.11111111111111
Ave = 1.111111111111111
Facet aspect ratios:
Min = 5.111111111111111
Max = 1.111111111111111
Ave = 1.111111111111111


and this is the part of my code that doesn't do what I was intending to do, which is to extra the numbers from Min, Max, and Ave:

if (my ($FLMin) = /\s*Edges\/lambda:\s*Min =\n\s*(\d+[.]\d+)/s) {print "$FLMin\n";}

I believe iit's just a simple syntax issue but I couldn't figure out where goes wrong. Is there anyone that can help?
 
First glance:

You need to make it non-greedy.

Your \s* matches are by default greedy, and will match the largest match they can.

So your original will match to 'Edges/lambda' as expected, but the 'Min =' part will match to the last 'Min = ' (the one in facet aspect ratios), and the number will match to the very last number (the ave in facet aspect ratios).

A '?' makes a match non-greedy.

Code:
 /\s*Edges\/lambda:\[b]s*?[/b]Min =\n[b]\s*?[/b](\d+[.]\d+)/s


 
Oh sorry I put the \n in the wrong place.

So the correct code is:

/\s*Edges\/lambda:\n\s*?Min =\s*?(\d+[.]\d+)/s

But what about the first \s*? Why doesn't it need a "?" too?
 
Your \s* matches are by default greedy, and will match the largest match they can.

\s* is technically not greedy. ".*" is greedy because it matches as much as possible of anything. \s* only matches as many spaces as necessary.

This line might not be correct:

Code:
if (my ($FLMin) = /\s*Edges\/lambda:\s*Min =\n\s*(\d+[.]\d+)/s) {print "$FLMin\n";}

You have used "=", the assignment operator, instead of the regexp binding operator "=~". Using "=" will work only in a special circumstance. It will work if the text is in the system scalar $_ in which case the above line is the same as:

Code:
if (my ($FLMin) = $_ =~ /\s*Edges\/lambda:\s*Min =\n\s*(\d+[.]\d+)/s) {print "$FLMin\n";}

the problem is the \n after the "=" in the search pattern. Should be:

Code:
if (my ($FLMin) = /\s*Edges\/lambda:\s*Min =\s*(\d+\.\d+)/s) {print "$FLMin\n";}



------------------------------------------
- Kevin, perl coder unexceptional! [wiggle]
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top