Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations wOOdy-Soft on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Evaluation problem with regex loaded in array 2

Status
Not open for further replies.

Tve

Programmer
May 22, 2000
166
FR
Hi,

I'm processing a few hundred text files every night to do some cleanup. Basically, I loop on the files and process every line...easy. The thing is, I want to read the regular expressions from a configuration file. So I basicaly load the regex into a @regex array and I loop this array. It all works fine, except when I want to use $1.

I've created a sample that produces the same as my script (see below). This is the input:

This is line 1
This is line 21

I am expecting this:

This is line one
This is dummy val: 21

but I get this (notice the $1):

This is line one
This is dummy val: $1


This is the script:
[tt]
my @values = ( 'This is line 1',
'This is line 21',
);

# Initialise regex array (normaly read from file)
my @regex = ( [ qr/line 1/i , 'line one' ],
[ qr/line ([0-9]+)/i , 'dummy val: $1' ],
);

# Print out before running
print "Data before processing:\n\t", join("\n\t",@values),"\n";

# Loop on all lines...
foreach my $line (@values) {

# Looping on all regex
foreach (@regex) { $line =~ s/$$_[0]/$$_[1]/ }

}

# Print out before running
print "Data after processing:\n\t", join("\n\t",@values),"\n";
[/tt]

I must be missing a small thing here...Any suggestions?



AD AUGUSTA PER ANGUSTA

Thierry
 
Try changing just the line after "# Looping on all regex" to this:

Code:
foreach (@regex) { $line =~ s/$$_[0]/"\"$$_[1]\""/ee }

--jim
 
Jim,

Works great!

Would you mind just giving a small explanation?

Thanks,

Thierry


AD AUGUSTA PER ANGUSTA

Thierry
 
Jim,

I did some lookup and I start the understand the solution.

I'm not sure why you have to use /ee and not /e, but I will probably get it one day %-)

I wonder if this has a high impact on performance...

Anyway, thanks again.


AD AUGUSTA PER ANGUSTA

Thierry
 
Sure no prob. The letters on the end of a regex are called regex modifers. This is becuase the modify the way the regex behaves. You are already familiar with the 'i' modifier, which tells the regex to ignore case.

The 'e' modifier tells the regex to treat the replacement side of the substitution (what the match is to be replaced with) as perl code. That is, code to be evaluated, and it's result (return value) to be used as the replacement.

An example might be in order to help explain:

Code:
sub get_name { return "Bob" };
my $sentence = "My name is NAME";
$sentence =~ s/NAME/get_name()/e;
in the code above, we have a normal subroutine that returns a string. If we didn't use the 'e' modifier, then the regex would have treated the right side of the regex as just a string, and we would have had a sentence that looked like:
"My name is get_name()"
This is not what we wanted. We want perl to treat the right side as code, so we add the 'e' modifier, and as a result, it actually evaluates the right side of the regex, and uses the return value instead.

Now for the fun, you can stack the 'e' modifiers. So by using two, you are telling it to do simply do it twice

1) eval the replacement string, replace the replacement string with the result of the eval
2) eval the replacement string, replace the replacement string with the result of the eval

So to recap, the first 'e' evals the original replacement string, the second 'e' evals what the FIRST eval returned, and the third one (if present) would eval what the SECOND one returned, and so on.

Hope that made sense.

--jim
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top