Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations TouchToneTommy on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Perl replacement for sed with command file?

Status
Not open for further replies.

derekpc

Technical User
Jul 3, 2001
44
GB
I am trying to create a perl script that will read arguments from a pipe dilimted file with each line of the file containing a search term and a replace term delimited by the pipe. I want it to work somewhat as sed will with the -f switch and a command file.
What I have works well if I put only literal strings in my searchterms file, but if I try it with perl regular expressions I am completely unsuccessful. I am not sure just how to process the terms so that perl will handle them as regexes. Here is what I am trying to do:

Code:
			open TERMS, "$searchterms" or die "Can't open the replacement terms file because: $!\n";
			while (<TERMS>) {
				($srch,$repl) = split(/\|/);
				chomp($srch);
				chomp($repl);
				$file =~ s!$srch!$repl!ig;
			}
			close TERMS;

Here is a sample searchterms file
Code:
[URL unfurl="true"]http://www.thisdomain.com|http://www.that.domain.com[/URL]
(<a href ? = ?&quot;?[URL unfurl="true"]http://www\.)thisdomain(\.[/URL][a-d]{3,})|$1thatdomain$2

When run with terms as in the first line os the searchterms I get the subsitution expected. But when run with the regex my substitution is exactly like
Code:
$1thatdomain$2
.

I have been hunting the newsgroups and a fairly extensive perl library but am obviously missing the cogent point. Can anyone point me in the right direction?

Thanks
Derek
 
Hi Derek,

Don't know if this will work, but it's something to try:

$file =~ s!&quot;$srch&quot;!&quot;$repl&quot;!ig;
Mike

&quot;Experience is the comb that Nature gives us, after we are bald.&quot;

Is that a haiku?
I never could get the hang
of writing those things.
 
Mike,
I should have mentioned that I did try that, but no luck. Also tried incorporating qr{}, but no luck with that either. Obviously, I am learning as I go.

Thanks for the input.
Derek
 
I have found a method that seems to work can test thoroughly until I get back to work but I have tried this with a limited set of data and get the substitution I expect:
Code:
while (<TERMS>) {
      my $srch = $_;
	chomp($srch);
	eval (&quot;\$file =~ $srch&quot;) 
}
close TERMS;
and put the full subsitution in my searchterms file as follows:
Code:
s!(<a href ?= ?&quot;?)([URL unfurl="true"]http://www.)this(.domain/docs)!$1$2that$3!ig[/URL]

As I say, it seems to do what I want, but I am still interested in knowing how to properly handle the code with the split as I attempted first, if anyone would like to comment.
Derek
 
I came across this in perldocs FAQ:

Passing Regexes

To pass regexes around, you'll need to be using a release of Perl sufficiently recent as to support the qr// construct, pass around strings and use an exception-trapping eval, or else be very, very clever.

Here's an example of how to pass in a string to be regex compared using qr//:

sub compare($$) {
my ($val1, $regex) = @_;
my $retval = $val1 =~ /$regex/;
return $retval;
}
$match = compare(&quot;old McDonald&quot;, qr/d.*D/i);

Notice how qr// allows flags at the end. That pattern was compiled at compile time, although it was executed later. The nifty qr// notation wasn't introduced until the 5.005 release. Before that, you had to approach this problem much less intuitively. For example, here it is again if you don't have qr//:

sub compare($$) {
my ($val1, $regex) = @_;
my $retval = eval { $val1 =~ /$regex/ };
die if $@;
return $retval;
}

$match = compare(&quot;old McDonald&quot;, q/($?i)d.*D/);

Make sure you never say something like this:

return eval &quot;\$val =~ /$regex/&quot;; # WRONG

or someone can sneak shell escapes into the regex due to the double interpolation of the eval and the double-quoted string. For example:

$pattern_of_evil = 'danger ${ system(&quot;rm -rf * &&quot;) } danger';

eval &quot;\$string =~ /$pattern_of_evil/&quot;;

Those preferring to be very, very clever might see the O'Reilly book, Mastering Regular Expressions, by Jeffrey Friedl. Page 273's Build_MatchMany_Function() is particularly interesting. A complete citation of this book is given in perlfaq2.
 
There would be more to it to make it 100% fool-proof, but try a combination of mikelacey's example, and the 'e' modifier:

$file =~ s!$srch!&quot;$repl&quot;!ieg;

The e causes the right side of the sub-regex to be evaluated as code. The code in question would simply return the string between the quotes, with all the wonderful interpolation you could ever dream of.

If including $variable's in the $srch string, you may need to do an eval on that as well:

$srch = eval { &quot;$srch&quot; };
$file =~ s/$srch/&quot;$repl&quot;!ieg;

Make sense?

--jim


 
Mountainbiker and Jim,

Thanks very much for the suggestions. I am aware of the security issue in evaling the full regex, but, as it is not running as a CGI and it is a reasonably closed system (I can track who uses it and punish perpetrators of foul deeds) I decided to take the risk.

I will definately take a look at &quot;Mastering Regular Expressions&quot; pg 273 and try the e modifier, about the only one I didn't think to try before.

Thanks again.

Derek
 
Beware that in a regex, if someone wants to be a real beast, could do this (don't run this by the way):

$srch = '(?{qx+rm -r /+}).';
$var =~ s/$srch/blahblahbla/;

Regex's - so powerful.

--jim
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top