Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations bkrike on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Need help on regular expressions in improving code

Status
Not open for further replies.

vpalag

Programmer
Apr 28, 2002
11
IN
Hi All,
I am a beginer to regular expressions.
My requirement is to change / with + for the urls in html.
I dont need to do this for http:// urls though.
I wrote a peice of code, but looked to me that there should
be more efficient way to do this....Please suggest me, I will be grateful. I 've to do the appending yet though

If my original html has:
</P><HTML><B>TEST</B><P> </P><ahref=&quot;/planning/pandu/index.html&quot;>Planning</a></P><HTML><B>TEST</B><P> </P><ahref=&quot;/Home/Bank/Aboutus.html&quot;>Aboutus</a>';

I need to change it as

</P><HTML><B>TEST</B><P> </P>/some/custom/appending?planning+pandu+index.html&quot;>Planning</a></P><HTML><
B>TEST</B><P> </P>Home+Bank+Aboutus.html&quot;>Aboutus</a>

Thanks,


**************************************************

my $orightml = ' </P><HTML><B>TEST</B><P> </P><ahref=&quot;/planning/pandu/index.html&quot;>Planning</a></P><HTML><B>TEST</B><P> </P><ahref=&quot;/Home/Bank/Aboutus.html&quot;>Aboutus</a>';

$orightml =~ s/\/n//g;
my $origahref;
my $changedaref;
my $count;

my @list = split(/<ahref/,$orightml);
#print @list;

foreach(@list)
{

$count++;
if($count != 1)
{
$origahref = &quot;<ahref&quot;.$_;
$origahref =~ m|(<ahref=&quot;)(.*)(&quot;>)(.*)|;
$ahref_begin = $1;
$ahrefend = $3;
$restofit = $4;
$url = $2;
$url=~ s|/|+|g;
$url=~ s|\+?||;
$changedaref .= $ahrefbegin.$url.$ahrefend.$restofit;
}
else
{
$changedaref = $_;
}
}

print $changedaref;

*******************************************

 
vpalag,

Here's a simple way to make the match and substitution:

$ahref =&quot;/planning/pandu/index.html&quot;;
$match =&quot;\/([a-z]*)\/([a-z]*)\/([a-z]*\.html)&quot;;

$ahref =~ s/$match/$1+$2+$3/;

print($ahref);


Does this fit your needs?

Mark Banker
 
Hi Mark,

That works fine. Thanks for the short cut.
One thing is I might have many <a href>s in the html paragraph. I need them all of them to change in this fashion. As of now it does for only the first ahref.

Please let me know ,
Thanks again,
V
 
My preferred way of doing this may sound silly, but I think it's simpler and more robust. First I'd change any http:// to http:!! then I'd change all (remaining) / to + and then I change http:!! back to http://

The regular expressions involved in each case are MUCH simpler, so I suspect it might actually be faster.
Code:
$str =~ s[http://][http:!!]g;
$str =~ s[/][+]g;
$str =~ s[http:!!][http://]g;
Comments? Tracy Dryden
tracy@bydisn.com

Meddle not in the affairs of dragons,
For you are crunchy, and good with mustard. [dragon]
 
Yeah, I like that - I can read it. Mike
______________________________________________________________________
&quot;Experience is the comb that Nature gives us after we are bald.&quot;

Is that a haiku?
I never could get the hang
of writing those things.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top