Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations TouchToneTommy on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

substitute with regular expression

Status
Not open for further replies.

gorgor

Programmer
Aug 15, 2002
164
US
It seems like everytime I do a regular expression that it's a new case and I
realize how much I don't know about them.

$rawPage contains the header and html code collected from an HTTP request.

I need to split this variable into two variables.

$header will contain all the information BEFORE <html>
$page will contain all the information including and following <html>

The white spaces and new lines should all stay intact.

I would like it to be in this type of format. I can't figure out the part
between brackets:

my $header = $rawPage =~ s|(<html>[through the end of the string contained
in $rawPage variable])||i;
my $page = $1;

Thanks for your help!
 
Hello,
I'm not sure, but i'm feeling you can try something like this :

$rawpage =~s/(.+)(<html>|<HTML>)(.+)//;
my $header = $1;
my $page = $3;
 
gorgor,

To use s/// you need parentheses around the assignment to $header. Otherwise $header get the return value of s///, which is not the remaining string but the number of matches. Also since you have the entire html file in one string you need to use the /s modifier to make '.' match a newline. So this should work
Code:
(my $header = $rawpage) =~ s/(<html>.*)$//is;
my $page = $1 || 'No Match';
Or using m//
Code:
my ($header, $page) = $rawpage =~ /^(.*?)(<html>.*)$/is;

jaa
 
Or, playing with 'split',...
Code:
#!/usr/local/bin/perl -w
use strict;
my $page = qq(
Content-type: text/html

<html some_tag_attributes:>
  <head>
    <title>Some Title Text Here</title>
  </head>
  <body>
    <p>Some text in the body</p>
    <p>Some more text in the body</p>
    <img src=&quot;[URL unfurl="true"]http://www.someserver.com/img.jpg&quot;>[/URL]
  </body>
</html>
);

my ($head, $tag, $html) = split /(<html.*?>)/, $page;
print qq(HEAD: $head\n);
print qq(HTML: \n $tag $html\n);

TMTOWTDI ;-) 'hope this helps

If you are new to Tek-Tips, please use descriptive titles, check the FAQs, and beware the evil typo.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top