Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations wOOdy-Soft on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Pattern Matching 1

Status
Not open for further replies.

inforeqd

Technical User
Jan 8, 2001
95
US
Heres the problem. I have looked through a couple of perl by example books and all over the net and I still cant find a way to do this.

I have a file that I need to take a chunk of text from the middle of and place it in a file. I have all the code except for the parse text part. I know this can be done in awk, I asked the question to that forum the other day. But how can I do this in perl?

Heres what i used previously but it wont parse just the block of text that I want. This segment just starts the parse at "Text to match" and ends at the EOF being read in as STDIN.

while (<IN>) {
chomp;
if (/Text to match/ .. /Text to stop matching/) {
print OUT;
}
}

I know this isnt the way to do it so any help is greatly appreciated.

Thanks
 
Hello inforeqd,

Your syntax is a little off....

open(IN,&quot;<someFile&quot;) or die &quot;$!\n&quot;;
while (<IN>) { $buffer .= $_; } # read the entire file into $buffer
close IN;

$buffer =~ /pattern to match/;
$matched_text = $&amp;; # Perl catches the matched text in a special var, $&amp;


OR,

if you have a string that identifies the start of the chunk and another for the end of the chunk....

$buffer =~ /start_pattern(.*?)end_pattern/is;
$matched_text = $1; # the paren's catch what matches inside them in $1.
# .*? - means any number of any chars up to end_pattern





Then, do what every you like with that chunk of text , $matched_text.

'hope this helps....



keep the rudder amid ship and beware the odd typo
 
gB, I think the ? after the * is redundant. * means any number of, including zero. ? means zero or one of. So, *? would mean zero or one of any number of. Right?

Also, in this line: while (<IN>) {$buffer .= $_;}, are the newlines included in $_ or is it delimited by newlines? In other words, does $buffer contain the newlines of the file, or just a list of lines undelmited by anything?
Sincerely,

Tom Anderson
CEO, Order amid Chaos, Inc.
 
I don't think so, Tom. In this context, the ? causes a minimal match. From O'Reilly's &quot;Programming Perl&quot;, bottom of page 63, &quot;By placing a question mark after any of the greedy quantifiers, they can be made to choose the smallest quantity for the first try.&quot;

Given this string, 'start some text [red]stop[/red] some more text [red]stop[/red] again',
this regex,
/start(.*)stop/
would catch 'some text [red]stop[/red] some more text', since the * is greedy.
this regex,
/start(.*?)stop/
would catch 'some text' as a minimal match.


Thanks for checking me, though. I think this time, I'm OK.



keep the rudder amid ship and beware the odd typo
 
Tom,
I just read the second paragraph in your previous post.

About the new lines in $buffer, they will be there. So, the pattern match will need an 's' on the end to cause the regex engine to treat the contents of $buffer as a single line. I was more in philosophy mode in my first response to this thread. More correctly, that section should read,


$buffer =~ /pattern to match/[red]s[/red];
$matched_text = $&amp;; # Perl catches the matched text in a special var, $&amp;

Thanks again.





keep the rudder amid ship and beware the odd typo
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top