Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations derfloh on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

regexp question

Status
Not open for further replies.

Ardeaem

Programmer
Joined
Feb 8, 2002
Messages
4
Location
US
I've searched several places for an answer to this, but haven't found one, so here goes:

I have a string that looks something like this:

abcd('blah','blahblah','blee')

and I want a regexp to match and return everything inside two single quotes. In this case, blah, blahblah, and blee. There are two case that are giving me problems. The first is I don't want an escaped quotation to trip it off. So, I used this:
/[^\\]'(.*?[^\\])'/

which works beautifully. The only problem is EMPTY strings. Say I have

abcd('','blah','blahblah','blee')

Well, that previous regexp blows up because it returns the character before the second quotaion, which is the previous quotation.

Any ideas on how I can solve this?
 
Code:
#!perl

$str = "abcd('','blah','blahblah','blee')";

while ($str =~ /('.*?')/gs) { print "Matched: $1\n"; }

'hope this helps

If you are new to Tek-Tips, please use descriptive titles, check the FAQs, and beware the evil typo.
 
but let's say

$str="abcd('','\\'Hello'\\','don\\'t','blah')";

I want it to return four things: "","\\'hello\\'","don't", and "blah".

I don't think the above rexexp would do that, but thanks for the help.
 
I think that your test case is broken!

[tt]$str="abcd('','\\'Hello'\\','don\\'t','blah')";[/tt]

The red quote is unescaped!

However, I think what you want for your re is
Code:
while ($str =~ /
    (              # collect into $1
        '          # single opening quote
            .*?    # any character, non-greedy
        (?<!\\)'   # quote not preceeded by     )              # stop collecting
/gsx) { print &quot;Matched: $1\n&quot;; }

It uses the zero-width negative look-behind construction [tt](?<!pattern)[/tt], meaning a quote that is not preceded by [tt]pattern[/tt].

All the fancy stuff in perl's re extensions start (? and < (arrow pointing left) is reasonably mnemonic for look-behind. The ! makes it a negative assertion - an = would have made it positive. Similarly, (?>= is a positive look-ahead assertion and (?>! is a negative look-ahead.




&quot;As soon as we started programming, we found to our surprise that it wasn't as easy to get programs right as we had thought. Debugging had to be discovered. I can remember the exact instant when I realized that a large part of my life from then on was going to be spent in finding mistakes in my own programs.&quot;
--Maurice Wilkes
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top