Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations wOOdy-Soft on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

regexp question

Status
Not open for further replies.

Ardeaem

Programmer
Feb 8, 2002
4
US
I've searched several places for an answer to this, but haven't found one, so here goes:

I have a string that looks something like this:

abcd('blah','blahblah','blee')

and I want a regexp to match and return everything inside two single quotes. In this case, blah, blahblah, and blee. There are two case that are giving me problems. The first is I don't want an escaped quotation to trip it off. So, I used this:
/[^\\]'(.*?[^\\])'/

which works beautifully. The only problem is EMPTY strings. Say I have

abcd('','blah','blahblah','blee')

Well, that previous regexp blows up because it returns the character before the second quotaion, which is the previous quotation.

Any ideas on how I can solve this?
 
Code:
#!perl

$str = "abcd('','blah','blahblah','blee')";

while ($str =~ /('.*?')/gs) { print "Matched: $1\n"; }

'hope this helps

If you are new to Tek-Tips, please use descriptive titles, check the FAQs, and beware the evil typo.
 
but let's say

$str="abcd('','\\'Hello'\\','don\\'t','blah')";

I want it to return four things: "","\\'hello\\'","don't", and "blah".

I don't think the above rexexp would do that, but thanks for the help.
 
I think that your test case is broken!

[tt]$str="abcd('','\\'Hello'\\','don\\'t','blah')";[/tt]

The red quote is unescaped!

However, I think what you want for your re is
Code:
while ($str =~ /
    (              # collect into $1
        '          # single opening quote
            .*?    # any character, non-greedy
        (?<!\\)'   # quote not preceeded by     )              # stop collecting
/gsx) { print &quot;Matched: $1\n&quot;; }

It uses the zero-width negative look-behind construction [tt](?<!pattern)[/tt], meaning a quote that is not preceded by [tt]pattern[/tt].

All the fancy stuff in perl's re extensions start (? and < (arrow pointing left) is reasonably mnemonic for look-behind. The ! makes it a negative assertion - an = would have made it positive. Similarly, (?>= is a positive look-ahead assertion and (?>! is a negative look-ahead.




&quot;As soon as we started programming, we found to our surprise that it wasn't as easy to get programs right as we had thought. Debugging had to be discovered. I can remember the exact instant when I realized that a large part of my life from then on was going to be spent in finding mistakes in my own programs.&quot;
--Maurice Wilkes
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top