Regexp: Not at the beginning. 3

awolff01 · Sep 5, 2003

I want to test for the pattern "MU" anywhere except at the beginning of a line:

Would this work :

if ($test =~ m/^^MU/)

Thanks.

it seems as though "^" is used to define the beginning at to do inverse operations.

raklet · Sep 5, 2003

^ is used to both indicate the begging of a line and inverse operations; however, it only indicates inverse operations when used inside of []. This creates a problem for your regex though. If you specify:

if ($test =~ m/^[^MU]/)

then the regex looks for any line that doesn't begin with M or with U but not MU together. If you specify:

if ($test =~ m/^^[M]U/)

then the regex looks for any line that doesn't begin with M and then is followed by a U.

So, what you have to do is use the ! (logical not) operator to tell the regex to match any that does not start with MU. Here is the code I used to test and verify.

@array = qw (MUd Mary Moses Mully Dan);
foreach (@array) {
chomp;
print "$_\n" if (!/^MU/);
}

The line you are interested in is this:

if (!/^MU/);

Cheers,

awolff01 · Sep 5, 2003

Actually I decided to do it a little differently.

In front of my pattern I want =~ m/[\s|\w]+MU/

I believe that using your method I would match anything that did not beginning with MU. What I want is to only match MU when it does not begin the line. So if it does not begin the line then it should have a least one alphanumeric character or a whitespace character preceeding it.

Your post was very helpful anyway. Thanks.

raklet · Sep 5, 2003

Yes, I understand what you are after now. Your solution will work just fine.

=~ m/[\s|\w]+MU/

Another way that you could write it is:

=~ m/.+?MU/

The . indicates any character, the + indicates one or more of any character, the ? tells it to no be greedy and stop at the first instance of MU. Otherwise the pattern will include all instances of MU until it gets to the last one it finds in a line (if MU occurs more than once).

icrf · Sep 5, 2003

You could use a negative lookbehind assertion for the beginning of line anchor.

Code:

/(?<!^)MU/

That matches any MU that is not preceded by the beginning of a line. That essentially does the same thing as racklet's !/^MU/ but lets you put other things into the regex if you need to (it negates only the position, not the entire regex). If you don't need anything else in the regex, I'd suggest his method.

The [\s|\w]+MU (the + or any quantifier is really redundant, since all you're looking for is a single character before it to ensure it's not at the beginning) wouldn't catch things like "$MU" because $ isn't whitespace or a word character. More so, if you're using multi-line input, where there could be a newline before MU, the \s contains a newline (hey, it's whitespace) and would match when you don't want it to. Using the . is better (it, by default, doesn't match newlines, just everything else), but I think the cleanest way is to use a short, simple regex and negate it.

Also, you don't need the OR operator | inside a [character class]. In fact, it actually inserts it as a character, adding it to the defined class of words and spaces, and now a pipe, which is probably not what was intended.

Ugh, it's 2am. I think I get cranky in the wee hours.

----------------------------------------------------------------------------------
...but I'm just a C man trying to see the light

raklet · Sep 6, 2003

Excellent comments. I doubt I could find a much better discussion of the subject in any book.

MikeLacey · Sep 7, 2003

I've never seen a "negative lookbehind assertion" even described anywhere except p5p, let alone used.... Wow, basically.

Mike

Want to get great answers to your Tek-Tips questions? Have a look at faq219-2884

It's like this; even samurai have teddy bears, and even teddy bears get drunk.

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

Regexp: Not at the beginning. 3

awolff01

Programmer

raklet

MIS

awolff01

Programmer

raklet

MIS

icrf

Programmer

raklet

MIS

MikeLacey

MIS

Similar threads

Part and Inventory Search

Sponsor