^ is used to both indicate the begging of a line and inverse operations; however, it only indicates inverse operations when used inside of []. This creates a problem for your regex though. If you specify:
if ($test =~ m/^[^MU]/)
then the regex looks for any line that doesn't begin with M or with U but not MU together. If you specify:
if ($test =~ m/^^[M]U/)
then the regex looks for any line that doesn't begin with M and then is followed by a U.
So, what you have to do is use the ! (logical not) operator to tell the regex to match any that does not start with MU. Here is the code I used to test and verify.
@array = qw (MUd Mary Moses Mully Dan);
foreach (@array) {
chomp;
print "$_\n" if (!/^MU/);
}
I believe that using your method I would match anything that did not beginning with MU. What I want is to only match MU when it does not begin the line. So if it does not begin the line then it should have a least one alphanumeric character or a whitespace character preceeding it.
Yes, I understand what you are after now. Your solution will work just fine.
=~ m/[\s|\w]+MU/
Another way that you could write it is:
=~ m/.+?MU/
The . indicates any character, the + indicates one or more of any character, the ? tells it to no be greedy and stop at the first instance of MU. Otherwise the pattern will include all instances of MU until it gets to the last one it finds in a line (if MU occurs more than once).
You could use a negative lookbehind assertion for the beginning of line anchor.
Code:
/(?<!^)MU/
That matches any MU that is not preceded by the beginning of a line. That essentially does the same thing as racklet's !/^MU/ but lets you put other things into the regex if you need to (it negates only the position, not the entire regex). If you don't need anything else in the regex, I'd suggest his method.
The [\s|\w]+MU (the + or any quantifier is really redundant, since all you're looking for is a single character before it to ensure it's not at the beginning) wouldn't catch things like "$MU" because $ isn't whitespace or a word character. More so, if you're using multi-line input, where there could be a newline before MU, the \s contains a newline (hey, it's whitespace) and would match when you don't want it to. Using the . is better (it, by default, doesn't match newlines, just everything else), but I think the cleanest way is to use a short, simple regex and negate it.
Also, you don't need the OR operator | inside a [character class]. In fact, it actually inserts it as a character, adding it to the defined class of words and spaces, and now a pipe, which is probably not what was intended.
Ugh, it's 2am. I think I get cranky in the wee hours.
----------------------------------------------------------------------------------
...but I'm just a C man trying to see the light
This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
By continuing to use this site, you are consenting to our use of cookies.