Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations MikeeOK on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Regular expression character class classes vs. shorthands

Status
Not open for further replies.

Zhris

Programmer
Aug 5, 2008
254
GB
Hi,

I have 2 alternative regexes to check for a valid UK Postcode. I know that they probably do not check accurately, however I believe the 2 regexes are identical but do not produce the same output.

Code:
[blue]Regex 1 (classes) (works as expected):[/blue]
if($_=~s/^([A-Za-z]{1,3}[0-9]{1}|[A-Za-z]{1,2}[0-9]{2})([0-9]{1}[A-Za-z]{2})$/$1 $2/){print "$_ > Passed<br>";}

[blue]Regex 2 (shorthands) (does not work as expected):[/blue]
if($_=~s/^(\w{1,3}\d{1}|\w{1,2}\d{2})(\d{1}\w{2})$/$1 $2/){print "$_ > Passed<br>";}

Regex 2 will pass postcodes such as 'AA1 11A' and 'AA1 1A1' which should fail.

What have I done wrong / why do they pass?

Thanks,

Chris
 
its because \w includes the digits 0-9.

\w = [a-zA-Z0-9_]

------------------------------------------
- Kevin, perl coder unexceptional! [wiggle]
 
Ahh, thanks Kevin. I misread my quick reference. Therefore there is no shorthand and I have to use the character class [A-Za-z]?

Chris
 
Right, there is no short hand for A-Za-z so you have to use the character class.

------------------------------------------
- Kevin, perl coder unexceptional! [wiggle]
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top