Quick Phone Number RegEx Question 2

Alphabin · Apr 15, 2004

I have phone number in the following format:
(xxx) xxx-xxxx ext. xxxxx

I need a RegEx so I can extract the 4 parts.

I have this right now

Code:

m|[(]?(\d{3})[)][ ]?(\d{3}\d?)[-](\d{4})[ ][ext.][ ]?(\d{5})|g

Nebbish · Apr 15, 2004

(xxx) xxx-xxxx ext. xxxxx

/$(\d{3})$ (\d{3})\-$d{4}) ext\. (\d{5})/;

If you wanted to get fancy, you could use the split function as so:

my $string = "(123) 123-4567 ext. 12345";
my @temp;
@temp = split(/(?:\(|$ |-| ext. |\n)/, $string);

HTH,

Nick

Coderifous · Apr 15, 2004

bling it up a notch:

Code:

my $num = '(123) 234-3456 ext. 45678';
my @parts = $num =~ /(\d+)/g;

print "$_ " for @parts;  # just proof that it worked

--jim

Coderifous · Apr 15, 2004

Silly me, you don't need the parens there... just forgot to take them out after tinkering around.

Code:

my @parts = $num =~ /\d+/g;

K.I.S.S.

--jim

Alphabin · Apr 15, 2004

Thanks guys for your help....

I really need to captured individually group $1, $2, $3 and $4

Right now that's what I have

Code:

$ExampleNumber = '(123) 123-1234 ext. 12345';

if ($ExampleNumber =~ m|[(]?(\d{3})[)][ ]?(\d{3}\d?)[-](\d{4})[ ][ext.][ ](\d{5})|g) {

$ExampleNumber1PFix1 = $1;
$ExampleNumber1PFix2 = $2;
$ExampleNumber1PFix3 = $3;
$ExampleNumber1PFix4 = $4;
}

print "Just to try \n";
print "$ExampleNumber1PFix1, $ExampleNumber1PFix2 , $ExampleNumber1PFix3 , $ExampleNumber1PFix4 \n";

Nebbish. I've tried your RegEx and it's not working, but syntax seems ok... weird..

Thanks Guys

Coderifous · Apr 15, 2004

Call me crazy, but I believe that either of the solution presented to you solved your problem... are you still having issues?

And you are mistaken... you do not need to capture them in $n variables. That is an imagined constraint.
If you are concerned about seeing if the regex succeeded, you can just check what was returned... in this case, checking for the four tokens.

Code:

my @parts = $num =~ /\d+/g;
print  (@parts == 4)?'hoookayyy!':'yayyaaaa!';

print "\nhawhhaaaat!\n";

# it's my code, and I can reference The Chapelle show in it if I want...

--jim

Nebbish · Apr 15, 2004

Double checked the regexp I presented and it still seems ok. Don't know what the problem there is, but I'd say Coderifous's solution is significantly more elegant anyway.

Coderifous · Apr 15, 2004

Thanks Nebster. BTW: Your solution worked fine on my system. So I am in the same boat as you, and am not sure what his problem was... certainly it was not your code.

--jim

duncdude · Apr 15, 2004

everyone gives me grief for regex's being used when not needed - for speed issues

$longPhone = "(123) 456-7890 ext. 12345";

$phoneNum = substr($longPhone, 1,3)." ".substr($longPhone,6,3)." ".substr($longPhone,10,4)." ".substr($longPhone,20,5);

print "phone number is : $phoneNum\n";

Kind Regards
Duncan

PaulTEG · Apr 15, 2004

A time and a place for everything
--Paul

Coderifous · Apr 15, 2004

Duncan, Duncan, Duncan ... you KNOW this solution calls for a regex. Simply because it's risky to depend on the data to come in EXACTLY the same format every time. If it were computer generated data... then maybe, but how often are phone numbers computer generated? Maybe at the phone company's office and that's it. I have seen how bright and talented your solutions normally are and this my friend falls far from the bunch. I feel confident that you would not use this type of solution in your own system. If I ever get a job where I see this implemented in some code somewhere, I will know it was you, and I will track you down... mark my words Duncmeister, I will find you and make you fix it.

I should make you write a 1000 word essay on why this is a rediculous suggestion, but I think you've learned your lesson here. If you do fall again though, I will have no choice but to write to Dave of the Tecumseh group and request that you be removed from the MVP list... I'm sorry Duncan, it's just alot of responsiblity being number one man. People expect alot of you. I was number one for a while... but I couldn't take the pressure... I cracked man... don't be like me, you go be a hero...

Now hold your head high, get out there and dazzle some dorks with your perl skills!!

(And now you have officially been given grief for NOT using a regex

--jim

duncdude · Apr 15, 2004

thanks for the grief jim!

but he did say...

[red]"I have phone number in the following format:
(xxx) xxx-xxxx ext. xxxxx"[/red]

...didn't he!

(rhetorical question) ;-)

Kind Regards
Duncan

Alphabin · Apr 15, 2004

No fight guys. He He !

I have to agree that the solution proposed by Dunc is really good. But like Jim said, I think a Regex is more suited for this since the data is not filtered to be 100% consistent.

Just one last question because I'm relly curious to find out the error in my initial Regex...

Code:

m|[(]?(\d{3})[)][ ]?(\d{3}\d?)[-](\d{4})[ ][a-z.][ ](\d{5})|g

I know it's caused by the [red] [a-z.] [/red] but I can't figure why. The first part is working properly. I can extract $1, $2, $3 but not $4. I'm simply curious even if I won't use this Regex

Thank Jim,Dunc & Nick... It's really appreciated to get some help from you guys.

duncdude · Apr 16, 2004

Code:

m|

[(]?         -> optional open brace ... \(?

(\d{3})      -> 3 digits

[)]          -> close brace / why not optional as earlier? ... \)?

[ ]?         -> optional space ... just (space - not in a class)?

(\d{3}\d?)   -> 3 digits then digit / optional ... (\d{3,4})

[-]          -> hyphen ... - (not in class)

(\d{4})      -> 4 digits

[ ]          -> space ... (space - not in a class)

[a-z.]       -> lowercase 'a' through 'z' AND full-stop [red]guess you need a +[/red]

[ ]          -> space ... (space - not in a class)

(\d{5})      -> 5 digits

|g

Kind Regards
Duncan

duncdude · Apr 16, 2004

Hi Alphabin

This one works:-

Code:

m|

\(?       [red](escaped) open paren - [b]optional[/b][/red]

(\d{3})   [red]3 digits[/red]

\)?       [red](escaped) closed paren - [b]optional[/b][/red]

 ?        [red]space - [b]optional[/b][/red]
 
(\d{3,4}) [red]min of 3 / max of 4 digits[/red]

-         [red]hyphen[/red]

(\d{4})   [red]4 digits[/red]

          [red]space[/red]
 
[a-z.]+   [red]'a' through 'z' AND full-stop / one or more of these[/red]

          [red]space[/red]
 
(\d{5})   [red]5 digits[/red]

|g;

Kind Regards
Duncan

duncdude · Apr 16, 2004

Hi Alphabin

this regex is very logical as you are only interested in the digits!

Code:

$phone =~ m|(\d{3})[^\d]+(\d{3,4})[^\d]+(\d{4})[^\d]+(\d{5})|g;

(\d{3})   -> [blue]3 digits - [b]fixed[/b][/blue]
[^\d]+    -> [blue]catch anything that isn't a digit[/blue]
(\d{3,4}) -> [blue]min of 3 digits - max of 4[/blue]
[^\d]+    -> [blue]catch anything that isn't a digit[/blue]
(\d{3})   -> [blue]4 digits - [b]fixed[/b][/blue]
[^\d]+    -> [blue]catch anything that isn't a digit[/blue]
(\d{3})   -> [blue]5 digits - [b]fixed[/b][/blue]

Kind Regards
Duncan

Alphabin · Apr 16, 2004

Thank you very much Dunc ! It is really appreciated. You get a big *

Again, thank you to all of you

duncdude · Apr 16, 2004

cheers Alphabin!

Kind Regards
Duncan

duncdude · Apr 16, 2004

a bit better...

Code:

$line1 = "193    03989337  000060003+00060009.69003858";
$line2 = "195      989337  000060003+00060009.69003858";

@line1split = split (//, $line1);
@line2split = split (//, $line2);

$line1Length = @line1split;

print "number of characters: $line1Length\n\n";

$valuePerChar = 100 / $line1Length;

print "score per character match: $valuePerChar %\n\n";

for ($x=0; $x<=$#line1split; $x++) {
  
  if ($line1split[$x] eq $line2split[$x]) {
    $matched = $matched + $valuePerChar;
    print "$line1split[$x] | $line2split[$x] ... (match) $matched %\n"
  } else {
    print "$line1split[$x] | $line2split[$x] ...         $matched %\n"
  }
  
}

output...

Code:

number of characters: 44

score per character match: 2.27272727272727 %

1 | 1 ... (match) 2.27272727272727 %
9 | 9 ... (match) 4.54545454545455 %
3 | 5 ...         4.54545454545455 %
  |   ... (match) 6.81818181818182 %
  |   ... (match) 9.09090909090909 %
  |   ... (match) 11.3636363636364 %
  |   ...         11.3636363636364 %
0 |   ...         11.3636363636364 %
3 |   ...         11.3636363636364 %
9 | 9 ... (match) 13.6363636363636 %
8 | 8 ... (match) 15.9090909090909 %
9 | 9 ... (match) 18.1818181818182 %
3 | 3 ... (match) 20.4545454545455 %
3 | 3 ... (match) 22.7272727272727 %
7 | 7 ... (match) 25 %
  |   ... (match) 27.2727272727273 %
  |   ... (match) 29.5454545454546 %
0 | 0 ... (match) 31.8181818181818 %
0 | 0 ... (match) 34.0909090909091 %
0 | 0 ... (match) 36.3636363636364 %
0 | 0 ... (match) 38.6363636363636 %
6 | 6 ... (match) 40.9090909090909 %
0 | 0 ... (match) 43.1818181818182 %
0 | 0 ... (match) 45.4545454545455 %
0 | 0 ... (match) 47.7272727272727 %
3 | 3 ... (match) 50 %
+ | + ... (match) 52.2727272727273 %
0 | 0 ... (match) 54.5454545454546 %
0 | 0 ... (match) 56.8181818181818 %
0 | 0 ... (match) 59.0909090909091 %
6 | 6 ... (match) 61.3636363636364 %
0 | 0 ... (match) 63.6363636363636 %
0 | 0 ... (match) 65.9090909090909 %
0 | 0 ... (match) 68.1818181818182 %
9 | 9 ... (match) 70.4545454545455 %
. | . ... (match) 72.7272727272727 %
6 | 6 ... (match) 75 %
9 | 9 ... (match) 77.2727272727273 %
0 | 0 ... (match) 79.5454545454545 %
0 | 0 ... (match) 81.8181818181818 %
3 | 3 ... (match) 84.0909090909091 %
8 | 8 ... (match) 86.3636363636363 %
5 | 5 ... (match) 88.6363636363636 %
8 | 8 ... (match) 90.9090909090908 %

Kind Regards
Duncan

nix45 · Apr 19, 2004

duncdude, I think you meant to post that last one in the thread titled "File Comparison Algorithm".

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

Quick Phone Number RegEx Question 2

Programmer

Programmer

Programmer

Programmer

Programmer

Programmer

Programmer

Programmer

Programmer

Technical User

Programmer

Programmer

Programmer

Programmer

Programmer

Programmer

Programmer

Programmer

Programmer

MIS

Similar threads

Log in

Part and Inventory Search

Sponsor