Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Shaun E on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Quick Phone Number RegEx Question 2

Status
Not open for further replies.

Alphabin

Programmer
Dec 8, 2002
119
CA
I have phone number in the following format:
(xxx) xxx-xxxx ext. xxxxx

I need a RegEx so I can extract the 4 parts.

I have this right now

Code:
m|[(]?(\d{3})[)][ ]?(\d{3}\d?)[-](\d{4})[ ][ext.][ ]?(\d{5})|g
 
(xxx) xxx-xxxx ext. xxxxx

/\((\d{3})\) (\d{3})\-\(d{4}) ext\. (\d{5})/;

If you wanted to get fancy, you could use the split function as so:

my $string = "(123) 123-4567 ext. 12345";
my @temp;
@temp = split(/(?:\(|\) |-| ext. |\n)/, $string);

HTH,

Nick
 
bling it up a notch:

Code:
my $num = '(123) 234-3456 ext. 45678';
my @parts = $num =~ /(\d+)/g;

print "$_ " for @parts;  # just proof that it worked

--jim
 
Silly me, you don't need the parens there... just forgot to take them out after tinkering around.

Code:
my @parts = $num =~ /\d+/g;

K.I.S.S.

--jim
 
Thanks guys for your help....

I really need to captured individually group $1, $2, $3 and $4

Right now that's what I have

Code:
$ExampleNumber = '(123) 123-1234 ext. 12345';

if ($ExampleNumber =~ m|[(]?(\d{3})[)][ ]?(\d{3}\d?)[-](\d{4})[ ][ext.][ ](\d{5})|g) {

$ExampleNumber1PFix1 = $1;
$ExampleNumber1PFix2 = $2;
$ExampleNumber1PFix3 = $3;
$ExampleNumber1PFix4 = $4;
}

print "Just to try \n";
print "$ExampleNumber1PFix1, $ExampleNumber1PFix2 , $ExampleNumber1PFix3 , $ExampleNumber1PFix4 \n";

Nebbish. I've tried your RegEx and it's not working, but syntax seems ok... weird..

Thanks Guys
 
Call me crazy, but I believe that either of the solution presented to you solved your problem... are you still having issues?

And you are mistaken... you do not need to capture them in $n variables. That is an imagined constraint.
If you are concerned about seeing if the regex succeeded, you can just check what was returned... in this case, checking for the four tokens.
Code:
my @parts = $num =~ /\d+/g;
print  (@parts == 4)?'hoookayyy!':'yayyaaaa!';

print "\nhawhhaaaat!\n";

# it's my code, and I can reference The Chapelle show in it if I want...

--jim
 
Double checked the regexp I presented and it still seems ok. Don't know what the problem there is, but I'd say Coderifous's solution is significantly more elegant anyway.
 
Thanks Nebster. BTW: Your solution worked fine on my system. So I am in the same boat as you, and am not sure what his problem was... certainly it was not your code.

--jim
 
everyone gives me grief for regex's being used when not needed - for speed issues

$longPhone = "(123) 456-7890 ext. 12345";

$phoneNum = substr($longPhone, 1,3)." ".substr($longPhone,6,3)." ".substr($longPhone,10,4)." ".substr($longPhone,20,5);

print "phone number is : $phoneNum\n";



Kind Regards
Duncan
 
Duncan, Duncan, Duncan ... you KNOW this solution calls for a regex. Simply because it's risky to depend on the data to come in EXACTLY the same format every time. If it were computer generated data... then maybe, but how often are phone numbers computer generated? Maybe at the phone company's office and that's it. I have seen how bright and talented your solutions normally are and this my friend falls far from the bunch. I feel confident that you would not use this type of solution in your own system. If I ever get a job where I see this implemented in some code somewhere, I will know it was you, and I will track you down... mark my words Duncmeister, I will find you and make you fix it.

I should make you write a 1000 word essay on why this is a rediculous suggestion, but I think you've learned your lesson here. If you do fall again though, I will have no choice but to write to Dave of the Tecumseh group and request that you be removed from the MVP list... I'm sorry Duncan, it's just alot of responsiblity being number one man. People expect alot of you. I was number one for a while... but I couldn't take the pressure... I cracked man... don't be like me, you go be a hero...

Now hold your head high, get out there and dazzle some dorks with your perl skills!!

(And now you have officially been given grief for NOT using a regex ;)

--jim
 
thanks for the grief jim!

but he did say...

[red]"I have phone number in the following format:
(xxx) xxx-xxxx ext. xxxxx"[/red]


...didn't he!

(rhetorical question) ;-)


Kind Regards
Duncan
 
No fight guys. He He !

I have to agree that the solution proposed by Dunc is really good. But like Jim said, I think a Regex is more suited for this since the data is not filtered to be 100% consistent.

Just one last question because I'm relly curious to find out the error in my initial Regex...
Code:
m|[(]?(\d{3})[)][ ]?(\d{3}\d?)[-](\d{4})[ ][a-z.][ ](\d{5})|g

I know it's caused by the [red] [a-z.] [/red] but I can't figure why. The first part is working properly. I can extract $1, $2, $3 but not $4. I'm simply curious even if I won't use this Regex

Thank Jim,Dunc & Nick... It's really appreciated to get some help from you guys.


 
Code:
m|

[(]?         -> optional open brace ... \(?

(\d{3})      -> 3 digits

[)]          -> close brace / why not optional as earlier? ... \)?

[ ]?         -> optional space ... just (space - not in a class)?

(\d{3}\d?)   -> 3 digits then digit / optional ... (\d{3,4})

[-]          -> hyphen ... - (not in class)

(\d{4})      -> 4 digits

[ ]          -> space ... (space - not in a class)

[a-z.]       -> lowercase 'a' through 'z' AND full-stop [red]guess you need a +[/red]

[ ]          -> space ... (space - not in a class)

(\d{5})      -> 5 digits

|g


Kind Regards
Duncan
 
Hi Alphabin

This one works:-

Code:
m|

\(?       [red](escaped) open paren - [b]optional[/b][/red]

(\d{3})   [red]3 digits[/red]

\)?       [red](escaped) closed paren - [b]optional[/b][/red]

 ?        [red]space - [b]optional[/b][/red]
 
(\d{3,4}) [red]min of 3 / max of 4 digits[/red]

-         [red]hyphen[/red]

(\d{4})   [red]4 digits[/red]

          [red]space[/red]
 
[a-z.]+   [red]'a' through 'z' AND full-stop / one or more of these[/red]

          [red]space[/red]
 
(\d{5})   [red]5 digits[/red]

|g;


Kind Regards
Duncan
 
Hi Alphabin

this regex is very logical as you are only interested in the digits!

Code:
$phone =~ m|(\d{3})[^\d]+(\d{3,4})[^\d]+(\d{4})[^\d]+(\d{5})|g;

(\d{3})   -> [blue]3 digits - [b]fixed[/b][/blue]
[^\d]+    -> [blue]catch anything that isn't a digit[/blue]
(\d{3,4}) -> [blue]min of 3 digits - max of 4[/blue]
[^\d]+    -> [blue]catch anything that isn't a digit[/blue]
(\d{3})   -> [blue]4 digits - [b]fixed[/b][/blue]
[^\d]+    -> [blue]catch anything that isn't a digit[/blue]
(\d{3})   -> [blue]5 digits - [b]fixed[/b][/blue]


Kind Regards
Duncan
 
Thank you very much Dunc ! It is really appreciated. You get a big *


Again, thank you to all of you
 
a bit better...

Code:
$line1 = "193    03989337  000060003+00060009.69003858";
$line2 = "195      989337  000060003+00060009.69003858";

@line1split = split (//, $line1);
@line2split = split (//, $line2);

$line1Length = @line1split;

print "number of characters: $line1Length\n\n";

$valuePerChar = 100 / $line1Length;

print "score per character match: $valuePerChar %\n\n";

for ($x=0; $x<=$#line1split; $x++) {
  
  if ($line1split[$x] eq $line2split[$x]) {
    $matched = $matched + $valuePerChar;
    print "$line1split[$x] | $line2split[$x] ... (match) $matched %\n"
  } else {
    print "$line1split[$x] | $line2split[$x] ...         $matched %\n"
  }
  
}

output...

Code:
number of characters: 44

score per character match: 2.27272727272727 %

1 | 1 ... (match) 2.27272727272727 %
9 | 9 ... (match) 4.54545454545455 %
3 | 5 ...         4.54545454545455 %
  |   ... (match) 6.81818181818182 %
  |   ... (match) 9.09090909090909 %
  |   ... (match) 11.3636363636364 %
  |   ...         11.3636363636364 %
0 |   ...         11.3636363636364 %
3 |   ...         11.3636363636364 %
9 | 9 ... (match) 13.6363636363636 %
8 | 8 ... (match) 15.9090909090909 %
9 | 9 ... (match) 18.1818181818182 %
3 | 3 ... (match) 20.4545454545455 %
3 | 3 ... (match) 22.7272727272727 %
7 | 7 ... (match) 25 %
  |   ... (match) 27.2727272727273 %
  |   ... (match) 29.5454545454546 %
0 | 0 ... (match) 31.8181818181818 %
0 | 0 ... (match) 34.0909090909091 %
0 | 0 ... (match) 36.3636363636364 %
0 | 0 ... (match) 38.6363636363636 %
6 | 6 ... (match) 40.9090909090909 %
0 | 0 ... (match) 43.1818181818182 %
0 | 0 ... (match) 45.4545454545455 %
0 | 0 ... (match) 47.7272727272727 %
3 | 3 ... (match) 50 %
+ | + ... (match) 52.2727272727273 %
0 | 0 ... (match) 54.5454545454546 %
0 | 0 ... (match) 56.8181818181818 %
0 | 0 ... (match) 59.0909090909091 %
6 | 6 ... (match) 61.3636363636364 %
0 | 0 ... (match) 63.6363636363636 %
0 | 0 ... (match) 65.9090909090909 %
0 | 0 ... (match) 68.1818181818182 %
9 | 9 ... (match) 70.4545454545455 %
. | . ... (match) 72.7272727272727 %
6 | 6 ... (match) 75 %
9 | 9 ... (match) 77.2727272727273 %
0 | 0 ... (match) 79.5454545454545 %
0 | 0 ... (match) 81.8181818181818 %
3 | 3 ... (match) 84.0909090909091 %
8 | 8 ... (match) 86.3636363636363 %
5 | 5 ... (match) 88.6363636363636 %
8 | 8 ... (match) 90.9090909090908 %


Kind Regards
Duncan
 
duncdude, I think you meant to post that last one in the thread titled "File Comparison Algorithm".
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top