Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations TouchToneTommy on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Need help with regex

Status
Not open for further replies.

StickyBit

Technical User
Jan 4, 2002
264
CA
Folks,

I need to filter the following string from a file:

P9PF-JKUK-KLJI-KLNH-BHFT-MKNB-P (the string appears by itself on a line)

I'm using the following regex: if (/^\w{4}\-/) that does work but it needs fine tuning because I sometimes pick up garbage like the following:

font-family:Arial'><o:p>&nbsp;</o:p></span></font></p>

MIME-Version: 1.09133

I’m trying to use a regex that will find a string containing 4 alphanumeric characters followed by a dash '-' then 4 more alphanumeric characters etc. Can anyone help me?

thx.


 
Hw well defined is your string?
could
Code:
/^\w{4}\-\w{4}\-\w{4}\-\w{4}\-\w{4}\-\w{4}\-\w$/
closely defines your pattern ( and will therefore reject your example of false positives
 
Code:
[b]#!/usr/bin/perl[/b]

undef $/;

$_ = <DATA>;

s|^[A-Z0-9]{3}-[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{1}\n||m;

print;

[blue]__DATA__
blah
blah blah
9PF-JKUK-KLJI-KLNH-BHFT-MKNB-P
blah blah
blah[/blue]


Kind Regards
Duncan
 
Another suggestion:
Code:
#!perl
use strict;
use warnings;

/^([A-Z0-9]{4}-){6}[A-Z0-9]$/ && print while <DATA>;

__DATA__
blah
blah blah
P9PF-JKUK-KLJI-KLNH-BHFT-MKNB-P
blah blah
blah
Re is "4 alphanumeric chars followed by a dash, repeated 6 times, followed by one alphanumeric char".
Dunc, you changed line 3 of the data slightly.
(Left off the first char, so your re's a little different.)
 
Mike

You got me again!

Lucky someone's got their eye on me... (someone who knows their onions fortuntately!)

Thanks dude!

You've also - by omission - kindly pointed out that my regex could have been reduced to:-

[tt]s|[A-Z0-9]{3}-([A-Z0-9]{4}-){5}[A-Z0-9]\n||;[/tt]


... but now it does not work with the beginning of line caret? weird!

Cheers


Kind Regards
Duncan
 
Dunc, you need the /m modifier on the end of your re for it to work. With /m, ^ matches at the beginning of a line, anywhere within the multi-line string. Without /m, ^ matches only at the beginning of the entire string. Your modified re works if you use /m.

Trying out your code, I see we had different interpretations of StickBit's request to "filter the following string from a file": your idea of "filtering" was to remove the string; my idea was to return it. Both valid interpretations, given what info we had.
 
thank you - it's always good to learn another trick! gotta keep learning those regex's - an awesomely powerful tool i'm sure you'll agree

and I agree - filter is a bit ambiguous - but again I think you're being modest... he mentioned that he was [red]finding[/red] other [red]rubbish[/red] - so I guess your regex that returned the match is correct :)


Kind Regards
Duncan
 
I guess filter was a little ambiguous; I should have used the word extract.

Thanks again.

StickyBit.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top