Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations derfloh on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Perl Pattern Matching 1

Status
Not open for further replies.

Guest_imported

New member
Joined
Jan 1, 1970
Messages
0
Hi!

Could someone tell me how to match a pattern of either "," or 8 digit number ex "12345678". The lines of the file I am trying to match start with either of these and I want to be able to extract a data when either of the two is encountered as the first field of the lines in that particular file...
Please Advice. Thanks!

Regards,
Shen
 
$file =~ /^(\d{8})|^,/;

pretty sure that'll do the trick.

If I'm correct it will match the beginning of the string followed by 8 digits OR the beginning of the string followed by a comma. Celia
 
Barbie,

"The lines of the file I am trying to match *start* with either of these"

<grin> Mike
&quot;Experience is the comb that Nature gives us after we are bald.&quot;

Is that a haiku?
I never could get the hang
of writing those things.
 
Mike,

Thats what

$file =~ /^(\d{8}|,)/;

does.

Try:

my @array = ( &quot;,line1&quot;,
&quot;12345678line2&quot;,
&quot;line3&quot;,
);

foreach (@array) {
print &quot;matched - $_\n&quot; if(/^(\d{8}|,)/);
}


;)

Barbie. Leader of Birmingham Perl Mongers
 
Are not the paren's superfluous?

/^\d{8}|,/

;-) 'hope this helps

If you are new to Tek-Tips, please use descriptive titles, check the FAQs, and beware the evil typo.
 
Yup, in this particular case the parens ARE superfluous. The | will or the previous &quot;thing&quot;, and since in this case you only want to or one &quot;thing&quot;, you don't really need the parens. However, it doesn't hurt, and makes thing look clearer. Tracy Dryden
tracy@bydisn.com

Meddle not in the affairs of dragons,
For you are crunchy, and good with mustard. [dragon]
 
Ok....

I see that /^\d{8}|,/ will match a line starting with 8 digits or a line containing a comma.

But I thought the requirement was to match a line starting with 8 digits or a line *starting* with a comma.

Am I reading the OP wrong?

I would have written /^\d{8}|^,/ Mike
&quot;Experience is the comb that Nature gives us after we are bald.&quot;

Is that a haiku?
I never could get the hang
of writing those things.
 
I just tested it, and Mike is correct! Without the ^ in front of the comma, it will match a comma ANYWHERE in the string. Another way to make it work correctly is to put everything after the ^ in parens (also tested):
Code:
/^(\d{8}|,)/
Mike gets a star!
Tracy Dryden
tracy@bydisn.com

Meddle not in the affairs of dragons,
For you are crunchy, and good with mustard. [dragon]
 
<puzzled> of course I'm correct...

Mind you, I didn't know about the \d{8} trick, I would actually have written /^(\d\d\d\d\d\d\d\d|,)/ which would have been a bit naff really. Mike
&quot;Experience is the comb that Nature gives us after we are bald.&quot;

Is that a haiku?
I never could get the hang
of writing those things.
 
Tracy: /^(\d{8}|,)/

Which is what I said in the first place. I should have made it clearer why the parentheses were there, but when you've been doing regexes for so long you forget to explaining the reasoning sometimes.

The parentheses are there to group sets of items, they don't just match a string and store in $1,$2 etc, which is a nice byproduct sometimes.

If was based on single characters you could have written:

/^[\d,]/

which will match a digit or a comma at the beginning of a line, but the [] only acknowledges one character.

HTH,
Barbie. Leader of Birmingham Perl Mongers
 
Mike, I'm surprised you didn't know about the repetition operator {n,m}. It's very useful. n is the minimum number of times, m is the maximum. {n} means exaxctly that number. {n,} means at least that many. {,m} means at most that many. You can use it on mulitple &quot;things&quot; by putting the group of things in parens. Here's a regex for an ip address:
Code:
/^(\d{3}\.){3}\d{3}$/
matches 3 digits and a period three times and then 3 digits. Tracy Dryden
tracy@bydisn.com

Meddle not in the affairs of dragons,
For you are crunchy, and good with mustard. [dragon]
 
Yep, so you did barbie, my mistake. Mike
&quot;Experience is the comb that Nature gives us after we are bald.&quot;

Is that a haiku?
I never could get the hang
of writing those things.
 
Yep, barbie had it correct.

For my approach to have worked, it needed another ^ before the comma.

as always, TMTOWTDI ;-)

(I love the conversations on this site.) 'hope this helps

If you are new to Tek-Tips, please use descriptive titles, check the FAQs, and beware the evil typo.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top