Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Chriss Miller on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Help with expressions 7

Status
Not open for further replies.

Supra

Programmer
Dec 6, 2000
422
US
This should be simple for the more advanced Perl users, but I need to write an expression that removes all characters from a string except letters, numbers and underscores. Could someone give an example of how this would be done? Also, could you explain exactly what's going on so I can learn from it? Thanks in advance for the person who has patience enough to explain pattern matching to me :)
 
Code:
my $foo = q~this is !@#$ what's left over_ *&^~;
$foo =~ tr/a-zA-Z0-9_//cd;
print $foo;


The translation operator (tr///) is used to change individual characters in a string/variable.

Options for the Translation Operator

c This option complements the match character list. In other words, the translation is done for every character that does not match the character list.

d This option deletes any character in the match list that does not have a corresponding character in the replacement list.

s This option reduces repeated instances of matched characters to a single instance of that character.
 
and another way:

Code:
my $foo = q~this is !@#$ what's left over_ *&^~;
$foo =~ s/[^\w]//g;
print $foo;

s/// is the substitution operator.

\w is the character class: a-zA-Z0-9_

[^\w] is a negated character class meaning anything to does not match a-ZA-Z0-9_

g This option finds all occurrences of the pattern in the string.

Since the replacement list is emtpty : s/[^\w]/[no replacement list]/g; all non word characters are removed.
 
Kevin, are there many character clases? and where do you obtain a list of them.

I take it ^ = not matching and \w = the class

cheers 1DMF

"In complete darkness we are all the same, only our knowledge and wisdom separates us, don't let your eyes deceive you.
 
There's a section in perlretut called "Using character classes" the predefined character classes are all described there.

Incidentally, it's easier to use \W as the negation of \w rather than [^\w]
 
where do I get perlretut is there an online version? remember I don't have PERL installed ;-)

"In complete darkness we are all the same, only our knowledge and wisdom separates us, don't let your eyes deceive you.
 
thanks, is there a big deal using ^ (not) rather then the negated (UPPERCASE) class.

does it make any difference which you use? or is it just good practice to use character classes and negated character classes?

"In complete darkness we are all the same, only our knowledge and wisdom separates us, don't let your eyes deceive you.
 
That was EXTREMELY helpful Kevin!! Thanks a lot for your help - I just wish I could give more than 1 star!!
 
That was EXTREMELY helpful Kevin!! Thanks a lot for your help - I just wish I could give more than 1 star!!

You're welcome, thanks for the star, and thanks to whoever else added the second star. :)
 
thanks, is there a big deal using ^ (not) rather then the negated (UPPERCASE) class.
Readability. That's it really. They both do the same thing, but when there is a predefined character class for what I'm trying to do, I prefer to use it. It prevents people looking at your code from having to take the time to figure out what you mean by [^\w] when you could've just written \W.
 
gotcha, cheers guys, stars all round :)

"In complete darkness we are all the same, only our knowledge and wisdom separates us, don't let your eyes deceive you.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top