Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Chriss Miller on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Reg Exp - legal filenames

Status
Not open for further replies.

adelante

Programmer
Joined
May 26, 2005
Messages
82
Location
DK
How do I either check for illigal characters in a string, or remove them? (I mean characters that can both be used in as filenames in Unix, Windows, and mysql)

I have made a script to upload files and create folders, and people should be able to give the files and folders names and rename.

However if they write something like this:
"my new/seminew folder"
"pictures/images from the weekend"

An error should be returned, not to mess up the sync between mysql, and the actual content their folder.

I dont know how to check for illigal characters, does anyone have this line of code??
 
Allowing visitors to create folders and specify file names sounds like a security nightmare.
I would have thought that specifying the chr's which are allowed would be better than trying to find all the dangerous ones.

Keith
 
A simple solution is to remove everything that isn't a letter, number, dot, or a space.

Code:
$var =~ s/[^A-Za-z0-9\. ]//g;

Or if you're just wanting to see if it has any of these characters to throw an error message:

Code:
if ($var =~ /[^A-Za-z0-9\. ]/) {
   die "invalid file name";
}
 
almost the same as Kirsle's suggestion but using the tr/// operator instead:

Code:
$var =~ tr/A-Za-z0-9._-//cd;

If you need to use variables for the search or replacement lists go with s/// instead of tr///;


 
$var =~ tr/A-Za-z0-9._-//cd;
$var =~ /[^A-Za-z0-9\. ]/

*Crying!!* I dont understand the last part of either of yours, however while waiting for suggestions, I tried something very clumpsy myself that seems to work though.

Code:
$filename =~ s/\w//ig;
$filename =~ s/[@|™|®|©|·|&|+|-| ]//ig;

if($filename){
  die "invalid file name";
  }
... I replace all the characters that I know are legal (or think I know are legal?) and if something is left in the string, it must be an illigal name...

E.g. the letter: Þ ...so now islandic people can't rename their folders. hmm? owell, I will try both or yours.

I just dont understand Regular Expression! :( Any really good tutorial to them somewhere? I get the basics, but not more complicated stuff like yours... I have read probably 20 tutorials explaining the basics.

Thanks alot for your attention. Case closed more or less.
 
These both do the same thing:

$var =~ tr/A-Za-z0-9._-//cd;
$var =~ s/[^A-Za-z0-9\. ]//g;

they remove any characters that are not in the search pattern:

A-Z (A thru Z all upper-case)
a-z (a thru z all lower-case)
0-9 (zero thru nine)
. (dot)
_ (underscore)
- (dash)

mine uses the tr/// (transliteration) operator with the 'c' and 'd' options on the end.

'c' means all characters NOT in the search pattern.

'd' means delete all characters not found in the replacement list. There is no replacement list in my regexp so all characters not found in the search list are deleted from the string.

Kirsle's does the same thing but just a little bit differently. These really are not complex regexp's. They are maybe a little past beginner level but not very much. If you have read 20 regexp tutorials you should understand them.

 
actually they don;t do quite the same thing:

$var =~ tr/A-Za-z0-9._-//cd;
$var =~ s/[^A-Za-z0-9\. ]//g;


the second one keeps spaces in the variable but removes dashes, the first one (mine) removes spaces but keeps dashes.
 
Quick note on this:
Code:
$filename =~ s/[@|™|®|©|·|&|+|-| ]//ig;

When you're using square brackets to define a character class, the pipe ("|") character is just an ordinary character, so you don't put it in to mean OR. The character class will already match any of the characters contained in it.
 
so this would be the same:

$filename =~ s/[@™®©·&+- ]//ig;

.. I think regexp is so confusing. owell..

Thanks alot everyone! :)
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top