×
INTELLIGENT WORK FORUMS
FOR COMPUTER PROFESSIONALS

Contact US

Log In

Come Join Us!

Are you a
Computer / IT professional?
Join Tek-Tips Forums!
  • Talk With Other Members
  • Be Notified Of Responses
    To Your Posts
  • Keyword Search
  • One-Click Access To Your
    Favorite Forums
  • Automated Signatures
    On Your Posts
  • Best Of All, It's Free!

*Tek-Tips's functionality depends on members receiving e-mail. By joining you are opting in to receive e-mail.

Posting Guidelines

Promoting, selling, recruiting, coursework and thesis posting is forbidden.

Students Click Here

Stripping invalid characters from filenames
3

Stripping invalid characters from filenames

Stripping invalid characters from filenames

(OP)
Some years ago, I wrote a little function that strips invalid characters from filenames. Why would you need to do that? You might need the user to specify the name of an output file. Or you might want to generate a filename from some other string, such as a customer or product name. In both those cases, you have to be sure that the name doesn't contain question marks, colons, backslashes, or other characters that aren't allowed in filenames.

So here is my function. As you can see, it simply replaces each instance of an invalid character with an underscore.

CODE -->

FUNCTION StripInvalidChars

* Removes all invalid characters from a filename (excluding extension)
* and replaces them with underscores.

LPARAMETERS tcIn

LOCAL lcBadChars

lcBadChars = [<>:"/\|?*]

RETURN CHRTRAN(tcIn, lcBadChars, REPLICATE("_", LEN(lcBadChars))) 

Most of the time, this works well enough. But it's not perfect. One issue is that it doesn't distinguish plain filenames (stem plus extension) from full path designations (including drive and/or directories). It would be easy to modify it to retain colons and backslashes, but it would be harder to deal with those characters in the "wrong" place (e.g. more than one consecutive colon or backslash). There are a few other similar minor issues.

Then I discovered what looks like a much easier solution: a FoxTools function named CleanPath() which appears to do just what its name suggests.

To use it, you must first open the FoxTools library, like this:

SET LIBRARY TO (HOME(1) + "foxtools")


after which you can call CleanPath() just like any other function. You pass it the "raw" input string, and it returns the cleaned-up version. In this case, the invalid characters are completely removed (in my function, they are replaced by undescores).

However, this too is not perfect. Its biggest problem is that it doesn't recognise embedded spaces in filenames (probably because it was written in MS_DOS days). If a filename contains spaces, it simply removes them. On the other hand, it does seem to handle drives and paths correctly - including, for example, removing duplicate backslashes - at least in most cases.

I hope the above information will be useful for anyone who has this requirement. To help you decide between two methods, here are the results of a quick comparison test of the two functions:

CODE

Input		    Own function       CleanPath()	Comment (CP = CleanPath)

abc.dbf             abc.dbf            ABC.DBF   	As expected
abc def.dbf         abc def.dbf        ABCDEF.DBF    	CP removes space
abc?def.dbf         abc_def.dbf        ABCDEF.DBF    	Both correctly remove ?
c:\abc.dbf          c__abc.dbf         C:\ABC.DBF  	CP handles full path OK
c:\data\abc.dbf     c__data_abc.dbf    C:\DATA\ABC.DBF  ditto
c:\data\\abc.dbf    c__data__abc.dbf   C:\DATA\ABC.DBF  CP removes double backslash 
c::\\abc.dbf        c____abc.dbf       ABC.DBF		CP loses (invalid) drive
(empty string)      (empty string)     (empty string)   As expected
?//*|		    _____ (5 underscrs)(empty string) 

I'd welcome your comments or suggestions re the above.

Mike

__________________________________
Mike Lewis (Edinburgh, Scotland)

Visual FoxPro articles, tips and downloads

RE: Stripping invalid characters from filenames

The first special case that I can think of:
What about UNC paths starting with \\?

To answer myself

CODE

? CLEANPATH("\\server\share") && returns "\server\share" 

So the foxtools implementation also isn't perfect with paths. I'd take it as good enough for many cases. To ask the user for a result file name, I'd perhaps set a standard output directory in his user profile documents or let him pick now via GETDIR(), which solves the path issue and then the function would only be concerned with the filestem name, and a simple CHRTRAN is good enough for that case.

As in FORCEEXT(ADDBS(basedir)+StripInvaildchars(stemname),"extension"), where you are already sure basedir is a valid output directory and "extension" the necessary extension. CLEANPATH(FORCEEXT(ADDBS(basedir)+stemname,"extension")) would work, too, the imperfection regarding UNC paths would mean basedir shouldn't be a UNC path, though.

So all in all I don't see much benefit of using CLEANPATH.

Chriss

RE: Stripping invalid characters from filenames

I'd used JustStem() to pull out the filestem, clean that with your existing function, and then put the whole filepath back together using ForcePath() and ForceExt():

CODE -->

* Assuming original is in cFileWithPath
LOCAL cCleanPath, cExt, cPath, cStem
cExt = JustExt(m.cFileWithPath)
cPath = JustPath(m.cFileWithPath)
cStem = JustStem(m.cFileWithPath)

cCleanPath = ForceExt(ForcePath(StripInvalidChars(m.cStem), m.cPath), m.Ext) 

Tamar

RE: Stripping invalid characters from filenames

Nice one Mike, Chris and Tamar!

Best Regards,
Scott
MSc ISM, MIET, MASHRAE, CDCAP, CDCP, CDCS, CDCE, CTDC, CTIA, ATS, ATD

"I try to be nice, but sometimes my mouth doesn't cooperate."

RE: Stripping invalid characters from filenames

(OP)
Chris and Tamar,

Thank you both for your comments. All good points.

When I wrote StripInvalidChars(), I only intended it to be used with the filename stem, so I had no reason to think about UNC paths or any other issues concerning drive and path designations. It was only when I started experimenting with CleanPath() that I thought about those cases. I hope that my little comparison chart will be of interest to anyone thinking of using either function.

My original function was part of an error-logging routine. This generated a text file containing certain error information. The filename stem was a concatenation of the user name, date and time. I had no control over the characters the user name could contain, hence the need to strip invalid characters.

Mike

__________________________________
Mike Lewis (Edinburgh, Scotland)

Visual FoxPro articles, tips and downloads

RE: Stripping invalid characters from filenames

Tamar,

JustPath("\\server\\share\\filename.dbf") will still be an erroneous path.

Another validity check could be determining whether DIRECTORY(JUSTPATH(m.cCleanpath),1) is .t. before using it.

I see neither cleanpath nor justpath nor a simple STRTRAN changing double backslashes to single ones will handle the start of a path correctly when UNC is involved, so if you want to get to the bottom of everything I think you still need a function caring for UNC, perhaps also changing slashes to backslashes or such corrections you can assume to be meant that way.

There's JUSTDRIVE, which again doesn't care for UNC paths, but would strip off drive letter and colon, so it can be used to detect the normal drive letter paths.

I suggested a basedir from config you can trust to be checked as existing path. But I see why you suggest your solution, if the only erroneous part is expected in the stem name, then decomposing the full name, correcting the stem part and putting it all back together works fine. So does using the basedir.

Chriss

Red Flag This Post

Please let us know here why this post is inappropriate. Reasons such as off-topic, duplicates, flames, illegal, vulgar, or students posting their homework.

Red Flag Submitted

Thank you for helping keep Tek-Tips Forums free from inappropriate posts.
The Tek-Tips staff will check this out and take appropriate action.

Reply To This Thread

Posting in the Tek-Tips forums is a member-only feature.

Click Here to join Tek-Tips and talk with other members! Already a Member? Login


Close Box

Join Tek-Tips® Today!

Join your peers on the Internet's largest technical computer professional community.
It's easy to join and it's free.

Here's Why Members Love Tek-Tips Forums:

Register now while it's still free!

Already a member? Close this window and log in.

Join Us             Close