INTELLIGENT WORK FORUMS
FOR COMPUTER PROFESSIONALS

Log In

Come Join Us!

Are you a
Computer / IT professional?
Join Tek-Tips Forums!
  • Talk With Other Members
  • Be Notified Of Responses
    To Your Posts
  • Keyword Search
  • One-Click Access To Your
    Favorite Forums
  • Automated Signatures
    On Your Posts
  • Best Of All, It's Free!

*Tek-Tips's functionality depends on members receiving e-mail. By joining you are opting in to receive e-mail.

Posting Guidelines

Promoting, selling, recruiting, coursework and thesis posting is forbidden.

Jobs

regex to find invalid characters in filename

regex to find invalid characters in filename

(OP)
I am trying to move files that have invalid characters out
of a directoy but the regex i am using is still copying
the good files that i want to keep in the log_dir

files can be like this
bill-0001.log
BILL-0120-.log
Bill-A-1234-Nov.log

The problem is those files are still being moved
can someone tell me what I am doing wrong here with my regex?

thnx inadvance!

for FILENAMES in `ls log_dir`
do
          if [[ "$FILENAMES" == ^[a-zA-Z0-9.-_]+$ ]] ; then
                #do nothing file is good
                  :
          else
                #badfile name
                print "Found invalid file ${FILENAMES}"
                mv "${FILENAMES}" /tmp/
          fi
done
 

RE: regex to find invalid characters in filename

Hi

[a-zA-Z0-9.-_] means a character that falls in one of the intervals :
  • a-z ( any of a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p, q, r, s, t, u, v, w, x, y and z )
  • A-Z ( any of A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U, V, W, X, Y and Z )
  • 0-9 ( any of 0, 1, 2, 3, 4, 5, 6, 7, 8 and 9 )
  • .-_ ( any of ., /, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, :, ;, <, =, >, ?, @, A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U, V, W, X, Y, Z, [, \, ], ^ and _ )
Are you sure this is what you want ?

Feherke.
feherke.ga

RE: regex to find invalid characters in filename

(OP)
Hello Feherke,

I want to be able to evaluate the FILENAME and if it see's anything in any position
other than the "-", ".", number 0-9, or alph A-Z, to mark the filename as invalid.

I thought the syntax for my regex would work but i get mixed returns
I also tried ecsaping the "-"

if [[ "$NAMES" =~ ^[a-zA-Z0-9.\-_]+$ ]] ; then

so when my script runs it should catch this as invalid because of the "#"
in the name:

Found invalid file bill01#pp.txt (which works)

but it fails here:

Found invalid file bill583-20151008104804-RETURN-GA1.txt ( which should be a valid filename)

RE: regex to find invalid characters in filename

(OP)
Basically, I want filenames to be of the POSIX "Fully portable filenames" standard,
which lists these: A–Z a–z 0–9 . _ - as acceptable in filenames everything else should be invalid.

RE: regex to find invalid characters in filename

Hi

Well, the escaping with backslash ( \ ) does not work as usual in the shell's regular expressions.

However the old simple trick works : move the dash ( - ) to the end of character class to avoid being interpreted as interval :

CODE

if [[ "$NAMES" =~ ^[a-zA-Z0-9._-]+$ ]] ; then 

Feherke.
feherke.ga

Red Flag This Post

Please let us know here why this post is inappropriate. Reasons such as off-topic, duplicates, flames, illegal, vulgar, or students posting their homework.

Red Flag Submitted

Thank you for helping keep Tek-Tips Forums free from inappropriate posts.
The Tek-Tips staff will check this out and take appropriate action.

Reply To This Thread

Posting in the Tek-Tips forums is a member-only feature.

Click Here to join Tek-Tips and talk with other members!

Resources

Close Box

Join Tek-Tips® Today!

Join your peers on the Internet's largest technical computer professional community.
It's easy to join and it's free.

Here's Why Members Love Tek-Tips Forums:

Register now while it's still free!

Already a member? Close this window and log in.

Join Us             Close