×
INTELLIGENT WORK FORUMS
FOR COMPUTER PROFESSIONALS

Log In

Come Join Us!

Are you a
Computer / IT professional?
Join Tek-Tips Forums!
  • Talk With Other Members
  • Be Notified Of Responses
    To Your Posts
  • Keyword Search
  • One-Click Access To Your
    Favorite Forums
  • Automated Signatures
    On Your Posts
  • Best Of All, It's Free!
  • Students Click Here

*Tek-Tips's functionality depends on members receiving e-mail. By joining you are opting in to receive e-mail.

Posting Guidelines

Promoting, selling, recruiting, coursework and thesis posting is forbidden.

Students Click Here

Jobs

regex to find invalid characters in filename

regex to find invalid characters in filename

regex to find invalid characters in filename

(OP)
I am trying to move files that have invalid characters out
of a directoy but the regex i am using is still copying
the good files that i want to keep in the log_dir

files can be like this
bill-0001.log
BILL-0120-.log
Bill-A-1234-Nov.log

The problem is those files are still being moved
can someone tell me what I am doing wrong here with my regex?

thnx inadvance!

for FILENAMES in `ls log_dir`
do
          if [[ "$FILENAMES" == ^[a-zA-Z0-9.-_]+$ ]] ; then
                #do nothing file is good
                  :
          else
                #badfile name
                print "Found invalid file ${FILENAMES}"
                mv "${FILENAMES}" /tmp/
          fi
done
 

RE: regex to find invalid characters in filename

Hi

[a-zA-Z0-9.-_] means a character that falls in one of the intervals :
  • a-z ( any of a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p, q, r, s, t, u, v, w, x, y and z )
  • A-Z ( any of A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U, V, W, X, Y and Z )
  • 0-9 ( any of 0, 1, 2, 3, 4, 5, 6, 7, 8 and 9 )
  • .-_ ( any of ., /, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, :, ;, <, =, >, ?, @, A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U, V, W, X, Y, Z, [, \, ], ^ and _ )
Are you sure this is what you want ?

Feherke.
feherke.ga

RE: regex to find invalid characters in filename

(OP)
Hello Feherke,

I want to be able to evaluate the FILENAME and if it see's anything in any position
other than the "-", ".", number 0-9, or alph A-Z, to mark the filename as invalid.

I thought the syntax for my regex would work but i get mixed returns
I also tried ecsaping the "-"

if [[ "$NAMES" =~ ^[a-zA-Z0-9.\-_]+$ ]] ; then

so when my script runs it should catch this as invalid because of the "#"
in the name:

Found invalid file bill01#pp.txt (which works)

but it fails here:

Found invalid file bill583-20151008104804-RETURN-GA1.txt ( which should be a valid filename)

RE: regex to find invalid characters in filename

(OP)
Basically, I want filenames to be of the POSIX "Fully portable filenames" standard,
which lists these: A–Z a–z 0–9 . _ - as acceptable in filenames everything else should be invalid.

RE: regex to find invalid characters in filename

Hi

Well, the escaping with backslash ( \ ) does not work as usual in the shell's regular expressions.

However the old simple trick works : move the dash ( - ) to the end of character class to avoid being interpreted as interval :

CODE

if [[ "$NAMES" =~ ^[a-zA-Z0-9._-]+$ ]] ; then 

Feherke.
feherke.ga

Red Flag This Post

Please let us know here why this post is inappropriate. Reasons such as off-topic, duplicates, flames, illegal, vulgar, or students posting their homework.

Red Flag Submitted

Thank you for helping keep Tek-Tips Forums free from inappropriate posts.
The Tek-Tips staff will check this out and take appropriate action.

Reply To This Thread

Posting in the Tek-Tips forums is a member-only feature.

Click Here to join Tek-Tips and talk with other members!

Close Box

Join Tek-Tips® Today!

Join your peers on the Internet's largest technical computer professional community.
It's easy to join and it's free.

Here's Why Members Love Tek-Tips Forums:

Register now while it's still free!

Already a member? Close this window and log in.

Join Us             Close