Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations TouchToneTommy on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

File Sorting Script

Status
Not open for further replies.

SegFaultAX

Programmer
Oct 12, 2007
2
US
Problem: I have a bunch of files with assorted extensions in a "dump" folder. I want to sort the files into directories according to their extensions. Assume that the extension will ALWAYS be followed by the last period. If there is no period present, then it can either be ignored or moved into some other directory. The directory will be a direct child of the dump dir. If a file is found that has an extension with no corresponding directory, make it, then move it. Under some circumstances, there may be duplicate file names, in which case something like _1 or _2 can be appended to the end.

Question: How difficult would a script like this be to write? Are there any limitations in perl that would make some aspect of this impossible? Can anyone provide examples that would at least give a skeleton for creating a script like this? Any other suggestions or comments?

Thanks in advance!
Cheers, MK
 
works for me Chris. Not overwriting the files. If you are checkig via FTP you may need to refresh the view to get a newlisting from the server after running the script.

------------------------------------------
- Kevin, perl coder unexceptional! [wiggle]
 
Hmm, thats strange. I'm not checking via FTP at the moment, using a webspace explorer system (because my FTP client doesn't let me change permissions, its old).

Nothing is getting overwritten, it runs through the script and only processes some files. I check the webspace explorer, and only those ones have been processed. I then have to run the script again, to process another chunk of files. Check webspace explorer again, and that chunk of files have been processed. Etc etc, I have to keep running the script until it says 0 files have been processed to ensure every file has been processed. No idea why its doing this for me.

I am talking about my script yeah? Your script works beautifully.

Chris
 
Your script does nothing when I try it. The files sit in the "dump" folder, no new folders are created and no files are moved.

------------------------------------------
- Kevin, perl coder unexceptional! [wiggle]
 
Okay, maybe I hadn't posted a corrected version :s...

Code:
#############################################################################################
## Tek-tips.com Topic Response                                                             ##
## [URL unfurl="true"]http://tek-tips.com/viewthread.cfm?qid=1416792&page=1[/URL]                                   ##
##                                                                                         ##
## - Sort files with different extensions into corresponding directories.                  ##
## - The corresponding directories are children of the files folder.                       ##
## - If a file has no extension (i.e. file OR file.) then place into the No_Ext directory. ##
## - Create the extension directories automatically, unless one already exists.            ##
## - If a file name already exists, then rename it with a unique new name.                 ##
#############################################################################################

#############################################################################################
## Please Note:                                                                            ##
##                                                                                         ##
## If your file has no extension but contains .'s (i.e. "My.File") then the string after   ##
## the first . will be counted as the extention (i.e. the file will be called "My" and its ##
## extension will be "File" (or ".File").                                                  ##
#############################################################################################

#############################################################################################
## Problems:                                                                               ##
##                                                                                         ##
## - Not all files are processed together, instead one or a few at a time. Need to find    ##
##   out why this is happening. For now, you must keep refreshing until message "0 files   ##
##   have been sorted appears.                                                             ##
## - Haven't accounted for files i.e. ".File", therefore its moved into an extension       ##
##   folder named "File". If the file exists then it is renamed i.e. _001122001122.File.   ##
## - I could use a number to rename files i.e. "File_n.txt" It takes the _n from the file  ##
##   that already exists and adds 1 to it, to create the next number                       ##
## - If I processed 2 files with the same name during one session, then it is likely the   ##
##   script will take less than a second to process the files, therefore they will be      ##
##   given the same time/date stamp.                                                       ##
#############################################################################################

###############
#! /usr/bin/perl
use strict;
use CGI ':standard';
###############

###############
##### DECLARE VARIABLES
my ($time_date_stamp, $path, $no_ext_dump, $fullstop, $underslash, @directory_contents, $counter, $file_name, @split_string, $last_fullstop, @split_name_ext, $name, $ext);
##### GET TIME/DATE STAMP
my ($sec, $min, $hour, $mday, $mon, $year)=gmtime;
$time_date_stamp = sprintf ('%02d%02d%02d%02d%02d%02d', $hour, $min, $sec, $mday, $mon, $year);
##### DECLARE CONTENT TYPE
print "Content-type: text/html\n\n";
##### INITIAL VALUES
$path = "/ChrisMassey.co.uk/Perl/Scripts/CleanDirectory/Files/";
$no_ext_dump = "No_Ext";
$fullstop = ".";
$underslash = "_";
##### START PROCESS COUNTER ON 0
$counter = 0;
##### OPEN PATH DIRECTORY AND GET CONTENTS
opendir (FILE, "$path") || die "Cannot Open: $path";
@directory_contents = grep {!/^\.{1,2}$/} readdir (FILE);
closedir (FILE);
###############

###############
##### MAIN LOOP
foreach (@directory_contents) {
     if (($_ !~ /\./) || ($_ =~ /\.$/) || (($_ =~ /\.$/) && ($_ =~ /\.+/))) {
          unless (-d "$path$_") {
               unless (-e "$path$no_ext_dump") {
                    mkdir ("$path$no_ext_dump", 0777) || die "Cannot Open: $path$no_ext_dump";
               }
               $file_name = $_;
               if (-e "$path$no_ext_dump/$_") {
                    @split_string = split(//, $_);
                    if ($split_string[-1] eq ".") {
                         chop (@split_string);
                         $_ = join ("", @split_string);
                         $last_fullstop = ".";
                    }
                    else {
                         $last_fullstop = "";
                    }
                    $_ = $_ . $underslash . $time_date_stamp . $last_fullstop;
               }
               rename ("$path$file_name", "$path$no_ext_dump/$_") || die "Cannot Open: $path$_ - $path$no_ext_dump/$_";
               print "<p>$_ has been moved to $path$no_ext_dump/$_";
               $counter++
          }
     }
     else {
          @split_name_ext = split (/\./, $_);
          $ext = $split_name_ext[-1];
          pop (@split_name_ext);
          $name = join (".", @split_name_ext);
          unless (-e "$path$ext") {
               mkdir ("$path$ext", 0777) || die "Cannot Open: $path$ext";
          }
     }
     unless (-d "$path$_") {
          $file_name = $_;
          if (-e "$path$ext/$_") {
               $_ = $name . $underslash . $time_date_stamp . $fullstop . $ext;
          }
          rename ("$path$file_name", "$path$ext/$_") || die "Cannot Open: $path$_ - $path$ext/$_";
          print "<p>$_ has been moved to $path$ext/$_";
          $counter++
     }
}
##### MAIN LOOP ENDED
###############

###############
##### FINAL PRINT STATEMENT
print "<p>$counter files have been sorted";
###############

One more thing to note with your script. I'm not been picky hehe, but obviously these things can only be discovered during testing, and your script works nicely so i'm testing it. If a file exists then you add the indicator (i.e. _1) to the end of the file, including the extension (i.e. File.txt_1). I was just doing some excessive testing, and I then tried processing a file name File.txt_1 to see what would happen, and it created a new extension folder to suit the new extension (txt_1). Obviously this isn't hard to fix, simply move the _1 to before the last . (File_1.txt).

Chris
 
OK, got it going, I didn't put a '/' at the end of $path.

Now it does like you said, processes one file at a time, but they are all going into the no extension folder.

------------------------------------------
- Kevin, perl coder unexceptional! [wiggle]
 
Yeah sorry, try the script I just posted.

Its because I included a condition before which allowed every file to pass through the first if statement, its now:

if (($_ !~ /\./) || ($_ =~ /\.$/) || (($_ =~ /\.$/) && ($_ =~ /\.+/))) {

Chris
 
Sorry, I forgot to mention, that I believe this problem only occurs if there are mixed file types (i.e. a mix of File.txt, File. File and My.File) etc.

Chris
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top