Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations TouchToneTommy on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

File Sorting Script

Status
Not open for further replies.

SegFaultAX

Programmer
Oct 12, 2007
2
US
Problem: I have a bunch of files with assorted extensions in a "dump" folder. I want to sort the files into directories according to their extensions. Assume that the extension will ALWAYS be followed by the last period. If there is no period present, then it can either be ignored or moved into some other directory. The directory will be a direct child of the dump dir. If a file is found that has an extension with no corresponding directory, make it, then move it. Under some circumstances, there may be duplicate file names, in which case something like _1 or _2 can be appended to the end.

Question: How difficult would a script like this be to write? Are there any limitations in perl that would make some aspect of this impossible? Can anyone provide examples that would at least give a skeleton for creating a script like this? Any other suggestions or comments?

Thanks in advance!
Cheers, MK
 
Question:
How difficult would a script like this be to write?
Pretty easy.
Are there any limitations in perl that would make some aspect of this impossible?
No
Can anyone provide examples that would at least give a skeleton for creating a script like this?
google
perl split
perl opendir
perl File::Copy
perl -e file test

Any other suggestions or comments?
None of us have any problems helping you out but very few of us will do all the work for you. Feel free to get something going, post your code, and ask some questions.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[noevil]
Travis - Those who say it cannot be done are usually interrupted by someone else doing it; Give the wrong symptoms, get the wrong solutions;
 
Sounds like a job for File::Find

------------------------------------------
- Kevin, perl coder unexceptional! [wiggle]
 
You think?? I found File::Find to be a little tough even for me. I wanted to use it just because everyone likes it and I normally would have just looped through the directories on my own but it felt like I had to do some funky stuff to get it to play nice with me.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[noevil]
Travis - Those who say it cannot be done are usually interrupted by someone else doing it; Give the wrong symptoms, get the wrong solutions;
 
Hi SegFaultAX,

One question:

Do you initially know which mime types (i.e. .txt) will exist. I ask this because if you want to sort the different types into different folders then if you know initially which mime types exist then you can pre-create the folders. Otherwise folders will have to be created automatically based on the mime type.

I am going to have to implement a similar system in the next few days, therefore I am quite interested in your method.

Chris
 
File::Find is a little tricky at first. I know the documentation is a little hard to understand at first, but with a little experimentation you will find it very useful.

Once you get the hang of it, it's just too easy to use. Writing your own directory recursion function will be a thing of the past.

------------------------------------------
- Kevin, perl coder unexceptional! [wiggle]
 
Yeah.. I used it and it did everything I wanted it to. Just took me a while to get there.

I need to go back and add some logic to find empty directories and delete them.



~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[noevil]
Travis - Those who say it cannot be done are usually interrupted by someone else doing it; Give the wrong symptoms, get the wrong solutions;
 
I couldn't resist. I wrote a basic script to complete this task. I haven't tested it, so it may be a failure, and I also HAVEN'T TAKEN THE MIME TYPE FROM EACH FILE (I haven't looked how to do this yet). So the script its partly incomplete (I have noted where mime type needs to be found).

But heres a skeleton script anyway. Forgive me for my extensive methods of completing this task...

Code:
#! /usr/bin/perl
use strict;
use CGI ':standard';

##### Declare Variables
my ($path, @directory_contents, @mime_type, $number_of_files, $counter, $move_to_folder);

##### Path To Directory Containing All Files And Folders
$path = "/Directory/";

##### Open Directory Containing All Files
opendir (LOGDIR, "$path") || die "Cannot Open: $path";
@directory_contents = readdir (LOGDIR);
closedir (LOGDIR);

##### Remove "." And ".." From Directory Contents
shift (@directory_contents);
shift (@directory_contents);

##### Get Mime Type Of Each File
foreach (@directory_contents) {
############################################### GET MIME TYPE

##### Push Mime Type Into A Second Array
push @mime_type, $_;

##### End Get Mime Type Of Each File
}

##### Check If Folder Already Exists For Each Mime Type
foreach (@mime_type) {
unless  (-e "$path$_") {

##### Create A New Folder
mkdir ("$path$_", 0777) || die "Cannot Open: $path$_";

##### End Unless
}

##### End Check If Folder Already Exists For Each Mime Type
}

##### Check How Many Files There Are To Move
$number_of_files = @directory_contents;

##### Content Type Declare
print "Content-type: text/html\n\n";

##### Move Each File Into Its Corresponding Folder
$counter = 0;
while ($counter < $number_of_files) {
foreach (@directory_contents) {
$move_to_folder = @mime_type[$counter];
rename ("$path$_", "$path$move_to_folder") || die "Cannot Open: $path$_ - $path$move_to_folder";

##### Print Event
print "$_ has been moved to $path$move_to_folder";

##### End Foreach
}

##### Add 1 To The Counter
$counter = $counter + 1;

##### End Move Each File Into its Corresponding Folder
}

Chris
 
One mistake I can see...
Code:
rename ("$path$_", "$path$move_to_folder") || die "Cannot Open: $path$_ - $path$move_to_folder";
Meant to be...
Code:
rename ("$path$_", "$path$move_to_folder$_") || die "Cannot Open: $path$_ - $path$move_to_folder";
 
Okay, I still haven't tested... But heres my COMPLETE method (which is probably inefficient (I use no modules, and I get the mime type by splitting using the .))

Code:
#! /usr/bin/perl
use strict;
use CGI ':standard';

my ($path, @directory_contents, @mime_type_split, $mime_type_split_scaler, @mime_type, $number_of_files, $counter, $move_to_folder);
$path = "/Directory/";
opendir (LOGDIR, "$path") || die "Cannot Open: $path";
@directory_contents = readdir (LOGDIR);
closedir (LOGDIR);
shift (@directory_contents);
shift (@directory_contents);
foreach (@directory_contents) {
@mime_type_split = split (/\./, $_);
shift (@mime_type_split);
$mime_type_split_scaler = "@mime_type_split";
push @mime_type, $mime_type_split_scaler;
}
foreach (@mime_type) {
unless (-e "$path$_") {
mkdir ("$path$_", 0777) || die "Cannot Open: $path$_";
}
}
$number_of_files = @directory_contents;
print "Content-type: text/html\n\n";
$counter = 0;
while ($counter < $number_of_files) {
foreach (@directory_contents) {
$move_to_folder = @mime_type[$counter];
rename ("$path$_", "$path$move_to_folder$_") || die "Cannot Open: $path$_ - $path$move_to_folder";
print "$_ has been moved to $path$move_to_folder";
}
$counter = $counter + 1;
}
 
Code:
##### Remove "." And ".." From Directory Contents
shift (@directory_contents);
shift (@directory_contents);

This is not good.. your just assuming . and .. are at the beginning of the array and that is not true in all cases.

You can do something like
next if $file =~ /^\.\.?$/;

This
Code:
##### Content Type Declare
print "Content-type: text/html\n\n";
 is not needed nor is
use CGI ':standard';

##### Move Each File Into Its Corresponding Folder
$counter = 0;
while ($counter < $number_of_files) {
foreach (@directory_contents) {
$move_to_folder = @mime_type[$counter];
rename ("$path$_", "$path$move_to_folder") || die "Cannot Open: $path$_ - $path$move_to_folder";

##### Print Event
print "$_ has been moved to $path$move_to_folder";

##### End Foreach
}

##### Add 1 To The Counter
$counter = $counter + 1;

##### End Move Each File Into its Corresponding Folder
}
seems all messed up




You have a few logic error and perl errors in your script..



~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[noevil]
Travis - Those who say it cannot be done are usually interrupted by someone else doing it; Give the wrong symptoms, get the wrong solutions;
 
hey travs,

I'm still learning :).

In order to use a print statement surely I must declare using:
print "Content-type: text/html\n\n";

I always get told that I need to remove:
use CGI ':standard';
However, if I remove it my script will produce an Error 500. I think this is something to do with the server I run my scripts from...

I'm going to test my script to see what I have done wrong.

Thank you
 
The CGI and header is only if the script is web based. He didn't mention needing it to be web based so we can assume it won't be.

I (in my opinion) would change the logic of your script to

readdir
for $file (@dir)
Get rid of . and ..
if file !~ /\./ move into special dump dir and next
split file on . in to @tmp
check if $path/$tmp[-1] exits if not create it
move $file into $path/$tmp[-1]
}

Only needs one loop, no counters or anything as your just processing a file at a time.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[noevil]
Travis - Those who say it cannot be done are usually interrupted by someone else doing it; Give the wrong symptoms, get the wrong solutions;
 
Thank you travs and Kevin,

I will change my script in accordance to those guidelines once I have this script working.

Right, I have tested my script, I had to make a couple of minor changes (include a forward slash after the new folder name AND one of my foreach loops logic was slightly wrong)...

The demo is...

Although since the files are moved, whoever goes to the demo first can see properly (I could however return the files again and the end of the script)

As you can see though, every file is moved into the txt folder, even though I can't see what I'm doing wrong...

I haven't made your suggested changes yet, but heres the script...

Code:
#! /usr/bin/perl
use strict;
use CGI ':standard';

my ($forward_slash, $path, @directory_contents, @mime_type_split, $mime_type_split_scaler, @mime_type, $number_of_files, $counter, $move_to_folder);
$forward_slash = "/";
$path = "/PATH/CleanDirectory/Files/";
opendir (LOGDIR, "$path") || die "Cannot Open: $path";
@directory_contents = readdir (LOGDIR);
closedir (LOGDIR);
shift (@directory_contents);
shift (@directory_contents);
foreach (@directory_contents) {
@mime_type_split = split (/\./, $_);
shift (@mime_type_split);
$mime_type_split_scaler = "@mime_type_split";
push @mime_type, $mime_type_split_scaler;
}
foreach (@directory_contents) {
unless (-e "$path$_") {
foreach (@mime_type) {
mkdir ("$path$_", 0777) || die "Cannot Open: $path$_";
}
}
}
$number_of_files = @directory_contents;
print "Content-type: text/html\n\n";
$counter = 0;
while ($counter < $number_of_files) {
foreach (@directory_contents) {
$move_to_folder = @mime_type[$counter];
rename ("$path$_", "$path$move_to_folder$forward_slash$_") || die "Cannot Open: $path$_ - $path$move_to_folder";
print "<p>$_ has been moved to $path$move_to_folder";
}
$counter = $counter + 1;
}
 
You move every file to the first mime_type regardless of what mime type it is.
when you loop through the directory contents you don't check what the file ext is you just move it to the folder.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[noevil]
Travis - Those who say it cannot be done are usually interrupted by someone else doing it; Give the wrong symptoms, get the wrong solutions;
 
Thanks, I see. I didn't put the counter + 1 in the last foreach loop.

This was also a mistake
Code:
foreach (@directory_contents) {
unless (-e "$path$_") {
foreach (@mime_type) {
mkdir ("$path$_", 0777) || die "Cannot Open: $path$_";
}
}
}

And should have been

Code:
foreach (@mime_type) {
unless (-e "$path$_") {
mkdir ("$path$_", 0777) || die "Cannot Open: $path$_";
}
}

Its working fine now, but its a poor method. Sorry for posting lots of irrelevent scripts. I'm going to focus on his question now, and use your methods
 
There was a lot more to the task than I thought.

In the end I used a time and date stamp for files that already existed...

Once the directories for each mime type had been created then they were also carried into the directory contents array, so I had to check if $_ was a directory or not...

And I also forgot that if the file already existed then before I could change its name I had to split off the mime type again i.e. ('file', '.txt') and add the date stamp to the end of the first element.

I dumped files with no extension (or just a .) into a separate folder.

Finally I created another script which recursively goes through a directory and performs the same task (so that you can just dump the directory into the Files directory rather than each individual file).

Works perfectly :-D

 
Cool.. as long as your learning.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[noevil]
Travis - Those who say it cannot be done are usually interrupted by someone else doing it; Give the wrong symptoms, get the wrong solutions;
 
First of all, I want to say thank you to everyone for your quick and very helpful responses. Next, I want to say this isn't a project for work, it's just something I have come up with to help me learn perl. I figure the best way to learn is to come up with a problem, then work to solve it. Finally, this is not for a web application, just a script I want to run to clean up some folders in the aforementioned manner on my machine.

Now then, maybe I should start with the first bit of my task, as the whole thing seems a little too complex for now. chrismassey (or other), could you trim your script to just the portion where it discovers the present mime types in the dump, then generates the directory(ies). Thats the bit I am most interested in studying. After I understand that, I will work on moving the files from the dump to the target directory. I think that should give me at least a basic understanding of perl arrays and parsing.

I hope I'm not asking too much. I just find I learn best by studying an example, then looking up each directive and its usage individually. If this is to bothersome a task, I completely understand. Thank you all again for your help thus far.

SegFault[AX] - this.forwardTo("/dev/null");
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top