Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations TouchToneTommy on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Creating files of data for website-visitors to download

Status
Not open for further replies.

emilybartholomew

Technical User
Aug 3, 2001
82
US
Hello-

I am creating a web interface for a database in which I store lots and lots of data. The idea is that people can come to the website and and select what kind of data they want to see, such as: "Energy Prices between may 2001 and july 2001" or "Load between january 2000 and december 2001" or whatever they wish.

I can display the data fine on the page, but I am also supposed to allow users to download a .txt/.csv/.zip (doesn't matter) file of this data to their own computers.

My thinking so far is this: when they submit their query, my program should create a file and put it in a directory. Then when the page appears with the data have a link to this file. If they click off of the page (with or without downloading the file), it will disappear from my server forever. I would like to avoid a ginormous build up of useless files. I have not been able to implement it yet. I can probably create the file, but how would you delete it? I guess I could just delete once a day (as a cron job or something - I'm running Linux) but on the off chance someone is downloading at the moment the job runs, I'd like to not do this.

Has anyone ever done anything like this?
Thanks-
emily
 
You're right on track. But put PHP in the way of the download instead of having it as a simple HTML link directly to the file.

Create the files in one directory, and have a php app do the downloading to the browser. The PHP app moves the file from a scratch directory to a second directory then streams it from there. Then the files that users find interesting are separated from the weeds. You could even have the PHP download app update the time-date stamp on the file every time the file is downloaded.

Your cron job could then delete everything in the "write-to" directory on a short period of time (say files older than 3 hours), and delete everything in the "read-from" directory on a longer period (files older than a week).

A user that regularly downloads the file (say every 6 days) could have the file indefinately without accumulating a bunch of junk on the filesystem.

You're still going to have a problem from time to time that a file that should have been there is deleted at inopportune times. That is pretty much life with HTTP.
 
Oops. I forgot what forum I was in.

perform "s/ PHP / perl /gi" on the above post.
 
That's a good technique. I"ll try it and see how it goes. Thanks!
 
I do something similar except I don't use two dirs. I just fire a child process after the query results are presented to the user. The child process deletes anything older that an hour. That way, the application cleans-up behind itself. This can have some performance costs when compared to a cron job..... depends on load, horsepower, etc....

#!/usr/loca/bin/perl
$| = 1;
use CGI;
my $cgi = new CGI;
# run query and output html
# then
# run clean-up routine for stuff older than some time.

TMTOWTDI ;-) 'hope this helps

If you are new to Tek-Tips, please use descriptive titles, check the FAQs, and beware the evil typo.
 
That's a really good idea, about the self-cleaning perl script.

The problem now is that I'm having trouble getting unlink to work. I loop through the list of file names, check for modification date, and if it's less than the current date I try to erase it. But, even though I get a valid modification date, unlink returns an error that says 'No such directory or file.' Could this be a permissions problem? (I'm running Linux) Or a path problem?
 
More likely a path problem. Are you using absolute paths or relative paths in your code?
 
Now that I've asked that question, I see that the two are inter-related.

Could it be that the app that deletes the files does not have permissions to a directory in the document path?
 
Well, I'm using absolute paths. I'm running the program as root, so it should have enough permissions to delete. Just to check, I changed the ownership of the program file itself to root, but makes no difference.
 
It's gotta' be something simple... bad path, ....inadequate permissions (except you say you are root).

Usually, when I have a problem like that, I am trying to 'unlink' something other than what I think I'm trying to unlink.....[hammer]

I usually have to do some really basic idiot checking, like printing the args I'm handing to 'unlink'..... making sure I'm trying to do what I think I'm trying to do.

Hopefully, you are better at this than I am and don't have those kinds of problems. ;-) 'hope this helps

If you are new to Tek-Tips, please use descriptive titles, check the FAQs, and beware the evil typo.
 
I highly doubt you are as bad as you say, goBoating.

The latest discovery is this: after I run the program the file that i'm trying to unlink (which had lots of stuff in it before) is empty. And I'm getting a "No such file or directory" error.
 
Could the file still be open by something else? Mike
________________________________________________________________

"Experience is the comb that Nature gives us, after we are bald."

Is that a haiku?
I never could get the hang
of writing those things.
 
alright, time to see some code, cough it up..... show us what you got..... ;-) 'hope this helps

If you are new to Tek-Tips, please use descriptive titles, check the FAQs, and beware the evil typo.
 
You asked for it... (I actually thought I already put some up - ah my faulty memory)


#First, get today's date
($d,$d,$d,$d,$m,$y)=localtime();
$m++;
$m =&quot;0&quot; . $m if ($m < 10);
$d =&quot;0&quot; . $d if ($d < 10);
$y = $y + 1900;
my $cur_date = $y. $m . $d;

#Second, loop through the directory /var/ and grab all file names. Put these in an array.

@files = qx{find /var/ -name &quot;*.csv&quot;};

print &quot;@files\n&quot;;
#Loop through all the files and analze the date each was modified. If it was before the current date, delete it
foreach my $file_name ( @files ){

open (INFILE, &quot;>&quot; . $file_name );
my ($date_read, $date_mod ) = (stat(INFILE))[8,9];

($d,$d,$d,$d,$m,$y) = localtime($date_mod);
$m++;
$m =&quot;0&quot; . $m if ($m < 10);
$d =&quot;0&quot; . $d if ($d < 10);
$y = $y + 1900;
close (INFILE);

if($y.$m.$d < $cur_date ){
print &quot;Should be deleting $file_name....\n&quot;;
my $count = unlink($file_name) or die &quot;Can't delete file $file_name: $!\n&quot;;
print &quot;Number of files deleted: $count\n&quot;;
}

}
 
I know the answer to the ancilliary question, at least: Why are the files in the directory 0 length?

You'll probably find that every file in your downloads directory is 0 length. That's because of this line:

open (INFILE, &quot;>&quot; . $file_name );

Just performing an overwrite open will set the file to zero length. Don't open every file to use stat on it. You can name the file in a string instead. [stat (&quot;/var/www/dl/foo.csv&quot;)]

Perfection in engineering does not happen when there is nothing more to add. Rather it happens when there is nothing more to take away.
 
Eureka!

I think I have solved the problem. It's the use of the &quot;find&quot; command that prevents the script from deleting the files. I have no idea why, but I can reproduce your error on files I copied to a directory from other parts of the filesystem.

Instead of using &quot;find&quot; try this:

opendir (DIR, $download_dir) or die (&quot;Could not open dir&quot;);
@foo = readdir(DIR);
closedir (DIR);

I'm able to delete the files when I get their names this way.

Remember, though, that the filenames returned will not include paths like find does.

Perfection in engineering does not happen when there is nothing more to add. Rather it happens when there is nothing more to take away.
 
#!/usr/local/bin/perl
The file names coming from qx(find....) have new lines on the end. You need to chomp them and then unlink them.

A slightly simplified treatment.....
Code:
@files = qx{find ./ -name &quot;*.html&quot;};
foreach my $file_name ( @files )
    {
    chomp($file_name);
    print &quot;FILE: $file_name\n&quot;;
    unlink($file_name) or die &quot;FAILED: $file_name:, $!\n&quot;;
    }
'hope this helps

If you are new to Tek-Tips, please use descriptive titles, check the FAQs, and beware the evil typo.
 
You guys are good. I couldn't get the first technique to work (the opendir....) but chomp works great. Thank you so much!!
 
.... - ah my faulty memory)

Memory Fault, huh? Must be running a Win OS...

[lol] LOL

Sorry, I couldn't resist.....
I'll try to stay off that soap box before I get a flame war started. 'hope this helps

If you are new to Tek-Tips, please use descriptive titles, check the FAQs, and beware the evil typo.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top