×
INTELLIGENT WORK FORUMS
FOR COMPUTER PROFESSIONALS

Contact US

Log In

Come Join Us!

Are you a
Computer / IT professional?
Join Tek-Tips Forums!
  • Talk With Other Members
  • Be Notified Of Responses
    To Your Posts
  • Keyword Search
  • One-Click Access To Your
    Favorite Forums
  • Automated Signatures
    On Your Posts
  • Best Of All, It's Free!

*Tek-Tips's functionality depends on members receiving e-mail. By joining you are opting in to receive e-mail.

Posting Guidelines

Promoting, selling, recruiting, coursework and thesis posting is forbidden.

Students Click Here

Counting files by date - filling unused dates with 0's

Counting files by date - filling unused dates with 0's

Counting files by date - filling unused dates with 0's

(OP)
Hello,

I am trying to count the number of files created per day with a given extension. I am searching a folder recursively.

What I have at the moment counts the number of files per day, but does not include those days where there are no files.

CODE

#!/usr/bin/env python

import os, glob, time
print '-'*60
root = raw_input("Folder to search:\n") + '/*'                  
#print root
ext = '/*.' + raw_input("File type to filter:\n")  
file_out = open(raw_input('Fileout name : \n'),"w")   
t_start = time.time()
print '-'*26 + 'Running' + '-'*27 # approx 20 seconds / 1000 files
date_file_list = []                                             
for folder in glob.glob(root):                                  
    for file in glob.glob(folder + ext):                        
        stats = os.stat(file)                                                     
        date_file_tuple = time.localtime(stats[8]), file                    
        date_file_list.append(date_file_tuple)                                               
        daylist = []                       
        for file in date_file_list:                         
        days = time.strftime("%d/%m/%y", file[0])              
            daylist += [days]                               
d = {}                              
from sets import Set                
for i in Set(daylist):              
    d[i] = daylist.count(i)                       
    file_out.write('%s \t%s \n' % (i,d[i]))      
print 'Total no. of %s files = %d\n%.2f seconds runtime\n ' % (ext, len(daylist),time.time()-t_start)
print '='*60

It is probably not as efficient as it should be - any tips on how to improve the loops would be very helpful :D

I import the out_file to excel to create a chart of the number of files created per day.

Does anyone know a good way of including those days where there were no files created?

Any help greatly appreciated!

RE: Counting files by date - filling unused dates with 0's

Here's how I would go about it:
I'd glob (as you do) a list of the files.
Then I'd getctime another list of times corresponding to the creation times of each of those files.
Then I'd make a dictionary of {ctime:filename}'s
Then I'd sort the list of ctimes

CODE

import glob, os, time
startd='e:/python/test/'
dlst=glob.glob(startd+'*.*')
tlst=[os.path.getctime(f) for f in dlst]
d=dict(zip(tlst,dlst))
tlst.sort()
day0=time.strftime('%d',time.localtime(tlst[0]))
dayf=time.strftime('%d',time.localtime(tlst[-1]))
Now I'd create another list of the days I wanted (the simplest way is to work in POSIX as returned by getctime remembering that 1 day=86400 seconds).

_________________
Bob Rashkin

RE: Counting files by date - filling unused dates with 0's

(OP)
Thanks a lot for your help, it has taught me a lot! By the way, I am also a geophysicist, with just over a year of experience.

I have got the script doing what I want, using your method to create a list. I didn't need to know each file, just the frequency per day so I just made a list of the dates.

CODE

#!usr/bin/env python

import glob, os, datetime
startd='C:\\Computer\\MyPython\\'              
ext = '/*.*'
file_out = open(raw_input('Fileout name : \n'),"w")

flst=glob.glob(startd+'*.*')
dlst=[datetime.date.fromtimestamp(os.path.getctime(f)) for f in flst]
dlst.sort()
start_date=dlst[0]
end_date=dlst[-1]
print start_date, end_date
from datetime import timedelta
daylst=[start_date+timedelta(n)for n in range((end_date - start_date).days)]
print 'start=%s, end=%s' %(daylst[0], daylst[-1])
d={}
for i in set(daylst):
    d[i]=dlst.count(i)
    print i, d[i]
    file_out.write('%s \t%d \n' % (i,d[i]))
file_out.close()

Many thanks!
Now I can't decide if I should be sorting the list by date in the script or simply in excel where I will make a histogram of it...
Is there an easy way of sorting the daylst by date?

 

RE: Counting files by date - filling unused dates with 0's

daylst.sort() will sort the list in place.

_________________
Bob Rashkin

Red Flag This Post

Please let us know here why this post is inappropriate. Reasons such as off-topic, duplicates, flames, illegal, vulgar, or students posting their homework.

Red Flag Submitted

Thank you for helping keep Tek-Tips Forums free from inappropriate posts.
The Tek-Tips staff will check this out and take appropriate action.

Reply To This Thread

Posting in the Tek-Tips forums is a member-only feature.

Click Here to join Tek-Tips and talk with other members! Already a Member? Login


Close Box

Join Tek-Tips® Today!

Join your peers on the Internet's largest technical computer professional community.
It's easy to join and it's free.

Here's Why Members Love Tek-Tips Forums:

Register now while it's still free!

Already a member? Close this window and log in.

Join Us             Close