Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations bkrike on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

loop through subdirectories 1

Status
Not open for further replies.

lespaul

Programmer
Feb 4, 2002
7,083
US
I need to throw together a quick program to extract information from some text files. I've done the part to extract the information, but I'm having some difficulties with the file directory.

There is a specific folder that contains folders for each day and within each days folder is a log (.txt). I need to search each text file in the folders:

Main Folder
2006-01-01
Log00
Log01
Log02
2006-01-02
Log00
Log01
Log02

What is the easiest way for me to loop through all the files in all the folders in this specific directory.

Thanks!

Leslie
 
You are showing us a linear directory structure (ie: no directories under directories), which is a particular (and easiest) scenario.

Anyway, I'm propposing you the general algorithm, able to work in any valid directory structure. It is a recursive sweep; if you start it in the root of a 400 GB HD it will traverse any and all directories in the disk, taking a while :).

Code:
{
- Dir:
  Starting directory; say "C:\" to traverse all the disk
  or "C:\MyDir\" to traverse MyDir and all their subdirs.
  Ending backslash ("\") mandatory
}
procedure SweepDir(Dir : AnsiString);
  var
    SR  : TSearchRec;
  begin
    {Using the Delphi attributes (faDirectory and the like)
    is inefficent and obscure due to the inclusion logic
    Delphi uses. Lets check for faAnyFile.}
    if FindFirst(Dir + 'LOG*.*', faAnyFile, SR) = 0
      then
        begin
          repeat
            {We need to filter the "." and ".." elements,
            other way we'll enter an infinite recursion.}
            if (SR.Name <> '.') and (SR.Name <> '..')
              then if SR.Attr = faDirectory
                then {Subdir found. Recursive call to traverse it.}
                  begin
                    {Add the ending backslash.}
                    SweepDir(Dir + SR.Name + '\');
                  end
                else {File found. Process it.}
                  begin
                    ProcessFile(Dir, SR.Name);
                  end;
          until (FindNext(SR) <> 0) or FClosed;
          FindClose(SR);
        end;
  end;

Code written on the fly; check it carefully.

buho (A).

 
thanks! I'll give this a go!

can the starting directory be a network address:

\\Asdfa-xfwiw5emc\logs

les
 
until (FindNext(SR) <> 0) or FClosed;

what's this suppose to be?
 
FClosed: A "cancel" flag. If FClosed turns to True the sweep is cancelled.

Net address: yes, it can sweep a shared folder.

Note: the comment "Add the ending backslash" is wrong; the correct one is "Add the NEW SUBDIR and the ending backslash".

buho (A);

 
Is FClosed a flad you defined somewhere? I can't find anything that will let me use it.

The rest of the function is working well, except for without the FClosed flag, it turns into an infinite loop.

Thanks for the help!

leslie
 
I think I've made a mistake in the code. Your files have not extension, so may be the line

"if FindFirst(Dir + 'LOG*.*', ...)"

is wrong and the correct one is

"if FindFirst(Dir + 'LOG*', ...)".

Not sure about, I never use files without extension and forget how exactly the OS interprets the wildcard in this case.

buho (A).
 
I got the correct files and they loop fine, it's just ending the loop that I'm having problems with now!

Thanks!
les
 
Actually, the FClosed flag is supposed to be defined somewhere and driven by the interface (like the user clicking a button) or by the ProcessFile function detecting an irrecoverable error. It is not directly related with the sweeping algorithm.

To have it working with the interface you need to:

a) Run the sweeper in a worker trhead, so the interface stays active while sweeping.

b) Add an Application.ProcessMessages before the "until" line if the sweeper is working in the main thread.

To have it working with the ProcessFile function (error management) you need to set it to True inside the function.

Anyway, don't pay too much attention to it if you feel you don't need it.

Changing the "until" line to "until FindNext(SR) <> 0;" is supposed to change nothing if you are not actually resorting to the flag.

buho (A).


 
so how do I define the flag?

All I have left to do is stop the process once it reaches the last subdirectory folder. The rest is awesome!

Thanks again for your help.

leslie
 
ok here's what I've got:

1 form - Form1
1 button - Button1

when form is created TStringList initialized:
Code:
procedure TForm1.FormCreate(Sender: TObject);
begin
  Form1.IPList:= TStringList.Create;
  Form1.IPList.Sorted := True;
  Form1.IPList.Duplicates := dupIgnore;
end;

user selects button
Code:
procedure TForm1.Button1Click(Sender: TObject);
begin
SweepDir('\\Asdfa-xfwiw5emc\logs\2006-01-07\');
Form1.IPList.SaveToFile('\\Asdfa-xfwiw5emc\logs\IPAddressList.txt');
end;

processes:
Code:
procedure TForm1.SweepDir(Dir : AnsiString);
var
  SR  : TSearchRec;
begin
  if FindFirst(Dir + '*.*', faAnyFile, SR) = 0 then
  begin
  repeat
      if (SR.Name <> '.') and (SR.Name <> '..') then
        if SR.Attr = faDirectory then 
        begin
          SweepDir(Dir + SR.Name + '\');
        end
        else 
        begin
          ProcessFile(Dir, SR.Name);
        end;
        FindNext(SR);
    until FindNext(SR) <> 0; //or FClosed;
    FindClose(SR);
  end;
end;


procedure TForm1.ProcessFile(Dir : AnsiString; SRName : TFileName);
var
  F: TextFile;
  S: string;
begin
  AssignFile(F, Dir + SRName);
  Reset(F);
  While not EOF(F) do
  begin
    Readln(F, S);
    if pos('164.64.140.2/23', S) > 0 then
    begin
      Form1.IPList.Add(GetIPAddress(copy(S, pos('for outside:', S) + 12, 25)));
    end;
  end;
end;

function TForm1.GetIPAddress(IPString : string): string;
begin
  GetIPAddress := copy(IPString, 1,pos('/', IPString) - 1);
end;

This works great as long as I manually change the button to each subdirectory, ie:
Code:
SweepDir('\\Asdfa-xfwiw5emc\logs\2006-01-07\');
SweepDir('\\Asdfa-xfwiw5emc\logs\2006-01-08\');
SweepDir('\\Asdfa-xfwiw5emc\logs\2006-01-09\');

Which I have done, this is a quick and dirty little program, but I'd like to modify it for future use to not have to put each subdirectory. Can anyone see where I've gone wrong or can explain Buho's creating of the flag process.

thanks!

leslie
 
I think if you simply change your starting point from
[tt]
SweepDir('\\Asdfa-xfwiw5emc\logs\2006-01-07\');
[/tt]
to
[tt]
SweepDir('\\Asdfa-xfwiw5emc\logs\');
[/tt]
you will get all of the files you want thru the magic of recursion.

As for the cancel flag. If the process is really long running to the point that you need to allow the user to cancel, you should probably revise the structure somewhat. Instead of processing the files inside of the Sweep routine, simply gather a list of all of the files that need to be processed (TStringList). Then you can have your process routine use that list as a driver for the actual work. By knowing how many files there are to process you can then put up a progress bar to let the user know how it's going. Then you could consider allowing the user to do a cancel if needed.

But of course in either case you need to be careful not to process the same files again on a restart. Perhaps part of the process should be to rename the files as you go. (e.g. Log00.txt could be renamed xLog00.txt and your process could ignore file names starting with "x".)



 
A useful function is IncludeTrailingPathDelimiter

From the D7 Help
IncludeTrailingPathDelimiter ensures that a path name ends with a trailing path delimiter ('\" on Windows, '/' on Linux). If S already ends with a trailing delimiter character, it is returned unchanged; otherwise S with appended delimiter character is returned.

Note: IncludeTrailingPathDelimiter works with multibyte character sets.
I use this at the start of my equivalent to SweepDir. In which case it does not matter whether SweepDir is called
Code:
  SweepDir( 'C:' );
or by
Code:
  SweepDir( 'C:\' );
So the revised SweepDir would look something like:
Code:
procedure TForm1.SweepDir(Dir : AnsiString);
var
  SR  : TSearchRec;
begin
  Dir := IncludeTrailingPathDelimiter( Dir );
  if FindFirst(Dir + '*.*', faAnyFile, SR) = 0 then
  begin
    repeat
      if (SR.Name <> '.') and (SR.Name <> '..') then
        if SR.Attr = faDirectory then
          SweepDir(Dir + SR.Name )
        else
          ProcessFile(Dir, SR.Name);
        FindNext(SR);
    until FindNext(SR) <> 0; //or FClosed;
    FindClose(SR);
  end;
end;
And in theory the code should work on Linux although I have never tried this.

Andrew
Hampshire, UK
 
You are calling FindNext twice in your code:
Code:
repeat
      if (SR.Name <> '.') and (SR.Name <> '..') then
        if SR.Attr = faDirectory then 
        begin
          SweepDir(Dir + SR.Name + '\');
        end
        else 
        begin
          ProcessFile(Dir, SR.Name);
        end;
        [COLOR=red]FindNext(SR); <-- DELETE THIS LINE [/color]
    until FindNext(SR) <> 0; //or FClosed;

buho (A).
 
thanks! I'll correct that and try it on monday. Have a great rest of the weekend!

les
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top