INTELLIGENT WORK FORUMS
FOR COMPUTER PROFESSIONALS

Log In

Come Join Us!

Are you a
Computer / IT professional?
Join Tek-Tips Forums!
  • Talk With Other Members
  • Be Notified Of Responses
    To Your Posts
  • Keyword Search
  • One-Click Access To Your
    Favorite Forums
  • Automated Signatures
    On Your Posts
  • Best Of All, It's Free!

*Tek-Tips's functionality depends on members receiving e-mail. By joining you are opting in to receive e-mail.

Posting Guidelines

Promoting, selling, recruiting, coursework and thesis posting is forbidden.

Jobs

Reference Material

Understanding FindFirst and FindNext by Glenn9999
Posted: 15 Dec 08 (Edited 15 Dec 08)

Understanding How FindFirst and FindNext Work

Reference Threads:
thread102-1418012: FindFirst/FindNext
thread102-1512569: findfirst - findnext

This document will be a discussion of how FindFirst/FindNext functions.  It will include a description of the behavior of the Sysutils variant and how it functions.
As well, it will describe some ideas to utilize what is going on under the hood to
provide some simplicity and some performance in the code.

The basics of FindFirst/FindNext

Most of us should know the basics of how FindFirst and FindNext work.  They are located
in the Sysutils unit.  What follows is an example that will form a basis for argument
in the reference thread that I started.  Since only minor changes will be required, I
will just post the program that I used.

CODE

{$APPTYPE CONSOLE}
program test1; uses sysutils;
  { test program #1 for FindFirst/FindNext FAQ by Glenn9999,
    demonstrates testing for attributes }

  procedure writeattr(inattr: longint);
    { tests file attributes and writes out a corresponding letter for it }
    begin
      { list of file attributes }
      if (inattr and faReadOnly) = faReadOnly then
         write('R');
      if (inattr and faHidden) = faHidden then
         write('H');
      if (inattr and faSysFile) = faSysFile then
         write('S');
      if (inattr and faVolumeID) = faVolumeID then
         write('V');
     if (inattr and faDirectory) = faDirectory then
         write('D');
      if (inattr and faArchive) = faArchive then
         write('A');
    end;

  var
    SR: TSearchRec;
    retcode: integer;
  begin
    retcode := FindFirst('C:\*.*', faAnyFile, SR);
    while retcode = 0 do
      begin
        write(SR.Name, '(':30-Length(SR.Name));
        writeattr(Sr.Attr);
        writeln(')');
        retcode := FindNext(SR);
      end;
    FindClose(SR);
    readln;
  end.

As we can see, this program lists all files in "C:\*.*" along with the attributes attached to them.  To distinguish between the attributes you test using bitmasks.
And is the operator which is used.  How this works in a technical level is
a subject for another FAQ, if necessary, but know that this is how it is done.

I selected C:\*.* because most of us can relate to what we will see (the contents of a relatively basic XP install).  The output of the above program on my system is as follows:

CODE

AUTOEXEC.BAT                 (A)
BJPrinter                    (HD)
boot.ini                     (RHSA)
CONFIG.SYS                   (A)
Documents and Settings       (D)
error.log                    (A)
FSC                          (D)
IO.SYS                       (RHSA)
MSDOS.SYS                    (RHSA)
NTDETECT.COM                 (RHSA)
ntldr                        (RHSA)
pagefile.sys                 (HSA)
Program Files                (RD)
RECYCLER                     (HSD)
sqmdata00.sqm                (HA)
sqmnoopt00.sqm               (HA)
System Volume Information    (HSD)
WINDOWS                      (D)

The issue of the first thread

To continue on to the idea that was mentioned in the first thread I started, I'll change the first program by putting faDirectory in the Attr spot of FindFirst.  We can then
observe the issue which prompted my post:

CODE

AUTOEXEC.BAT                 (A)
CONFIG.SYS                   (A)
Documents and Settings       (D)
error.log                    (A)
FSC                          (D)
Program Files                (RD)
WINDOWS                      (D)

We are told in the manual for FindFirst:

Quote:


FindFirst searches the directory specified by Path for the first file that matches the file name implied by Path and the attributes specified by the Attr parameter.
..
Attributes can be combined by adding their constants or values. For example, to search for read-only and hidden files in addition to normal files, pass (faReadOnly + faHidden) the Attr parameter.

I didn't look at the manual too closely until just now, but the bolded part is what puts some confusion into things - and why the question was asked about in the thread.  The specified attribute is included IN ADDITION to files with regular attributes.
Hence, there are 4 directories and 3 files specified.  

Admittedly it seems to work according to documentation, but it is still definitely broken in functionality, since it doesn't behave in what I would call a predictable way.  We have 3 "normal files" listed, which would be true, but only 4 of the 7 directories that we expected back returned. This is no doubt because the other directories had hidden attributes attached to them.

SFFFN
After I posted that, I experimented with writing something with a more predictable filter.  In having a look into it, I found that the filtering function related to
Attr is written in the Sysutils implementation of FindFirst as opposed to anything to
do with a Windows API call.  So I ended up writing a version which will do strict filtering, more out of interest.  I shared it because I thought it would be something
interesting to share.  You will see it at the end of that thread, but I will reproduce it here.

CODE

unit sfffn;
  { unit to encapsulate new FindFirst/FindNext/FindClose.  Certain
    optimizations were made compared to the sysutil versions - most
    important being the fact that the functions in this unit will
    *STRICTLY* filter attributes presented to it.  So don't expect this
    version to act like the Sysutils version.

    Example: faDirectory in Attr only returns directories.
             faHidden+faSysFile only returns files /directories
             with BOTH attributes

    This means there shouldn't be a need for post-filtering unless you
    want to "OR" the attributes - like if you want to return either
    faHidden or faSysFile.  To that end, this should present an
    improvement in that realm.

    The relevant definitions from Sysutils were copied into this source file
    so the added weight of Sysutils is not necessary if all you do is
    search files.  Certain helps were also gotten from the Sysutils source.

    Functions in this unit:
    SFindFirst; Like the FindFirst
    SFindNext; Like the FindNext
    SFindClose; Like the FindClose }

  interface
    uses windows;
    const
      faReadOnly  = $00000001;
      faHidden    = $00000002;
      faSysFile   = $00000004;
    { faVolumeID  = $00000008; faVolumeID seems not used in Windows anyway }
      faDirectory = $00000010;
      faArchive   = $00000020;
      faAnyFile   = $0000003F;
    type
      TFileName = string;
      LongRec = packed record
         Lo, Hi: Word;
      end;
      TSearchRec = record
        Time: Integer;
        Size: Integer;
        Attr: Integer;
        Name: TFileName;
        ExcludeAttr: Integer;
        FindHandle: THandle;
        FindData: TWin32FindData;
      end;

  function SFindFirst(const Path: string; Attr: Integer;
                      var F: TSearchRec): Integer;
  function SFindNext(var F: TSearchRec): Integer;
  procedure SFindClose(F: TSearchRec);

  implementation

procedure move_rec(var F: TSearchRec);
  var
    LocalFileTime: TFileTime;
  begin
    with F do
      begin
        Size := FindData.nFileSizeLow;
        Attr := FindData.dwFileAttributes;
        Name := FindData.cFileName;
        FileTimeToLocalFileTime(FindData.ftLastWriteTime, LocalFileTime);
        FileTimeToDosDateTime(LocalFileTime, LongRec(F.Time).Hi,
            LongRec(F.Time).Lo);
      end;
  end;

function SFindFirst(const Path: string; Attr: Integer;
                      var F: TSearchRec): Integer;
  const
    faSpecial = faHidden or faSysFile or faDirectory;
  var
    ret: boolean;
  begin
    if Attr = faAnyFile then
      F.ExcludeAttr := not Attr and faSpecial
    else
      F.ExcludeAttr := Attr;
    F.FindHandle := FindFirstFile(PChar(Path), F.FindData);
    if F.FindHandle <> INVALID_HANDLE_VALUE then
      begin
        while (F.FindData.dwFileAttributes and F.ExcludeAttr) <> F.ExcludeAttr do
          begin
            ret := FindNextFile(F.FindHandle, F.FindData);
            if (not ret) then
              begin
                Result := GetLastError;
                exit;
              end;
          end;
        move_rec(F);
        Result := 0;
      end
    else
      Result := GetLastError;
  end;

  function SFindNext(var F: TSearchRec): Integer;
    var
      ret: boolean;
    begin
      repeat
        ret := FindNextFile(F.FindHandle, F.FindData);
        if (not ret) then break;
      until (F.FindData.dwFileAttributes and F.ExcludeAttr) = F.ExcludeAttr;
      move_rec(F);
      if ret then
        Result := 0
      else
        Result := GetLastError;
    end;

  procedure SFindClose(F: TSearchRec);
    { copied from SysUtils source - nothing changes for this }
    begin
      if F.FindHandle <> INVALID_HANDLE_VALUE then
        Windows.FindClose(F.FindHandle);
    end;

end.

We can see that the FindFirst and FindNext call a Windows API function which does not
do any filtering.  You will also see that this is true of the Sysutils FindFirst and
FindNext if you load up the copy of SYSUTILS.PAS that is likely with your Delphi install
(not reproducing it here for the obvious copyright concerns).

The main difference between this version and the Sysutils version is that it does not
return files with the faArchive attribute, and will return anything with an attribute (the two main logic changes).  We will see that in recoding the program that was given
to use SFFFN.PAS

CODE

BJPrinter                    (HD)
Documents and Settings       (D)
FSC                          (D)
Program Files                (RD)
RECYCLER                     (HSD)
System Volume Information    (HSD)
WINDOWS                      (D)

We now only have directories and we have all of them.

Post-Filtering

Now as it was pointed out, it's pretty hard to come up with a filter that would suit all circumstances.  As was pointed out in the second thread, the only way to return things that are NOT faDirectory is to test for that (again an issue of bitmasks and how they work).  So we're back to doing the post-filtering, as most people seem to generally do, and what seems best.

Now the question out of this comes whether we can improve upon FindFirst/FindNext in this respect and come up with something that is at the very least a little more functionally useful, and most certainly efficient.  An attempt at that is posted
below:

CODE

unit bfffn;
  { unit to encapsulate new FindFirst/FindNext/FindClose.

    This is a bare version which does not take attribute in any form and will
    return all files - ideal for post-filtering algorithms.  Performance
    improvements can be made which departs from the TSearchRec structure to
    TWin32FindData, but those will not be made within this unit.

    The relevant definitions from Sysutils were copied into this source file
    so the added weight of Sysutils is not necessary if all you do is
    search files.  Certain helps were also gotten from the Sysutils source.

    Functions in this unit:
    BFindFirst; Like the FindFirst
    BFindNext; Like the FindNext
    BFindClose; Like the FindClose }

  interface
    uses windows;
    const
      faReadOnly  = $00000001;
      faHidden    = $00000002;
      faSysFile   = $00000004;
    { faVolumeID  = $00000008; faVolumeID seems not used in Windows anyway }
      faDirectory = $00000010;
      faArchive   = $00000020;
      faAnyFile   = $0000003F;
    type
      TFileName = string;
      LongRec = packed record
         Lo, Hi: Word;
      end;
      TSearchRec = record
        Time: Integer;
        Size: Integer;
        Attr: Integer;
        Name: TFileName;
        ExcludeAttr: Integer;
        FindHandle: THandle;
        FindData: TWin32FindData;
      end;

  { attribute not specified in BFindFirst! }
  function BFindFirst(const Path: string; var F: TSearchRec): Integer;
  function BFindNext(var F: TSearchRec): Integer;
  procedure BFindClose(F: TSearchRec);

  implementation

  procedure move_rec(var F: TSearchRec);
    var
      LocalFileTime: TFileTime;
    begin
      with F do
        begin
          Size := FindData.nFileSizeLow;
          Attr := FindData.dwFileAttributes;
          Name := FindData.cFileName;
          FileTimeToLocalFileTime(FindData.ftLastWriteTime, LocalFileTime);
          FileTimeToDosDateTime(LocalFileTime, LongRec(F.Time).Hi,
              LongRec(F.Time).Lo);
        end;
    end;

  function BFindFirst(const Path: string; var F: TSearchRec): Integer;
    begin
      F.FindHandle := Windows.FindFirstFile(PChar(Path), F.FindData);
      if F.FindHandle <> INVALID_HANDLE_VALUE then
        begin
          move_rec(F);
          Result := 0;
        end
      else
        Result := GetLastError;
    end;

  function BFindNext(var F: TSearchRec): Integer;
    var
      ret: boolean;
    begin
      ret := Windows.FindNextFile(F.FindHandle, F.FindData);
      if ret then
        begin
          move_rec(F);
          Result := 0;
        end
      else
        Result := GetLastError;
    end;

  procedure BFindClose(F: TSearchRec);
    { copied from SysUtils source - nothing changes for this }
    begin
      if F.FindHandle <> INVALID_HANDLE_VALUE then
        Windows.FindClose(F.FindHandle);
    end;

end.

As you can see a lot of code was removed, which relates to finding files by a specific
bitmask specified within Attr.  This also affords some control over the code, and
actually allows us to be specific about what is moved or not moved.

Some Testing Programs

I ran some trials against a testing program using Sysutils and this BFFFN unit, using this program.

CODE

{$APPTYPE CONSOLE}
{$DEFINE NODEBUG}
program test4; uses sysutils;
  { test program #4 for FindFirst/FindNext FAQ by Glenn9999,
    allows for benchmarking of file seeking performance.
    parses through all directories, optionally writing all the results}

  type
    DWord = Longint;

  { a timing routine }
  function TimeMS: DWord;
          stdcall; external 'winmm.dll' name 'timeGetTime';

  procedure listdirectory(indir: string);
    var
      SR: TSearchRec;
      retcode: integer;
    begin
      {$IFDEF DEBUG }
      writeln(indir);
      {$ENDIF}
      retcode := FindFirst(indir + '\*.*', faAnyFile, SR);
      while retcode = 0 do
        begin
          {$IFDEF DEBUG }
          writeln(SR.Name);
          {$ENDIF}
          if (SR.Attr and FaDirectory) = faDirectory then
            begin
              if (SR.Name <> '.') and (SR.Name <> '..') then
                ListDirectory(indir + '\' + SR.Name);
            end;
          retcode := FindNext(SR);
        end;
      Findclose(SR);
    end;

  var
    STime: Longint;
    numtimes, i: longint;
  begin
    write('Number of times to run: ');
    readln(numtimes);
    STime := TimeMS;
    for i := 1 to numtimes do
      listdirectory('C:');
    STime := TimeMS - STime;
    write('Program completed in ', STime, ' ms.  Press Enter to Exit.');
    readln;
  end.

All were run against the drive five times(51,040 files; 6,555 folders).

Using Sysutils: 3485ms
Using BFFN as posted: 3469ms
Using BFFN moving only file name and attr: 3250ms
Using the Windows variants in the main program: 3157ms

Another benchmark - five times (15300 files, 1670 folders on WinME)
Using Sysutils: 1737ms
Using BFFN as posted: 1798ms
Using BFFN moving only file name and attr: 1560ms
Using the Windows variants in the main program: 1509ms

In profiling these programs, the Windows calls, specifically NTDLL.DLL and KERNEL32.DLL take up 98% of the execution time, there is not incredibly much to tune (of this time, FindClose calls took 11%, FindNext took 3%, and FindFirst 0.2%).  Though, as you will notice, there is some room for improvement, especially if you scan a large number of files.  This difference is from the existence of the code layer on top of the Windows calls in Sysutils.

For those interested in what the program looked like with the Windows variants in it,
here that is:

CODE

{$APPTYPE CONSOLE}
{$DEFINE NODEBUG}
program test7; uses windows;
  { test program #7 for FindFirst/FindNext FAQ by Glenn9999,
    allows for benchmarking of file seeking performance.
    parses through all directories, optionally writing all the results

    change define to DEBUG in order to write the directory paths and names

    TWin32FindDataA = record
      dwFileAttributes: DWORD;
      ftCreationTime: TFileTime;
      ftLastAccessTime: TFileTime;
      ftLastWriteTime: TFileTime;
      nFileSizeHigh: DWORD;
      nFileSizeLow: DWORD;
      dwReserved0: DWORD;
      dwReserved1: DWORD;
      cFileName: array[0..MAX_PATH - 1] of AnsiChar;
      cAlternateFileName: array[0..13] of AnsiChar;
    end;
    }

  const
    faDirectory = $00000010;
  type
    DWord = Longint;

  { a timing routine }
  function TimeMS: DWord;
          stdcall; external 'winmm.dll' name 'timeGetTime';

  procedure listdirectory(indir: string);
    var
      SR: TWin32FindData;
      retcode, FindHandle: integer;
    begin
      {$IFDEF DEBUG }
      writeln(indir);
      {$ENDIF}
      FindHandle := Windows.FindFirstFile(PChar(Indir + '\*.*'), SR);
      if FindHandle <> INVALID_HANDLE_VALUE then
        Retcode := 0
      else
        Retcode := GetLastError;
      while retcode = 0 do
        begin
          {$IFDEF DEBUG }
          writeln(SR.cFileName);
          {$ENDIF}
          if (SR.dwFileAttributes and FaDirectory) = faDirectory then
            begin
              if (String(SR.cFileName) <> '.') and (String(SR.cFileName) <> '..') then
                ListDirectory(indir + '\' + SR.cFileName);
            end;
          if Windows.FindNextFile(FindHandle, SR) then
            Retcode := 0
          else
            Retcode := GetLastError;
        end;
        if FindHandle <> INVALID_HANDLE_VALUE then
           Windows.FindClose(FindHandle);
    end;

  var
    STime: Longint;
    numtimes, i: longint;
  begin
    write('How many times to run: ');
    readln(numtimes);
    STime := TimeMS;
    for i := 1 to numtimes do
      listdirectory('C:');
    STime := TimeMS - STime;
    write('Program completed in ', STime, ' ms.  Press Enter to Exit.');
    readln;
  end.

Back to Embarcadero: Delphi FAQ Index
Back to Embarcadero: Delphi Forum

My Archive

Resources

Close Box

Join Tek-Tips® Today!

Join your peers on the Internet's largest technical computer professional community.
It's easy to join and it's free.

Here's Why Members Love Tek-Tips Forums:

Register now while it's still free!

Already a member? Close this window and log in.

Join Us             Close