Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Chriss Miller on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Perl grep 1

Status
Not open for further replies.

pmcmicha

Technical User
May 25, 2000
353
I have a job which runs with the following code:


if($rdir eq "test1") {
if(scalar(@ex_test1) != 0) {
foreach $file (@file_list0) {
chomp($file);
if(grep(/$file/, @ex_test1) == 0) {
MTIME(); ## print current time
print LOG "New file.\n";
push @file_list1, "$file";
}
}
}
print LOG "Done.\n";
}


My script connects to various servers, grabs the contents within specified directories and then from my previous pulls, and then attempts to remove any duplicates. I only need to get the new files. These files are generated from users and I have no control on how they name their files. This is the problem. One of the files has (text) as part of the filename and it keeps getting pulled each time. How can I avoid this from happening?

Thanks in advance.
 
Could be your grep statement.
Code:
            if(grep(/$file/, @ex_test1) == 0) {
tries to match the pattern "$file" anywhere in each of the elements of @ex_test1. Try changing to this:
Code:
            if(!grep {$_ eq "$file"} @ex_test1) {
This compares the file name to each elem in @ex_test1, instead of doing a pattern match.

HTH

 
mikevh,

Works like a champ, thanks.
 
Thanks. This is probably better through, without grep:
Code:
        foreach $file (@file_list0) {
            chomp($file);
            [b]my $found = 0;
            for (my $i=0; $i<@ex_test1 && !$found; $i++) {
                $found = ($ex_test1[$i] eq $file);
            }
            if (!$found) {[/b]
                MTIME();  ## print current time
                print LOG "New file.\n";
                push @file_list1, "$file";
            }
        }
Because grep returns a list, it needs to search all the way to the end of @ex_test1 for each $file, even if it finds a match early. Unless @ex_test1 is short, this becomes very inefficient. You're better off with the above, which quits as soon as a match is found.

HTH

 
Or, another way:
Code:
my %seen;
$seen{$_}++ for @ex_test1;
foreach my $file (@file_list0) {
    chomp $file;
    unless ($seen{$file}++) {
         MTIME();  ## print current time
         print LOG "New file.\n";
         push @file_list1, $file;
    }
}
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top