Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Wanet Telecoms Ltd on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Need less expensive way of doing this . . .

Status
Not open for further replies.

youthman

Programmer
Apr 28, 2004
49
US
Good day folks!

I have a problem that I am hopeing some of you will be able to help with. I am setting up the server to run a CRON job on a series of files, and it is taking it forever to run, and is pulling a TON of system resources while it is working. Here is what I am doing. I have to download a series of files and make a database based on what is in those files. Here are the specifics:

1. The first thing that the system does is opens two files and combines them based on their contents. The first is a file of office names and codes the second is a file of agent names and codes. The agents are matched by a single office code that is found in a column in both files. This file is now saved to the server.
2. Now the script has to download a file that contains the directory contents on another server. This is loaded into memory and compare against a forth file. The forth file is the heart of the system. It is a 27 Meg file that is combined with the rest of the files created. One column is compared against the resulting file in step one for information, and another column is compared against the information in memory. A new file is created with all the remaining information. This file ends up being close to 30 meg when it is done.

This entire process is taking about 4 hours to complete, and my server host is not happy about it. Is there a way to make this less taxing on the server?

Here is the heart of the code!
Code:
print $foo->header;


$counter = 0;
$line_to_write = "";

 $database_path = blah.../httpdocs/ResData.txt";
 $database_path2 = blah.../httpdocs/ResData-processed2.txt";
 $members = "blah.../httpdocs/Member.txt";
 $omembers = "blah.../httpdocs/Office.txt";
 $agents="blah.../httpdocs/Agents.txt";
 $image_file="[URL unfurl="true"]http://blah.../images/imagefiles.txt";[/URL]

 unlink("$database_path2");
 unlink("$agents");

 $list = get("$image_file") || die "Can't get image file list:  $!";
 @files = split (/\n/, $list);

$dbcounter = 0;

# ---------------------------------------------- Find Member Info

 open DATAB, "< $members" or die $foo->header, "Can't open Members database: $!\n";
 while (<DATAB>) {
   if($dbcounter == 0){
    $dbcounter++;
     next;
   }
   ($FirstName, $LastName, $PubID, $AgentPhone, $AgentFax, $UIDOffice, $AgentEmail, $AgentWeb, $MobilePhone) = split /	/;
      open DATAO, "< $omembers" or die $foo->header, "Can't open office database: $!\n";
     while (<DATAO>) {
    ($OfficeName, $OfficeID, $Address1, $Address2, $City, $State, $Zip, $OfficePhone, $OfficeFax, $OfficeWeb) = split /	/;
        if (($UIDOffice ne "") and ($UIDOffice eq $OfficeID))  {
           last;
        }
     }
     close DATAO;
     chop $MobilePhone;
     if($MobilePhone eq "") {
       $MobilePhone = "None";
     }
     $agent_name = "$FirstName $LastName";
     $agent_phone = "$AgentPhone  Cell:  $MobilePhone";
     $line = "$agent_name	$PubID	$agent_phone	$OfficePhone	$OfficeFax	$OfficeName	$AgentEmail\n";
      open (OUTDB3, ">>$agents") or die $foo->header, "Can't write to agents2 FILE: $!";
       print OUTDB3 "$line";
      close OUTDB3;
  $dbcounter++;

 }

 close DATAB;

open DATA, "< $database_path" or die $foo->header, "Can't open Homes database: $!\n";

while (<DATA>) {
  if($counter == 0){
    $counter++;
    next;
  }

    ($Acres, $AcresDesc, $AgentCoList, $AgentList, $AgentSell, $Amenities, $Appliances, $Architecture, $Area, $Areas, $AuctionDate, $AuctionLicense, $AuctionManager, $AuctionMgrLicense, $AuctionTime, $AuctionType, $BaseDesc, $BasementYN, $BathsArea, $BathsFull, $BathsFullBasement, $BathsFullMain, $BathsFullUpper, $BathsHalf, $BathsHalfBasement, $BathsHalfMain, $BathsHalfUpper, $BathsTotal, $Bed2Dim, $Bed2Floors, $Bed2Level, $Bed2Windows, $Bed3Dim, $Bed3Floors, $Bed3Level, $Bed3Windows, $Bed4Dim, $Bed4Floors, $Bed4Level, $Bed4Windows, $Bed5Dim, $Bed5Floors, $Bed5Level, $Bed5Windows, $Beds, $BldgNum, $BoardLoad, $BuildersName, $CapRate, $City, $ClosedDate, $ClosingCosts, $CoListOtherPhone1, $CoListOtherPhone1Type, $CoListOtherPhone2, $CoListOtherPhone2Type, $CoListOtherPhone3, $CoListOtherPhone3Type, $CommissionDisc, $CompBuy, $CompletionDate, $CompSell, $CompVariable, $ConstructionStatus, $ContactName, $ContactName2, $ContactPhone, $ContactPhone2, $ContactType, $ContactType2, $Contingency, $CoolSystem, $County, $CountyTax, $DaysOnMarket, $Directions, $Disclosures, $Documents, $DupFlag, $EconomicOccupancy, $Editor, $Equipment, $ExpenseActual, $ExpenseActualCap, $ExpenseCAM, $ExpenseCAMSF, $ExpenseInsurance, $ExpenseInsuranceSF, $ExpenseMisc, $ExpenseMiscSF, $ExpenseStop, $ExpireDate, $FireplaceDesc, $Fireplaces, $Foundation, $Garage, $GarageCap, $GarageDesc, $geoCodeDate, $geoResult, $GrossRentMultiplier, $HeatSrc, $HeatSystem, $ImprovementsAllowed, $IncomeGrossPotential, $IncomeNetOperating, $Inspection, $Interior, $InternetAddress, $InternetList, $InternetRemarks, $Interstate, $Investment, $Latitude, $LeaseAmountSF, $LeaseExisting, $LeasePriceSF, $LeaseProposed, $LeaseProposedTerm, $LeaseRemainTerm, $LeaseTerms, $LeaseType, $LeaseTypeSold, $Legal, $LegalTax, $Levels, $Lifestyles, $ListDate, $ListOtherPhone1, $ListOtherPhone1Type, $ListOtherPhone2, $ListOtherPhone2Type, $ListOtherPhone3, $ListOtherPhone3Type, $ListPrice, $ListPriceChange, $ListPriceOrig, $ListStatus, $ListType, $LoanInterestRate, $LoanMandatoryFee, $LoanMortgageAmount, $LoanYears, $Location, $Longitude, $LotInfo, $LotNum, $LotSize, $LotSizeTax, $MandFeeInclude, $MandFeePaid, $MapE, $MapN, $MapNumEW, $MapNumNS, $MLSNUM, $Modified, $NewConstruction, $NewConstructionDesc, $OfficeExtension, $OfficeList, $OfficeSell, $OpenHouseDate, $OpenHouseRemarks, $OpenHouseTime, $OwnershipInterest, $PendingDate, $PhotoCount, $PhotoDate, $PhotoInstr, $PhysicalOccupancy, $PointsBuyer, $PointsSeller, $Porch, $Possession, $PossessionCom, $PossibleFinance, $PresentUse, $ProposedUse, $PropStatus, $PropType, $Remarks, $RemarksAgent, $RemarksCommission, $Renovated, $Repairs, $ReportDetails, $ResCondo, $RoadAccess, $RoadFrontageCom, $RoadSurface, $RoomBreakfastDim, $RoomBreakfastFloors, $RoomBreakfastLevel, $RoomBreakfastWindows, $RoomDiningDesc, $RoomDiningDim, $RoomDiningFloors, $RoomDiningLevel, $RoomDiningWindows, $RoomFamilyDim, $RoomFamilyFloors, $RoomFamilyLevel, $RoomFamilyWindows, $RoomGreatDim, $RoomGreatFloors, $RoomGreatLevel, $RoomGreatWindows, $RoomKitchenDim, $RoomKitchenFloors, $RoomKitchenLevel, $RoomKitchenWindows, $RoomLivingDim, $RoomLivingFloors, $RoomLivingLevel, $RoomLivingWindows, $RoomMasterBedDesc, $RoomMasterBedDim, $RoomMasterBedFloors, $RoomMasterBedLevel, $RoomMasterBedWindow, $RoomOther1, $RoomOther1Dim, $RoomOther1Floors, $RoomOther1Level, $RoomOther1Windows, $RoomOther2, $RoomOther2Dim, $RoomOther2Floors, $RoomOther2Level, $RoomOther2Windows, $RoomOther3, $RoomOther3Dim, $RoomOther3Floors, $RoomOther3Level, $RoomOther3Windows, $Rooms, $SalesPrice, $SaleTerms, $SchoolDistrict, $Section, $SellerMotivation, $Sewage, $Siding, $SoldTerms, $SolidWaste, $SolidWasteTax, $Special, $SqFtBasement, $SqFtBasementLabel, $SqFtBasementTax, $SqFtMain, $SqFtMainUpper, $SqFtOptions, $SqFtOptionsAmt, $SqFtPercent, $SqFtPriceMin, $SqFtSource, $SqFtTotal, $SqFtTotalBase, $SqFtTotalCalc, $SqFtTotalTax, $SqFtUpper, $State, $StreetDir, $StreetDirTax, $StreetName, $StreetNameTax, $StreetNum, $StreetNumDisplay, $StreetNumTax, $StreetType, $StreetTypeTax, $TaxAmount, $TaxAmountSemi, $TaxAmountSemiTax, $TaxAmountTax, $TaxExemption, $TaxExemptionTax, $TaxID, $TaxIDMultiple, $TaxSqFt, $TaxYear, $TaxYearPaid, $Township, $TownshipTax, $TownTax, $TransType, $TransTypeSold, $UID, $UIDPrp, $UtilityOptions, $VacancyLoss, $WaterHeater, $WaterSrc, $WithdrawnDate, $YearBuilt, $YearBuiltTax, $Zip4, $Zip4Tax, $ZipCode, $ZipCodeTax, $ZoningCom) = split /	/;


# ---------------------------------------------- Find Member Info

   open DATAB, "< $agents" or die $foo->header, "Can't open agents3 database: $!\n";
   while (<DATAB>) {
    ($agent_name, $PubID, $agent_phone, $OfficePhone, $OfficeFax, $OfficeName, $AgentEmail) = split /	/;
        if (($AgentList ne "") and ($AgentList eq $PubID))  {
           last;
         }
    }
    close DATAB;

# -----------------------------------------------------------  Check images
$im1="$MLSNUM\_1.jpg";
$im2="$MLSNUM\_2.jpg";
$im3="$MLSNUM\_3.jpg";
$im4="$MLSNUM\_4.jpg";
$im5="$MLSNUM\_5.jpg";
$im6="$MLSNUM\_6.jpg";
$im7="$MLSNUM\_7.jpg";
$im8="$MLSNUM\_8.jpg";
$im9="$MLSNUM\_9.jpg";
$im0="$MLSNUM\_0.jpg";

if ($PhotoCount > 0) {
 foreach $item (@files){
         $item =~ s/\s+//g;
 chomp $item;

    if ($item eq "$im0"){
      $image0="[URL unfurl="true"]http://blah...//images/PicturesAll/$im0";[/URL]
     if ($PhotoCount == 1) {
       last;
     }
    }
    if ($item eq $im1){
      $image1="[URL unfurl="true"]http://blah...//images/PicturesAll/$im1";[/URL]
     if ($PhotoCount == 2) {
       last;
     }
    }
    if ($item eq $im2){
      $image2="[URL unfurl="true"]http://blah.../images/PicturesAll/$im2";[/URL]
     if ($PhotoCount == 3) {
       last;
     }
    }
    if ($item eq $im3){
      $image3="[URL unfurl="true"]http://blah...//images/PicturesAll/$im3";[/URL]
     if ($PhotoCount == 4) {
       last;
     }
    }
    if ($item eq $im4){
      $image4="[URL unfurl="true"]http://blah...//images/PicturesAll/$im4";[/URL]
     if ($PhotoCount == 5) {
       last;
     }
    }
    if ($item eq $im5){
      $image5="[URL unfurl="true"]http://blah...//images/PicturesAll/$im5";[/URL]
     if ($PhotoCount == 6) {
       last;
     }
    }
    if ($item eq $im6){
      $image6="[URL unfurl="true"]http://blah...//images/PicturesAll/$im6";[/URL]
     if ($PhotoCount == 7) {
       last;
     }
    }
    if ($item eq $im7){
      $image7="[URL unfurl="true"]http://blah...//images/PicturesAll/$im7";[/URL]
     if ($PhotoCount == 8) {
       last;
     }
    }
    if ($item eq $im8){
      $image8="[URL unfurl="true"]http://blah...//images/PicturesAll/$im8";[/URL]
     if ($PhotoCount == 9) {
       last;
     }
    }
    if ($item eq $im9){
      $image9="[URL unfurl="true"]http://blah...//images/PicturesAll/$im9";[/URL]
      last;
    }
 }
}

$line_to_write = "$Acres	$AcresDesc	$Amenities	$Appliances	$Architecture	$Area	$Areas	$BaseDesc	$BasementYN	$BathsArea	$BathsFull	$BathsFullBasement	$BathsFullMain	$BathsFullUpper	$BathsHalf	$BathsHalfBasement	$BathsHalfMain	$BathsHalfUpper	$BathsTotal	$Bed2Dim	$Bed2Floors	$Bed2Level	$Bed2Windows	$Bed3Dim	$Bed3Floors	$Bed3Level	$Bed3Windows	$Bed4Dim	$Bed4Floors	$Bed4Level	$Bed4Windows	$Bed5Dim	$Bed5Floors	$Bed5Level	$Bed5Windows	$Beds	$BuildersName	$City	$CompletionDate	$ConstructionStatus	$Contingency	$CoolSystem	$County	$Directions	$Disclosures	$Equipment	$ExpireDate	$FireplaceDesc	$Fireplaces	$Foundation	$Garage	$GarageDesc	$HeatSrc	$HeatSystem	$Inspection	$Interior	$InternetList	$InternetRemarks	$Legal	$Levels	$Lifestyles	$ListDate	$ListPrice	$ListType	$Location	$LotInfo	$LotNum	$LotSize	$MandFeeInclude	$MandFeePaid	$MapE	$MapN	$MapNumEW	$MapNumNS	$MLSNUM	$NewConstruction	$NewConstructionDesc	$OfficeList	$OwnershipInterest	$PhotoCount	$PhysicalOccupancy	$Porch	$Possession	$PossessionCom	$PossibleFinance	$Remarks	$RemarksAgent	$RemarksCommission	$Repairs	$RoomBreakfastDim	$RoomBreakfastFloors	$RoomBreakfastLevel	$RoomBreakfastWindows	$RoomDiningDesc	$RoomDiningDim	$RoomDiningFloors	$RoomDiningLevel	$RoomDiningWindows	$RoomFamilyDim	$RoomFamilyFloors	$RoomFamilyLevel	$RoomFamilyWindows	$RoomGreatDim	$RoomGreatFloors	$RoomGreatLevel	$RoomGreatWindows	$RoomKitchenDim	$RoomKitchenFloors	$RoomKitchenLevel	$RoomKitchenWindows	$RoomLivingDim	$RoomLivingFloors	$RoomLivingLevel	$RoomLivingWindows	$RoomMasterBedDesc	$RoomMasterBedDim	$RoomMasterBedFloors	$RoomMasterBedLevel	$RoomMasterBedWindow	$RoomOther1	$RoomOther1Dim	$RoomOther1Floors	$RoomOther1Level	$RoomOther1Windows	$RoomOther2	$RoomOther2Dim	$RoomOther2Floors	$RoomOther2Level	$RoomOther2Windows	$RoomOther3	$RoomOther3Dim	$RoomOther3Floors	$RoomOther3Level	$RoomOther3Windows	$Rooms	$SchoolDistrict	$Section	$Sewage	$Siding	$SolidWaste	$Special	$SqFtBasement	$SqFtBasementLabel	$SqFtMain	$SqFtMainUpper	$SqFtOptions	$SqFtPercent	$SqFtPriceMin	$SqFtSource	$SqFtTotal	$SqFtTotalBase	$SqFtTotalCalc	$SqFtUpper	$State	$StreetDir	$StreetName	$StreetNum	$StreetNumDisplay	$StreetType	$TaxAmount	$TaxAmountSemi	$TaxExemption	$TaxYear	$TaxYearPaid	$Township	$UtilityOptions	$WaterHeater	$WaterSrc	$YearBuilt	$Zip4	$ZipCode	$agent_name	$agent_phone	$OfficePhone	$OfficeFax	$OfficeName	$image0	$image1	$image2	$image3	$image4	$image5	$image6	$image7	$image8	$image9	$AgentEmail	$ZoningCom\n";

$counter++;

open (OUTDB2, ">>$database_path2") or die "Can't write to DB2 FILE: $!";
  print OUTDB2 "$line_to_write";
close OUTDB2;


$image0="";
$image1="";
$image2="";
$image3="";
$image4="";
$image5="";
$image6="";
$image7="";
$image8="";
$image9="";

}

$dbh = DBI->connect('DBI:mysql:redatabase;host=localhost', 'un', 'pass') || die "Database connection not made: $DBI::errstr";

$sth = $dbh->prepare("DELETE FROM `hometest`");
 
$sth->execute;
$sth->finish();

$sth = $dbh->prepare("LOAD DATA LOCAL INFILE '$database_path2' REPLACE INTO TABLE hometest");

$sth->execute;
$sth->finish();
$dbh->disconnect();



    # -------------------------------------------- Find the time
     ($sec, $min, $hour, $mday, $mon, $year, $wday, $yday, $isdst) = localtime(time);
     $year = ($year + 1900);
     $wday = (Sun, Mon, Tue, Wed, Thu, Fri, Sat)[(localtime)[6]];
     $mon = (Jan, Feb, Mar, Apr, May, Jun, Jul, Aug, Sep, Oct, Nov, Dec)[(localtime)[4]];
     if($hour < 13){
       $side = "AM";
     } 
     if($hour > 12){
       $hour = $hour - 12;
       $side = "PM"; 
     }
     if($sec < 10){
       $sec = "0$sec"; 
     }
     if($hour < 10){
       $hour = "0$hour";
     }

$actual_time = "$wday,$mon $mday at $hour:$min:$sec $side";

print "Database update completed on $actual_time was successful!!!!! \n \n Inserted $counter homes into SQL database! \n \n";

unlink("$database_path2");
unlink("$agents");
print "Done !<BR><BR><BR>";
 
If your host is getting teed about it, why not process all the files on a machine in your own network, and just put the results back up to the server.

You should think about Databases for this kind of stuff especially with the volume of data, which is by no means massive, but run on a shared server it's gonna p!ss other users off. The benfits of a database are that you only need to call the data your're interested in

Regards
--Paul

It's important in life to always strike a happy medium, so if you see someone with a crystal ball, and a smile on their face ...
 
Thanks for the fast response Paul. The problem I have is that I need this to run automatically. The files themselves are a flat-file database format that is tab delimited. They are being processed into a SQL database for the application that the information is used in. Any other ideas?
 
Correct me if I'm wrong but it appears that you're processing the complete list of data every time you run this script. Surely the entire list doesn't change every day, does it? Is there a way to just update what's changed rather than rehash the entire list every day? This will certainly be much faster and more economical than the way you're currently doing it.

There's always a better way. The fun is trying to find it!
 
Unfortunatly, no. I've pursued that idea as well. The data is a list of homes that are for sale in the area. There are over 300 columns in the original database and over 17,800 lines. The second file is member offices, and that changes rarely, but the third is member agents and that changes about every two days. So of the three files, one changes about once a month, one changes about twice a week and the third changes daily.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top