Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations bkrike on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Need Urgent Help with Perl Lookup & Substitution Routine 1

Status
Not open for further replies.

EricSilver

Technical User
Mar 18, 2008
19
US
Hello,

I am a new user having a problem getting what should be a simple routine to work.

What I am doing is opening an address file, and a state name file, so I can change state abbreviations, i.e. “AZ” to full state names, i.e. “Arizona.”

After the address file is opened, the state name file is opened. This file consists of three fields: 1.( Unique identifier; 2.) State Abbreviation; 3.) Full state name.

The routine compares the address file state abbreviation data value to the abbreviation field value in the state name file. If it matches, the Address File abbreviation data element is changed to the State Name File full name data element.

If the Address File abbreviation is not in the state name file, I want the routine to print “error” in place of the full state name. Unfortunately, that part is not working. Instead of printiong "ERROR" for one Address file record, it prints "ERROR" for all of them. Any assistance would be appreciated. Here is what I have:


## FILE LOCATIONS
$file='/File.txt'; ## THE ADDRESS FILE
$maplocation='/Map.txt'; ## THE STATE NAME LOOKUP FILE
$file2='File2.txt'; ## THE MODIFIED ADDRESS FILE


## OPEN ADDRESS FILE AND ADD CONTENTS TO AN ARRAY
open(FILE,"<$file")||die "Could not open $file";
@file=<FILE>;
close FILE;

## FOR EACH RECORD IN THE ADDRESS FILE, DO THE FOLLOWING
foreach $line (@file) {
@data=split(/t/,$line);

## CREATE VARIABLES CORRESPONDING TO ADDRESS FILE DATA
## (This step is not really necessary)
$d0=$data[0];
$d1=$data[1];
$d2=$data[2];
$d3=$data[3];
$d4=$data[4];
$d5=$data[5];
$d6=$data[6];
$d7=$data[7
$d8=$data[8];
$d10=$data[10];

## OPEN STATE NAMES FILE AND ADD CONTENTS TO AN ARRAY
open(MAP,"<$maplocation");
@entries = <MAP>;
close MAP;

## FOR EACH RECORD IN THE ADDRESS FILE, DO THE FOLLOWING
foreach $line2 (@entries) {
@fields=split(/,/,$line2);

## COMPARE ADDRESS FILE STATE ABBREVIATION DATA TO STATE FILE ABBREVIATION DATA. (THIS CODE WOKS PERFECTLY)

if ($d8 eq $fields[1]) {$d8=$fields[2]};

## $d8 is the address file abbreviation value; $fields[1]
## is the State File abbreviation value; and $fields[2] is
## the state file full name value.

## IF ADDRESS FILE STATE ABBREVIATION IS NOT PRESENT IN STATE NAME FILE, PRINT ERROR (THIS CODE FAILS):

if ($d8 eq $fields[1]) {$d8=$fields[2]} else {$d8=”error”};
}

## INSTEAD OF PRINTING "ERROR" FOR ONE RECORD, IT PRINTS ERROR FOR ALL OF THEM.

## WRITE OUTPUT TO FILE
$line= ”$d0”.”$d1”.”$d2”.”$d3”.”$d4”.”$d5”.”$d6”.”$d7”.”$d8”.”$d9”.”$d10”."\n";

};

open(DATA,">$file2");
print DATA (@file);
close DATA;
 
Things start to become complex.
You have now the possibility of multiple matches: take your example for NY and NYC, you'll likely need to decide that you are looking for the longest match possible (with a minimum of two characters). In that case I would do it this way (as a replacement for the last two lines above)
Code:
  $d8=uc($data[8]);
  while(length($d8)>1){
    if(exists$states{$d8}){
      $data[8]=$states{$d8};
      last;
    }
    $d8=substr($d8,0,-1);
  }
  $data[8]='Error'if length($d8)<2;


Franco
: Online engineering calculations
: Magnetic brakes for fun rides
: Air bearing pads
 
Thanks Franco,

In some cases (with fields other than the state abbreviations we have been discussing here) an exact length will be required, i.e., where NYC and NYCVB are unique values and a precise match is required.

NYCCVB could, for instance, match NYC, however, NYCVB, as a unique value, must match only NYCVB, and not NYC.

You’re right, it is complex (and confusing) but if I can dynamically change the size of the substring length - substr($abbrev,0,2) - before doing the lookup, things become easy.

I already know I can make the substring length variable, i.e. substr($abbrev,0,$var). I know I can identify the specific lookup value that corresponds to a submitted value, and, by extension, get the length() of that lookup value.

My task, therefore, is to preload that length() value as substr (,,$var) just before the lookup occurs, throughout the loop cycle. If that works, I will be very happy.
 
In saying that things become complex I was referring to the fact that you must be able to define exactly the rules to follow before writing any code.
Anyway, as you seem to be looking for the longest possible match, look closely at my code above: it does that (and I hope this thread will stop here).

Franco
: Online engineering calculations
: Magnetic brakes for fun rides
: Air bearing pads
 
The "on the fly" solution I described has been done, and works as required, so yes, the thread can end.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top