Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Chriss Miller on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Pattern Matching road block

Status
Not open for further replies.

cbh35711

Programmer
Joined
Dec 12, 2005
Messages
28
Location
US
I'm trying to get a report i run from our system into a tab delimited format. I've pretty much got the pattern down save one part. The system puts tabs between fields that aren't too long, but the ones that are it puts spaces.
MIDWEST BENEFIT PHARMACY\t\t\t\t then spaces
CORNERSTONE PHARMACY SERVICES LLC-INDIANAPOLIS ***CORP.*** no \t just spaces
At the end of both of those lines there's another field with any of three numbers. Since not all of the fields end in tab, and spaces are in the names that i want to capture in one field, i figured i could stop the field based on those three numbers in the next field. This is where i'm stuck, i can't figure out how :)
So basicallly m%(.* ending with) 58914, 0088, or 0066%
Just not sure how to write that last part.
Also those numbers won't appear anywhere else on the line.
I know i could use $, but i'm not sure how to group those numbers together so it wouldn't just stop on the first single number it matched.

Thank you for any help you may be able to offer,

Chris
 
Work out how many spaces a tab represents. Use a regex to replace each tab with n spaces. By expanding the line, all the fields should now start in fixed columns. Then just substring the line up into variables. You might have to strip off trailing blanks from each of the variables to clean them up before you use them for further processing.

Not the snazzy regex solution you were hoping for, I expect, but without a reliable delimiter between fields there isn't much else you can do...

Steve

[small]"Every program can be reduced by one instruction, and every program has at least one bug. Therefore, any program can be reduced to one instruction which doesn't work." (Object::PerlDesignPatterns)[/small]
 
Groovy idea. Here's the kicker, in all this system's infinite wisdom it seems to fluctuate the length of the tabs...Or something like that because that doesn't seem to work.
So when i look at this file in notepad it looks great, if i could draw the excel "delimit by line" in it things would work out great. Does that give you any idea of something else i can try?

Thanks all,

Chris
 
It doesn't work because tabs don't equate to a fixed number of spaces; they move the cursor to the next 'tab stop' in the application you're viewing the file in. In Notepad, I believe the tab stops are every 8 characters.

Code:
while(<DATA>) {
  my ($field) = m%(.*?)(?:58914|0088|0066)%;
  print "$field\n";
}
__DATA__
MIDWEST BENEFIT PHARMACY\t\t\t\t\t58914
CORNERSTONE PHARMACY SERVICES LLC-INDIANAPOLIS ***CORP.***    0088

Replace those \t's with actual tabs in your editor...
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top