Environment is AIX v5.2 - standard installation of Perl, system engineers and corporate security refuse to allow me to add any additional modules beyond the basic, standard installation we currently have.
With that out of the way, I was handed - literally on my way out the door today - a 'request' to provide a Perl script by COB tomorrow which will edit an extremely large delimited file. Breaking it apart into segments is no problem and the specs were just emailed to me here at home. However, on looking those specs over I see what I think is a need for some regex ... which means I'm in trouble. Every time in the past that I thought I understood regex, I was proven wrong ... and with the looming deadline, I don't have time to experiment.
Among the requirements for editing are these 4 - the only ones that I don't know how to do:
Determine if a field is all numeric
Determine if a field contains only numbers and spaces (in any order)
Determine if a field contains letters, spaces, periods and commas only (in any order)
Determine if a field contains only numbers and dashes (in a fixed pattern, e.g., SSN, phone number, etc.)
These are demographic fields on company employees, fields such as telephone number (xxx-xxx-xxxx), SSN (xxx-xx-xxxx), name (with Jr., Dr., etc.) and so forth.
I can certainly brute force this by checking each character of each string and looking at patterns, etc. but I'm sure there is a much faster way to do it using regex ... and that's what I need.
However, rather than just someone provide the code, I'd really like to learn as I go so would greatly appreciate an explanation of what's happening within the statement and add to my knowledgebase.
Any assistance is greatly appreciated and, as always, thanks in advance for your assistance.
Best,
Tom
"My mind is like a steel whatchamacallit ...
With that out of the way, I was handed - literally on my way out the door today - a 'request' to provide a Perl script by COB tomorrow which will edit an extremely large delimited file. Breaking it apart into segments is no problem and the specs were just emailed to me here at home. However, on looking those specs over I see what I think is a need for some regex ... which means I'm in trouble. Every time in the past that I thought I understood regex, I was proven wrong ... and with the looming deadline, I don't have time to experiment.
Among the requirements for editing are these 4 - the only ones that I don't know how to do:
Determine if a field is all numeric
Determine if a field contains only numbers and spaces (in any order)
Determine if a field contains letters, spaces, periods and commas only (in any order)
Determine if a field contains only numbers and dashes (in a fixed pattern, e.g., SSN, phone number, etc.)
These are demographic fields on company employees, fields such as telephone number (xxx-xxx-xxxx), SSN (xxx-xx-xxxx), name (with Jr., Dr., etc.) and so forth.
I can certainly brute force this by checking each character of each string and looking at patterns, etc. but I'm sure there is a much faster way to do it using regex ... and that's what I need.
However, rather than just someone provide the code, I'd really like to learn as I go so would greatly appreciate an explanation of what's happening within the statement and add to my knowledgebase.
Any assistance is greatly appreciated and, as always, thanks in advance for your assistance.
Best,
Tom
"My mind is like a steel whatchamacallit ...