Skip over variables when importing flat file into sas
Skip over variables when importing flat file into sas
(OP)
I wrote the following code to extract a flat file into sas
data lib.test2;
length Rank 3 Code1 $3 Drug $25 Manufacturer $25 Sales 8 PerChg 8;
infile 'c:\Test2.txt' DLM=' ()';
input Rank Code1 Drug Manufacturer Sales PerChg ;
run;
Here is a sample of what the data looks like in the flat file
3 Plavix Bristol-Myers Squibb Company 4,223,124 11.2%
4 Advair Diskus GlaxoSmithKline 3,653,410 2.3%
5 (1) Seroquel AstraZeneca Pharmaceuticals 3,117,591 7.2%
6 (6) Abilify Bristol-Myers Squibb Company 3,083,351 30.0%
7 Singulair Merck & Co., Inc. 3,027,378 4.5%
The reason I have a variable called Code1 is because some of the entries have a number with parenthesis. As an example two of the entries have such a sequence. I have about 200 records to import. Is there a way to skip the numbers with a parenthesis. All I really want is
Rank
Drug
Manufacturer
Sales
Percentage
data lib.test2;
length Rank 3 Code1 $3 Drug $25 Manufacturer $25 Sales 8 PerChg 8;
infile 'c:\Test2.txt' DLM=' ()';
input Rank Code1 Drug Manufacturer Sales PerChg ;
run;
Here is a sample of what the data looks like in the flat file
3 Plavix Bristol-Myers Squibb Company 4,223,124 11.2%
4 Advair Diskus GlaxoSmithKline 3,653,410 2.3%
5 (1) Seroquel AstraZeneca Pharmaceuticals 3,117,591 7.2%
6 (6) Abilify Bristol-Myers Squibb Company 3,083,351 30.0%
7 Singulair Merck & Co., Inc. 3,027,378 4.5%
The reason I have a variable called Code1 is because some of the entries have a number with parenthesis. As an example two of the entries have such a sequence. I have about 200 records to import. Is there a way to skip the numbers with a parenthesis. All I really want is
Rank
Drug
Manufacturer
Sales
Percentage
RE: Skip over variables when importing flat file into sas
You could do sequenced inputs with reading the line bit by bit, here I prefer sucking the whole line in and cutting everything around the manufacturer name. Being lazy I coded the cutting as macro to be able to reuse it on different variables.
CODE
%macro mCutStr (in, out, cutFrom=L, dlm=" ");
%if %upcase("&CUTFROM.") eq "L" %then %do;
&OUT. = scan (&IN., 1, &DLM.);
&IN. = left (substr (&IN., length(&OUT.)+1));
%end; %else %do;
&OUT. = scan (&IN., -1, &DLM.);
&IN. = left (substr (&IN., 1, length(&IN.)-length(&OUT.)-1));
%end;
%mend;
%*-- cut number OUT from side CUTFROM of string IN with delimiter DLM. ;
%*-- use TEMP as a buffer variable. ;
%macro mCutNum (in, out, temp, cutFrom=L, dlm=" ");
%if %upcase("&CUTFROM.") eq "L" %then %do;
&TEMP. = scan (&IN., 1, &DLM.);
&IN. = left (substr (&IN., length(&TEMP.)+1));
&OUT. = input (&TEMP., BEST.);
%end; %else %do;
&TEMP. = scan (&IN., -1, &DLM.);
&IN. = left (substr (&IN., 1, length(&IN.)-length(&TEMP.)-1));
&TEMP. = compress (&TEMP., ',');
&OUT. = input (&TEMP., BEST.);
%end;
%mend;
data WORK.test (keep = rank drug manufacturer sales percentage);
infile 'c:\temp\test.txt'
length=linelen;
length invar $500
dummyStr $10
rank 8
drug $20
manufacturer $50
sales 8
percentage 8;
/*-- read the whole line into one string. */
input invar $varying. linelen;
/*-- read the rank (first word on string) */
%mCutNum (invar, rank, dummyStr);
/*-- drop an optional parenthesis number */
if invar =: '(' then
invar = left (substr (invar, index(invar, ')')+1));
/*-- get the drug (one (!) word on the left) */
%mCutStr (invar, drug);
/*-- drop pct sign on the right */
invar = substr (invar, 1, length (trim(invar))-1);
/*-- cut percentage and sales from the right */
%mCutNum (invar, percentage, dummyStr, cutfrom=R);
%mCutNum (invar, sales, dummyStr, cutfrom=R);
/*-- what is left: manufacturer incl. opt. spaces. */
manufacturer = invar;
run;
RE: Skip over variables when importing flat file into sas
If you are using vi editor then in your flatfile type the following command in command mode
:%s/ (.*)//g
This would take out all the characters and make your input file clean and ready to read into SAS.
Regards,
Swetha