×
INTELLIGENT WORK FORUMS
FOR COMPUTER PROFESSIONALS

Contact US

Log In

Come Join Us!

Are you a
Computer / IT professional?
Join Tek-Tips Forums!
  • Talk With Other Members
  • Be Notified Of Responses
    To Your Posts
  • Keyword Search
  • One-Click Access To Your
    Favorite Forums
  • Automated Signatures
    On Your Posts
  • Best Of All, It's Free!

*Tek-Tips's functionality depends on members receiving e-mail. By joining you are opting in to receive e-mail.

Posting Guidelines

Promoting, selling, recruiting, coursework and thesis posting is forbidden.

Students Click Here

Skip over variables when importing flat file into sas

Skip over variables when importing flat file into sas

Skip over variables when importing flat file into sas

(OP)
I wrote the following code to extract a flat file into sas
data lib.test2;
length Rank 3 Code1 $3 Drug $25 Manufacturer $25 Sales 8 PerChg 8;
infile 'c:\Test2.txt' DLM=' ()';
input Rank Code1 Drug Manufacturer Sales PerChg ;
run;

Here is a sample of what the data looks like in the flat file

3 Plavix Bristol-Myers Squibb Company 4,223,124 11.2%
4 Advair Diskus GlaxoSmithKline 3,653,410 2.3%
5 (1) Seroquel AstraZeneca Pharmaceuticals 3,117,591 7.2%
6 (6) Abilify Bristol-Myers Squibb Company 3,083,351 30.0%
7 Singulair Merck & Co., Inc. 3,027,378 4.5%
The reason I have a variable called Code1 is because some of the entries have a number with parenthesis. As an example two of the entries have such a sequence. I have about 200 records to import. Is there a way to skip the numbers with a parenthesis. All I really want is
Rank
Drug
Manufacturer
Sales
Percentage

 

RE: Skip over variables when importing flat file into sas

I tend to provide lengthy solutions, yet this makes sure you can handle these optional cases. Principle problems with you data are an optional variable placed on the data line, plus you have no delimiter between drug and manufacturer; I assume the drug is a single word, but that seems to not be the case - happy fixing that one smile

You could do sequenced inputs with reading the line bit by bit, here I prefer sucking the whole line in and cutting everything around the manufacturer name. Being lazy I coded the cutting as macro to be able to reuse it on different variables.

CODE

%*--  cut string OUT from side CUTFROM of string IN with delimiter DLM.  ;
%macro mCutStr (in, out, cutFrom=L, dlm=" ");
  %if %upcase("&CUTFROM.") eq "L" %then %do;
    &OUT. = scan (&IN., 1, &DLM.);
    &IN. = left (substr (&IN., length(&OUT.)+1));
  %end; %else %do;
    &OUT. = scan (&IN., -1, &DLM.);
    &IN. = left (substr (&IN., 1, length(&IN.)-length(&OUT.)-1));
  %end;
%mend;

%*--  cut number OUT from side CUTFROM of string IN with delimiter DLM.  ;
%*--  use TEMP as a buffer variable.                                     ;
%macro mCutNum (in, out, temp, cutFrom=L, dlm=" ");
  %if %upcase("&CUTFROM.") eq "L" %then %do;
    &TEMP. = scan (&IN., 1, &DLM.);
    &IN. = left (substr (&IN., length(&TEMP.)+1));
    &OUT. = input (&TEMP., BEST.);
  %end; %else %do;
    &TEMP. = scan (&IN., -1, &DLM.);
    &IN. = left (substr (&IN., 1, length(&IN.)-length(&TEMP.)-1));
    &TEMP. = compress (&TEMP., ',');
    &OUT. = input (&TEMP., BEST.);
  %end;
%mend;

data WORK.test (keep = rank drug manufacturer sales percentage);
  infile 'c:\temp\test.txt'
         length=linelen;
  length invar        $500
         dummyStr     $10
         rank         8
         drug         $20
         manufacturer $50
         sales        8
         percentage   8;
  /*--  read the whole line into one string.  */
  input invar $varying. linelen;
  /*--  read the rank (first word on string)  */
  %mCutNum (invar, rank, dummyStr);
  /*--  drop an optional parenthesis number  */
  if invar =: '(' then
    invar = left (substr (invar, index(invar, ')')+1));
  /*--  get the drug (one (!) word on the left)  */
  %mCutStr (invar, drug);
  /*--  drop pct sign on the right  */
  invar = substr (invar, 1, length (trim(invar))-1);
  /*--  cut percentage and sales from the right  */
  %mCutNum (invar, percentage, dummyStr, cutfrom=R);
  %mCutNum (invar, sales, dummyStr, cutfrom=R);
  /*--  what is left: manufacturer incl. opt. spaces.  */
  manufacturer = invar;
run;

RE: Skip over variables when importing flat file into sas

This solution is good if your flat file is under unix. You can remove all the characters with the parenteses as follows:

If you are using vi editor then in your flatfile type the following command in command mode  

:%s/ (.*)//g

This would take out all the characters and make your input file clean and ready to read into SAS.

Regards,
Swetha

Red Flag This Post

Please let us know here why this post is inappropriate. Reasons such as off-topic, duplicates, flames, illegal, vulgar, or students posting their homework.

Red Flag Submitted

Thank you for helping keep Tek-Tips Forums free from inappropriate posts.
The Tek-Tips staff will check this out and take appropriate action.

Reply To This Thread

Posting in the Tek-Tips forums is a member-only feature.

Click Here to join Tek-Tips and talk with other members! Already a Member? Login

Close Box

Join Tek-Tips® Today!

Join your peers on the Internet's largest technical computer professional community.
It's easy to join and it's free.

Here's Why Members Love Tek-Tips Forums:

Register now while it's still free!

Already a member? Close this window and log in.

Join Us             Close