Log In

Come Join Us!

Are you a
Computer / IT professional?
Join Tek-Tips Forums!
  • Talk With Other Members
  • Be Notified Of Responses
    To Your Posts
  • Keyword Search
  • One-Click Access To Your
    Favorite Forums
  • Automated Signatures
    On Your Posts
  • Best Of All, It's Free!

*Tek-Tips's functionality depends on members receiving e-mail. By joining you are opting in to receive e-mail.

Posting Guidelines

Promoting, selling, recruiting, coursework and thesis posting is forbidden.


mostly formatted data?

mostly formatted data?

Hi everyone. I'm trying to write an awk script to read a "mostly" formatted file. A chunk of the input file is shown below:

Ca A 4861A -0.113 0.6733 He 2 1025A -2.425 0.0033 S II 1.037m -2.475 0.0029 Ar 3 517.5A -2.592 0.0022
Ca A 4340A -0.438 0.3190 He 2 992.4A -2.635 0.0020 S II 1.034m -2.178 0.0058 Ar 3 387.3A -2.867 0.0012
Ca A 4102A -0.692 0.1779 He 2 972.2A -2.810 0.0014 S 2 1256A -2.824 0.0013 Ar 3 517.4A -2.937 0.0010
Ca A 3970A -0.900 0.1100 He 2 4686A -2.088 0.0071 S 3 18.67m -0.371 0.3721 Ar 4 444.7A -2.124 0.0066
Ca A 3889A -1.078 0.0730 He 2 3203A -2.501 0.0028 S 3 33.47m -1.122 0.0661 Ar 4 392.0A -2.777 0.0015
Ca A 3835A -1.234 0.0510 He 2 2733A -2.791 0.0014 S 3 1720A -2.430 0.0033 Ar 4 859.3A -2.567 0.0024

I want a script that, given a name like "He 2 4686A" (second column, fourth row), will give the numbers -2.088 and 0.0071. The problem is I never know in which column this will appear. So I don't know what address the two numbers have.
I have no previous awk experience, but I think now is a great time to learn it.

RE: mostly formatted data?


Given your sample data, how you know that you want those two numbers ?


RE: mostly formatted data?

Because the output format is
"emission line name" "wavelength" "flux" and "line ratio"
He 2 4686A flux lineratio

what happens is, depending on my model, I have a different number of lines, so a particular line appears in a different place every time. But I always want the two numbers following the line name and the wavelength. So, the two fields after "He 2 4686A".

RE: mostly formatted data?

Here is a link to what the output looks like. It was made to be readable by humans, not machines =/

RE: mostly formatted data?


Should I understand that you need the values from the two columns immediately following those containing the found values ?

A not so elegant solution, supposing the column separators are spaces :

CODE --> command-line

awk -vsearch='He 2 4686A' '{for(i=1;i<NF;i++)if($i" "$(i+1)" "$(i+2)==search)print$(i+3),$(i+4)}' /input/file 
Tested with gawk and mawk.


RE: mostly formatted data?

Ah, I see what you did. Thanks, that's pretty straightforward.

Red Flag This Post

Please let us know here why this post is inappropriate. Reasons such as off-topic, duplicates, flames, illegal, vulgar, or students posting their homework.

Red Flag Submitted

Thank you for helping keep Tek-Tips Forums free from inappropriate posts.
The Tek-Tips staff will check this out and take appropriate action.

Reply To This Thread

Posting in the Tek-Tips forums is a member-only feature.

Click Here to join Tek-Tips and talk with other members!


Close Box

Join Tek-Tips® Today!

Join your peers on the Internet's largest technical computer professional community.
It's easy to join and it's free.

Here's Why Members Love Tek-Tips Forums:

Register now while it's still free!

Already a member? Close this window and log in.

Join Us             Close