Log In

Come Join Us!

Are you a
Computer / IT professional?
Join Tek-Tips Forums!
  • Talk With Other Members
  • Be Notified Of Responses
    To Your Posts
  • Keyword Search
  • One-Click Access To Your
    Favorite Forums
  • Automated Signatures
    On Your Posts
  • Best Of All, It's Free!

*Tek-Tips's functionality depends on members receiving e-mail. By joining you are opting in to receive e-mail.

Posting Guidelines

Promoting, selling, recruiting, coursework and thesis posting is forbidden.


Tabular DATA manipulation

Tabular DATA manipulation

I'm producing a long tabular text file by extracting information from a set of log files. I wanted to do some operations on the resulting tabular data and create a new text file with tabular data.

my tabular data looks like this

CODE --> awk

Compound	State		Method		Approach	S^2		Energy			Path
C(CCH)2        	singlet   	CC        	TO   		ERROR   ->	input issue or ?	3-1/C-CCH-2/C-CCH-2-CC-s.out
C(CCH)2        	singlet   	CC        	TO   		1.108791	-191.426232325854 	3-1/C-CCH-2/C-CCH-2-s.out
C(CCH)2        	triplet   	CC        	TO   		2.235993	-191.434509836762 	3-1/C-CCH-2/C-CCH-2-t.out
C(NH2)2        	triplet   	DFT       	TO   		ERROR   ->	input issue or ?	3-1/C-NH2-2/C-NH2-2-t.out
C(NMe2)2       	triplet   	DFT       	TO   		ERROR   ->	input issue or ?	3-1/C-NMe2-2/C-NMe2-2-t.out
C(SH)2         	singlet   	CC        	TO   		ERROR   ->	input issue or ?	3-1/C-SH-2/C-SH-2-CC-s.out
C(SH)2         	singlet   	DFT       	TO   		0.000006	-835.261598037781 	3-1/C-SH-2/C-SH-2-s.out
C(SH)2         	triplet   	DFT       	TO   		2.034097	-835.190581480918 	3-1/C-SH-2/C-SH-2-t.out
C(SiH3)2       	singlet   	CC        	TO   		ERROR   ->	SCF NOT CONVERGED	3-1/C-SiH3-2/C-SiH3-2-CC-s.out
C(SiH3)2       	triplet   	CC        	TO   		ERROR   ->	input issue or ?	3-1/C-SiH3-2/C-SiH3-2-CC-t.out
C(SiH3)2       	singlet   	DFT       	TO   		0.000224	-620.339326760127! 	3-1/C-SiH3-2/C-SiH3-2-s.out
C(SiH3)2       	triplet   	DFT       	TO   		2.013503	-620.379515709604 	3-1/C-SiH3-2/C-SiH3-2-t.out
CF2            	singlet   	CC        	TO   		0.000000	-237.419131945340 	3-1/CF2/CF2-CC-s.out
CF2            	singlet   	DFT       	TO   		-0.000000	-237.686609290184 	3-1/CF2/CF2-s.out 

and the code producing it is as below:

CODE --> bash

awk '
BEGIN           {print "Compound\tState\t\tMethod\t\tApproach\tS^2\t\tEnergy\t\t\tPath"}'
find . -name '*.out' | while read FILENAME


awk '
FNR==1          {if (FILENAME ~ /-/) 
                  { sub("./","", FILENAME);m=split(FILENAME, Ti, "/") 
                                         n=split(Ti[m], T, "-")
                                         if (length(T[1]) < 2 ) {T[1]=T[1]"("T[2]")"substr(T[3],1,1)}
                                         printf("%-15.10s\t%-10s\t%-10s\t%-5s\t\t",  T[1], substr(T[n],1,1)=="t"?"triplet":"singlet", FILENAME~"-CC"?"CC":"DFT",FILENAME~"3-1"?"TO":"NONE");
                  {sub("./","", FILENAME);m=split(FILENAME, Ti, "/") 
                                         n=split(Ti[m], T, ".")
                                         if (length(T[1]) < 2 ) {T[1]=T[1]"("T[2]")"T[3]}
                                         printf ("%-15.10s\t%-10s\t%-10s\t%-5s\t\t", T[1] , "Singlet", "DFT",FILENAME~"3-1"?"TO":"NONE   ");


!OPS &&
/xyz 0 1/ {MULT==1}

!OPS &&
/xyzfile 0 1/ {MULT==1;}

/The optimization did not converge but reached the maximum number of/ { OPT=1 }
/An error has occured in the MDCI module/ { MDCI=1 }   
/HURRAY/        {FOUND=1;
FOUND && !OPS &&
/THE OPTIMIZATION HAS CONVERGED/ {printf "%s\t","Restricted"}

/SCF NOT CONVERGED AFTER/ {printf "%s\t","SCF Crash!"}

/Expectation value of/ { printf ("%s\t",$6)
    if (!PROB){ print $NF " \t"  FILENAME 
             else{print $NF "! \t" FILENAME
END             {if (!CONV && !SS){printf "%s\t","ERROR   ->"}
    if (!CONV && OPT==1) {print "NOT OPTIMIZED\t\t" FILENAME}
else if(!CONV && PROB==1) {print "SCF NOT CONVERGED\t" FILENAME}
else if(!CONV && MDCI==1) {print "MDCI MODULE ERROR\t" FILENAME}
else if(!CONV && !PROB && CONV!=1 && MDCI!=1){print "input issue or ?\t" FILENAME} 
' OFS="\t" "$FILENAME"

I wanted to whenever columns: Compound, Method and Approach are a match energy values be reduced from each other exactly as Singlet-Triplet and form a new tabular data all together. for example

CODE --> awk

C(CCH)2        	singlet   	CC        	TO   		1.108791	-191.426232325854 	3-1/C-CCH-2/C-CCH-2-s.out
C(CCH)2        	triplet   	CC        	TO   		2.235993	-191.434509836762 	3-1/C-CCH-2/C-CCH-2-t.out 

Form one row as

CODE --> awk

C(CCH)2   	CC        	TO	0.008277510908 

and of course if match is not found just a simple error match not found or data not available. your help is appreciated and thanks in advance

Red Flag This Post

Please let us know here why this post is inappropriate. Reasons such as off-topic, duplicates, flames, illegal, vulgar, or students posting their homework.

Red Flag Submitted

Thank you for helping keep Tek-Tips Forums free from inappropriate posts.
The Tek-Tips staff will check this out and take appropriate action.

Reply To This Thread

Posting in the Tek-Tips forums is a member-only feature.

Click Here to join Tek-Tips and talk with other members!


Close Box

Join Tek-Tips® Today!

Join your peers on the Internet's largest technical computer professional community.
It's easy to join and it's free.

Here's Why Members Love Tek-Tips Forums:

Register now while it's still free!

Already a member? Close this window and log in.

Join Us             Close