Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations wOOdy-Soft on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

SED / AWK

Status
Not open for further replies.

abovebrd

IS-IT--Management
May 9, 2000
690
US
I am in need of some help. On a routine bases I have large database files that I need to extract data from. I then need to input the extracted data into new files.<br><br>The data is delimited, <br><br>Example:<br><br>****<br>Joe Smith<br>408-123-4567<br>Sample Co Inc.<br><br>(Followed by a multiple more lines)<br><br>****<br><br>The common delimiter between each set of data is ****<br>The data is then moved to a new file. I then continue to do this 100 or so more times until the file is empty. I have the ability to change the delimiter. The new files can also be named anything. The only requirement is they have a .tag extension.<br><br><br>My question: Does any know how I can automate this process through a script. I think SED would accomplish this, but I am not familiar enough with SED to execute it.<br><br>Any thoughts on automating this task would be helpful. <br><br>Danny.<br><A HREF="mailto:dannyd@aboveboardelectronics.com">dannyd@aboveboardelectronics.com</A><br>
 
Danny,<br><br>It sounds like awk might be the tool for the job.&nbsp;&nbsp;You'll have to describe the output format you need more explicitly for a more detailed answer though... do you need it to be comma delimited rows?&nbsp;&nbsp;SQL insertion statements?<br><br>Annihilannic.
 
Here is a sample of my source file :<br><br>cvr=none<br>tfn=408-573-5542<br>fll=&lt;&lt;EOF<br><br>ABC Company<br><br>00402932 IN 05/26/00&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;16.00&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;0.00&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;16.00&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;0.00&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;0.00&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;0.00<br>00403326 IN 05/31/00&nbsp;&nbsp;&nbsp;&nbsp;197.32&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;0.00&nbsp;&nbsp;&nbsp;&nbsp;197.32&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;0.00&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;0.00&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;0.00<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;------------------------------------------------------------<br><br><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;213.32&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;0.00&nbsp;&nbsp;&nbsp;&nbsp;213.32&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;0.00&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;0.00&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;0.00<br>EOF<br><br><br><br>cvr=none<br>tfn=408-573-5542<br>fll=&lt;&lt;EOF<br><br>XYZ Company<br><br>00405461 IN 06/13/00&nbsp;&nbsp;&nbsp;&nbsp;144.29&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;0.00&nbsp;&nbsp;&nbsp;&nbsp;144.29&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;0.00&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;0.00&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;0.00<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;------------------------------------------------------------<br><br><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;144.29&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;0.00&nbsp;&nbsp;&nbsp;&nbsp;144.29&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;0.00&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;0.00&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;0.00<br>EOF<br><br><br><br>I need to grab all of the information between cvr=none and<br>EOF. I then need to move that information to a new file. I would then need to repeat the process on the next occurence and so on until all occurences are moved to new files. <br>My source file has around 30,000 lines. There would also be around 500 files generated from this process.<br><br>If you have any ideas I would love to here them<br><br>Danny <br><br>&nbsp;<br> <p>Danny Daniels<br><a href=mailto:dannyd@aboveboardelectronics.com>dannyd@aboveboardelectronics.com</a><br><a href= > </a><br>
 
Is perl available on ths server?&nbsp;&nbsp;If it is, I have something I've used in the past that would probably work for you as well. <p> <br><a href=mailto: > </a><br><a href= > </a><br>--<br>
0 1 - Just my two bits
 
This awk script does the trick:<br>-------------------- 8&lt; -----------------------<br>#!/usr/bin/awk -f<br>BEGIN { <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;FILEINDEX=0 <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;OUTFILENAME=&quot;record.&quot; FILEINDEX<br>}<br>/^cvr=/ {<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;close(OUTFILENAME)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;FILEINDEX++<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;OUTFILENAME=&quot;record.&quot; FILEINDEX<br>}<br>{ <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;print $0 &gt; OUTFILENAME<br>}<br>-------------------- 8&lt; -----------------------<br><br>Basically every time it encounters a line beginning with 'cvr=' it opens a new output filename.<br><br>I tried to do it using '\nEOF\n' as a record separator, but it didn't seem to work... if anyone can tell me why I'd appreciate it!<br><br>Annihilannic.
 
Annihilannic,<br><br>I can change the syntax of the record separator if that helps. <br><br>AndyBo,<br><br>Perl is not loaded on this server. But installing perl is not out of the question ?<br><br>Danny<br><A HREF="mailto:dannyd@aboveboardelectronics.com">dannyd@aboveboardelectronics.com</A><br><br> <p>Danny Daniels<br><a href=mailto:dannyd@aboveboardelectronics.com>dannyd@aboveboardelectronics.com</a><br><a href= > </a><br>
 
Annihilannic:&nbsp;&nbsp;&quot;\nEOF\n&quot; probably wouldn't work because what you've actually got is &quot;^EOF\n&quot;.&nbsp;&nbsp;ie, from beginning of the line (&quot;^&quot;) look for &quot;EOF&quot; followed by a newline (&quot;\n&quot;).<br><br>Danny:&nbsp;&nbsp;Installing a full blown perl might be overkill to solve a single problem.&nbsp;&nbsp;I'd go with the &quot;awk&quot; solution first as awk will already installed.&nbsp;&nbsp;If you're having problems getting the awk to run, post back and we'll get into some perl hacking :) <p> <br><a href=mailto: > </a><br><a href= > </a><br>--<br>
0 1 - Just my two bits
 
Thanks AndyBo, Annihilannic: <br><br>I ran the awk script earlier today and it seemed to do to trick. If fact it worked great !!!!<br><br>Thanks guys,<br><br>Just one more question:<br>Can you recommend a good good source for learing AWK. <br>Maybe a good book ?<br><br>Danny <br><A HREF="mailto:dannyd@aboveboardelectronics.com">dannyd@aboveboardelectronics.com</A><br>&nbsp;&nbsp;<br> <p>Danny Daniels<br><a href=mailto:dannyd@aboveboardelectronics.com>dannyd@aboveboardelectronics.com</a><br><a href= > </a><br>
 
I would say the &quot;classic&quot; Sed & Awk reference is &quot;Sed & Awk&quot; from O'Reilly.&nbsp;&nbsp;There are condensed highlights taken from this book in the &quot;Unix in a Nutshell&quot; and &quot;Unix Power Tools&quot; books also from O'Reilly.&nbsp;&nbsp;It's probably better to go for the original, though, if you want to really get down and learn some Awk. <p> <br><a href=mailto: > </a><br><a href= > </a><br>--<br>
0 1 - Just my two bits
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top