INTELLIGENT WORK FORUMS
FOR COMPUTER PROFESSIONALS

Log In

Come Join Us!

Are you a
Computer / IT professional?
Join Tek-Tips Forums!
  • Talk With Other Members
  • Be Notified Of Responses
    To Your Posts
  • Keyword Search
  • One-Click Access To Your
    Favorite Forums
  • Automated Signatures
    On Your Posts
  • Best Of All, It's Free!

*Tek-Tips's functionality depends on members receiving e-mail. By joining you are opting in to receive e-mail.

Posting Guidelines

Promoting, selling, recruiting, coursework and thesis posting is forbidden.

Jobs

using an external script in awk

using an external script in awk

(OP)
Hello to all,

i can not find a solution:

i have a file file1.txt looks like
4567;9803298ß;38840 (over thousands and thousands Lines)

And an external script "dosomething"
Now i need to do the following stepps:
1. Open the file
2 Cut the first 4 digits
3. Use them as a parameter to call an external script like: dosomething xxxx
4 Use the result of "dosomething xxxx" to replace the first 4 digits of this line...
5 print the line in output.

i am working with awk and i can not find a way to use this external skript in it...

Can someone PLEASE help?

thnaks!

RE: using an external script in awk

Hi

Sorry, this is an off-topic answer, but my curiosity is at the end of its limits.

Why do you need and Awk solution for this ?

Do not take it personally, this is not strictly related to you or your question. I just saw during the years people coming and asking for Awk solutions and there are cases when I can not imagine why.

For example I would solve your problem like this, definitely without involving Awk :

CODE --> command-line

while IFS=';' read -r begin end; do
  echo "$( dosomething "$begin" );$end"
done < file1.txt 
The above works in Bash, Dash, MKsh.

Feherke.
http://feherke.github.com/

RE: using an external script in awk

(OP)
Hi feherke,

i think you are right... i am a newbie so i have to learn a lot!!
And i am glad everytime i get an idea from people who have experience like you!!

Thank you very much for the answer. I used it and it is exactly what i need...

Theo

RE: using an external script in awk

(OP)
hmm i ralise now thats very slow...
my file is 80 MB and the script runs since 3 hours....

I tested it with i small file and it worked fine but i did not realise that it takes so long if i have my orig file...

RE: using an external script in awk

Hi

I am afraid there is no much to optimize in that code. But some strategies may help.

Maybe caching ? Previously you wrote :

Quote (Theo)

4567;9803298ß;38840 (over thousands and thousands Lines)

(...)

2 Cut the first 4 digits

Given the huge amount and the shortness of codes, is it possible the 4 digit codes to not be unique ? In this case we could run dosomething for a given code only once and save its output, then later reuse that saved output without running dosomething again.

Maybe parallelising ? Some versions of xargs and make are able to execute tasks in parallel. This is especially useful if dosomething has idle times during the run or you have multicore processor. But even if not, running multiple dosomething processes in the same time should help. Of course, if the order of the output matters, this becomes abit more complicated, but bearable.

So give us some details on those codes and dosomething's activity.

Feherke.
http://feherke.github.com/

RE: using an external script in awk

(OP)
Hi Feherke and thank you so much for your help!

the first field (first 4 digits) are not unique.
"dosomething" is i binary and it takes this number and calculate a new one. The new number depends allways from the input. That means e.g. "dosomething 4567" gives allways 9878 as output.

Is this what you needed to know?

RE: using an external script in awk

Hi

The simplest version :

CODE --> (Ba|K)Sh

cache='/tmp/dosomething.cache'

while IFS=';' read -r begin end; do
  [[ -f "$cache/$begin" ]] || dosomething "$begin" > "$cache/$begin"
  echo "$(< "$cache/$begin" );$end"
done < file1.txt 
This creates separate file for each code. Fast to write, fast to read, may be slow to search, but this probably depends on the used filesystem too.

Regarding that search slowness, I would just start the script, wait until there are a few thousand files in the cache directory, then do a [[ -f '/tmp/dosomething.cache/4567' ]] ( or any other code ) from the command line and see whether it takes whole seconds. If yes, tell us. Then we will look for other storage tricks ( for example separate subdirectories based on the first character ) or alternatives ( for example SQLite database ).

One thing to note :

Quote (man bash)

BUGS
It's too big and too slow.

If you have Ksh, use that instead. ( On Linux you will probably find the public domain ( pdksh ) or MirOS ( mksh ) implementation. They are also faster. )

If you have Dash, use that instead. But Dash has only what POSIX specifies, so the above code will need minor rewrite.

Feherke.
http://feherke.github.com/

RE: using an external script in awk

Hi

Thinking again, my concern was exaggerated. Even there are thousands of lines, there will be no more than 10000 code pairs. So search speed can not be an issue.

Even more, neither the storage can be an issue. I mean, while actually running dosomething was reduced to minimum, PHV's Awk code should be also fast. ( With one minor glitch : a close() after the getline() would avoid running out of available file handles. )

Feherke.
http://feherke.github.com/

RE: using an external script in awk

(OP)
@phv
i get as output, my input file without the first "column" :)

RE: using an external script in awk

(OP)
oooh sorry.. i should refresh the site bevor posting :)

RE: using an external script in awk

(OP)
Hi PHV,
using your code, i stopped the skript after a few minutes and opened the output file. I see the whole line but without the first "column".. That's the position where the 4 digits should be...

RE: using an external script in awk

(OP)
Hi feherke,

i runed for a few seconds your 2nd version and braked it. should i now type in the commandline only:

[[ -f '/tmp/dosomething.cache/4567' ]]

???

RE: using an external script in awk

Hi

Quote (Theo)

should i now type in the commandline only:

[[ -f '/tmp/dosomething.cache/4567' ]]
Yes. You will see no output, only the exit code will be set. ( echo $? to see the exit code of the previous command. But is irrelevant now. ) The key point was to see if a simple check for the file is affected by the huge amount of filesystem entries in that directory.

But as I mentioned in the next post, given that the cache directory will never have more than 10000 files, my concern was exaggerated.

Feherke.
http://feherke.github.com/

RE: using an external script in awk

(OP)
shocked @PHV WOW 12 seconds!!!!! and the file was ready!!

A big THANKS to all for your help!!!!!!! Those are the moments where i realise all the things i can NOT do wink

RE: using an external script in awk

(OP)
PHV is it possible to explain to me a little bit your code?
I am not sure about it...
awk -F';' Fileseparator is ; (until here ok) blush
But for the rest i supose what it "could" mean..

RE: using an external script in awk

What it is in the code that you don't understand ?
You're supposed to at least have read the man page (as suggested 25 Jul 12 5:14 )

RE: using an external script in awk

(OP)
Ok i was just not so clear about the cmd structure...

But i tried it with other external programms and i see that this works fine with every one of them

THANKS a lot again!!!!

Red Flag This Post

Please let us know here why this post is inappropriate. Reasons such as off-topic, duplicates, flames, illegal, vulgar, or students posting their homework.

Red Flag Submitted

Thank you for helping keep Tek-Tips Forums free from inappropriate posts.
The Tek-Tips staff will check this out and take appropriate action.

Reply To This Thread

Posting in the Tek-Tips forums is a member-only feature.

Click Here to join Tek-Tips and talk with other members!

Resources

Close Box

Join Tek-Tips® Today!

Join your peers on the Internet's largest technical computer professional community.
It's easy to join and it's free.

Here's Why Members Love Tek-Tips Forums:

Register now while it's still free!

Already a member? Close this window and log in.

Join Us             Close