INTELLIGENT WORK FORUMS
FOR COMPUTER PROFESSIONALS

Log In

Come Join Us!

Are you a
Computer / IT professional?
Join Tek-Tips Forums!
  • Talk With Other Members
  • Be Notified Of Responses
    To Your Posts
  • Keyword Search
  • One-Click Access To Your
    Favorite Forums
  • Automated Signatures
    On Your Posts
  • Best Of All, It's Free!

*Tek-Tips's functionality depends on members receiving e-mail. By joining you are opting in to receive e-mail.

Posting Guidelines

Promoting, selling, recruiting, coursework and thesis posting is forbidden.

Jobs

how do I append a string to the output file name? many thanks!

how do I append a string to the output file name? many thanks!

how do I append a string to the output file name? many thanks!

(OP)
Hi guys,

I'm hoping that you have someone here has good unix shell programming skills. I've written a code that parses/processes hundreds of input files one at a time and that can be as large as a Gbyte using AWK and some shell scripting. It runs very fast for the amount of data it's running through. At the end of the processing the codes produces an output file in the same directory that its input file was in, standard kind of stuff, but the problem is that these hundreds of output files are strewn across the hundred of directories.

Based on the filtering criteria, many of these files will be empty, but others will have data, but I don't know which is which, and I can't really open up every directory to see if that outputFile has content or not.

So what I want to do is to send the output file from each run to a common directory. Trouble is the output files all have the exact same name, so they would just overwrite each other. So what I'd like to do is to append a unique string to each of the outputFiles, then they can all be sent to a common directory, and I can easily see which ones have data, and which ones don't. The unique string that I would like to use is the immediate directory that the input file is in. So by example, here's what I mean:

Here's what three of the input directory structures and file might look like, but there's really hundreds of them:
/home/tabitha/my_data/S-T-3-001-F_2012_08_16/inputFile.txt
/home/tabitha/my_data/W-B-7-011-3_2012_08_15/inputFile.txt
/home/tabitha/my_data/BA-Z-Y-011-A_081512/inputFile.txt

Here's what the current output looks like, note that they always have the same outputFile.txt name:
/home/tabitha/my_data/S-T-3-001-F_2012_08_16/ouputFile.txt
/home/tabitha/my_data/W-B-7-011-3_2012_08_15/outputFile.txt
/home/tabitha/my_data/BA-Z-Y-011-A_081512/outputFile.txt

Here's what I need the the output to look like:
/home/tabitha/my_data/Common_Directory/S-T-3-001-F_2012_08_16_ouputFile.txt
/home/tabitha/my_data/Common_Directory/W-B-7-011-3_2012_08_15_outputFile.txt
/home/tabitha/my_data/Common_Directory/BA-Z-Y-011-A_081512_outputFile.txt

so the output directory precedes the outputFile name, hopefully this is clear.

I tried different combinations of getline, cat, find, but I think I keep getting stuck because I don't know how to cast that last directory name as a variable which in the print statement I could append.

what do you think? many thanks for whoever is able to help me :)

RE: how do I append a string to the output file name? many thanks!

It's not straightforward to answer without seeing the code that it needs to fit into, but something along these lines could work:

CODE

inputfile=/home/tabitha/my_data/S-T-3-001-F_2012_08_16/inputFile.txt
outputfile=$(echo "$inputfile" | sed 's#/inputFile#_outputFile#;s#my_data#my_data/Common_Directory#')
echo inputfile is $inputfile
echo outputfile is $outputfile 

CODE --> output

inputfile is /home/tabitha/my_data/S-T-3-001-F_2012_08_16/inputFile.txt
outputfile is /home/tabitha/my_data/Common_Directory/S-T-3-001-F_2012_08_16_outputFile.txt 



Annihilannic
tgmlify - code syntax highlighting for your tek-tips posts

RE: how do I append a string to the output file name? many thanks!

(OP)
Thanks Annihilannic!

Thanks so munch, it works great!

Except I need the script to loop over a text file that contains several hundred paths and inputFile.txt, like:

/home/tabitha/my_data/S-T-3-001-F_2012_08_16/inputFile.txt
/home/tabitha/my_data/W-B-7-011-3_2012_08_15/inputFile.txt
/home/tabitha/my_data/BA-Z-Y-011-A_081512/inputFile.txt
.
.
.

that's only three, I need to use the script on several hundred. The several hundred directory paths are listed out in a text file that I create using:

find /home/tabitha/my_data/ -name inputFile.txt > lists_of_paths_and_names_to_the_inputFiles.txt

do you know how to read in the listing textfile and assign it as the inputfile.txt of your script and loop over each line?

thanks so much for helping me!

RE: how do I append a string to the output file name? many thanks!

You said "I've written a code that parses/processes hundreds of input files one at a time", so I assumed you had already done that part.

Something like this?

CODE

while read inputfile
do
    outputfile=$(echo "$inputfile" | sed 's#/inputFile#_outputFile#;s#my_data#my_data/Common_Directory#')
    echo inputfile is $inputfile
    echo outputfile is $outputfile
done <lists_of_paths_and_names_to_the_inputFiles.txt 

Annihilannic
tgmlify - code syntax highlighting for your tek-tips posts

RE: how do I append a string to the output file name? many thanks!

(OP)
ut oh, somethings not right???

when I echo $inputfile I get back what I expect:

/home/tabitha/my_data/S-T-3-001-F_2012_08_16/inputFile.txt

/home/tabitha/my_data/W-B-7-011-3_2012_08_15/inputFile.txt

/home/tabitha/my_data/BA-Z-Y-011-A_081512/inputFile.txt



when I echo $output file I get back a list of the files with their new names:

/home/tabitha/my_data/Common_Directory/S-T-3-001-F_2012_08_16_ouputFile.txt

/home/tabitha/my_data/Common_Directory/W-B-7-011-3_2012_08_15_outputFile.txt

/home/tabitha/my_data/Common_Directory/BA-Z-Y-011-A_081512_outputFile.txt



but when I go to the Common_Directory, there are no files in there??? Maybe I didn't say something right, sorry, I'm really new at this, but I need the output files with the new names to be in the Common_Directory. I tried adding a mv command in the do loop but couldn't get that to work :(

RE: how do I append a string to the output file name? many thanks!

It would help if you posted your actual code so we can see where this stuff needs to fit. Presumably you just need to send the output of your processing to the output file, using your_processing_code_here >$outputfile, but as it is I can only guess.

Annihilannic
tgmlify - code syntax highlighting for your tek-tips posts

RE: how do I append a string to the output file name? many thanks!

(OP)
so there are a couple of steps that I do, first I run a find command:

find /svr_ardvark/home/tabitha/my_data/08042012/ -name MRAC.txt > list_of_paths_to_MRAC_files.txt

and this gives me:

/svr_ardvark/home/tabitha/my_data/08042012/S-T-3-001-F_2012/MRAC.txt
/svr_ardvark/home/tabitha/my_data/08042012/W-B-7-011-3_2012/MRAC.txt
/svr_ardvark/home/tabitha/my_data/08042012/BA-Z-Y-011-A_081512/MRAC.txt

then I need a tool (the one you've been helping me with and I greatly appreciate) that loops over the results of the find command and creates a text file that is a list that will be used for batch processing. Each entry in the list has four parts:

1) calling an awk script
2) the current position of the MRAC files
3) renaming the MRAC files with the directory name preceding the MRAC.txt file
4) redirection of the processed MRAC files into a common directory.

this output file is a batch_processing_list.txt file and should look like this:

awk -f parsing_tool.awk /svr_ardvark/home/tabitha/my_data/08042012/S-T-3-001-F_2012/MRAC.txt > /svr_ardvark/home/tabitha/my_data/08042012/common_directory/S-T-3-001-F_2012_MRAC.txt

awk -f parsing_tool.awk /svr_ardvark/home/tabitha/my_data/08042012/W-B-7-011-3_2012/MRAC.txt > /svr_ardvark/home/tabitha/my_data/08042012/common_directory/W-B-7-011-3_2012RAC.txt

awk -f parsing_tool.awk /svr_ardvark/home/tabitha/my_data/08042012/BA-Z-Y-011-A_2012/MRAC.txt > /svr_ardvark/home/tabitha/my_data/08042012/common_directory/BA-Z-Y-011-A_2012_MRAC.txt

the parsing_tool.awk that’s being called by batch_processing_list.txt is:

BEGIN {
FS=" "
}
{
if($11==8899)
printf("%s %d %d %s %d %d\n", $1, $4, $5, $11, $17,$21);
}
END {}

It all seems to work if I create the batch_processing_list.bash by hand with a text editor and excel, which is ok if I’m just trying to process under a hundred files, but to do this on several hundreds for files I will need more automated scripts.

Thanks again, so much for all your help!!!

RE: how do I append a string to the output file name? many thanks!

You can pipe the results of the find command directly into the loop that does the processing (unless you specifically need the intermediate file for some other purpose) as follows.

CODE

find /svr_ardvark/home/tabitha/my_data/08042012/ -name MRAC.txt | while read inputfile
do
    outputfile=$(echo "$inputfile" | sed 's#/inputFile#_outputFile#;s#my_data#my_data/Common_Directory#')
    awk -F " " '$11==8899 { printf("%s %d %d %s %d %d\n", $1, $4, $5, $11, $17, $21); }' $inputfile >$outputfile
done 

I have also abbreviated the awk script somewhat and included it on the command-line rather than in a separate parsing_tool.awk script since it is quite short. I specified the separator on the command-line (probably unnecessary because the default separator in awk is white space), and then removed the unnecessary BEGIN and END clauses (since they're both now empty), and also the if statement is not required because each statement in awk has an implicit conditional at the start anyway.

Annihilannic
tgmlify - code syntax highlighting for your tek-tips posts

RE: how do I append a string to the output file name? many thanks!

(OP)
It works perfectly! Thanks sooo much for all your help!

Next, I'm going to try to use data from a different file (called L102.txt) that contains two columns of data to use as filtering criteria in the MRAC.txt file. The L102.txt file has data in one column that I know the values that I want, and data in the second (correlated with the first column) that I don't know the values that I want, it's these values in the second column that I need to use for filtering the MRAC.txt file

I want to give this a try on my own, but I'll probably need a little help.....

Tabby

Red Flag This Post

Please let us know here why this post is inappropriate. Reasons such as off-topic, duplicates, flames, illegal, vulgar, or students posting their homework.

Red Flag Submitted

Thank you for helping keep Tek-Tips Forums free from inappropriate posts.
The Tek-Tips staff will check this out and take appropriate action.

Reply To This Thread

Posting in the Tek-Tips forums is a member-only feature.

Click Here to join Tek-Tips and talk with other members!

Resources

Close Box

Join Tek-Tips® Today!

Join your peers on the Internet's largest technical computer professional community.
It's easy to join and it's free.

Here's Why Members Love Tek-Tips Forums:

Register now while it's still free!

Already a member? Close this window and log in.

Join Us             Close