×
INTELLIGENT WORK FORUMS
FOR COMPUTER PROFESSIONALS

Log In

Come Join Us!

Are you a
Computer / IT professional?
Join Tek-Tips Forums!
  • Talk With Other Members
  • Be Notified Of Responses
    To Your Posts
  • Keyword Search
  • One-Click Access To Your
    Favorite Forums
  • Automated Signatures
    On Your Posts
  • Best Of All, It's Free!

*Tek-Tips's functionality depends on members receiving e-mail. By joining you are opting in to receive e-mail.

Posting Guidelines

Promoting, selling, recruiting, coursework and thesis posting is forbidden.

Students Click Here

Accurate AWK array searching
2

Accurate AWK array searching

Accurate AWK array searching

(OP)
Can anybody offer some help getting this AWK to search correctly?

I need to search inside the "sample.txt" file for all the 6 array elements in the "combinations" file. However, I need the search to happen from every single character instead of like an ordinary text editor search box type search, which searches by blocks after each occurrence. I need to search in the most squeezed in way so as to display exactly every times it happens. For example I need the type of search that finds inside the string "AAAAA" the combination "AAA" happening 3 times, not 1 time. See my previous post about this: https://stackoverflow.com/questions/50053094/bash-...

The sample.txt file is:

CODE --> bash

AAAAAHHHAAHH 
The combinations file is:

CODE --> bash

AA  
HH  
AAA  
HHH  
AAH  
HHA 
How do I get the script

CODE --> bash

#!/bin/bash
awk 'NR==FNR {data=$0; next} {printf "%s %d \n",$1,gsub($1,$1,data)}' 'sample.txt' combinations > searchoutput 
to output the desired output:

CODE --> bash

AA 5
HH 3
AAA 3
HHH 1
AAH 2
HHA 1 
instead of what it is currently outputing:

CODE --> bash

AA 3 
HH 2 
AAA 1 
HHH 1 
AAH 2 
HHA 1 
?

As we can see, the script is only finding the combinations just like a text editor. I need it to search for the combinations from the start of every character instead so that the desired output happens.

How do I have the AWK output the desired output instead? Can't thank you enough.

RE: Accurate AWK array searching

Try something like this:

CODE

# Run:
#   awk -f starrysky1.awk sample.txt combinations.txt

{ 
  # remove spaces
  gsub(/[ ]+/, "", $0)
} 

NR==FNR {
  data=$0;
  data_len = length(data)
  next
} 

{
  pattern = $0
  pattern_len = length(pattern)
  pattern_count = 0
  for (j=1; j+pattern_len-1 <= data_len; j++) {
    if (substr(data, j, pattern_len) ~ pattern) {
      pattern_count++
    }      
  }
  printf("%s\t%d \n", pattern, pattern_count)
} 

Output:

CODE

$ awk -f starrysky1.awk sample.txt combinations.txt
AA      5
HH      3
AAA     3
HHH     1
AAH     2
HHA     1 

RE: Accurate AWK array searching

(OP)
Thanks !!!!! Got it working thanks beyond reality

RE: Accurate AWK array searching

Hi

Here on Tek-Tips we used to thank for the received help by giving stars. Please click the

Great post!

link at the bottom of mikrom's post ( then confirm in the pop-up window ). That way you both show your gratitude and indicate this thread as helpful.

Feherke.
feherke.github.io

Red Flag This Post

Please let us know here why this post is inappropriate. Reasons such as off-topic, duplicates, flames, illegal, vulgar, or students posting their homework.

Red Flag Submitted

Thank you for helping keep Tek-Tips Forums free from inappropriate posts.
The Tek-Tips staff will check this out and take appropriate action.

Reply To This Thread

Posting in the Tek-Tips forums is a member-only feature.

Click Here to join Tek-Tips and talk with other members! Already a Member? Login

Close Box

Join Tek-Tips® Today!

Join your peers on the Internet's largest technical computer professional community.
It's easy to join and it's free.

Here's Why Members Love Tek-Tips Forums:

Register now while it's still free!

Already a member? Close this window and log in.

Join Us             Close