INTELLIGENT WORK FORUMS
FOR COMPUTER PROFESSIONALS

Log In

Come Join Us!

Are you a
Computer / IT professional?
Join Tek-Tips Forums!
  • Talk With Other Members
  • Be Notified Of Responses
    To Your Posts
  • Keyword Search
  • One-Click Access To Your
    Favorite Forums
  • Automated Signatures
    On Your Posts
  • Best Of All, It's Free!

*Tek-Tips's functionality depends on members receiving e-mail. By joining you are opting in to receive e-mail.

Posting Guidelines

Promoting, selling, recruiting, coursework and thesis posting is forbidden.

Jobs

Print key-value pairs

Print key-value pairs

(OP)
Hello techies

I am trying to print key-value pairs specifically:
the date and time and the values for HOST and USER

This is the sample input text.

 <txt>20-JUL-2015 07:58:22 * (CONNECT_DATA=(SID=ORADB3)(CID=(PROGRAM=perl)(HOST=winserver5)(USER=oem))) * (ADDRESS=(PROTOCOL=tcp)(HOST=10.10.10.10)(PORT=12345)) * establish * ORADB3 * 0
 <txt>20-JUL-2015 07:58:38 * (CONNECT_DATA=(SID=ORADB4)(CID=(PROGRAM=)(HOST=__jdbc__)(USER=))) * (ADDRESS=(PROTOCOL=tcp)(HOST=10.10.10.10)(PORT=12345)) * establish * ORADB4 * 0
 <txt>20-JUL-2015 08:01:09 * (CONNECT_DATA=(CID=(PROGRAM=JDBC Thin Client)(HOST=__jdbc__)(USER=ZAHIER))(SERVICE_NAME=ORADB6)(CID=(PROGRAM=JDBC Thin Client)(HOST=__jdbc__)(USER=ZAHIER))) * (ADDRESS=(PROTOCOL
=tcp)(HOST=10.10.10.10)(PORT=12345)) * establish * ORADB6 * 0 

I am working on Solaris 10 and tried using sed & awk (see below) to remove the text that I DO NOT want - but this is clearly inefficient.

CODE -->

grep HOST log.xml | sed 's/\*/ /g' | sed 's/(/ /g' | sed 's/)/ /g' | sed 's/<txt>/ /g' | sed 's/CONNECT_DATA= SERVICE_NAME=/ /g' | sed 's/CONNECT_DATA= SID=/ /g'  | sed 's/CONNECT_DATA=/ /g' | sed 's/CID=/ /g' | sed 's/ADDRESS=/ /g' | sed 's/PROTOCOL=tcp/ /g' | sed 's/PORT=.*/ /g' | tr -s ' ' 

How do I extract just the fields that I need?

RE: Print key-value pairs

Quote (Zahier)


How do I extract just the fields that I need?
You can use the function match() like this:
zahier.awk

CODE

{
  # chomp every line
  chomp()
  # match TIME
  match($0, /([0-9]+:[0-9]+:[0-9]+)/, time)
  printf "TIME = %s", time[1]
  #
  str = $0
  # match HOST-USER pairs
  while (match(str, /\(HOST=([^)]*)\)\(USER=([^)]*)\)/, hu) > 0) {
    printf ", HOST='%s', USER = '%s'", hu[1], hu[2]
    str = substr(str, RSTART + RLENGTH)
  }
  # match HOST-PORT pairs
  while (match(str, /\(HOST=([^)]*)\)\(PORT=([^)]*)\)/, hp) > 0) {  
    printf ", HOST='%s', PORT = '%s'", hp[1], hp[2]
    str = substr(str, RSTART + RLENGTH)
  }
  printf "\n"
}

#
function chomp() {
  # strip out the carriage return or line feed at the end of current line
  # the function modifies global variable $0 (current line)
  sub(/\r$/, "", $0)
  sub(/\n$/, "", $0)
} 

When I run the script on the given file I get this output:

CODE

$ gawk -f zahier.awk zahier.txt

TIME = 07:58:22, HOST='winserver5', USER = 'oem', HOST='10.10.10.10', PORT = '12345'
TIME = 07:58:38, HOST='__jdbc__', USER = '', HOST='10.10.10.10', PORT = '12345'
TIME = 08:01:09, HOST='__jdbc__', USER = 'ZAHIER', HOST='__jdbc__', USER = 'ZAHIER', HOST='10.10.10.10', PORT = '12345' 

RE: Print key-value pairs

(OP)
Hello mikrom

With awk on Solaris I get...
awk: syntax error near line 3
awk: illegal statement near line 3

and if I use /usr/xpg4/bin/awk I get...
line 5 (NR=1): wrong number of arguments to function ""

Unfortunately I do not have gawk. Solaris has awk and nawk, which does not seem to have the match function.
BUT this is still a great script for when I do port to Linux!!



RE: Print key-value pairs

I have tried it with GNU Awk 3.1.7 which comes with MinGW/MSYS on Windows.
... maybe it would be possible to install gawk on Solaris too...

RE: Print key-value pairs

It seems that there is big difference between several awk versions
I tried my script on IBM iSeries where I have two awk versions and got these errors:

1. from a native awk version (probably an old version which doesnt have the swich --version)

CODE

> awk -f zahier.awk zahier.txt                                                   
   Syntax Error The source line is 9.                                            
   The error context is                                                          
                    match($0, >>>  /([0-9]+:[0-9]+:[0-9]+)/, <<<                 
   awk: 0602-502 The statement cannot be correctly parsed. The source line is 9. 
   Syntax Error The source line is 14.                                           
          awk: 0602-543 There are 2 extra ) characters. 

2. from GNU Awk 3.0.3

CODE

> gawk -f zahier.awk zahier.txt                              
  gawk: zahier.awk:9: fatal: match() cannot have 3 arguments 

... then I redesigned the script so it calls the function match() only with 2 arguments (and not 3):

Now the script work on my IBM iSeries too:

CODE

> awk -f zahier.awk zahier.txt                                                                       
  TIME=07:58:22, HOST=winserver5, USER=oem, HOST=10.10.10.10, PORT=12345                             
  TIME=07:58:38, HOST=__jdbc__, USER=, HOST=10.10.10.10, PORT=12345                                  
  TIME=08:01:09, HOST=__jdbc__, USER=ZAHIER, HOST=__jdbc__, USER=ZAHIER, HOST=10.10.10.10, PORT=12345 

Maybe you awk problem is similar.
Here is the modified script - you can try it:

CODE

{
  # chomp every line
  chomp()
  str = $0
  # match TIME
  if (match(str, /[0-9]+:[0-9]+:[0-9]+/)) {
    line = "TIME=" substr(str, RSTART, RLENGTH)
    str = substr(str, RSTART + RLENGTH)
  }
  # match HOST-USER pairs
  while (match(str, /\(HOST=[^)]*\)\(USER=[^)]*\)/) > 0) {
    hu = substr(str, RSTART, RLENGTH)
    line = line hu
    str = substr(str, RSTART + RLENGTH)
  }
  # match HOST-PORT pairs
  while (match(str, /\(HOST=([^)]*)\)\(PORT=([^)]*)\)/) > 0) {  
    hp = substr(str, RSTART, RLENGTH)
    line = line hp
    str = substr(str, RSTART + RLENGTH)
  }
  # replace parentheses with commas
  gsub(/\(/, ", ", line)
  gsub(/\)/, "", line)
  # print resulting line
  printf "%s\n", line
}


#
function chomp() {
  # strip out the carriage return or line feed at the end of current line
  # the function modifies global variable $0 (current line)
  sub(/\r$/, "", $0)
  sub(/\n$/, "", $0)
} 

RE: Print key-value pairs

Hi

Quote (microm)

It seems that there is big difference between several awk versions
They warned you.

Quote (man gawk)

GNU EXTENSIONS

(...)

       The following features of gawk are not available in POSIX awk.

(...)

       · The optional third argument to the match() function. 

Feherke.
feherke.ga

RE: Print key-value pairs

(OP)
mikrom
Great! It works with...

/usr/xpg4/bin/awk

So feherke, in other words you're saying it's variants of the same commands complying with different standards.

Thanks again, I really appreciate the help.

RE: Print key-value pairs

Quote (feherke)


They warned you.

Quote ((man gawk))


GNU EXTENSIONS
(...)
The following features of gawk are not available in POSIX awk.
(...)
· The optional third argument to the match() function.
feherke, thanks for the explanation.
Now I know that in the gawk ver. 3.0.3 the function match() had only 2 arguments and in the version 3.1.7 the 3. argument is possible.

Quote (Zahier)


Great! It works with...
Ok, but while I was trying to make that example, I always wanted you to ask, if you don't have Perl on your Solaris machine?

RE: Print key-value pairs

(OP)
I do have perl, I have not dabbled in perl scripting...yet.

perl, v5.8.4 built for sun4-solaris-64int

RE: Print key-value pairs

IMO ver. 5.8.x should be ok.
Maybe next time it would be worth to try.

Red Flag This Post

Please let us know here why this post is inappropriate. Reasons such as off-topic, duplicates, flames, illegal, vulgar, or students posting their homework.

Red Flag Submitted

Thank you for helping keep Tek-Tips Forums free from inappropriate posts.
The Tek-Tips staff will check this out and take appropriate action.

Reply To This Thread

Posting in the Tek-Tips forums is a member-only feature.

Click Here to join Tek-Tips and talk with other members!

Resources

Close Box

Join Tek-Tips® Today!

Join your peers on the Internet's largest technical computer professional community.
It's easy to join and it's free.

Here's Why Members Love Tek-Tips Forums:

Register now while it's still free!

Already a member? Close this window and log in.

Join Us             Close