INTELLIGENT WORK FORUMS
FOR COMPUTER PROFESSIONALS

Log In

Come Join Us!

Are you a
Computer / IT professional?
Join Tek-Tips Forums!
  • Talk With Other Members
  • Be Notified Of Responses
    To Your Posts
  • Keyword Search
  • One-Click Access To Your
    Favorite Forums
  • Automated Signatures
    On Your Posts
  • Best Of All, It's Free!

*Tek-Tips's functionality depends on members receiving e-mail. By joining you are opting in to receive e-mail.

Posting Guidelines

Promoting, selling, recruiting, coursework and thesis posting is forbidden.

Jobs

Print out first 3 strings in a specified column

Print out first 3 strings in a specified column

Print out first 3 strings in a specified column

(OP)
Separating columns via multiple spaces...

awk -F '[[:space:]][[:space:]]+' '{ print $1, $7"/"$6, $12 }'

Each column of data will have multiple data separated by spaces..

Not sure how to print out only the first 3 items in $12....

Any idea's

Thanks....

Joe Despres

RE: Print out first 3 strings in a specified column

Hi

So the fields are separated by two or more whitespace characters and sub-fields are separated by single whitespace characters.

The big question here is whether you need to preserve the original whitespaces. I mean, to keep space in the output where was space in the input and keep tab where was tab.

If not needed to preserve whitespaces as they were, is simple :

CODE --> any Awk

awk -F '[[:space:]][[:space:]]+' '{ split($12, m, /[[:space:]]/); print $1, $7"/"$6, m[1], m[2], m[3] }' 

If need to preserve whitespaces, gets abit complicated, but still reasonably simple if only has to work in GNU Awk :

CODE --> GNU Awk

awk -F '\\s\\s+' '{ print $1, $7"/"$6, gensub(/(\S+\s\S+\s\S+).*/, "\\1", "", $12) }'

# or

awk -F '\\s\\s+' '{ match($12, /\S+\s\S+\s\S+/, m); print $1, $7"/"$6, m[0] }' 

If need to preserve whitespaces and to be portable ( or at least work with something else than GNU Awk ) :

CODE --> any Awk

awk -F '[[:space:]][[:space:]]+' '{ match($12, /[^[:space:]]+[[:space:]][^[:space:]]+[[:space:]][^[:space:]]+/); print $1, $7"/"$6, substr($12, RSTART, RLENGTH) }' 

( As you can see, in GNU Awk you can use \s for [[:space:]] and \S for [^[:space:]]. That also works in original-awk ( available on Ubuntu, not sure about its origin ), but not in Mawk. There the closest alternative would be [ \t] for [[:space:]] and [^ \t] for [^[:space:]]. )


Feherke.
feherke.ga

RE: Print out first 3 strings in a specified column

(OP)
didn't work....

I do like the ::---> '\\s\\s+'

Thanks!

Joe Despres

RE: Print out first 3 strings in a specified column

Hi

Quote (Joe)

didn't work....
Sorry to hear that. Could you post some sample input and expected output ? And specify which Awk implementation / version are you using.

Feherke.
feherke.ga

RE: Print out first 3 strings in a specified column

(OP)
awk -W version
GNU Awk 3.1.8

Using awk on a Avamar system :)

#### Here's the raw out put from the mccli command ::--->

CODE -->

9145091880251509 Completed w/Exception(s) 10010      2015-12-23 20:00 EST 00h:59m:07s 2015-12-23 20:59 EST Scheduled Backup   6.2 TB         0.1%      yyy.com /xxxx Windows Server 2008 R2 Enterprise Server Edition (No Service Pack) 64-bit 7.0.102-47     2015-12-23 20:00 EST 2015-12-24 08:00 EST 00h:00m:36s  /xxxx/Windows 2008                Windows File System Retention_xxxx   D         xxxx Windows /xxxx/Windows_2008                       xxxx Windows-Windows 2008-1450918802270                        Avamar N/A
9145091880251709 Completed w/Exception(s) 10010      2015-12-23 20:59 EST 00h:05m:19s 2015-12-23 21:05 EST Scheduled Backup   42.8 GB        0.8%      yyyy.com /xxxx  Windows Server 2008 R2 Enterprise Server Edition (No Service Pack) 64-bit 7.0.102-47     2015-12-23 20:00 EST 2015-12-24 08:00 EST 00h:59m:45s  /xxxx/Windows 2008                Windows VSS         Retention_xxxx   D         xxxx Windows /xxxx/Windows_2008                       xxxx Windows-Windows 2008-1450918802270                        Avamar N/A
9145083240268209 Completed w/Exception(s) 10010      2015-12-22 22:11 EST 00h:48m:34s 2015-12-22 23:00 EST Scheduled Backup   6.2 TB         0.1%      yyyy.com /xxxx  Windows Server 2008 R2 Enterprise Server Edition (No Service Pack) 64-bit 7.0.102-47     2015-12-22 20:00 EST 2015-12-23 08:00 EST 02h:11m:46s  /xxxx/Windows 2008                Windows File System Retention_xxxx   D         xxxx Windows /xxxx/Windows_2008                       xxxx Windows-Windows 2008-1450832402416                        Avamar N/A 

#### Output desired ::--->
9145091880251509 Completed w/Exception(s) /xxxx/yyy.com Windows File System
9145091880251709 Completed w/Exception(s) /xxxx/yyy.com Windows VSS
9145083240268209 Completed w/Exception(s) /xxxx/yyy.com Windows File System

Basically I want to check for exceptions from yesterdays backup results... Will apply this same info to the failures as well..

Thanks....

Joe Despres

RE: Print out first 3 strings in a specified column

Hi

Then the field separator theory seems not good enough :

CODE --> fragment

... yyy.com /xxxx Windows Server 2008 ...
... yyyy.com /xxxx  Windows Server 2008 ...
... yyyy.com /xxxx  Windows Server 2008 ... 

As you have GNU Awk, I would say, better we use the match() function to collect the needed pieces. ( match()'s 3rd parameter is GNU extension. )

But having only limited information about the input ( I assume those "xxxx" are placeholders for sensitive data ), putting together the regular expression would be quite long. So I would suggest an off-topic solution : Perl, because it's regular expressions support non-greedy quantifiers.

CODE --> Perl

perl -ne 'print"$1 $3/$2 $4\n"if/^(.+?)\s+\d+\s+\d{4}-\d{2}-\d{2}.+?\s(\w+\.\w+)\s+(\/\w+).+\s{2,}\/\w+\/.+?\s{2,}(.+)\s+Retention/' 

Actually the accent is on non-greedy modifiers, so any tool/language with PCRE would do it.

Feherke.
feherke.ga

RE: Print out first 3 strings in a specified column

(OP)
Hey Feherke.....

That didn't work :(

Thanks! You shouldn't work on this any more...

Joe Despres

RE: Print out first 3 strings in a specified column

Hi

Well, it works for the sample input... I suppose the issue is with those "xxxx", which I try to match a \w+. If they contain non-word characters, those will break the matching.

Feherke.
feherke.ga

RE: Print out first 3 strings in a specified column

(OP)
Yeah, xxxx is just alphabet characters

Thanks

Joe Despres

RE: Print out first 3 strings in a specified column

(OP)
I totally forgot! mccli command can output xml!

CODE -->

<Row>
      <ID>9145117800006709</ID>
      <Status>Completed</Status>
      <ErrorCode>0</ErrorCode>
      <StartTime>2015-12-26 20:11 EST</StartTime>
      <Elapsed>00h:07m:05s</Elapsed>
      <EndTime>2015-12-26 20:18 EST</EndTime>
      <Type>Scheduled Backup</Type>
      <ProgressBytes>22.7 GB</ProgressBytes>
      <NewBytes>0.9%</NewBytes>
      <Client>mickey.mouse.com</Client>
      <Domain>/Unrestrictive/Infrastructure</Domain>
      <OS>Windows Server 2008 R2 Enterprise Server Edition Service Pack 1 64-bit</OS>
      <ClientRelease>7.1.101-145</ClientRelease>
      <Sched.StartTime>2015-12-26 20:00 EST</Sched.StartTime>
      <Sched.EndTime>2015-12-27 08:00 EST</Sched.EndTime>
      <ElapsedWait>00h:11m:21s</ElapsedWait>
      <Group>/Infrastructure-ServerFile-S20-RD30</Group>
      <Plug-In>Windows VSS</Plug-In>
      <RetentionPolicy>RD30</RetentionPolicy>
      <Retention>D</Retention>
      <Schedule>S20</Schedule>
      <Dataset>/ServerFile</Dataset>
      <WID>S20-Infrastructure-ServerFile-S20-RD30-1451178000029</WID>
      <Server>Avamar</Server>
      <Container>N/A</Container>
    </Row> 

Each backup generates one set of this...

All I really need is to strip out all the tags and put the data on one line separated by a comma

Joe Despres

RE: Print out first 3 strings in a specified column

Hi

Quote (Joe)

All I really need is to strip out all the tags and put the data on one line separated by a comma
May I suggest another off-topic solution for that ? XMLStarlet :

CODE

xmlstarlet sel -t -m //Row -v ID -o , -v Status -o , -v Errorcode -o , -v Domain -o / -v Client -o , -v Plug-In -n 
( Although not sure where the commas will come in the picture as until now the separators were spaces. )

Feherke.
feherke.ga

RE: Print out first 3 strings in a specified column

(OP)
Bummer...... I don't have "xmlstarlet" installed :(

#### This seems to work ::--->

CODE -->

raw-quickc () {
export MCCLI=/usr/local/avamar/bin/mccli
export BIN=/home/admin/bin
echo "ID,Status,ErrorCode,StartTime,Elapsed,EndTime,Type,ProgressBytes,NewBytes,Client,Domain,OS,ClientRelease,Sched.StartTime,Sched.EndTime,ElapsedWait,Group,Plug-In,RetentionPolicy,Retention,Schedule,Dataset,WID,Server,Container"
$MCCLI activity show --completed=true --verbose --xml | sed -n '/<Row/,/<\/Row/p'| sed 's/<\/\?[^>]\+>//g'|awk '{$1=$1}1'|awk -f $BIN/ONE_Line.awk|sed 's/\&\;lt\;//g'
} 

ugly enough to back a buzzard off a gut wagon!

#### ONE_Line.awk ::--->

CODE -->

BEGIN { RS = ""; FS = "\n"; ORS = "" }
{
        x=1
        while ( x<NF ) {
                print $x ","
                x++
        }
        print $NF "\n"
} 

My next goal is to grep out part of a column :)

Thanks....

Joe Despres

Red Flag This Post

Please let us know here why this post is inappropriate. Reasons such as off-topic, duplicates, flames, illegal, vulgar, or students posting their homework.

Red Flag Submitted

Thank you for helping keep Tek-Tips Forums free from inappropriate posts.
The Tek-Tips staff will check this out and take appropriate action.

Reply To This Thread

Posting in the Tek-Tips forums is a member-only feature.

Click Here to join Tek-Tips and talk with other members!

Resources

Close Box

Join Tek-Tips® Today!

Join your peers on the Internet's largest technical computer professional community.
It's easy to join and it's free.

Here's Why Members Love Tek-Tips Forums:

Register now while it's still free!

Already a member? Close this window and log in.

Join Us             Close