Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Chriss Miller on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Reading characters from within a line of text

Status
Not open for further replies.

Loon

Programmer
May 24, 2000
100
GB
Hi,<br><br>&nbsp;&nbsp;I have to extract a substring for a line of text. I have the line of text within $_ and need to extract a numerical value after a constant string in the line (hence I know where it will be).<br><br>&nbsp;&nbsp;I have an idea of how to do this with a substr command, however is there a way of moving the current string matching position to the start of the substring I want?<br><br>e.g.<br><br>To extract the number 32 from the following <br><br>Preparing the Exercise Activity Log for CPU on path 32 file<br><br><br>I wouldn't be able to use the normal substr with numbers as the CPU bit could be something like MEMORY, or another keyword...<br><br>Any ideas greatly appreciated.<br>Cheers<br>Loon<br><br>
 
You could use pattern matching......<br><br>If the digits you are looking for will be the only digits in the string, you could<br><FONT FACE=monospace><br>$string =~ /.*(\d+).*/;<br>$number = $1;<br></font><br>Which says match anything of any length, then some digits, then more anything of any length.&nbsp;&nbsp;Then catch the contents of the paren's in $number.<br><br>If there might be more that one set of digits, that won't work.&nbsp;&nbsp;So,......<br>If 'CPU' will change to MEMORY to DISK to some else, then morph your match pattern and then do the match.<br><br> <p> <br><a href=mailto: > </a><br><a href= > </a><br> keep the rudder amid ship and beware the odd typo
 
That's cool goBoating thanks! <br>The CPU and MEMORY lines can use the above, however, the others require something a bit more complicated as their 'numbers' (actually hardware paths on a unix server) will have . and / characters as they will be SCSI paths.<br><br>So can I use the same regexp but with an extra catch for . and / characters (until the next white-space)? I know very little about regexps... could someone show me how?<br><br>Many Thanks<br>Loon<br>
 
OK,&nbsp;&nbsp;another shot<br><FONT FACE=monospace><br>#!perl<br>$str = 'Preparing the Exercise Activity Log for SCSI on path /dev/dsk/c0t0d0s0 file';<br><br><font color=red># this would match CPU line but not current string.</font><br>if ($str =~ /.*(\d+).*/) { $number = $1; }<br><br><font color=red># failed to set number in previous line, should match this time</font><br>if ($str =~ /Preparing the Exercise Activity Log for SCSI on path (\/.*? )file/)<br>&nbsp;&nbsp;&nbsp;&nbsp;{ $number = $1;}<br><br>print &quot;number is $number\n&quot;;<br></font><br><br><br>hope this helps... <p> <br><a href=mailto: > </a><br><a href= > </a><br> keep the rudder amid ship and beware the odd typo
 
What about something along the following lines (I am still learning this regexp stuff so I don't think this will work):<br><FONT FACE=monospace><br>$line = &quot;-- Exercise Activity Log for SCSI Disk on path 10/0.6.0 --&quot;<br>$line =~ tr/a-zA-z -//;<br></font><br>Would that transpose all alphabetical characters (and space and -) to nothing? Leaving the string:<br><FONT FACE=monospace><font color=#AAAAAA>-- Exercise Activity Log for SCSI Disk on path </font>10/0.6.0<font color=#AAAAAA> --</font></font><br><br>Or, would this catch the right string?:<br><FONT FACE=monospace><br>if (/.*[\d+]\/[\d+][\d+.]*.*/)<font color=red><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;^&nbsp;&nbsp;&nbsp;^&nbsp;&nbsp;&nbsp;^&nbsp;&nbsp;^&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;^&nbsp;&nbsp;^ ^<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;¦&nbsp;&nbsp;&nbsp;¦&nbsp;&nbsp;&nbsp;¦&nbsp;&nbsp;¦&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;¦&nbsp;&nbsp;¦ ¦___ <font color=blue>Any number of any char</font><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;¦&nbsp;&nbsp;&nbsp;¦&nbsp;&nbsp;&nbsp;¦&nbsp;&nbsp;¦&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;¦&nbsp;&nbsp;¦_____ <font color=blue>0+ times</font><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;¦&nbsp;&nbsp;&nbsp;¦&nbsp;&nbsp;&nbsp;¦&nbsp;&nbsp;¦&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;¦________ <font color=blue>1+ digits followed by .</font><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;¦&nbsp;&nbsp;&nbsp;¦&nbsp;&nbsp;&nbsp;¦&nbsp;&nbsp;¦______________ <font color=blue>1+ digits</font><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;¦&nbsp;&nbsp;&nbsp;¦&nbsp;&nbsp;&nbsp;¦_________________ <font color=blue>followed by a /</font><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;¦&nbsp;&nbsp;&nbsp;¦_____________________ <font color=blue>1+ digits</font><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;¦_________________________ <font color=blue>Any number of any char</font></font><br></font><br><br>I am presuming that the '/' needs the backslash to escape it and that the . inside the character class [] does not need escaping. <br><br>However, this does not seem to work. So can I just use a declaration like the following?:<br><FONT FACE=monospace><br>$scsi_path = /.*[\d+]\/[\d+][\d+.]*.*/;<br></font><br>or should I be re-assigning <FONT FACE=monospace>$&</font> or something? e.g.<br><FONT FACE=monospace><br>$scsi_path = $&;<br></font><br><br>I'm a bit stuck here! Any help greatly appreciated!<br><br>Cheers,<br>Loon<br><br><br>
 
You're real close.....regex's take a little playing with.......If your string is fairly consistent and your example is truely representative, .......I don't think you need to get rid of any text, just catch the scsiPath pattern.<br><i>a space, some digits, a slash, more digits with decimals, a space</i><br><br><FONT FACE=monospace>#!/usr/local/bin/perl<br>$str = '-- Exercise Activity Log for SCSI Disk on path 10/0.6.0 --';<br># catch scsiPath pattern in $1<br>$str =~ / (\d+\/[\d+.]*) /g;<br><font color=red>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;¦ ¦&nbsp;&nbsp;&nbsp;¦&nbsp;&nbsp;&nbsp;&nbsp;¦&nbsp;&nbsp;&nbsp;¦<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;¦ ¦&nbsp;&nbsp;&nbsp;¦&nbsp;&nbsp;&nbsp;&nbsp;¦&nbsp;&nbsp;&nbsp;space<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;¦ ¦&nbsp;&nbsp;&nbsp;¦&nbsp;&nbsp;&nbsp;[\d+.]* - repeating digits dot pattern<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;¦ ¦&nbsp;&nbsp;&nbsp;\/ - escaped slash&nbsp;&nbsp;<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;¦ \d+ - any number of digits<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;space<br></font><br>$scsiPath = $1;<br>print &quot;matched $scsiPath\n&quot;;<br><br>'hope this helps <p> <br><a href=mailto: > </a><br><a href= > </a><br> keep the rudder amid ship and beware the odd typo
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top