Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations TouchToneTommy on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Regex Text File and format output

Status
Not open for further replies.

stevio

Vendor
Jul 24, 2002
78
AU
Hi all,

I'm trying to search a text file for 2 values using regex and format the output so that the values are side by side in a tab delimited format

Text file looks like this. 'Val' is constant, but the number after Val will be different
----------------------------
Value Val 1
.
.
more text here
.
.
Phone: 124566
.
.
Value Val 2
.
.
Phone: 345672
----------------------------

So the output should look something like (no headings required), only the raw data with a tab or space in between

Val 1 124566
Val 2 345672
I can find the first value, but don't know how to add the second value and format it. Can someone please help?

So far, the code looks like this:
Code:
Const ForReading = 1
Set objRegEx = CreateObject("VBScript.RegExp")
objRegEx.Global = True
objRegEx.Pattern = "Val\s+\d+"
Set objFSO = CreateObject("Scripting.FileSystemObject")
Set objFile = objFSO.OpenTextFile("C:\data.txt", ForReading)
strSearchString = objFile.ReadAll
objFile.Close

Set colMatches = objRegEx.Execute(strSearchString)
If colMatches.Count > 0 Then
   strMessage = "The following values were found:" & vbCrlf
   For Each strMatch in colMatches
       strMessage = strMessage &  strMatch.Value & vbcrlf
   Next
End If

Wscript.Echo strMessage
Current output looks like:

Val 1
Val 2
Val 3
etc
 
>objRegEx.Pattern = "Val\s+\d+"
[tt]objRegEx.Pattern = "(Val[ ]+\d+)(\s|.)*?Phone:[ ]+(\d+)"[/tt]

>strMessage = strMessage & strMatch.Value & vbcrlf
[tt]strMessage = strMessage & & strMatch.submatches(0) & vbtab & strMatch.submatches(2) & vbcrlf[/tt]

ps: Do you know what commonly mean strMatch? It means strMatch is conceived as a string at least by the scripter. Why do you then go with that labour to put three letters s-t-r there just to mislead yourself?
 
tsuji,

can you please explain the regex. When I run it, it hangs.

objRegEx.Pattern = "(Val[ ]+\d+)(\s|.)*?Phone:[ ]+(\d+)"

my guess

Search for Val, one or more whitepace characters [ ], then one or more digits. Then search for any whitespace zero or more times with non-greedy search, then search for Phone:, one or more whitespaces and then one or more digits.

Thanks
 
> When I run it, it hangs.
Not hang as long as your original not hang. Impossible to hang due to that functional block. What is your testing script then?
 
the problem was in the file, I copied the contents into another file and it ran fine.

tsuji, can you explain the use of (\s|.) in the regex, especially the | ?

btw, have star!!
 
stevio, (\s|.)*? means this. It means to match either the blank space (the "\s" part) (which includes space, tab, line-feed, carriage return ...) or (the "|") any characters other than blank space (the "dot" part). Hence, (\s|.) practically match "anything". You need \s because the string will contain lots of line break. If you have multi-line property set and each match is carried on a single line, than you will see less often the part \s of (\s|.)*? and will see more often the non-greedy .*? .
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top