Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations derfloh on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Regular Expression Search Pattern 2

Status
Not open for further replies.

helpeachother

Programmer
Joined
Jun 1, 2005
Messages
13
Location
CA
Hi, Everyone:

Thank you so much for your help with my debug question posted yesterday. I am sorry that I need your help again.

The following is a function a wrote in VB.NET to use the regular expression to get a few numbers from a block of text I grabed from a table on our intranet. It supposes to grab the 1st, 4th, 8th and 9th number on the row of the table that matches the searchDate. (see the code below.)

However, when I test the code, it does not go through the " If num.Count > 1 Then" loop. It seems that the regExp of numPattern does not get a match, but it should. Can anyone help mw with this?

Thanks,

KHWright


Function getMemoryData(ByVal block As String, ByVal searchDate As String) As String()

Dim matches, num As System.Text.RegularExpressions.MatchCollection
Dim regExp As Regex
Dim datePattern, numPattern, temp, mData(3) As String

datePattern = "<td[^>]*>(\d\d\d\d)(\d\d)(\d\d) \d\d:\d\d:\d\d</td>"
numPattern = "<td align=middle>(\d+)\s*</td>\s*<td align=middle>\d+\s*</td>\s*<td align=middle>\d+\s*</td>" + _
"\s*<td align=middle>(\d+)\s*</td>\s*<td align=middle>\d+\s*</td>\s*<td align=middle>\d+\s+</td>" + _
"\s*<td align=middle>\d+\s*</td>\s*<td align=middle>(\d+)\s*</td>\s*<td align=middle>(\d+)\s*</td>"


mData(0) = "" : mData(1) = "" : mData(2) = "" : mData(3) = ""

regExp = New Regex(datePattern)
matches = regExp.Matches(block)
regExp = Nothing






'Only do this if there was a result from MTX Memory page

For Each m As Match In matches 'process matches
temp = m.Value

If InStr(1, temp, searchDate, CompareMethod.Text) > 0 Then


regExp = New Regex(numPattern) 'establish a regular expression
num = regExp.Matches(temp) 'get your matches

'Grab the last two numbers. these should be your Max & Avg
If num.Count > 1 Then
'DS = 1 PS = 4 : Spare = 7 : Provision = 8
mData(0) = num(num.Count - 3).Value
mData(1) = num(num.Count - 2).Value
mData(2) = num(num.Count - 1).Value
mData(3) = num(num.Count - 0).Value

End If

End If
Next

Return (mData)

End Function

 
Hello,

I'm not sure if this is the reason, but you set regExp = Nothing before you check the matches code. Check out a fantastic tool and very inexpensive. Also check out MatchEvaluators in MDSN. Basically, they use AddressOf to allow you to define a function which is called for every match. So your loop above could be placed in a function. Good Luck and hope this helps a bit!

Have a great day!

j2consulting@yahoo.com
 
Jame:

Yes. Here it is:

20050530 00:00:00 1065069 57106 1122175 88173 10130 98303 1220478 163840 1384318
20050531 00:00:00 1065147 57028 1122175 88173 10130 98303 1220478 163840 1384318
20050601 00:00:00 1064855 57320 1122175 88173 10130 98303 1220478 163840 1384318
20050602 00:00:00 1064959 57216 1122175 88173 10130 98303 1220478 163840 1384318


Thanks,

Kathy
 
Ok, first time through, here is what I came up with. I took the sample data you gave me and added HTML to it so I was having my patterns working with this sample data:

<TR>
<TD align=middle>20050530</TD align=middle>
<TD align=middle>00:00:00</TD align=middle>
<TD align=middle>1065069</TD align=middle>
<TD align=middle>57106</TD align=middle>
<TD align=middle>1122175 </TD align=middle>
<TD align=middle>88173</TD align=middle>
<TD align=middle>10130 </TD align=middle>
<TD align=middle>98303</TD align=middle>
<TD align=middle>1220478</TD align=middle>
<TD align=middle>163840</TD align=middle>
<TD align=middle>1384318</TD align=middle>
</TR>
<TR>
<TD align=middle>20050531 </TD align=middle>
<TD align=middle>00:00:00</TD align=middle>
<TD align=middle>1065147 </TD align=middle>
<TD align=middle>57028</TD align=middle>
<TD align=middle>1122175</TD align=middle>
<TD align=middle>88173</TD align=middle>
<TD align=middle>10130</TD align=middle>
<TD align=middle>98303</TD align=middle>
<TD align=middle>1220478</TD align=middle>
<TD align=middle>163840</TD align=middle>
<TD align=middle>1384318</TD align=middle>
</TR>

Then, I made a pattern that seperates each block of HTML that is between <tr> and </tr>. That way I knew that each match was a row in the HTML table. That pattern is:

(<TR>)[<>/=:\d\sa-zA-Z]+?(</TR>)

After I had that, I could iterate through each row of the html table and grab JUST the data with this pattern:

(?<=<TD align=middle>)([:\d\s]+)(?=</td)

Then, when you're iterating through that list of matches (and you will have 11), simply capture the ones you want... e.g. 1st, 4th, 8th and 9th number.

So you'll have two patterns.. one to get each row in your table, then a second to grab just the data from each row. Make sense?

--
James
 
Oops... take out the " align=middle" in the closing /TD tags... that's what I get for using find/replace. :) It'll still work for ya! :)

--
James
 
James:

Thanks you SO MUCH for spending time to work this out for me. I will try it this morning and let you know the results.

I really appreciate your help, James!!!-:)

KHWright



 
Jame:

I rewrote the function, but the program stop at following line:

regExp = New Regex(datePattern)

It seems that it did not like my datePattern. The following is my code for the function. Willl you have a look at the code to help me find the error? Thanks.

To my understanding the regular expression in Perl is used a little different in VB.NET. I am using regular expression in VB.NET.


Thanks,

Kathy



Function getMemoryData2(ByVal block As String, ByVal searchDate As String) As String()

Dim matches, num As System.Text.RegularExpressions.MatchCollection
Dim regExp As Regex
Dim datePattern, numPattern, temp, mData(3), test1, test2 As String


datePattern = "(<TR>)[<>/=:\d\sa-a-zA-Z}+?(/TR>)"
numPattern = "(?<=<TD align=middle>)([:\d\s]+)(?=</td)"


mData(0) = "" : mData(1) = "" : mData(2) = "" : mData(3) = ""

regExp = New Regex(datePattern)
matches = regExp.Matches(block)
regExp = Nothing


'Only do this if there was a result from MTX Memory page

For Each m As Match In matches 'process matches
temp = m.Value
'If searchDate.ToString = Microsoft.VisualBasic.Left(temp, 8) Then
If InStr(1, temp, searchDate, CompareMethod.Text) > 0 Then


regExp = New Regex(numPattern) 'establish a regular expression
num = regExp.Matches(temp) 'get your matches

'Grab the last two numbers. these should be your Max & Avg
If num.Count > 1 Then

'DS = 0 PS = 4 : Spare = 8 : Provision = 9

mData(0) = num.Item(0).ToString
mData(1) = num.Item(4).ToString
mData(2) = num.Item(8).ToString
mData(3) = num.Item(9).ToString

test1 = mData(0)
test2 = mData(4)



End If

End If
Next

Return (mData)

End Function
 
Set it to ignore case...

Dim selectedRegexOptions As RegexOptions
selectedRegexOptions = RegexOptions.IgnoreCase

Then

regExp = New Regex(numPattern, selectedregexoptions)
regExp = New Regex(datePattern, selectedregexoptions)


--
James
 
Jame:

VB.NT still does not like the like:

regExp = New Regex(datePattern, selectedRegexOptions)


I will try to tweak the pattern a little bit to see if it will work.


Thanks again for taking the time to help me!!

KHWright
 
Your pattern is "(<TR>)[<>/=:\d\sa-a-zA-Z}+?(/TR>)"

It should be "(<TR>)[<>/=:\d\sa-a-zA-Z]+?(/TR>)"

you've got a } in place of ]

:)

--
James
 
Hi, Thanks, Jame.

You are really helpful!!!! I hope I will be able to provide help to someone else in the future.

KWRIght
 
Jame:

I have not tried yet. I do hope so. I will let you know.

Thanks:-)

KWright
 
Jame:

Sorry to bother you again. The numPattern does not get anything.

(?<=<td align=middle>)([:\d\s]+)(?<=</td>)

I do not quite understand([:\d\s]+).

Thanks,

KWright
 
Change your

(?<=<td align=middle>)([:\d\s]+)(?<=</td>)

to

(?<=<td align=middle>)([:\d\s]+)(?=</td)

You've got an extra '<' in there. :)

--
James
 
James:

I have just found out that the numPatter: (?<=<td align=middle>)([:\d\s]+)(?=</td) works.

The datePattern : "(<TR>)[<>/=:\d\sa-a-zA-Z]+?(/TR>)"
skips every other row, e.g. it mtaches the 2 row, 4th row, etc. That is why I did not get the num data on the row I try to get the matches.

Do you know what I should do to fix this problem? Thanks.

KWright
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top