Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations wOOdy-Soft on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Parsing for *any* image URLs in an HTML file

Status
Not open for further replies.

paradoxdetected

Programmer
Aug 2, 2009
4
GB
Hi guys!

New guy here, long time lurker, first time actual joiner-and poster.

I'm having a particularly fun time in VB 2008.

I'm trying to return an array (that I can For/Each through) of URLs in an HTML file.

To claify, I want (preferably using regular expressions)to download all the images from a 'Google images' searchpage.

I have the following code that I've been fighting for over a week, can anyone tell me where I'm going wrong?

-----------------------------

Dim myRegex As New System.Text.RegularExpressions.Regex("[\w-]+\.)+[\w-]+(/[\w- ./]*)+\.(?:gif|jpg|jpeg|png|bmp|GIF|JPEG|JPG|PNG|BMP|Gif|Jpg|Jpeg|Png|Bmp)$")
Dim Valid As System.Text.RegularExpressions.MatchCollection
Dim sourcetext As String = wbGoog.DocumentText

MsgBox("source: " + sourcetext)

Valid = System.Text.RegularExpressions.Regex.Matches(sourcetext, "[\w-]+\.)+[\w-]+(/[\w- ./]*)+\.(?:gif|jpg|jpeg|png|bmp|GIF|JPEG|JPG|PNG|BMP|Gif|Jpg|Jpeg|Png|Bmp)$")

For Each match As System.Text.RegularExpressions.Match In Valid
MsgBox(match)
MsgBox("for each initiated!")
Next

-----------------------------

Thank you all for your time, I'm sure I'm doing something stupid here, any - ANY - help would be appreciated :)

Thanks all!

ParadoxDetected.
 
Look at using a WebBrowser and a HtmlDocument. Then it is as simple as getting the .Images method which contains all of the Images on a web page.

Were web1 is a WebBrowser Control.
Code:
    Private Sub web1_DocumentCompleted(ByVal sender As System.Object, ByVal e As System.Windows.Forms.WebBrowserDocumentCompletedEventArgs) Handles web1.DocumentCompleted
        Dim wx As HtmlDocument = web1.Document
        Dim picLnks As HtmlElementCollection = wx.Images

        'play with pictures.
End Sub

-I hate Microsoft!
-Forever and always forward.
-My kingdom for a edit button!
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top