Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations bkrike on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Download text only from web into Excel 3

Status
Not open for further replies.

Aubs010

Technical User
Apr 4, 2003
306
GB
I'm not a newby with VBA, have created a fair few Access databases/Excel workbooks & Word documents with vast amounts ov VB coding.

What I want to achieve:
I want to be able to gather information from a website, into an excel worksheet
The urls are in the format me.ur.com/id=X
where x is 1 to 10000

The problem I'm having:
It's taking way too long to get the information from the website.

The question:
Is there any way to download ONLY TEXT, and not graphics? - I think this is what is taking so long to download.


Any help greatly appreciated.




Aubs
 
Not sure. How are you getting the download?

WinHTTP will bring in WinHTTP.ResponseBody, or WinHTTP.ResponseText.

Unfortunately, ResponseText brings in the HTML, as text, itself. Yes, the displayed text is there, but it is included with everything else.

Please post further if you find a way to extract ONLY the dispolayed text.

Gerry
See my Paintings and Sculpture
 
Are you using something like this?


Sub Read_URL()

With ActiveSheet.QueryTables.Add(Connection:="URL; Destination:=Range("a1")) 'write web-page to sheet
.BackgroundQuery = True
.TablesOnlyFromHTML = False
.Refresh BackgroundQuery:=False
.SaveData = True
End With

End Sub
 
Guys,

Sorry for the delay in getting back to you...

I was using somehting very similar to the example given by ETID, however, I have looked into the information I will be downloading and now all I need to do is download the actual HTML text if possible i.e. download all the:

Code:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML>
  <HEAD>
    <TITLE>This is the title</TITLE>
  </HEAD>
  <BODY>
    The Body Goes Here
  </BODY>
</HTML>

if this makes any sense to anyone!

Thanks in advance for your help :)



Aubs
 
Fumei,

I have found that XML does the same by turning on the reference to it:
Code:
    Dim text As String
    Dim XML As MSXML2.XMLHTTP
    Set XML = New MSXML2.XMLHTTP
    Dim URL
    URL = "[URL unfurl="true"]http://website.com/search="[/URL] & ItemNumber & Chr(34)

    XML.Open "GET", URL, False
    XML.send

    text = XML.responseText

    Set XML = Nothing
text returns the html of the page.

However do you think your way may be quicker?

Thanks for your help, I'll give you a
star.gif
for your efforts :)



Aubs
 
Quicker? Gee, I don't know. I have to try testing XML vs WinHTTP. If there is a difference I doubt it would be very much either way. Both use, I would think, HTTP protocol. XML may, repeat may, parse the response into text faster as it is newer. There is a new WinHTTP (version 5.2) now available from Microsoft.

Gerry
See my Paintings and Sculpture
 
I tried both and WinHTTP took considerably longer to process than XML...

Think I'll use XML for now! :)

Thanks all the same for your help, I wouldn't have got underway with out it - much appreciated.



Aubs
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top