×
INTELLIGENT WORK FORUMS
FOR COMPUTER PROFESSIONALS

Log In

Come Join Us!

Are you a
Computer / IT professional?
Join Tek-Tips Forums!
  • Talk With Other Members
  • Be Notified Of Responses
    To Your Posts
  • Keyword Search
  • One-Click Access To Your
    Favorite Forums
  • Automated Signatures
    On Your Posts
  • Best Of All, It's Free!
  • Students Click Here

*Tek-Tips's functionality depends on members receiving e-mail. By joining you are opting in to receive e-mail.

Posting Guidelines

Promoting, selling, recruiting, coursework and thesis posting is forbidden.

Students Click Here

Jobs

Extracting HMTL data into Foxpro Form

Extracting HMTL data into Foxpro Form

Extracting HMTL data into Foxpro Form

(OP)
Hi experts,

I used the below code in a form command button to display a web page content then copy the content into an edit box, when the url is www.google.lk this does work. But for some other web sites it says member BODY does not evaluate to an object (specially when the url is a pdf file.)

Please help me to solve this problem.

*****************************
SET TALK OFF

LOCAL oInet

mc=ALLTRIM(thisform.url.Value)
lcURL = [&mc]

oInet=CREATEOBJECT("InternetExplorer.Application")
oInet.Navigate([&lcURL])

DO WHILE oInet.busy
wait "busy" window nowait timeout 1
ENDDO


thisform.edit2.value=oinet.document.body.InnerText
release oInet

**************************************************

RE: Extracting HMTL data into Foxpro Form

Hi

You could test for the document having a body of type object before you try and take the InnerText.

CODE

if type("oinet.document.body") = "O"
  thisform.edit2.value=oinet.document.body.InnerText
else
  thisform.edit2.value="Unreadable"
endif 

Why are you doing all that macro substitution?

Would this not work? oInet.Navigate(alltrim(thisform.url.value))

Regards

Griff
Keep Smileing

There are 10 kinds of people in the world, those who understand binary and those who don't.

I'm trying to cut down on the use of shrieks (exclamation marks), I'm told they are !good for you.

RE: Extracting HMTL data into Foxpro Form

(OP)
Hi Griff,

I tried your code still I am getting the error for www.google.lk it works but for www.yahoo.com it throws and error message member BODY does not...
for online pdf files it says unreadable. Is there any way I can read online pdf file contents (my main object is reading a list of online pdf files and copy the text into the edit box.

I used the macro because the form refuse to navigate the url typed in the text box.

Regards MSiddeek.

RE: Extracting HMTL data into Foxpro Form

Quote:

specially when the url is a pdf file

Well, if the docuement is a PDF, it won't have a <body> tag, hence the error.

If your aim is to extract the text from a PDF, you will need to find some other way of doing it. Internet Explorer can dispaly PDF text, but it doesn't know what the text contains.

Mike




__________________________________
Mike Lewis (Edinburgh, Scotland)

Visual FoxPro articles, tips and downloads

RE: Extracting HMTL data into Foxpro Form

Just dabbling with Intellisence, I can see the following:

oInet=CREATEOBJECT("InternetExplorer.Application")
oInet.Navigate(" < some PDF file >")
oPDF = oInet.Document


oPDF has PIMs that will allow you to navigate the pages of the PDF, print it, change things like the zoom factor and the number of pages to view, etc. But nothing that will let you get at the contents of the PDF.

Mike

__________________________________
Mike Lewis (Edinburgh, Scotland)

Visual FoxPro articles, tips and downloads

RE: Extracting HMTL data into Foxpro Form

You wait for busy being false. You have to wait for readystate being 4 (DONE): https://developer.mozilla.org/en-US/docs/Web/API/X...
Besides your cascades of macro substitutions make no sense, but they introduce no problem.

Let's look at it:

CODE

SET TALK OFF

LOCAL loInet, lcURL

lcURL = "www.yahoo.com"  && ALLTRIM(thisform.url.Value)

oInet=CREATEOBJECT("InternetExplorer.Application")
oInet.Navigate(lcURL)

DO WHILE loInet.readystate<>4
? loInet.busy, loInet.reqadystate
ENDDO
? Left(loInet.document.body.innerText,160)+"..." 

Works for me, and busy gets .T. after readystate becomes 4, so that alone also doesn't explain what you experience. I wouldn't guarantee checking busy is sufficient. The readystate speaks of the document you load and is the more relevant status.

If that doesn't work for you for some sites, are they perhaps blocked for you?

Bye, Olaf.

Olaf Doschke Software Engineering
https://www.doschke.name

RE: Extracting HMTL data into Foxpro Form

Just to step back a bit.

Am I right in saying that there are two separate problems here:

1. How to prevent the "not an object" error; and

2. How to extract the contents of a PDF.

I can't reproduce the first problem. Olaf has given you some advice that might be useful.

As far as the contents of the PDF are concerned, you won't be able to display its contents even when you have solved the first problem, for the reasons I have stated. Instead, you could try this:

1. Drop a Microsoft Web Browser OLE control onto your form. Name it, say, oBrowser.

2. At the point at which you want to display the PDF: oBrowser.Navigate2(" < url of your pdf > ")

The PDF should now appear within the control.

Whether this works or not will depend on a setting within Internet Explorer that determines whether the browser itself displays PDFs or whether it opens them in the default PDF viewer. I haven't used IE for years, so I can't say where that setting is; you will have to dig around.

NOTE: The above remark re Internet Explorer will apply even if you are using a different browser, or even if you have a version of Windows in which IE has been replaced by Edge. The Microsoft Web Browser control is a wrapper for IE, which is always present, even in Windows 10.

Mike

__________________________________
Mike Lewis (Edinburgh, Scotland)

Visual FoxPro articles, tips and downloads

RE: Extracting HMTL data into Foxpro Form

(OP)
Mike / Olaf

As Mike mentioned in his last post, my main problem is copying the content from PDF. Actually I am in the process of dumping of about 102,000 records from online pdf result files saved in a web site, each PDF is one record, therefore I need to open all those files and copy them back to a table using a FoxPro form.

Yes Mike I did drop the Microsoft Web Browser OLE control into a form and navigated, the pdf is now opening within the form control.
but my problem is getting the content from the web control to a text / edit box.

MSiddeek

RE: Extracting HMTL data into Foxpro Form

Olaf,

this works better:

CODE --> vfp

SET TALK OFF

LOCAL loInet, lcURL

lcURL = "www.yahoo.com"  && ALLTRIM(thisform.url.Value)

loInet=CREATEOBJECT("InternetExplorer.Application")
loInet.Navigate(lcURL)

DO WHILE loInet.readystate<>4
? loInet.busy, loInet.readystate
ENDDO
 clea
? Left(loInet.document.body.innerText,160)+"..." 

Regards,
Koen

RE: Extracting HMTL data into Foxpro Form

Correct, I forgot to put th l everywhere.

MSiddeek, well, Mike is right about PDF, when a browser displays a PDF, it's not within the HTML DOM. You can forget your idea to get the PDF content from the DOM.

Bye, Olaf.

Olaf Doschke Software Engineering
https://www.doschke.name

RE: Extracting HMTL data into Foxpro Form

Quote:

my problem is getting the content from the web control to a text / edit box.

Why do you want to do that? Is it because you want the user to be able to edit the text? And then save it back to the PDF? If so, you would do better to look for a dedicated PDF-editing tool.

But if you simply want to display the text, the web browser control already does that for you. It is different from an edit box in as much as it also displays all the original formatting, which an edit box does not. It also lets you follow hyperlinks, which might or might not be desirable. But the main thing is that, when displaying a PDF, the web browser control is to all intents and purposes read-only.

Mike

__________________________________
Mike Lewis (Edinburgh, Scotland)

Visual FoxPro articles, tips and downloads

RE: Extracting HMTL data into Foxpro Form

Besides all that, if the PDF texts come from data, then why do you need that? Because it's not your data?

Bye, Olaf.

Olaf Doschke Software Engineering
https://www.doschke.name

RE: Extracting HMTL data into Foxpro Form

(OP)
Olaf

Yes these pdf test results are belongs one of my client, to whom I am developing a new sql back end database application, he needs his old data to be incorporated into this new application. The main problem is that my client has access to the pdf files and not for the cloud database. The previous developer cannot be traced at all, who has the password for cloud database.

Mike

Presently I do have a table containing web url of these pdf files ie www.myweb.com/reports/123565.pdf. What I did was, I developed a small project and created a form to fetch the records from navigated pdf files
below I attached the form which I created

each CR I have treated as in the edit box as one field and filling the respective fields by pressing the get record button then the save button replaces the fields into the table and advance the record by one and place the url on the url field and the get pdf button load the pdf file from web then do the manual copy paste from web content to the edit box.

MSiddeek.

RE: Extracting HMTL data into Foxpro Form

MSiddeek, seeing your form doesn't really help. The issue is NOT the user interface; it's finding some way of extracting the underlying text.

Mike

__________________________________
Mike Lewis (Edinburgh, Scotland)

Visual FoxPro articles, tips and downloads

Red Flag This Post

Please let us know here why this post is inappropriate. Reasons such as off-topic, duplicates, flames, illegal, vulgar, or students posting their homework.

Red Flag Submitted

Thank you for helping keep Tek-Tips Forums free from inappropriate posts.
The Tek-Tips staff will check this out and take appropriate action.

Reply To This Thread

Posting in the Tek-Tips forums is a member-only feature.

Click Here to join Tek-Tips and talk with other members! Already a Member? Login

Close Box

Join Tek-Tips® Today!

Join your peers on the Internet's largest technical computer professional community.
It's easy to join and it's free.

Here's Why Members Love Tek-Tips Forums:

Register now while it's still free!

Already a member? Close this window and log in.

Join Us             Close