Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Chriss Miller on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

link checker

Status
Not open for further replies.

lizok

Programmer
Jan 11, 2001
82
US
Can any one tell me how to develop a link checker across a site? maybe point me in the right direction?

thank you
liz
 
Is this across your site or across third-party sites that a user submits?

ALFII.com
---------------------
If this post answered or helped to answer your question, please reply with such so that forum members with a similar question will know to use this advice.
 
This would be across the whole site. I'd like to specify root dir, link checker should scan all files/all links and report on broken links. something like Xenu does.

thanx
liz
 
What specifically are you looking for that Xenu doesn't do? It is by far the best program of it's type that I have found.

Hope this helps

Wullie

Fresh Look - Quality Coldfusion 7/Windows Hosting

The pessimist complains about the wind. The optimist expects it to change. The leader adjusts the sails. - John Maxwell
 
i don't want to run Xenu as a separate program. i'd like to have the same functionality build it in my CF application. were i put it URL, click the button and get same report as Xenu produces.
 
In that case, I would say that your question is far too general.

The only real way that anyone can respond is either to give you the full code (which is not the Tek-Tips way) or to tell you how a link checker works, which you already seem to know.

It takes the URL, parses it for links and then continues over and over logging all of the errors found.

Start working on your application and when you hit specific problems that you cannot solve yourself, then come back here with those, posting the exact code that you are having issues with.

Hope this helps

Wullie

Fresh Look - Quality Coldfusion 7/Windows Hosting

The pessimist complains about the wind. The optimist expects it to change. The leader adjusts the sails. - John Maxwell
 
I have the code...

but all you need to do is input a root url to the link checker script and it does:

* crawls each page, gets links that it doesn't have already and saves them to a var.
* marks each link in the links var that it has crawled so it does not crawl it twice.
* marks each link in the links var that it could not crawl so you know which are bad.

I wrote the script to automatically update my google sitemaps files for my sites.

 
i guess what i was looking for is a tag that validates the URL. parsing is easy. what i think i dont understand is HOW the url is checked. if there a CF tag that returns some kind of code that would indicate that the LINK is dead?

imstillatwork: are you sharing the code? :) or selling?

liz

 
cfhttp will get (or send) the http information. You can use it to get a url contents and http status codes. If the status code is 200 OK, then the link works fine.

imstillatwork: are you sharing the code? smile or selling?

neither, but I am willing to help.

Make sure you have the cfml reference. It should be your best friend while coding if you're new, and always be close no matter how long you've been doing this.


Here is a good start for you link check problem. Remeber a link is really just a url
Code:
<cfhttp url="[URL unfurl="true"]http://www.google.com"[/URL] method="GET" timeout="4" useragent="Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)"></cfhttp>
<cfdump var="#cfhttp#">
[code]

Now that you've checked the url '[URL unfurl="true"]http://google.com'[/URL] you can see all sorts of fun stuff that cfhttp returns. read through the reference and see what each item is. Now you can tell if the link was good or not.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top