Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations wOOdy-Soft on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

URL freezing

Status
Not open for further replies.

brinker

Programmer
May 31, 2001
48
CA
Hi,

I have written a program that loops around grabbing information from one URL, but with different search queries. The program sometimes runs for days until completion as expected. However, other times, it freezes for unknown reasons. I am catching all errors, but there are none to be caught.

I have already accounted for the fact that the page may be temporarily unavailable, there may be bind or other exceptions and so on. Is there something that I am missing when dealing with URLs or is there a way to escape from these deadlock conditions.

thanks

Narjit
 
>> is there a way to escape from these deadlock conditions.

If you mean a synchronization deadlock, no. A deadlock situation is a bug and you must fix it to correct the applications behavior.

If you mean a blocking socket call that never returns you can use a TimerTask to timeout on your end and then close the Socket or even the Thread if you need to.


-pete
I just can't seem to get back my IntelliSense
 
Thanks Pete,

Is there a way to use a timer task without explictly using threads?

I am doing accessing the URL using a simple function call.


Narjit
 
Narjit,

Actually the TimerTask runs in a thread. The key is you need to be able to unblock the thread your calling your function in so depending on the API your using that may or may not be possible. If it is not possible then you want to use a thread so that you can stop the thread that is blocked on the API call.

does that make sense?


-pete
I just can't seem to get back my IntelliSense
 
I'll try that and get let you know if this solves the problem.

Thanks

Narjit
 
Something else to keep in mind as the cause. Some sites institute spider management logic.

The concept is that spider programs need to be well behaved. This means not submitting one request after another continuously forever. They should not pound a web server with request but rather divide the work up for execution at different times hopefully during low load periods.

If a system identifies you as a spider (quite easy actually) and the server load climbs above a threshold set by the administrator then the system could strand your requests.


-pete
I just can't seem to get back my IntelliSense
 
With the threaded version, it appears that I need to specify a protocol. This is the error I am getting:

java.net.MalformedURLException: no protocol

How does one specify a protocol on Java?
 
>> java.net.MalformedURLException: no protocol

is referring to http or ftp etc.

Code:
[URL unfurl="true"]http://somedomain.com/someresource[/URL]


-pete
I just can't seem to get back my IntelliSense
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top