I am working on a research project that involves tracking and analyzing how a research panel interacts on the web. As I am pretty unfarmilar with broswer technology....I was hoping someone could point me in the right direction.
1) What language(s) are Alexa's and Compete's toolbar development...
I manage 8 different websites. Some are tagged with Google Analytics, some Omniture, some WebTrends, some DoubleClick Dart Spotlight, and some have multiple tags. I would like to centralize the configuration and management of all tags on a central server.
Thus, when a vistior lands on one of...
Anyone have an idea why some timeout is not working on some domain names that I Ping? There are certain DNs that are not active that take 20 seconds to timeout, even though I have it set to 1 or 2 seconds (I tried setting the timeout 2 different ways).
Is it the number of Pings it is sending...
I am using the following methods to collect IP and server data for a large number (500k+) urls, which is then inserted into a MySQL database.
I need:
- IP
- DNS Server Name
- Server Info (Type and OS)
Is there a more efficient method to collect this data? Is it redundant to use 3...
I want to use HTTP::Headers to pull down server-side data (see bottom of page) for the sites I am contacting with HTTP::Request GET.
My 'Request' variables are returning the right values, but the my 'Headers' variables are null. I am pretty sure its something simple in my syntax, but can't...
Is it possible to get server-side variables for URLs that I am spidering?? I am using Net::Ping to get IP addresses, but I would also like to get other environmental data such as server details, server type (apache, etc), server operating system, and so one.
Can I use Net::Ping or 'Get' to...
I was wondering if anyone has any suggestion on extracting contact information from web sites that I spider?? There is no set format for the html content, so it has to be a flexible search.
Right now, all I can think of is having the spider look for "Phone", "Phone Number", "Telephone", etc...
I have developed a spider to collect information about a predetermined set of URLs. One thing that I would like to add is the ability to record an IP address.
Anyone know the code (or a site) that will allow me to get IPs from spidering homepage content.
Thanks
My program was just transferred from a shared server to a dedicated. The default timeout on the decided is 3 minutes, so when my spider comes across a URL that is not active or doesn't return code....my spider tries for exactly 3 minutes, when I need it to be 30 seconds.
Here's the code...
Ok, so the getstore function of my perl spider is not working when a web site is set for 'no-cache' status. Basically its coming up with nothing.
Does anyone know how to get around this???? How to I get the source code into an array??
I've written a perl spider to do some analysis work on the web. The program I've written successful pulls the source code for 95% of the 10k+ URLS, but there are a select few that I get the 'getstore' function to work on.
These web sites are anything specical at all, so I am not sure why it...
This is a simple issue, but I can't seem to figure out what I am doing wrong. I build an administration page that let's my client add new Loan Officer contact information. The admin program automatically generates an updated 'contact_us.html' page.
My problem is that I can only get the program...
I am writing data to a file using the following:
open (DATA, ">>../cgi-bin/data/marketinfo.dat");
print DATA "$variable1\~$variable2 \n";
close (DATA);
It worked just fine on my server, but when I uploaded it to my client's server the \n doesn't work.
Rather than starting a new line for...
A variable in my program is an AE Code ($aenum) that contains both numbers and letters, which are assigned via .dat file. Example $aenum is c15, c16, c17.
However, I can't get my if then to work and error when a newly input AE code is the same as an existing.
If ($aenum eq $aenum2) {...
I am attempting to create a spider to do some data mining on the internet. I want my .pl program to open a page on the web, put the text into an array, and write it to a text file.
Using the code below, I can't seem to get any data into the array, but if I change $url to...
This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
By continuing to use this site, you are consenting to our use of cookies.