Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations TouchToneTommy on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

data analyser program

Status
Not open for further replies.

spookie

Programmer
May 30, 2001
655
IN
Hello,

I want to write a script for data analyser something like webstat,in php.
I think i have to read the log file and store it in DB(MySQL in my case) but not sure whr to start from ..

any thoughts are welcome..
thanks in advance

spookie

--------------------------------------------------------------------------
I never set a goal because u never know whats going to happen tommorow.
 
I did a similar thing with the logs from squid in our office.

Read the file, and process it line by line, adding a new row to your mysql table. Then empty the file so that you don't duplicate the data.

Then with the mighty power of SQL, you can do wotever you want with the data! --BB
 
thanks for the reply,

the problem is i dont have the write permission to the log file and even it cannot be changed..:(
so i have to make sure that every time it reads the file it gets only the new record since the last time..
also there is no cron facility :( this code will be embeded to some frequently accessed page , so i have to concern abt the amount of time this code will take to execute so that the page access will not be considerably slow(even if the piece of code will execute only once in a day).
how can i filter out the log record(only the record starting from the previous day)effectively..or any other suggestions?

thanks in advance

spookie
--------------------------------------------------------------------------
I never set a goal because u never know whats going to happen tommorow.
 
If the logfile has a timestamp in each line that includes seconds, you might be able to make a copy of the last lgofile line you accessed. You might then be able to programmatically search through the file on the next run, ignoring all input until you hit that line again.

You haven't specified what is generating the logfile, but is it possible for that app to start a new file on a regular basis? For example, most Linux distributions come with an app called logrotate, which will on a configurable period copy a logfile to a new name. It can then restart the app, causing it to create a new logfile. Want the best answers? Ask the best questions: TANSTAAFL!
 
Thanks sleipnir214 !!
I am storing the log into MySQL..each time i check the last inserted line by reading a text file which has a timestamp of last inserted record..the application reads the text file and starts from the line from the last line inserted!!

now i want to use this info so that the stats can be shown in terms of pagewise,countriwise hits..(it shd be as nearly as possible to a standard software for log analyzer)

Any suggestions..how the countrywise hits can be obtained?
how shd i go abt this?

Any help or link will be greatly appriciated.

Thanks in advance

spookie
--------------------------------------------------------------------------
I never set a goal because u never know whats going to happen tommorow.
 
Hits by country are difficult.

I can think of three possible ways to get that information.

One would be to perform a reverse DNS lookup on the IP address to get a domain name. From there you could perform a whois query to find out the country in which that domain name resides. Parsing the data could get tricky as not all whois servers return data in the same format.

Another would be to perform a whois query against the ARIN whois server to find out in what country that IP is used. You might have to follow a referral to RIPE or another internet numbering authority.

A third would be to find a prebuilt database that matches IP to country and insert that in to your database for lookups.

If you use either of the first two, remember to dump any results you get into a table, so that you won't have to look up that address again. Want the best answers? Ask the best questions: TANSTAAFL!
 
Thanks again!!

If i choose the third option,where could i find the prebuilt database ?(I assume its the simplest of the options)..

any tutorial on this issue?

Also i want to make sure, when one particular page is requested there are number of relative entries in the log that includes the requested documents of all the images in that page.so all the entries are to be counted individually.
is it right?

TIA

spookie --------------------------------------------------------------------------
I never set a goal because u never know whats going to happen tommorow.
 
I don't know of any prebuilt databases -- but then, I've never looked, either. Nor do I know of any tutorials, either.

Whether you count each graphic retrieval as a separate hit is up to you. I don't -- in fact, on my PHP-based e-commerce sites, I don't even log graphic retrievals. Want the best answers? Ask the best questions: TANSTAAFL!
 
ok thanks sleip!!

spookie --------------------------------------------------------------------------
I never set a goal because u never know whats going to happen tommorow.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top