Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations wOOdy-Soft on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Get unique info from a file 1

Status
Not open for further replies.

4x4uk

Technical User
Apr 30, 2002
381
GB
I am in the process of creating a logging script which store info in a flat text file in the following format

Date|IP|BrowserInfo|

I can count all the records in the file and output the info in readable form with the following script

Code:
<?php
$total=0;
$fcontents = file(&quot;logfile.txt&quot;,&quot;r&quot;);
while (list ($line_num, $line) = each ($fcontents)) {
$total=$total+1;
list($string1,$string2,$String3)=explode(&quot;|&quot;,$line);
echo 
&quot;<b>Date/Time: </b>&quot;,$string1,&quot;<br>&quot;,
&quot;<b>Remote Address: </b>&quot;,$string2,&quot;<br>&quot;,
&quot;<b>Browser/OS: </b>&quot;,$string3,&quot;<p>&quot;;
}
echo &quot;<p><b>Total Visitors = &quot;,$total,&quot;</b><br>\n&quot;;
?>

I was wondering how to count unique values only using IP address for example. I would also like to count browser types.

Anyone got any ideas?
Thanks
Ian It's not a lie if you believe it!
 
Populate arrays, one for IP addresses, one for browsers.

For each entry, search your IP array for that address (array_search() might be useful: If the address is not there, push it on to the end of the array. Do the same for browsers.

Keep in mind that every time you add an address to the list, array_search() will take longer. Given a sufficient number of addresses, your script will take longer to run than you PHP settings will allow.

One way to speed things up, if you have a large number of addresses, is to create two arrays for the IP addresses, one a 4-dimensional adjacency matrix (a 4d array), the other one a single-dimensional array. The 4d matrix will use each octet of the IP address as one axis, and to store a record for 1.2.3.4, you would set $ipsearch[1][2][3][4] = 1 and $iplist[] = &quot;1.2.3.4&quot;. To search for the repeat of a particular address, you could then efficiently search each octet separately, making it faster to determine that an address has not already been added -- if the first octet of your address does not appear in the first axis of the search matrix, you know you've never seen the address before.

Once you've finished the list, you can then pull your unique addresses out of $iplist.


Another way might be to make use of a database server, such as MySQL or PostgreSQL. Database servers specialize in searches -- if your list is large, you could get a speed boost. Want the best answers? Ask the best questions: TANSTAAFL!
 
array_search is great, and by far the most robust solution, but if you simply want yes or no, in_array is the easiest to toss into an if then

if (!in_array($ip_found, $ip_array)) {
$ip_array[] = $ip_found;
}

-Rob
 
Thanks for the info.
I had thought about mySQL, I just never used it directly myself hence the reason I went for the textfile in the first place. But I'm all for making life easier in the long run so I guess its back to the books and manual again to learn how to use mySQL.

Regards
Ian It's not a lie if you believe it!

 
Another approach would be to read your file daily and strip out the days records to a new file ccyymmdd.txt which need contain only the IP addresses and the browser info. Then this file could be read line for line into an associative array and sorted and processed first by IP address and then by browser details using the KSORT and/or ASORT functions and using the SORT-STRING feature.

This approach would stop your main file from growing and would provide a back-up of each days results.

Clive
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top