×
INTELLIGENT WORK FORUMS
FOR COMPUTER PROFESSIONALS

Log In

Come Join Us!

Are you a
Computer / IT professional?
Join Tek-Tips Forums!
  • Talk With Other Members
  • Be Notified Of Responses
    To Your Posts
  • Keyword Search
  • One-Click Access To Your
    Favorite Forums
  • Automated Signatures
    On Your Posts
  • Best Of All, It's Free!
  • Students Click Here

*Tek-Tips's functionality depends on members receiving e-mail. By joining you are opting in to receive e-mail.

Posting Guidelines

Promoting, selling, recruiting, coursework and thesis posting is forbidden.

Students Click Here

Jobs

Regular Expressions

Regular Expressions

Regular Expressions

(OP)
I'm trying to parse the apache.log file and have a function working that does so, but I know little about regex and could not parse it directly. I reviewd and tried just about every posting I could find here and on other sites but nothing worked. Please see $pattern below and can tell me how to parse each section without the ugly work-arounds? This is for the local development copy of the log which is truncated (without the last two columns) but I would like to use the same code to also parse the live NCSA combined log format. Any help is appreciated.

CODE

function ParseLocalToScreen($path) {
	global $output;
	// Parses the local Windows Apache development Log Format lines:
	// REF RAW LOG: 127.0.0.1 - - [27/Apr/2014:15:00:24 -0700] "GET / HTTP/1.1" 200 3051
	// REF PARSED OUTPUT: 127.0.0.1, -, -, 2014-04-27 15:00:24, GET, /, HTTP/1.1, 200, 3051
	$pattern = '/^(\S+)\s '; // Remote Host
	$pattern .= '([^\s]+) '; // Log Name
	$pattern .= '([^\s]+) '; // User
	$pattern .= '\[(\d+)\/(\w+)\/(\d+):(\d{1,2}:\d{1,2}:\d{1,2} '; // Datetime WORKS SO-SO
	$pattern .= '?[\+\-]?\d*)\] "(.*)/'; // Remainder
	
	if (is_readable($path)) :
		$fh = fopen($path,'r') or die($php_errormsg);
		while (!feof($fh)) :
			$s = fgets($fh);
			if (preg_match($pattern,$s,$matches)) :
				list($whole_match, $remote_host, $logname, $user, $day, $month, $year, $time, $remainder) = $matches;
				$month = date('m', strtotime($month)); // Converts short month to numeric
				$time = trim(substr($time,0,-6)); // Removes -0800 offset
				$replacements = array(' ', '"');
				$remainder = str_replace($replacements, ', ', $remainder); // Removes extra space and quote
			endif;
			// REGEX NOT WORKING: remove extra field, build datetime and other output for MySQL
			$output .= str_replace(", , ", ", ", "$remote_host, $logname, $user, $year-$month-$day $time, $remainder<br>\n");
		endwhile;
		fclose($fh);
		echo $output;
	else : 
		echo "Cannot access log file!";
	endif;
} 

RE: Regular Expressions

Hi

Sorry, no time for detailed analysis now, just a quick question. You know that strtotime() is able to parse that date format ?

CODE --> php -a

Interactive mode enabled

php > var_dump(date('c', strtotime('27/Apr/2014:15:00:24 -0700')));
string(25) "2014-04-28T01:00:24+03:00" 

Feherke.
feherke.ga

RE: Regular Expressions

(OP)
Yes, thank you, and no real rush. I did know that but I was expecting the regex to provide the date as a single variable using list() but instead it gives it in bits and pieces, which is the main issue here. When I first wrote the function it used a pattern something like:

CODE --> PHP

$pattern = '/^([^ ]+) ([^ ]+) ([^ ]+) (\[[^\]]+\]) "(.*) (.*) (.*)" ([0-9\-]+) ([0-9\-]+) "(.*)" "(.*)"$/'; 

and to assign variables, it was using:

CODE --> PHP

list($whole_match, $remote_host, $logname, $user, $date_time, $method, $request, $protocol, $status, $bytes, $referer, $user_agent) = $matches; 

. . . where $date_time was providing what was needed for formatting. Somehow I broke it but finally realized that each bit of the date and time is now being broken down to individual variables so to get it to work I simply put them back together in the needed order but it's inelegant and ugly that way!

Red Flag This Post

Please let us know here why this post is inappropriate. Reasons such as off-topic, duplicates, flames, illegal, vulgar, or students posting their homework.

Red Flag Submitted

Thank you for helping keep Tek-Tips Forums free from inappropriate posts.
The Tek-Tips staff will check this out and take appropriate action.

Reply To This Thread

Posting in the Tek-Tips forums is a member-only feature.

Click Here to join Tek-Tips and talk with other members!

Close Box

Join Tek-Tips® Today!

Join your peers on the Internet's largest technical computer professional community.
It's easy to join and it's free.

Here's Why Members Love Tek-Tips Forums:

Register now while it's still free!

Already a member? Close this window and log in.

Join Us             Close