×
INTELLIGENT WORK FORUMS
FOR COMPUTER PROFESSIONALS

Log In

Come Join Us!

Are you a
Computer / IT professional?
Join Tek-Tips Forums!
  • Talk With Other Members
  • Be Notified Of Responses
    To Your Posts
  • Keyword Search
  • One-Click Access To Your
    Favorite Forums
  • Automated Signatures
    On Your Posts
  • Best Of All, It's Free!
  • Students Click Here

*Tek-Tips's functionality depends on members receiving e-mail. By joining you are opting in to receive e-mail.

Posting Guidelines

Promoting, selling, recruiting, coursework and thesis posting is forbidden.

Students Click Here

Jobs

How to slow program as workload grows and grows

How to slow program as workload grows and grows

How to slow program as workload grows and grows

(OP)
I have a small program that scans Apache log files for certain "evil" patterns and then blocks those IP addresses from server (not just Apache).
But as these log files grow and grow, the CPU percentage keeps going up.
When I restart Apache after clearing log files, problem goes away, until size grows again.

I don't want the scanning to wait too long since many pages are produced with mod_perl and PostgreSQL values.

Should I be changing how it gets its data or doing something inside of the program?

Any suggestions?

RE: How to slow program as workload grows and grows

Hi

Would be really necessary to know how and how many log files are processed.

I suppose, the script always read the entire file, processing the already seen lines too again and again. In this case, after processing all lines I would save the current position returned by tell into a file, then o next processing I would read that value back, move to that position using seek and process only the lines arrived since the previous processing.

If you have to always process the entire log file, I would use the above tell / seek thing to insert the new lines into a database then doing the intrusion detection queries in the database.

But of course, these are just theories. They may or may not match your task.

Feherke.
feherke.github.io

RE: How to slow program as workload grows and grows

(OP)
Just the Apache error and access logs. No need to scan anything but the latest entries as they come off.

Your idea seems like a good one, I will try it.
I thought I might get a better answer by asking here :)

RE: How to slow program as workload grows and grows

Hi

Quote (MrCBofBCinTX)

Just the Apache error and access logs.
Obviously my question was too brief. By "how many" I was thinking to rotated logs.

Quote (MrCBofBCinTX)

No need to scan anything but the latest entries as they come off.
Another way to achieve that, is to use two set of log files :
  • the original one in common log format, that you will keep untouched
  • another one formatted for easier parsing (*), that one you will process and after each processing truncate to empty
(*) Why to log the date as "[27/Jul/2013:20:53:00 +0000]" when "1374947580" is faster to parse ?

Feherke.
feherke.github.io

RE: How to slow program as workload grows and grows

(OP)
I decided to use File::Tail.
It has several different parameters and it allows me to keep using a debugging mode I have which lets me test new patterns by reading in the whole file if I want to.
(And reversing the IP's blocked if it is screwed up.)
Luckily I never make mistakes :(

I like to restart the server fairly often and look through the error log to see if any new bad bots are showing up to be added to the special friends list

Red Flag This Post

Please let us know here why this post is inappropriate. Reasons such as off-topic, duplicates, flames, illegal, vulgar, or students posting their homework.

Red Flag Submitted

Thank you for helping keep Tek-Tips Forums free from inappropriate posts.
The Tek-Tips staff will check this out and take appropriate action.

Reply To This Thread

Posting in the Tek-Tips forums is a member-only feature.

Click Here to join Tek-Tips and talk with other members! Already a Member? Login

Close Box

Join Tek-Tips® Today!

Join your peers on the Internet's largest technical computer professional community.
It's easy to join and it's free.

Here's Why Members Love Tek-Tips Forums:

Register now while it's still free!

Already a member? Close this window and log in.

Join Us             Close