Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations derfloh on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

regexp 1

Status
Not open for further replies.

nix45

MIS
Joined
Nov 21, 2002
Messages
478
Location
US
I need a regular expression that will match a string similar to this...

me@foo.org_you@foo.com_This is the subject_00298284384.eml

I want to match the From, To, and Subject fields. I was using split and splitting on the _, but that doesn't work when you have email addresses with _'s.

($from, $to, $sub, $crap) = split(/_/, $str, 4)

I want to use a regexp to match the first 3 fields only. This doesn't work...

if ($str =~ /^(\w+@\w+)_(\w+@\w+)_(\.*?)_\.*/)

$from = $1
$to = $2
$subject =$3


Thanks,
Chris
 
hi

This is one way...

$_ = 'me@foo.org_you@foo.com_This is the subject_00298284384.eml';

m/(.+@.+\..+)_(.+@.+\..+)_([a-zA-Z ]+)/;

print "from: $1\n";
print "to: $2\n";
print "subject: $3\n";


regards
Duncan
 
I forgot to mention...

it works fine even if there are underscores in the email addresses

Duncan
 
Thanks, that seems to work in most cases. I have this one example (a piece of spam) that for some strange reason won't match the subject...

str2 equals 1_nobody@yahoo.com_chris@foo.org_Best 0nline CASIN0 Awards _0DE18098.eml
from = 1_nobody@yahoo.com
to = chris@foo.org
sub = Best


Why is it only matching "Best" for the subject?


Thanks,
Chris

 
Also, is there any way around a string like this...

zp2tqwewq@yahoo.com_chris@foo.org_FWD_Via.gra Xanax.x Valium.m Vicodin.n ob bxrrmcdomxja_435F24CC.eml

It matches the second field as 'chris@foo.org_FWD', and the subject as "Via". It stops on any '.''s (dots).

Thanks,
Chris
 
I got it, for the most part (the $to still matches part of the $subject sometimes).

/(.+@.+\..+)_(.+@.+\..+?)_([a-zA-Z._ ]+)/

Chris
 
The underscore is the biggest worry, can the string be compiled with "_|_" as the sperator, that'd make your life a lot easier.

--Paul
 
Unfortunately not, its not a string that I'm creating, its the name of files in a directory.

Chris
 
Thanks for the star!

It's because of the CASIN'0' - zero instead of 'O'

This sorts it out...

$_ = '1_nobody@yahoo.com_chris@foo.org_Best 0nline CASIN0 Awards _0DE18098.eml';

m/(.+@.+\..+)_(.+@.+\..+)_([^_]+)_/;

print "from: $1\n";
print "to: $2\n";
print "subject: $3\n";


Duncan
 
I can't use the [^_] because I need to match underscores in the subject. The following works for 95% of the strings. It has trouble with strings that have special characters in the subject, such as &'-$%. Is there a way to include those characters in the third match (subject)?

/(.+@.+\..+)_(.+@.+\..+?)_(.+)_/

Also, notice I added the ? above, so it stops at the first underscore, otherwise the second match will sometimes include everything.

Thanks,
Chris
 
It does work - I tried it...

the ^_ means match anything BUT an underscore - which is necessary at that point in the regex

Duncan
 
Your right, it does work with special characters. Must be something with the CGI script thats messing those up, but thats another story. I had to remove the [^_] though because I need underscores in the subject.

Thanks a lot of your help Duncan,
Chris
 
Status
Not open for further replies.

Similar threads

Part and Inventory Search

Sponsor

Back
Top