×
INTELLIGENT WORK FORUMS
FOR COMPUTER PROFESSIONALS

Contact US

Log In

Come Join Us!

Are you a
Computer / IT professional?
Join Tek-Tips Forums!
  • Talk With Other Members
  • Be Notified Of Responses
    To Your Posts
  • Keyword Search
  • One-Click Access To Your
    Favorite Forums
  • Automated Signatures
    On Your Posts
  • Best Of All, It's Free!

*Tek-Tips's functionality depends on members receiving e-mail. By joining you are opting in to receive e-mail.

Posting Guidelines

Promoting, selling, recruiting, coursework and thesis posting is forbidden.

Students Click Here

Extracting emails using a regex
2

Extracting emails using a regex

Extracting emails using a regex

(OP)
Hello:

I have found out how to extract emails with some code from the Perl Cookbook, but I can not get it to work.

What I would like to know how to find all of the email addresses in a file and replace the email address with a link that says:


<a href="mailto:email@emailaddress.com">email@emailaddress.com</a>


I know that this would be a pretty complicated regex, but any help is appreciated.

Thanks,

-Vic

vic cherubini
malice365@hotmail.com
www.epicsoftware.com
====
Knows: Perl, HTML, JavScript, C/C++, PHP, Flash, Director
====

RE: Extracting emails using a regex

s/(\b.+?\@.+?\..+?\b)/<a href="$1">$1<\/a>/g;

will get the general format of email addresses.

adam@aauser.com

RE: Extracting emails using a regex

You forgot the mailto: ...

s/(\b.+?\@.+?\..+?\b)/<a href="mailto:$1">$1<\/a>/g;


Sincerely,
 
Tom Anderson
CEO, Order amid Chaos, Inc.
http://www.oac-design.com

RE: Extracting emails using a regex

(OP)
Yeah I saw that, and added it.

Thanks for all the help though.

-Vic

vic cherubini
malice365@hotmail.com
www.epicsoftware.com
====
Knows: Perl, HTML, JavScript, C/C++, PHP, Flash, Director
====

RE: Extracting emails using a regex

(OP)
One last question if you don't mind.

I modified the expression to say:


s/(\@.*?\..+?\b)/<a href="mailto:$1">$1<\/a>/g;


I don't think that I made myself clear in my first post.

Say I have a string that says:


$data = "If you want to email me, please do so at me\@myaddress.com";


What I want to do is extract the me@myaddress.com out and place it in the <a href="mailto:me@myaddress.com>me@myaddress.com</a>.

With the modified expression, I am getting the @myaddress.com part only, and not the me part. My logic would to find the @ sign and trace backwards until you find a space and then delete the space and add all of the characters that were found until the space. Is this at all possible or am I making it too hard?

Thanks for all the help, though.


-Vic

vic cherubini
malice365@hotmail.com
www.epicsoftware.com
====
Knows: Perl, HTML, JavScript, C/C++, PHP, Flash, Director
====

RE: Extracting emails using a regex

    Look at the difference between their code and yours (their's first):

s/(\b.+?\@.+?\..+?\b)/<a href="mailto:$1">$1<\/a>/g;
     s/(\@.*?\..+?\b)/<a href="mailto:$1">$1<\/a>/g;

 You made two changes.  One is you changed the first "+" after the @ into a "*".  That doesn't really matter.  Second, you left out the initial "\b.+?", which is the part that will match the "me" of "me@myaddress.com".


"If you think you're too small to make a difference, try spending a night in a closed tent with a mosquito."

RE: Extracting emails using a regex

(OP)
Stillflame:

I tried that at first, and if I have a string that says:


$string = "This is a string with an email in it me\@myaddress.com";
$string=~ s/(\b.+?\@.+?\..+?\b)/<a href="mailto:$1">$1<\/a>/g;
print "$string\n";


Perl will print:


<a href="mailto:This is a string with an email in it me@myaddress.com>This is a string with an email in it me@myaddress.com</a>


I see the logic of the regex. That the first \b matches the first word boundry (the me part of the address), the \@. matches the @. in the email and the last \b matches the last word boundry (the com part of the email address).

Could there possibly be somthing wrong with my version of perl? I am running Activestate's version on Windows ME. Has there been any reported cause of a mishap in the regex engine for Activestate?

I was sitting at my computer till 12 last night with Programming Perl in my lap and The Perl Cookbook on my desk and I couldn't, for the life of me, get the regex to work. It looks so logical, and seems that it would work, but doesn't.

Thank you for your time and dedication.


-Vic

P.S. And I have spent the night in a closed tent with a mosquito. I am in Boy Scouts. =)  

vic cherubini
malice365@hotmail.com
www.epicsoftware.com
====
Knows: Perl, HTML, JavScript, C/C++, PHP, Flash, Director
====

RE: Extracting emails using a regex

the regex should work... i haven't had any problems with activestate, but, i haven't run it on ME.

adam@aauser.com

RE: Extracting emails using a regex

I've played with a lot of regex stuff with activeState's ports and on several UNIX platforms.  The regex engine is rock solid.  It is one of Perl's long demonstrated strengths.  .....Humans trying to figure out how to use regex's..... now that is another thing (including me) ;^)

Almost invariably, when I can't get them to work, it is some simple assumption that I have made with out realizing that I've made an assumption.

I'm sure it is doing what you are asking it to do.


 
 
 keep the rudder amid ship and beware the odd typo

RE: Extracting emails using a regex

No, he's right, it does match that whole string.  The first
"\b" hits right before the first character of the string,
then the ".+?" matches the rest of the string up to the "@"
symbol.  He's going to need to get the actual regex that
matches valid email addresses.  I've only seen it on
someone else's computer, and i don't know where he found
it, but it was in a module.  The actual regex was
aproximately 30 lines of code, with really nasty zero-width lookaheads in it, but as i started to work out how to write
the correct regex, i realised that they were needed.  It may
be in CGI, i'll start looking, but if anybody knows exactly
where it's at, it would be greatly appreciated.


"If you think you're too small to make a difference, try spending a night in a closed tent with a mosquito."

RE: Extracting emails using a regex

(OP)
Thanks for all the help, guys!

I checked CPAN today for something that does this, and found something, but the regex was 100 lines of code (well 94 to be exact). How does one know how to write that?

Anyways, I have temporarily fixed it with the following code. It may not be the best way, in fact its probably pretty cryptic, but it works:


$data = "my name is vic and my email address is vikter@epicsoftware.com and here is some more text;


$data =~ s/\B//g;
@data = split(/ /,$data);

open(FILE,">>emails.txt") || die("failed to open file: $!");
foreach $var (@data) {
    if ($var =~ /^.+\@(\[?)[a-zA-Z0-9\-\.]+\.([a-zA-Z]{2,3}|[0-9]{1,3})(\]?)$/) {
        print FILE "<a href=\"mailto:$var\">$var</a>\n";
    } else {
        print "$var\n";
    }
}


The code splits all of the words in the $data variable into different members of the @data array. Then the foreach loop goes through the array, checks with a regex if the emails are valid email addresses and if so, prints them to a file, and if it isn't an email address, it does not print it to the email, instead it prints it to the screen.


Thanks for all the help though, and if you find anything on how to do this better, I would love to know.

Thanks again,

-Vic

vic cherubini
malice365@hotmail.com
www.epicsoftware.com
====
Knows: Perl, HTML, JavScript, C/C++, PHP, Flash, Director
====

Red Flag This Post

Please let us know here why this post is inappropriate. Reasons such as off-topic, duplicates, flames, illegal, vulgar, or students posting their homework.

Red Flag Submitted

Thank you for helping keep Tek-Tips Forums free from inappropriate posts.
The Tek-Tips staff will check this out and take appropriate action.

Reply To This Thread

Posting in the Tek-Tips forums is a member-only feature.

Click Here to join Tek-Tips and talk with other members! Already a Member? Login


Close Box

Join Tek-Tips® Today!

Join your peers on the Internet's largest technical computer professional community.
It's easy to join and it's free.

Here's Why Members Love Tek-Tips Forums:

Register now while it's still free!

Already a member? Close this window and log in.

Join Us             Close