×
INTELLIGENT WORK FORUMS
FOR COMPUTER PROFESSIONALS

Log In

Come Join Us!

Are you a
Computer / IT professional?
Join Tek-Tips Forums!
  • Talk With Other Members
  • Be Notified Of Responses
    To Your Posts
  • Keyword Search
  • One-Click Access To Your
    Favorite Forums
  • Automated Signatures
    On Your Posts
  • Best Of All, It's Free!

*Tek-Tips's functionality depends on members receiving e-mail. By joining you are opting in to receive e-mail.

Posting Guidelines

Promoting, selling, recruiting, coursework and thesis posting is forbidden.

Students Click Here

Removing extraneous info from emails

Removing extraneous info from emails

Removing extraneous info from emails

(OP)
Hello. I have been given a Perl script that [supposedly] strips extraneous info from emails before they are inserted into a database as comments. The script works GREAT with mail sent from lotus notes, not at all with messages from Outlook 98, and barely with mail sent from Outlook Express.
What I am looking for is _any_ information that will explain how the extra info is being removed, so that I can customize the script to format all mail the same way. Included below is the section of script that removes the info from the message, as well as a copy of a 'processed' message sent via Outlook Express.
TIA for any help!!
Jim
-----------Begin Code---------------
# We need to remove the extraneous information from the
#  addressing fields
    @tmp=split('"',$message{'From'});
    $message{'From'}=$tmp[1];
    
    if ($message{'To'}=~m"<") {
         @tmp=split(',',$message{'To'});

        foreach(@tmp) {
             @tmp2=split('<',$_,2);
             @tmp3=split('>',$tmp2[1],2);
             $_=@tmp3[0];
        }
        $message{'To'}=join(',',@tmp);
    }
    if ($message{'cc'}) {
        @tmp=split(',',$message{'cc'});
        foreach(@tmp) {
            @tmp2=split('<',$_,2);
            @tmp3=split('>',$tmp2[1],2);
           $_=@tmp3[0];
        }
        $message{'cc'}=join(',',@tmp);
    } else {
        $message{'cc'}=" ";
    }
# Date needs to be corrected to the dd-Mon-yyyy format
    $result=($message{'Date'}=~s"([0-9]) (...) ([0-9])"$1-$2-$3");
    @tmp=split(' ',$message{'Date'});
    $message{'Date'}=$tmp[1];
# Contents and subject must be escaped
    $result=($message{'Subject'}=~s"'"''"g);
    $result=($message{'Subject'}=~s"`"``"g);
    $result=($message{'contents'}=~s"'"''"g);
    $result=($message{'contents'}=~s"`"``"g);
    $result=($message{'contents'}=~s"\\"\\\\"g);
    @tmp=split(' ',$message{'Subject'});
    $catno="";
    foreach(@tmp) {
        if ($_=~m"[0-9]-") {
            $catno=$_;
        }
    }
----End Code---------------------
Results from a processed Outlook Express message:
------Begin results--------------
multipart/alternative; boundary="----=_NextPart_000_0005_01BFECC9.1AF66440" X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 5.50.4133.2400 X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4133.2400

This is a multi-part message in MIME format.

------=_NextPart_000_0005_01BFECC9.1AF66440 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable

THis test message=20 sent via Outlook Express

------=_NextPart_000_0005_01BFECC9.1AF66440 Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable


THis test message sent via Outlook = Express
------=_NextPart_000_0005_01BFECC9.1AF66440--
-----End results--------------

RE: Removing extraneous info from emails

# We need to remove the extraneous information from the
#  addressing fields
# looks like you have an associative array called %message already set up
# next two lines grab the first "From" address from the From list

    @tmp=split('"',$message{'From'});
    $message{'From'}=$tmp[1];

# this section splits up the "To" addresses
# if there's a < character in there
    if ($message{'To'}=~m"<") {
# make an array (@tmp) containing each address
         @tmp=split(',',$message{'To'});
# then - for each address
        foreach(@tmp) {
# chop off the < and > characters
             @tmp2=split('<',$_,2);
             @tmp3=split('>',$tmp2[1],2);
# and save the bare email address (without the < and > characters)
             $_=@tmp3[0];
        }
# save @tmp in the "to" element of %message
        $message{'To'}=join(',',@tmp);
    }
# same - but for the "cc" element (hmmm, should that be "Cc"?)
    if ($message{'cc'}) {
        @tmp=split(',',$message{'cc'});
        foreach(@tmp) {
            @tmp2=split('<',$_,2);
            @tmp3=split('>',$tmp2[1],2);
           $_=@tmp3[0];
        }
        $message{'cc'}=join(',',@tmp);
    } else {
        $message{'cc'}=" ";
    }
# Date needs to be corrected to the dd-Mon-yyyy format
# matches date fields and reformats
# not sure if that first match [0-9] will work as it should match two digits, not one
    $result=($message{'Date'}=~s"([0-9]) (...) ([0-9])"$1-$2-$3");
    @tmp=split(' ',$message{'Date'});
    $message{'Date'}=$tmp[1];
# Contents and subject must be escaped
    $result=($message{'Subject'}=~s"'"'"g);
    $result=($message{'Subject'}=~s"`"``"g);
    $result=($message{'contents'}=~s"'"'"g);
    $result=($message{'contents'}=~s"`"``"g);
    $result=($message{'contents'}=~s"\\"\\\\"g);
    @tmp=split(' ',$message{'Subject'});
    $catno="";
    foreach(@tmp) {
        if ($_=~m"[0-9]-") {
            $catno=$_;
        }
    }
----End Code---------------------

run out of time - sorry - baby needs to be fed!

Mike
michael.j.lacey@ntlworld.com
Cargill's Corporate Web Site

RE: Removing extraneous info from emails

(OP)
Thank you very much!!!
You are a great help! Hope the baby is full :)
Can you shed a little light on the following section as I'm almost sure this is where my problem is.
--------Begin Code--------------
# Contents and subject must be escaped
    $result=($message{'Subject'}=~s"'"''"g);
    $result=($message{'Subject'}=~s"`"``"g);
    $result=($message{'contents'}=~s"'"''"g);
    $result=($message{'contents'}=~s"`"``"g);
    $result=($message{'contents'}=~s"\\"\\\\"g);
    @tmp=split(' ',$message{'Subject'});
    $catno="";
-------End Code-----------------

TIA!!!
Jim

RE: Removing extraneous info from emails

baby's fine <grin>

let us know how you get on with the script, this kind of thing is a devil to test


# Contents and subject must be escaped
# this section "escapes" certain characters so that they will insert
# correctly into the database - you can't just insert a string
# like 'Michael's', it has to be 'Michael's'
# use / rather than " as the match character - much easier to read
# replace all ' with ' in the 'Subject' element of %message
$result=($message{'Subject'}=~s/'/'/g);
# replace all ` with `` in the 'Subject' element of %message
$result=($message{'Subject'}=~s/`/``/g);
# replace all ' with `` in the 'contents' element of %message
$result=($message{'contents'}=~s/'/'/g);
# replace all ` with `` in the 'contents' element of %message
$result=($message{'contents'}=~s/`/``/g);
# replace all \ with \\ in the 'contents' element of %message
$result=($message{'contents'}=~s/\\/\\\\/g);
# put each word in 'Subject' into the elements of the @tmp array
@tmp=split(' ',$message{'Subject'});

Mike
michael.j.lacey@ntlworld.com
Cargill's Corporate Web Site

RE: Removing extraneous info from emails

(OP)
> let us know how you get on with the script, this kind of > thing is a devil to test
Your not kidding there! The main problem is the script works great on messages sent via Lotus Notes (v4.6a) but doesnt strip out enough on messages sent via Outlook Express, and strips so much on messages sent via Outlook 98 that they never show up! This is a very difficult project. Your help has been invaluable. Trying to find examples or resources for this type of script is rare at best. Thanks again, and if you have any other thoughts please let me know, otherwise I'll let ya know how it turns out.
Thanks!
Jim

Red Flag This Post

Please let us know here why this post is inappropriate. Reasons such as off-topic, duplicates, flames, illegal, vulgar, or students posting their homework.

Red Flag Submitted

Thank you for helping keep Tek-Tips Forums free from inappropriate posts.
The Tek-Tips staff will check this out and take appropriate action.

Reply To This Thread

Posting in the Tek-Tips forums is a member-only feature.

Click Here to join Tek-Tips and talk with other members! Already a Member? Login

Close Box

Join Tek-Tips® Today!

Join your peers on the Internet's largest technical computer professional community.
It's easy to join and it's free.

Here's Why Members Love Tek-Tips Forums:

Register now while it's still free!

Already a member? Close this window and log in.

Join Us             Close