Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations TouchToneTommy on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Extracting text from emails

Status
Not open for further replies.

spinalwiz

Programmer
Feb 25, 2003
32
IT
I'm trying to extract text from a multipart MIME message. The message contains text/plain and text/html parts. Ideally I would like to remove the MIME information and the html tags, leaving just the text. But extracting just the text/plain parts would be ok. How do I go about doing this?

Thanks
 
[tt]MIME::parser[/tt] does the first part of what you want:

[tt]use MIME::parser;
my $parser = new MIME::parser;
my $entity = $parser->parse( \*STDIN );
# loop through the attachments.
if ($entity->is_multipart()) {
foreach my $part ( $entity->parts() ) {
my $path = $part->bodyhandle->path();
process_bit( $path );
}
} else {
.....
[/tt]

This is just a flavour of what the beast can do!

The html2text utility is the easiest way to strip HTML tags - non-trivial at the best of times.

Good Luck ;-)
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top