Skip to content

willdurand/EmailReplyParser

Repository files navigation

EmailReplyParser

GitHub ActionsTotal DownloadsLatest Stable Version

EmailReplyParser is a PHP library for parsing plain text email content, based on GitHub's email_reply_parser library written in Ruby.

Installation

The recommended way to install EmailReplyParser is through Composer:

composer require willdurand/email-reply-parser

Usage

Instantiate an EmailParser object and parse your email:

<?phpuseEmailReplyParser\Parser\EmailParser; $email = (newEmailParser())->parse($emailContent);

You get an Email object that contains a set of Fragment objects. The Email class exposes two methods:

  • getFragments(): returns all fragments;
  • getVisibleText(): returns a string which represents the content considered as "visible".

The Fragment represents a part of the full email content, and has the following API:

<?php$fragment = current($email->getFragments()); $fragment->getContent(); $fragment->isSignature(); $fragment->isQuoted(); $fragment->isHidden(); $fragment->isEmpty();

Alternatively, you can rely on the EmailReplyParser to either parse an email or get its visible content in a single line of code:

$email = \EmailReplyParser\EmailReplyParser::read($emailContent); $visibleText = \EmailReplyParser\EmailReplyParser::parseReply($emailContent);

Known Issues

Quoted Headers

Quoted headers aren't picked up if there's an extra line break:

On <date>, <author> wrote: > blah 

Also, they're not picked up if the email client breaks it up into multiple lines. GMail breaks up any lines over 80 characters for you.

On <date>, <author> wrote: > blah 

The above On ....wrote: can be cleaned up with the following regex:

$fragment_without_date_author = preg_replace( '/\nOn(.*?)wrote:(.*?)$/si', "", $fragment->getContent() );

Note though that we're search for "on" and "wrote". Therefore, it won't work with other languages.

Possible solution: Remove "[email protected]" lines...

Weird Signatures

Lines starting with - or _ sometimes mark the beginning of signatures:

Hello -- Rick 

Not everyone follows this convention:

Hello Mr Rick Olson Galactic President Superstar Mc Awesomeville GitHub **********************DISCLAIMER*********************************** * Note: blah blah blah * **********************DISCLAIMER*********************************** 

Strange Quoting

Apparently, prefixing lines with > isn't universal either:

Hello -- Rick ________________________________________ From: Bob [[email protected]] Sent: Monday, March 14, 2011 6:16 PM To: Rick 

Unit Tests

Setup the test suite using Composer:

$ composer install 

Run it using PHPUnit:

$ ./vendor/bin/simple-phpunit 

Contributing

See CONTRIBUTING file.

Credits

  • GitHub
  • William Durand

License

EmailReplyParser is released under the MIT License. See the bundled LICENSE file for details.

About

PHP library for parsing plain text email content.

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Packages

No packages published

Contributors 31

Languages