Double encoding headers causes weird errors

Created on 16 September 2022, over 2 years ago
Updated 9 June 2023, almost 2 years ago

I'm using the mimemail module along with phpmailer_smtp module.
(Not sure whether it's related to phpmailer_smtp or not, most likely not)

When the site name is a UTF-8 string, the email's From field will become something like this:

وبسایت <info@site.com>

The utf8 part of the From field is first encoded using Symfony\Component\Mime\Header\UnstructuredHeader
(See https://www.drupal.org/node/3207439 )

later, in the mimemail module, the encoded string is re-encoded by Drupal\Component\Utility\Unicode::mimeHeaderEncode()

(See Drupal\mimemail\Utility\MimeMailFormatHelper::mimeMailHeaders()
(Keep in mind that Unicode::mimeHeaderEncode() is now deprecated )

This double-encoding, causes weird errors,
I have spent many hours debugging line-by-line, but still not sure WTF is going on.

So, I'm posting this issue here, hope someone with greater knowledge could help.

BTW, as a temporary workaround, I have commented the following code, to get it to work.

/src/Utility/MimeMailFormatHelper.php line 746 - 748

// foreach ($headers as $field_name => $field_body) {
//   $headers[$field_name] = Unicode::mimeHeaderEncode($field_body);
// }
🐛 Bug report
Status

Postponed: needs info

Version

1.0

Component

Code

Created by

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Comments & Activities

Not all content is available!

It's likely this issue predates Contrib.social: some issue and comment data are missing.

  • Status changed to Postponed: needs info about 2 years ago
  • 🇺🇸United States tr Cascadia

    Mime Mail is not encoding this twice, but core may have changed to do encoding where it previously didn't. We have seen that on the "Return-Path" header but I have not seen a problem with the "From" header.

    Note Mime Mail is no longer using Unicode::mimeHeaderEncode(), which to be clear is perfectly valid for use in Drupal 9. "Deprecated" doesn't mean wrong, it means it will be removed in Drupal 10. Mime Mail does not yet have a Drupal 10 version.

    Please test using the latest version of Mime Mail, and if it isn't fixed please provide instructions to reproduce this error with the Mime Mail Example module (included as a submodule of Mime Mail). See https://www.drupal.org/docs/contributed-modules/mime-mail/testing-and-tr...

  • 🇺🇸United States tr Cascadia

    Oh, and for future reference, this issue had the wrong status. "Needs work" means there is a patch available but that patch is incomplete or has failing tests. The proper status for a new issue is usually "Active" unless there's an actual patch, in which case it should be "Needs review".

    This matters because the status influences what issues get looked at and worked on. As the only active maintainer of this module, when I see a "Needs work" issue in the queue that usually means I've flagged the issue as not ready to be committed, or that someone has proposed a patch that failed the automated testing, so I am unlikely to look at that issue again until the things that need work are addressed and the status set back to "Needs review".

  • 🇯🇵Japan iwahashi

    I created following patch to fix this issue.

    --- src/Utility/MimeMailFormatHelper.php.orig 2023-03-24 06:16:53.000000000 +0900
    +++ src/Utility/MimeMailFormatHelper.php    2023-06-09 15:00:09.844483234 +0900
    @@ -792,6 +792,7 @@
         // Run all headers through mime_header_encode() to convert non-ASCII
         // characters to an RFC compliant string, similar to drupal_mail().
         foreach ($headers as $field_name => $field_body) {
    +      $field_body = str_replace("\r\n", '', $field_body);
           $headers[$field_name] = (new UnstructuredHeader($field_name, $field_body))->getBodyAsString();
         }
    
    

    I'll explain why this patch is valid.
    Before the foreach loop, $headers['From'] in my case is

    =?utf-8?Q?Drupal9=E9=96=8B=E7=99=BA=E7=94=A8=E3=82=B5=E3=82=A4?=
     =?utf-8?Q?=E3=83=88?= <root@drupal9-devel.XXX>

    And after the foreach loop, it becomes without the patch

    =?utf-8?Q?=3D=3Futf-8=3FQ=3FDrupal9=3DE9=3D96=3D8B=3DE7=3D99?=
     =?utf-8?Q?=3DBA=3DE7=3D94=3DA8=3DE3=3D82=3DB5=3DE3=3D82=3DA4=3F=3D?=
     =?utf-8?Q??= =?utf-8?Q?=E3=83=88?= <root@drupal9-devel.XXX>

    The getBodyAsString() function flow is as follows.
    vendor/symfony/mime/Header/UnstructuredHeader.php: getBodyAsString() → $this->encodeWords()
    vendor/symfony/mime/Header/AbstractHeader.php: encodeWords() → $this->tokenNeedsEncoding()
    vendor/symfony/mime/Header/AbstractHeader.php: tokenNeedsEncoding() → preg_match()

    protected function tokenNeedsEncoding(string $token): bool
        {
            return (bool) preg_match('~[\x00-\x08\x10-\x19\x7F-\xFF\r\n]~', $token);
        }

    If $field_body contains LF or CR, UnstructuredHeader()->getBodyAsString() encodes $field_body again.
    So, we must remove extra LF and CR before UnstructuredHeader()->getBodyAsString().

Production build 0.71.5 2024