wordwrap() with $cut_long_words = TRUE could split multi-byte UTF-8 characters

Created on 3 June 2024, 8 months ago
Updated 4 June 2024, 8 months ago

Problem/Motivation

In 2014, @maximpodorov asked β†’

BTW, is it proper to use wordwrap(), strlen() and other byte level function for UTF-8 texts?

I took a look and no, I don't think it's proper for Drupal\Core\Mail\MailFormatHelper::wrapMailLine() to call wordwrap() function with $cut_long_words = TRUE, because it could split multi-byte UTF-8 characters.

Steps to reproduce

The MailFormatHelper class could produce invalid UTF-8 if a line is very long (greater than 996 bytes), has multi-byte characters, and has no spaces, thus triggering wrapping at non-space characters. This is especially likely to happen in languages that don't use spaces.

Proposed resolution

The second call to wordwrap() with $cut_long_words = TRUE should be replaced with something that can safely handle multi-byte characters.

Remaining tasks

User interface changes

API changes

Data model changes

Release notes snippet

πŸ› Bug report
Status

Closed: duplicate

Version

11.0 πŸ”₯

Component
MailΒ  β†’

Last updated 19 days ago

No maintainer
Created by

πŸ‡ΊπŸ‡ΈUnited States mfb San Francisco

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Comments & Activities

Production build 0.71.5 2024