MailFormatHelper::htmlToText() incorrect handling of newlines in anchor links

Created on 16 September 2008, over 16 years ago
Updated 1 May 2024, 12 months ago

Problem/Motivation

The function drupal_html_to_text does not handle the situation where there are newlines in the link text. For example, given the test sequence:

$html_text = "<a href=\"http://some.url\">Link \ntext</a>";
$plain_text = drupal_html_to_text($html_text);

We get:

Link text

when we should get:

Link text [1]

[1] http://some.url

The proposed simple fix is to add the 's' (PCRE_DOTALL) regexp modifier to the relevant pattern: a patch is attached.

For information, the reason I encountered this problem is I was using the 6.4 simplenews module which is effectively still in beta stage. I was providing it with a long string of HTML text (without any nl's) for emailing as html mail, and in this case also a plain text segment needs to be included in the resulting multipart email message (a routine called simpletext_html_to_text calls drupal_html_to_text). However various nl's get added and deleted all along the way, in a rather complex way I don't understand, including in the middle of link texts, and at a point when they are present this drupal function gets called with the resulting observed problem of missing footnotes.

Steps to reproduce

Proposed resolution

TBA

Remaining tasks

TBA

User interface changes

API changes

Data model changes

Release notes snippet

πŸ› Bug report
Status

Needs work

Version

11.0 πŸ”₯

Component
Base  β†’

Last updated about 1 hour ago

Created by

πŸ‡¬πŸ‡§United Kingdom Liberation

Live updates comments and jobs are added and updated live.
  • Needs issue summary update

    Issue summaries save everyone time if they are kept up-to-date. See Update issue summary task instructions.

Sign in to follow issues

Merge Requests

Comments & Activities

Not all content is available!

It's likely this issue predates Contrib.social: some issue and comment data are missing.

Production build 0.71.5 2024