Support unicode characters at trim_suffix (ellipsis)

Created on 4 August 2020, over 4 years ago
Updated 1 December 2023, 12 months ago

Problem/Motivation

If the user tries to set at "Suffix" field configuration a Unicode character this string is represented as is, not as should by

I.e.

Suffix = \u2026 is represented as "\u2026"
Suffix = \u2026 should be represeted as "..." (at one character)

Proposed resolution

Let represent as Unicode character

$unicodeChar = '\u1000';
echo json_decode('"'.$unicodeChar.'"');

Another option would be to use mb_convert_encoding()
echo mb_convert_encoding('က', 'UTF-8', 'HTML-ENTITIES');

🐛 Bug report
Status

Fixed

Version

2.1

Component

Code

Created by

🇧🇴Bolivia vacho Cochabamba

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Merge Requests

Comments & Activities

Not all content is available!

It's likely this issue predates Contrib.social: some issue and comment data are missing.

  • Merge request !5Support unicode for trim suffix → (Merged) created by jedihe
  • 🇺🇸United States ultimike Florida, USA

    Needs test(s) and guidance about if this introduces a security issue.

  • 🇺🇸United States ultimike Florida, USA

    The patch and MR have different changes - someone needs to:

    • Determine which, if either, fix the original issue (using a unicode code as an ellipse).
    • Ensure the MR is up-to-date against 2.1.x.
    • Add a test.

    -mike

  • Status changed to Needs review about 1 year ago
  • 🇺🇸United States ultimike Florida, USA

    I updated the current MR against 2.1.x, added some tests, and decided to use `json_decode()`. I didn't find anything that led me to believe that using `json_decode()` is a security issue, but I did wrap it in an `Html::escape()` to be safe. Tests are passing, needs a review or two.

    I also added a `#description` to the formatter config "Suffix" field reading, "Unicode character identifiers of the form \u2026 allowed.".

    -mike

  • Status changed to RTBC about 1 year ago
  • 🇺🇸United States markie Albuquerque, NM

    Tested locally and was able to convert \u2026 to ...

    Anyone want to look at this before I merge?

  • First commit to issue fork.
  • 🇮🇪Ireland lostcarpark

    I have tested that putting "\u2026" in the suffix in the 2.1.x branch and in the issue fork, and verified that without this change, "\u2026" gets appended to the text. With the change, it correctly gets converted to "…".

    I've reviewed the code and it looks good to me.

    The one thing that occurs to me is that if for some reason you actually want to have "\u" in the suffix, you should be able to do so. I have verified that if you put "\\u" in the suffix, it converts to "\u". I feel it's worth adding a test for that case, so I have added that to TruncateHtmlTest.php. Note that because you can have \" in single quotes, \\ converts to a single slash, so you need \\\\ to represent a double backslash.

  • Status changed to Needs review about 1 year ago
  • 🇬🇧United Kingdom natts London

    *bump* :-)

  • Status changed to Fixed 12 months ago
  • 🇺🇸United States ultimike Florida, USA

    Thanks to all for helping us get this across the finish line!

    -mike

  • Automatically closed - issue fixed for 2 weeks with no activity.

Production build 0.71.5 2024