When format is set to NULL ampersands are double-escaped

Created on 23 January 2020, almost 5 years ago
Updated 26 February 2024, 9 months ago

Problem/Motivation

We noticed that ampersands were being double-escaped when smart_trim was involved with processing a field for display via EntityViewDisplay. This happened when the render array that was created had #format set to NULL.

The format is expected to be provided by $item in the viewElements() function of the SmartTrimFormatter.

See https://git.drupalcode.org/project/smart_trim/blob/8.x-1.2/src/Plugin/Fi...

Proposed resolution

At the moment I've written a workaround using hook_entity_display_build_alter to set the #format to 'basic_html' when #formatter is set to "smart_trim". But it feels like the format should be set in the configuration for the field or field formatter somewhere...

If someone could show me the correct way to set #format for the smart_trim field formatter that would be great, thanks!

Here's some code which seems to work for us at the moment:

/**
 * Fix double-escaped output.
 *
 * Set default #format to basic_html for fields processed by smart_trim.
 * (Only when the #format key exists and is set to NULL).
 *
 * Implements hook_entity_display_build_alter().
 *
 * @param array $build
 * @param array $context
 * @return void
 */
function mymodule_entity_display_build_alter(&$build, $context) {

  // Loop through fields being built for display.
  foreach (Element::children($build) as $field_name) {

    // Use local variable to work with field element.
    $element =& $build[$field_name];

    // Only consider fields being formatted by smart_trim.
    if (isset($element['#formatter']) && $element['#formatter'] == 'smart_trim') {

      // Loop over render arrays built by smart_trim.
      foreach (Element::children($element) as $item_key) {

        if (!isset($element[$item_key]['#type'])) {
          continue;
        }

        // Set default value to 'basic_html'.
        if (
          $element[$item_key]['#type'] == 'processed_text' &&
          array_key_exists('#format', $element[$item_key]) &&
          is_null($element[$item_key]['#format'])
        ) {
          $element[$item_key]['#format'] = 'basic_html';
        }

      }
    }
  }
}
🐛 Bug report
Status

Fixed

Version

2.1

Component

Miscellaneous

Created by

🇳🇿New Zealand code_brown

Live updates comments and jobs are added and updated live.
  • Needs tests

    The change is currently missing an automated test that fails when run with the original code, and succeeds when the bug has been fixed.

Sign in to follow issues

Comments & Activities

Not all content is available!

It's likely this issue predates Contrib.social: some issue and comment data are missing.

  • 🇩🇪Germany a.dmitriiev

    Re-roll the patch for 2.1.x branch

  • Open in Jenkins → Open on Drupal.org →
    Core: 10.0.7 + Environment: PHP 8.1 & MySQL 5.7
    last update over 1 year ago
    18 pass, 4 fail
  • 🇳🇱Netherlands dennis_meuwissen

    The patch from #11 contains a mistake in line 65, this is probably supposed to be $format ?: $item->format instead of $format ?? $item->format. Otherwise leaving the format selector set to Default will use an empty string as the text format, which does not exist and causes a "Missing text format ." in the Drupal logs. And the output of the formatter will be blank. Attached is a patch with that issue fixed.

  • 🇺🇸United States ultimike Florida, USA

    Thanks to everyone for their assistance with this one so far.

    Before we move forward with a fix, we really need to write a test for this issue. In order to do that, we need a set of clear instructions to reproduce the issue. Here's what I did (on the 2.1.x branch).

    1. Created a new "Text (plain, long)" field.
    2. Set that field to be trimmed using Smart Trim.
    3. Created a node with the field that starts with the phrase "Fire & ice".

    When viewing the Smart Trimmed field, the output is: "Fire & ice". I assume this is the issue.

    Looking at both the MR and recent patch, I'm not sure why a new "Filter" config is necessary (or wanted). Both the patch and the MR allow the site builder to select the text format to be used for the smart trimmed content. Why should this be an option? Why shouldn't the smart trimmed content use the same filter as field? What am I missing here?

    If the issue is just the fact that when a plain text field is smart trimmed, it outputs & instead of &, then that is the issue that needs to be fixed.

    Thoughts?

    -mike

  • 🇮🇪Ireland lostcarpark

    So what I think is happening:

    • When a format is specified, Smart Trim is encoding the & but the format rules aren't generally modifying this.
    • When no format is specified, plain text is used, which encodes &, but as Smart Trim already encoded it, the resultant double occurs.

    The Format selector in the patch feels like a hack to me. and it feels a bit of a blunt instrument to add a default format to the FieldWidget in case the format isn't specified.

    I'm left wondering should it be Smart Trim's job to encode the & at all? If the field format is going to take care of encoding, should we not just leave encoding for the format to encode if require?

    If encoding & is required when the field is formatted, then I would suggest adding a check of the format, and if not specified, convert the & back to & rather than add a new property to Smart Trim.

  • 🇺🇸United States markie Albuquerque, NM

    I do remember there was an issue a long time ago about the ampersand being borked after trimming, so I do want to make sure our output is encoded. I definitely don't want to add a new property or setting as we are trying to stream line the config... pondering...

    either way this needs tests and the MR needs updating to the latest branch (though I think I can do it)

  • @ultimike opened merge request.
  • 🇺🇸United States ultimike Florida, USA
  • Assigned to markie
  • Status changed to Needs review over 1 year ago
  • 🇺🇸United States ultimike Florida, USA

    Soooo - I learned a lot about issue forks on this task - hence all the branches and what not. Thanks to @xjm at Drupal Dev Days for showing me how to create a new issue fork branch and MR - this is a skill I'll be utilizing in the future.

    Regardless, I believe I found a much simpler solution for this issue, and have added tests to demonstrate.

    D10 tests are passing. D9 tests will pass once 🐛 Read more link always displayed in some settings Fixed is merged into 2.1.x

    -mike

  • 🇮🇪Ireland lostcarpark

    I agree with the approach here. The test looks good.

    The pipeline failed, but it seems to be a reflection error on TruncateHTMLTest, so I suspect the problem may be the branch you are merging against, and nothing to do with this change. Hopefully some tinkering with the repo can fix that.

  • Status changed to RTBC over 1 year ago
  • 🇺🇸United States ultimike Florida, USA

    Looks good to me!

    -mike

  • Status changed to Fixed over 1 year ago
  • 🇺🇸United States markie Albuquerque, NM

    Thanks for the help. Merged MR and added to changelog for next release.

  • 🇩🇪Germany a.dmitriiev

    Uploading the patch for version 2.1.0 (without test), because there is no new stable version that includes the patch yet.

  • 🇮🇪Ireland lostcarpark

    Just for future reference, you can get the patch file by appending .patch to the MR URL.

    For example, the patch link here is: https://git.drupalcode.org/project/smart_trim/-/merge_requests/62.patch

    There is no need to upload patch files when using merge requests.

  • 🇩🇪Germany a.dmitriiev

    The change to tests doesn't apply to 2.1.0, that is why the patch was uploaded with only change to formatter class. I am not trying to hunt for credits here.

  • Automatically closed - issue fixed for 2 weeks with no activity.

  • Status changed to Fixed 12 months ago
  • leymannx Berlin

    There is no need to upload patch files when using merge requests.

    There's one famous downside to using MR patches. Imagine someone else pushes new code to the MR. The patch will change. Now imagine someone pushes malicious code to the MR. And you pull it in without knowing.

    Static patches are still the preferred method until GitLab lets us pull in patches by commit reference.

  • 🇮🇪Ireland lostcarpark

    @leymannx, while I agree static patches are best, it's not necessary to upload the patch file just so it can be statically linked. I save all patches a directory in my project repository so Composer is only applying local patches. That prevents any danger from the patch (whether a file or auto-generated from MR) being modified, or just getting deleted from the issue thread, and ensures the patches being applied are what you think they are.

    I do take the point that patch files are still useful for applying the change to a different version branch.

  • 🇩🇪Germany yannickoo Berlin

    FYI it's also possible to open a specific commit in GitLab and then just append .patch which might be safer when downloading patches from remote.

Production build 0.71.5 2024