Issue with embedded entities and Html::decodeEntities

Created on 21 February 2025, about 2 months ago

Problem/Motivation

When using the entity_embed module alongside this module, WYSIWYG embedded entities that are returned from Google Cloud translate have the

Directly after receiving translation from Google.

<drupal-entity data-entity-type="media" data-entity-uuid="bdf34a77-8019-41f6-8234-9ed11370e3cf" data-embed-button="image_bundle" data-entity-embed-display="entity_reference:entity_reference_entity_view" data-entity-embed-display-settings="{&quot;view_mode&quot;:&quot;gallery_full_width_no_thumbnail&quot;}"></drupal-entity>

Directly after running the text through Html::decodeEntities, the html entities are decoded as expected.

<drupal-entity data-entity-type="media" data-entity-uuid="bdf34a77-8019-41f6-8234-9ed11370e3cf" data-embed-button="image_bundle" data-entity-embed-display="entity_reference:entity_reference_entity_view" data-entity-embed-display-settings="{"view_mode":"gallery_full_width_no_thumbnail"}"></drupal-entity>

Now, when you go to view the translation on the node page the FilterHtml is removing it with the Xss:filter even though it is one of the allowed html tags. The end result is that it is like this, and the display-settings are stripped out, so the images render much small than the default translation that hasn't passed through Google translate.

<drupal-entity data-entity-type="media" data-entity-uuid="bdf34a77-8019-41f6-8234-9ed11370e3cf" data-embed-button="image_bundle" data-entity-embed-display="entity_reference:entity_reference_entity_view"></drupal-entity>

I'm currently working around this by re-encoding just the display-settings but I wonder if there is a fix we can apply in the module instead?

Steps to reproduce

* Install entity_embed.
* Set some display settings I guess.
* Use this module to request translation of some content using an embed inside a wysiwyg.
* Notice that the display-settings data attribute it stripped out.

Proposed resolution

Unsure of the best solution. Open to suggestions.

Remaining tasks

N/A

User interface changes

N/A

API changes

N/A

Data model changes

N/A

🐛 Bug report
Status

Active

Version

1.0

Component

Code

Created by

achap 🇦🇺

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Merge Requests

Comments & Activities

  • Issue created by @achap
  • 🇵🇹Portugal kallado

    @achap Let me try t find some time to fix this. We had already identified the problem in https://www.drupal.org/project/auto_node_translate_google/issues/3508218 🐛 Html should be decoded after translatio Active for the Auto Node Translate Google Provider . Seams to be the same issue and same solution.

  • achap 🇦🇺

    I think Html::decodeEntities is already working, the problem is when I look at the original node that has the embed on it (the one that hasn't been translated) it actually stores the embed with the Html entities in tact like:

    <drupal-entity data-entity-type="media" data-entity-uuid="bdf34a77-8019-41f6-8234-9ed11370e3cf" data-embed-button="image_bundle" data-entity-embed-display="entity_reference:entity_reference_entity_view" data-entity-embed-display-settings="{&quot;view_mode&quot;:&quot;gallery_full_width_no_thumbnail&quot;}"></drupal-entity>
    

    That doesn't get stripped by Xss::filter on output but the decoded one does:

    <drupal-entity data-entity-type="media" data-entity-uuid="bdf34a77-8019-41f6-8234-9ed11370e3cf" data-embed-button="image_bundle" data-entity-embed-display="entity_reference:entity_reference_entity_view" data-entity-embed-display-settings="{"view_mode":"gallery_full_width_no_thumbnail"}"></drupal-entity>
    
  • 🇵🇹Portugal kallado

    Ok that will need further investigation.

  • achap 🇦🇺

    As I couldn't think of a way to solve this without hardcoding a dependency on entity_embed what about dispatching a post translation event directly after the translation is received and the response is decoded. This fixes my problem. Uploaded patch based on merge request.

  • achap 🇦🇺

    Also, I should add the reason my workaround of using a filter doesn't work is because when you go to save the translation in the UI, the filter doesn't get applied. Also not in the translation review section... so that solution was not working properly.

  • achap 🇦🇺

    I realized it's actually possible to do this without an event already using hook_tmgmt_job_after_request_translation() however it's not as easy and results in saving the Job item and entity twice. But I want to have a backup if the MR isn't committed.

    You can do it like this:

    /**
     * Implements hook_tmgmt_job_after_request_translation().
     */
    function my_module_tmgmt_job_after_request_translation(array $job_items) {
    
      foreach ($job_items as $job_item) {
        // Only start working if the previous job was accepted.
        if ($job_item->isAccepted()) {
          /** @var \Drupal\tmgmt\Data $data_service */
          $data_service = \Drupal::service('tmgmt.data');
          $unfiltered_data = $job_item->getData();
          $data_items = $data_service->filterTranslatable($unfiltered_data);
          $changes = FALSE;
          foreach ($data_items as $data_item_key => $data_item_value) {
            // Issue with data-entity-embed-display-settings encoding
            // post translation.
            $original = $data_items[$data_item_key]['#translation']['#text'];
            $result =
            // @see https://www.drupal.org/project/tmgmt_google_v3/issues/3508188
            preg_replace_callback(
              '/data-entity-embed-display-settings="(\{[^<&]*?})"/',
              function ($matches) {
                $json = $matches[1];
    
                // Decode JSON.
                $decodedJson = json_decode($json, TRUE);
                if ($decodedJson === NULL) {
                  // Return original if JSON decoding fails.
                  return $matches[0];
                }
    
                // Re-encode and escape JSON for HTML attributes.
                $encodedJson = htmlspecialchars(json_encode($decodedJson), ENT_QUOTES, 'UTF-8');
                return 'data-entity-embed-display-settings="' . $encodedJson . '"';
              },
              $data_items[$data_item_key]['#translation']['#text'],
            );
            if ($result !== $original) {
              $changes = TRUE;
              $data_items[$data_item_key]['#text'] = $result;
            }
          }
    
          // The job item must be re-marked as active for the data to be propagated
          // to the node, otherwise it will only be saved on the job item (not the
          // corresponding entity) as the translator already marks it active. Only
          // mark it as active if something actually changed to prevent unnecessary
          // saves.
          if ($changes) {
            $job_item->active();
            $job_item->addTranslatedData($data_service->unflatten($data_items));
          }
        }
      }
    
    }
    

    The event would be nicer but up to you if you want to include it!

Production build 0.71.5 2024