Wrong entity encoding in story HTML

Created on 29 April 2024, about 2 months ago
Updated 6 May 2024, about 2 months ago

Problem/Motivation

We have a story with this parameter:

{"text":"Cat\u00e9gorie #2","url":"http:\/\/google.com"}

ServerController renders it like this:

<a href="http://google.com">Cat&Atilde;&copy;gorie #2</a>

Steps to reproduce

Here is our story:

{"title":"Components\/molecules\/navigation\/breadcrumb","parameters":{"server":{"url":"http:\/\/drupal.docksal.site\/storybook\/stories\/render"}},"stories":[{"args":{"breadcrumb":[{"text":"Homepage","url":"http:\/\/google.com"},{"text":"Cat\u00e9gorie #2","url":"http:\/\/google.com"},{"text":"Sous-cat\u00e9gorie avec un lien bcp plus long #3"}]},"parameters":{"server":{"id":"eyJwYXRoIjoidGhlbWVzXC9jdXN0b21cL2Zyb250XC9jb21wb25lbnRzXC9tb2xlY3VsZXNcL25hdmlnYXRpb25cL2JyZWFkY3J1bWJcL2JyZWFkY3J1bWIuc3Rvcmllcy50d2lnIiwiaWQiOiJkZWZhdWx0In0%3D"}},"name":"default"}]}

Proposed resolution

I think $dom->loadHTML() does not detect the encoding correctly.
Using Html::load() is a better practice and seems to fix the issue.

πŸ› Bug report
Status

Needs review

Version

1.0

Component

Storybook

Created by

πŸ‡«πŸ‡·France prudloff Lille

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Comments & Activities

  • Issue created by @prudloff
  • Status changed to Needs review about 2 months ago
  • πŸ‡«πŸ‡·France prudloff Lille

    GitLab fails to create the issue fork for some reason so here is a patch.

  • πŸ‡«πŸ‡·France prudloff Lille

    Turns out Html::load() removes everything outside the body so this is not what we need here.
    The root problem seems to be that DOMDocument::loadHTML() does not detect the encoding correctly. Forcing it like this works but does not feel very clean:

        $success = @$dom->loadHTML('<?xml encoding="utf-8" 

    ' . $html);
    ?>

    Using the HTML5 library seems to work correctly (it is what Html::load() uses internally).

Production build 0.69.0 2024