- Issue created by @Mistrae
- 🇨🇭Switzerland Mistrae
I created a patch to revert the lastest change if anyone need it fixed before a better solution can be found.
- Status changed to Postponed: needs info
10 months ago 12:41pm 22 January 2024 - 🇬🇧United Kingdom longwave UK
é
is normalised to é but should not be stripped:> \Drupal\Component\Utility\Html::normalize('<a href="https://www.mywebsite.com/services/identite">Identité</a>'); = "<a href="https://www.mywebsite.com/services/identite">Identité</a>"
Can you provide an example similar to the above that fails?
- 🇨🇭Switzerland Mistrae
@longwave, serialize not normalize.
Ex.
DOMDocument
with:<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE html> <html> <body> <a href="https://www.mywebsite.com/services/identite">Identité</a> </body> </html>
Run
Html::serialize
and get:
<a href="https://www.mywebsite.com/services/identite">Identit</a>
- 🇬🇧United Kingdom longwave UK
Well,
normalize()
just callsload()
thenserialize()
. Can you give a full code snippet that fails please? - 🇨🇭Switzerland Mistrae
Here is the full code that can recreate the error:
$html_dom = \Drupal\Component\Utility\Html::load(\Drupal\Core\Render\Markup::create('Identité')); $body = $html_dom->getElementsByTagName('body'); $node = $body->item(0); $child = $node->childNodes->item(0); $text = $child->textContent; $text = htmlentities($text, ENT_QUOTES, 'UTF-8'); $element = $html_dom->createElement('a', $text); $node->replaceChild($element, $child); \Drupal\Component\Utility\Html::serialize($html_dom)
- 🇬🇧United Kingdom longwave UK
- 🇨🇭Switzerland Mistrae
If I input the text directly yes it work. Maybe it's the
htmlentities
that doesn't work with the new function. - 🇬🇧United Kingdom longwave UK
$text = htmlentities($text, ENT_QUOTES, 'UTF-8');
This is the problem. If you remove this line, the issue goes away.
- 🇬🇧United Kingdom longwave UK
This might be an upstream bug in
\Masterminds\HTML5\Serializer\Traverser::node()
.In this case what has happened is we have injected an entity reference directly into the DOM,
$node->nodeType
isXML_ENTITY_REF_NODE
, but the switch statement does not handle this case. - 🇨🇭Switzerland Mistrae
OK thanks, just to be clear, does that mean that since 10.2 we cannot use
htmlentities
withserialize
and that will be considered as won't fix or should something be done here ? - Status changed to Postponed
10 months ago 3:47pm 22 January 2024 - 🇬🇧United Kingdom longwave UK
Thanks for reporting! I have reported this upstream at https://github.com/Masterminds/html5-php/issues/244 with a slightly modified example, let's wait to see what the maintainer there has to say. If they decline to fix we can still override in Drupal and serialize entity references correctly.
- 🇮🇳India gaurav.kapoor
In one of the websites, we are using smart trim and wrapping the generated summary around a link (linked to the respective node). Special characters such as german umaluts 'ä, ö, ü and ß' are then not showing up in the generated trimmed text. Patch from #3 resolved the issue.
- 🇧🇪Belgium weseze
Patch from #3 can cause contextual links placeholder to be rendered wrong, causing it to replace portions of your content instead of just the contextual placeholder div-element.
You should not use this patch.Instead, modules should fix their implementations and not use htmlentity encoding/decoding.
Just encountered this issue using linked_field module. See 🐛 Special characters are stripped Needs review .