- Issue created by @prudloff
- Status changed to Needs review
8 months ago 1:39pm 29 April 2024 - 🇫🇷France prudloff Lille
GitLab fails to create the issue fork for some reason so here is a patch.
- 🇫🇷France prudloff Lille
Turns out
Html::load()
removes everything outside the body so this is not what we need here.
The root problem seems to be thatDOMDocument::loadHTML()
does not detect the encoding correctly. Forcing it like this works but does not feel very clean:$success = @$dom->loadHTML('<?xml encoding="utf-8"
' . $html);
?>Using the HTML5 library seems to work correctly (it is what
Html::load()
uses internally). - First commit to issue fork.
- Status changed to RTBC
5 months ago 10:09am 24 July 2024 - 🇩🇪Germany spuky
I can confirm patch 3 is solving the encoding issue (for me with german Umlauts..)
provided the patch as an MR - Status changed to Fixed
3 months ago 9:33pm 9 September 2024 Automatically closed - issue fixed for 2 weeks with no activity.