- π§πͺBelgium wim leers Ghent π§πͺπͺπΊ
HTML5 has now been around for >15 years. More than half the lifetime of Drupal. We should get this fixed at some point π
This was last RTBC almost 2 years ago, in #68.
It was un-RTBC'd in #69 over concerns about XSS filtering.
Since then, there have been 3 releases: https://github.com/Masterminds/html5-php/releases. Nothing in them relating to what was surfaced in #69 AFAICT.
But they helpfully also provided this overview: https://github.com/Masterminds/html5-php#known-issues-or-things-we-desig.... And even more helpfully: https://github.com/Masterminds/html5-php#serializer-design. For us to be able to reuse 99% of this but provide our own handling of encoding in attribute values, we could:
- have
\Drupal\Component\Utility\Html::serialize()
not call\Masterminds\HTML5::saveHTML()
which calls\Masterminds\HTML5::save()
, which hardcodes the use of\Masterminds\HTML5\Serializer\OutputRules
β this would add ~10 LoC - instead provide
\Drupal\Component\Utility\HtmlSerializerRules extends \Masterminds\HTML5\Serializer\OutputRules
which handles attribute values differently by only overriding\Masterminds\HTML5\Serializer\OutputRules::attrs()
, which is ~30 out of the ~550, or 5%
Thoughts? π€
- have
- π«π·France andypost
FYI looking at PHP 8.3 changes I see many commits for
dom
extension
Moreover 2 commits already caused regressions and reverted, see
- https://github.com/php/php-src/issues/11469
- https://github.com/php/php-src/issues/11507So ++ to rely on external library
- πΊπΈUnited States effulgentsia
provide \Drupal\Component\Utility\HtmlSerializerRules extends \Masterminds\HTML5\Serializer\OutputRules which handles attribute values differently by only overriding \Masterminds\HTML5\Serializer\OutputRules::attrs()
+1. Or maybe rename this to
\Drupal\Component\Html5\Serializer\OutputRules
depending on the answer to below.have \Drupal\Component\Utility\Html::serialize() not call \Masterminds\HTML5::saveHTML() which calls \Masterminds\HTML5::save(), which hardcodes the use of \Masterminds\HTML5\Serializer\OutputRules
What would you do instead? Create a
\Drupal\Component\Html5\Html5
that extends\Masterminds\HTML5
and overrides thesave()
method to use ourOutputRules
instead? Or are you envisioning a different way of integrating our OutputRules overrides intoHtml::serialize()
? - last update
over 1 year ago 29,413 pass, 7 fail - @longwave opened merge request.
- π¬π§United Kingdom longwave UK
Opened MR!4274 from 11.x with the work from the 9.4.x branch merged in, no other changes yet - some of the conflicts in FilterKernelTest especially were tricky so let's see how this goes first.
- last update
over 1 year ago Custom Commands Failed - last update
over 1 year ago Custom Commands Failed - Status changed to Needs review
over 1 year ago 10:08pm 27 June 2023 - π¬π§United Kingdom longwave UK
Fixed some of the tests, but it looks like
HtmlTest::testTransformRootRelativeUrlsToAbsolute
was perhaps failing correctly? - last update
over 1 year ago Custom Commands Failed - last update
over 1 year ago 29,553 pass, 1 fail - Status changed to Needs work
over 1 year ago 3:13pm 28 June 2023 - πΊπΈUnited States smustgrave
Failure seems like it could be legit to issue.
- πΊπΈUnited States dww
Tagging to be smashed. Hope to get a chance to review + dig in soon.
- last update
over 1 year ago Custom Commands Failed - Status changed to Needs review
over 1 year ago 10:55am 20 July 2023 - π¬π§United Kingdom longwave UK
There is a typo in WildcardHtmlSupportTest -
class"
instead ofclass
- the HTML5 parser is stricter and rejectsclass"
as an invalid attribute.I am still not sure about the "fix" to ::testTransformRootRelativeUrlsToAbsolute(), will dig into that again.
- last update
over 1 year ago 29,826 pass - π¬π§United Kingdom longwave UK
Reverted the change to
HtmlTest::testTransformRootRelativeUrlsToAbsolute()
and solved the issue another way.This relates to #3311595: Html::transformRootRelativeUrlsToAbsolute() replaces "\r\n" with " \n" β . Code was added to Html::load() to pre-normalize newlines because of the way DOMDocument used to handle them. As we are no longer using DOMDocument we can remove the special cases in
Html::load()
and normalize newlines inHtml::serialize()
instead, I think - this makes HtmlTest pass again locally, and hopefully doesn't break anything else, - last update
over 1 year ago 29,758 pass, 1 fail - last update
over 1 year ago 29,827 pass - π¬π§United Kingdom longwave UK
Addressed #69 through #82 by adding a test as suggested in #69 and then implementing @Wim Leers' suggestion from #80. FilterKernelTest and XssTest pass locally, let's see what happens with the full suite.
Not sure this is worth introducing as an additional component as suggested in #82, but reviews and comments welcome.
- Status changed to RTBC
over 1 year ago 4:44pm 25 July 2023 - πΊπΈUnited States smustgrave
Regarding #82 maybe we could add a new component for our ruleset but is that needed right now or could be in a follow up?
From an agile standpoint I think this change makes things better right now and can be improved on.
- last update
over 1 year ago 29,882 pass - last update
over 1 year ago 29,880 pass - Status changed to Needs review
over 1 year ago 11:17pm 27 July 2023 - πΊπΈUnited States dww
Sorry to do this, but sounds like this still needs more review before truly RTBC.
- π¨π¦Canada joseph.olstad
Wow this is some great work @longwave, you've really been on a marathon run burning the midnight oil on this!
- last update
over 1 year ago 29,887 pass - Status changed to Needs work
over 1 year ago 7:37am 8 August 2023 The Needs Review Queue Bot β tested this issue. It no longer applies to Drupal core. Therefore, this issue status is now "Needs work".
This does not mean that the patch needs to be re-rolled or the MR rebased. Read the Issue Summary, the issue tags and the latest discussion here to determine what needs to be done.
Consult the Drupal Contributor Guide β to find step-by-step guides for working with issues.
- last update
over 1 year ago 29,962 pass - Status changed to Needs review
over 1 year ago 2:20pm 16 August 2023 - Status changed to Needs work
over 1 year ago 9:33am 21 August 2023 - π§πͺBelgium wim leers Ghent π§πͺπͺπΊ
Sorry for the long wait, @longwave π Here's a very thorough review!
I think this is very close to being ready, and mostly just needs some additional comments to ensure this remains maintainable in the future, as well as additions for the obscure edge cases that are changing, in both the issue summary and the change record.
Superb work, and wonderful to finally see this on the verge of being committable π
- last update
over 1 year ago 30,057 pass - last update
over 1 year ago 29,925 pass, 9 fail - Status changed to Needs review
over 1 year ago 10:36am 21 August 2023 - π¬π§United Kingdom longwave UK
Thanks for the in-depth review! Responded to all points: fixed some nits, added some docs to both the code and change record, rebased and pushed. Also reverted the BigPipe change but I have a feeling this will fail because the placeholders must match the exact string that is output by the normalizer, let's see though.
- Status changed to Needs work
over 1 year ago 9:02am 22 August 2023 - π«π·France andypost
There's new RFC to add HTML5 support to PHP (8.4 I bet) https://wiki.php.net/rfc/domdocument_html5_parser
- πΊπΈUnited States mfb San Francisco
The HTML5 RFC is good news and it's about time, although there will be more waiting to be able to use it.
- π¬π§United Kingdom longwave UK
Yeah, I feel like we should move forward with this anyway, then when Drupal increases the minimum version enough to use the new HTML5 DOM, we can migrate away from
masterminds/html5
. - π§πͺBelgium wim leers Ghent π§πͺπͺπΊ
Yeah, I feel like we should move forward with this anyway
π―
- last update
over 1 year ago 30,169 pass - Status changed to Needs review
over 1 year ago 10:07pm 20 September 2023 - π¬π§United Kingdom longwave UK
This should be ready for another round of review. Fixed the test fails - removed the big pipe placeholder
arguments
attribute when it is empty to hopefully avoid confusion, and fixed a bug in HtmlSerializerRules that was double-escaping some ampersands. - Status changed to RTBC
over 1 year ago 2:47pm 21 September 2023 - πΊπΈUnited States smustgrave
From what I can tell all threads have had a response at least.
CR reads well.
And after 8 years think moving to HTML5 makes sense :) also getting this in about a month out from 10.2 would give contrib modules good time to fix.
- last update
over 1 year ago 30,206 pass - last update
over 1 year ago 30,205 pass, 2 fail - last update
over 1 year ago 30,361 pass - last update
over 1 year ago Build Successful - last update
over 1 year ago 30,361 pass 20:51 17:17 Running- Status changed to Needs work
over 1 year ago 4:42pm 3 October 2023 - π¬π§United Kingdom alexpott πͺπΊπ
I think the changes to bigpipe mean that that module should have a post update to ensure the render cache is clearer so any cached placeholders have the expected HTML.
- π«π·France andypost
FYI this upgrade makes the related issue obsolete so removes last blocker for PHP 8.3 compatibility (and independent of libxml version) #3383577-20: TextSummaryTest:testLength() fails on some libxml versions β
- last update
over 1 year ago 30,372 pass - Status changed to Needs review
over 1 year ago 5:32pm 3 October 2023 - π«π·France andypost
Added
pib_pipe.post_update.php
withbig_pipe_post_update_html5_placeholders()
for #107 - last update
over 1 year ago 30,372 pass - Status changed to RTBC
over 1 year ago 6:55pm 3 October 2023 - Status changed to Fixed
over 1 year ago 8:39am 4 October 2023 -
alexpott β
committed 201ae2e3 on 11.x
Issue #2441811 by longwave, daffie, andypost, edurenye, lauriii, joseph....
-
alexpott β
committed 201ae2e3 on 11.x
- π§πͺBelgium wim leers Ghent π§πͺπͺπΊ
π₯³ Thanks everyone, and @longwave in particular! π 8.5 years in the making π
- π¨π¦Canada joseph.olstad
Nicely done folks! I wonder how this will affect contrib? We shall see! :)
- π«π·France andypost
Thank you all!
It was a blocker for PHP 8.3 compatibility!Meantime the parent issue needs some attention for follow-ups as mostly all
libxml
related issues could be closed π± [Meta] PHP DOM (libxml2) misinterprets HTML5 Active Automatically closed - issue fixed for 2 weeks with no activity.
- Status changed to Fixed
10 months ago 12:05am 9 May 2024 There is a report calling this a regression π Issue with HTML ` ` not being correctly filtered out from URLs Needs work .