\Drupal\filter\Plugin\Filter\FilterHtmlCorrector and Html::normalize() and incorrectly "corrects" <source> tags

Created on 2 March 2017, over 7 years ago
Updated 5 October 2023, about 1 year ago

Hi,

Whenever an image is embedded in the ckeditor, an additional end tagged is added to the source code. This leads to the following W3C validation error: Error: Stray end tag source.

Source code:
<picture><source srcset="photo4.JPG?itok=hocFQ_Zy 1x" media="screen and (min-width: 20em)" type="image/jpeg"></source><img srcset="/sites/default/files/styles/focal_point_640x800/public/field-media-image/foto4.JPG?itok=hocFQ_Zy" alt="Onderzoeken 3" title="Onderzoeken 3" typeof="foaf:Image" src="photo4.JPG?itok=hocFQ_Zy"></picture>

I've dug around in the code and determined that this is added in EntityEmbedFilter of this module (entity_embed). The process function serializes the $dom object, which is what seems to add an end tag to the childnode. This happens around line 145: $result->setProcessedText(Html::serialize($dom));

I've come across a similar issue where the reporter suggested the following solution:

$altered_html = preg_replace('/<source(.*?)><\/source>/is', '<source$1 />', Html::serialize($dom));
      return new FilterProcessResult($altered_html);

Is this, or another solution, something we should apply to this module?

🐛 Bug report
Status

Needs work

Version

11.0 🔥

Component
Filter 

Last updated 4 days ago

No maintainer
Created by

🇳🇱Netherlands LaravZ

Live updates comments and jobs are added and updated live.
  • Needs tests

    The change is currently missing an automated test that fails when run with the original code, and succeeds when the bug has been fixed.

Sign in to follow issues

Comments & Activities

Not all content is available!

It's likely this issue predates Contrib.social: some issue and comment data are missing.

  • 🇩🇪Germany Anybody Porta Westfalica

    Just FYI and as another example: For a block containing Shopify Buy Buttons (JavaScript) with only "Correct faulty and chopped off HTML" filter removed the closing elements from the code: https://github.com/Shopify/buy-button-js/issues/806
    This was quite tricky ;D

    The expected and correct result would be to keep the closing brackets from the JavaScript. Removing them breaks the code here.
    Not sure if this edge-case is relevant, but I thought it would make sense to let you know. Especially because the reason isn't easy to find.

  • 🇨🇦Canada joseph.olstad

    @Anybody , comment #31 good to know however that's out of the scope of this patch.

  • 🇩🇪Germany Anybody Porta Westfalica

    @joseph.olstad: Sorry, reading my comment, I guess I commented into the wrong issue -.- SORRY!
    I'll have a look, if I can find the correct one, and otherwise create one. Seems to be a similar issue with the "Correct faulty and chopped off HTML" filter.

  • 🇨🇦Canada joseph.olstad

    @Anybody, I did some related filter work last night and published a new module that cleans up some other unrelated ckeditor mess.

    We're using a javascript library that provides charts however the charts crash if the expected empty table descriptor <td> element contains the annoying &nbsp; that ckeditor insists on inserting.

    So for some reason the above patches no longer seem to be working the way they did at one point , now I'm testing Drupal 10.0.9 and I ended up using a text filter plugin.

    It's super easy to create a text filter plugin, I simply used drush gen.

    drush gen module;
    (create new module, or not)
    drush gen plugin:filter
    add the filter to an existing module (the one you just created with drush gen module or another existing module)

    The example plugin provides a configuration option, I've implemented this in a very simple module called wxt_chart_stability

    Enable this new plugin (whatever you called it, it'll have a checkbox, enable that)

    To ensure that this new plugin has the final word, I had to drag up the weight of the plugin to the top and save it.

    We need to redo the above patch that cleans up the source and track elements with regular expression code similar to the patch code and turn it into a contrib module instead.

    With that said, I don't know how a module as popular as ckeditor does such silly annoying things with nbsp and like you say other filters could be complicating matters.

  • 🇨🇦Canada joseph.olstad

    @Anybody, you might be overworked lol needing some extra sleep, check that link in your last comment. :)

  • 🇩🇪Germany Anybody Porta Westfalica

    @joseph.olstad that's for sure the case ;D Thanks, I corrected the link and should get some sleep now ....... ;)

  • 🇧🇪Belgium wim leers Ghent 🇧🇪🇪🇺

    🐛 Upgrade filter system to HTML5 Fixed landed. I'm hoping this will now just need test coverage to prove it works correctly?

Production build 0.71.5 2024