appendXML() does not allow   or double commas in comments

Created on 31 December 2024, about 1 month ago

Problem/Motivation

Migrating from ckeditor5_embedded_content module v1.0 to embedded_content module v2.0. During the migration process, I noticed that I was receiving errors related to unallowed content within XML.

  • When embedded content templates use template suggestions, that include double commas 'embedded-content--something' and twig debugging is turned on. I get this error and the content does not display: Warning: DOMDocumentFragment::appendXML(): Entity: line 5: parser error : Double hyphen within comment: in Drupal\embedded_content\Plugin\Filter\EmbeddedContent->process() (line 104 of modules/contrib/embedded_content/src/Plugin/Filter/EmbeddedContent.php).
  • When embedding content that contains text long formatted content and that content contains a   which is allowed in the editor I get this error: Warning: DOMDocumentFragment::appendXML(): Entity: line 13: parser error : Entity 'nbsp' not defined in Drupal\embedded_content\Plugin\Filter\EmbeddedContent->process() (line 104 of modules/contrib/embedded_content/src/Plugin/Filter/EmbeddedContent.php).

Steps to reproduce

Double commas error in XML

  1. Create a new embedded content example with a template suggestion
  2. Turn on twig debugging
  3. Create a new instance of it on a basic page
  4. Save the page and view

nbsp error in XML

  1. Create a new embedded content example with a WYSIWYG field.
  2. Create a new instance of it on a basic page adding a space before the text so a   is generated
  3. Save the page and view

Proposed resolution

Sanitize the rendered output for XML, or revert to a solution similar to ckeditor5_embedded_content v1.0.
https://git.drupalcode.org/project/ckeditor5_embedded_content/-/blob/1.0...

πŸ› Bug report
Status

Active

Version

2.0

Component

Code

Created by

πŸ‡ΊπŸ‡ΈUnited States vector_ray

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Merge Requests

Comments & Activities

  • Issue created by @vector_ray
  • πŸ‡ΊπŸ‡ΈUnited States vector_ray
  • πŸ‡ΊπŸ‡ΈUnited States vector_ray
  • πŸ‡ΊπŸ‡ΈUnited States mortona2k Seattle

    I'm seeing this as well.

  • πŸ‡ΊπŸ‡ΈUnited States jess_m

    I came across the same problem in the project I'm working on. For a quick fix solve until the solution is found for the module, in a custom module we built for templates, for the twig variable it turned into: *note, field_name is just an example

    {{ field_name|render|replace({' ':' '})|raw }}

    and it now prints it out correctly both on the page and in the WYSIWYG. It's an immediate quick fix.

  • Merge request !7Escape HTML characters for valid XML β†’ (Open) created by vector_ray
  • πŸ‡ΊπŸ‡ΈUnited States vector_ray

    I have created a merge request that escapes   and -- from the HTML before passing to XML.

  • Pipeline finished with Failed
    28 days ago
    Total: 283s
    #384370
  • πŸ‡ΊπŸ‡ΈUnited States mortona2k Seattle

    This didn't clear up my errors with appendXML:

    parser error : Specification mandates value for attribute async

    The offending line is:

    <script async src="https://public.codepenassets.com/embed/index.js"></script>
    

    Warning: DOMDocumentFragment::appendXML():

    Warning: DOMDocumentFragment::appendXML(): ^

    Warning: DOMDocumentFragment::appendXML(): Entity: line 12: parser error : attributes construct error in Drupal\embedded_content\Plugin\Filter\EmbeddedContent->process() (line 106 of modules/contrib/embedded_content/src/Plugin/Filter/EmbeddedContent.php).

    This is for a codepen oembed plugin, the errors seem to be coming from the template:

    <p class="codepen" data-height="300" data-default-tab="html,result" data-slug-hash="RwEgNJE" data-pen-title="Menu columns" data-user="mortona42" style="height: 300px; box-sizing: border-box; display: flex; align-items: center; justify-content: center; border: 2px solid; margin: 1em 0; padding: 1em;">
      <span>See the Pen <a href="https://codepen.io/mortona42/pen/RwEgNJE">
      Menu columns</a> by Andrew Morton (<a href="https://codepen.io/mortona42">@mortona42</a>)
      on <a href="https://codepen.io">CodePen</a>.</span>
    </p>
    <script async src="https://public.codepenassets.com/embed/index.js"></script>
    
  • πŸ‡ΊπŸ‡ΈUnited States vector_ray

    mortona2k async attribute needs to have a value to pass xml validation. Try async="true".

  • πŸ‡ΊπŸ‡ΈUnited States vector_ray
  • πŸ‡ΊπŸ‡ΈUnited States mortona2k Seattle

    That's a copy of the embed code from codepen. I could adjust it myself to get this working.

    The same code is used by the Codepen module, but there it is implemented in a template and is not embedded into wysiwyg, so it doesn't encounter this issue.

    This is a weird quirk, seems like we're using an XML parser to pass the HTML chunk, but it's still flagging things that are ok in html.

    If these are new/different issues, I'm more inclined to adjust the embed code an add async="true" so we can close this ticket. Doing that clears up the warnings.

  • πŸ‡ΊπŸ‡ΈUnited States vector_ray

    I think this call needs to be made by the maintainer. As I mentioned in the description we can either try to process the HTML before passing through the XML parser like the patch here, in which case we might find more and more cases where the solution might produce errors. Or revert to the previous version solution as seen here: https://git.drupalcode.org/project/ckeditor5_embedded_content/-/blob/1.0...

    @nuez

  • πŸ‡ΊπŸ‡ΈUnited States vector_ray
  • πŸ‡³πŸ‡±Netherlands groendijk

    Yeah, can confirm the CKEditor5 version of the function `process` does the trick. It's not only &nbsp; or --. Some other html might not be accepted as wel like <br> (without trailing slash).

  • πŸ‡©πŸ‡ͺGermany ckaotik Berlin

    I'm wondering, do we really need to use DOM parsing? Or couldn't we simply pass the generated markup as a text replacement for the <embedded-content> tag, using a simple str_replace (multibyte, probably).

  • πŸ‡ͺπŸ‡ΈSpain nuez Madrid, Spain

    I've revised this and reverted largely to the old way in the previous module, with the addition of the HTML::normalize from https://www.drupal.org/project/embedded_content/issues/3479770 πŸ› Warning: DOMDocumentFragment::appendXML() when input is not normalized Needs work . let me know if you find more issues

Production build 0.71.5 2024