CKE5 validator false negative on wildcard elements ("the following elements are missing")

Created on 25 November 2024, 2 months ago

Problem/Motivation

When adding the "elements" to a CKE5 plugin definition, you might want to use a wildcard to, for example, whitelist a set of classes on all HTML elements:

my_plugin:
  ckeditor5:
    plugins: [ htmlSupport.GeneralHtmlSupport ]
    config:
      htmlSupport:
        allow:
          - name:
              regexp:
                pattern: /.*/
            classes: [my, whitelisted, classes]
  drupal:
    label: Whitelisted elements
    library: core/ckeditor5.htmlSupport
    elements:
      - <* class="my whitelisted classes">

This is a not so strange scenario: think for example about keeping backwards compatibility with CKE4 while retaining some control on the markup.

But this will fail constraint validation, as it seems the <*> wildcard has already other two attributes whitelisted (lang and dir) by Drupal core CKE5 plugins, and it looks like the validator is searching for a exact match and won't consider <* class="my whitelisted classes"> to be a valid subset of <* dir="ltr rtl" lang class="my whitelisted classes">:

The message:

The current CKEditor 5 build requires the following elements and attributes:
(...) <* dir="ltr rtl" lang class="my whitelisted classes"> (...)
The following elements are missing:
<* class="my whitelisted classes">

Steps to reproduce

- Create a format with Limit HTML allowed tags filter and CKE5
- Create a custom plugin that whitelists classes on all HTML elements by using the <* class="my class"> notation on the elements property.
- Try to save the format. It will fail as it won't pass constraint validation.

Proposed resolution

The validator should consider <* class="my whitelisted classes"> to be valid even if there are more attributes whitelisted on the same definition, i.e: <* dir="ltr rtl" lang class="my whitelisted classes">.

Remaining tasks

User interface changes

None

Introduced terminology

API changes

Data model changes

Release notes snippet

πŸ› Bug report
Status

Active

Version

10.5 ✨

Component

ckeditor5.module

Created by

πŸ‡ͺπŸ‡ΈSpain idiaz.roncero Madrid

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Comments & Activities

  • Issue created by @idiaz.roncero
  • πŸ‡ΊπŸ‡ΈUnited States justcaldwell Austin, Texas

    Just looking at the examples from the core ckeditor5 module, your example seems close, but maybe it should be something more like:

    my_plugin:
      ckeditor5:
        plugins: [ htmlSupport.GeneralHtmlSupport ]
        config:
          htmlSupport:
            allow:
              - name: ~
                classes: [my, whitelisted, classes]
      drupal:
        label: Whitelisted elements
        library: core/ckeditor5.htmlSupport
        elements:
          - <* class="my whitelisted classes">
    
  • πŸ‡³πŸ‡ΏNew Zealand quietone

    Changes are made on on 11.x (our main development branch) first, and are then back ported as needed according to our policies.

  • πŸ‡ͺπŸ‡ΈSpain idiaz.roncero Madrid

    @justcaldwell yep, that might work for the CKEditor whitelisting part, but the problem I'm reporting is related with the very Drupally filter_html rules, this is, the elements: <* class="my whitelisted classes"> part.

  • I did some code diving to get some background information for this.

    TL;DR:
    "dir" and "lang" are allowed by the filter directly in the same way that "style" and "on*" are prohibited. Other attributes, which are allowed on any tag via configuration or plugin definition, don't have that priviledge.

    Relevant code:
    The validation message is thrown in:
    core/modules/ckeditor5/src/Plugin/Validation/Constraint/FundamentalCompatibilityConstraintValidator.php::checkHtmlRestrictionsMatch()
    The validation compares the provided HTML (tags, attributes and values form the configuration and plugin definitions) with the HTML, which is allowed by the "filter_html" filter ("Limit allowed HTML tags and correct faulty HTML").
    core/modules/filter/src/Plugin/Filter/FilterHtml.php::getHTMLRestrictions()
    That method collects all the allowed (and prohibited) HTML. At the very end, the following is added in any case:

    $restrictions['allowed']['*'] = [
        'style' => FALSE,
        'on*' => FALSE,
        'lang' => TRUE,
        'dir' => ['ltr' => TRUE, 'rtl' => TRUE],
    ];
    

    At the moment, the wildcard definitions like "<* class>" don't even arrive at the beginning of the method. If they would, the code above would override any definition for *-tags.

    Further foughts:
    As a workaround, someone could patch additional lines like the one for "lang" (any class) or dir (some classes) in. I wouldn't recommend this as a solution for this issue though, since it won't be a general solution for any use case.

    I think, what needs to be done, is to check, why the *-tags are not arriving at the beginning of the "getHTMLRestrictions()"-method.
    After that, it should just be making sure that they survive till the end of the method.
    At the end, the definition for "$restrictions['allowed']['*']" should be merged, instead of overridden, to keep whatever came so far while still maintaining the prohibition of "style" and "on*" as well as the permission for "lang" and "dir".

Production build 0.71.5 2024