Data loss in old 3x version of module: everything before last colon in every footnote is lost when using <fn> instead of [fn] syntax

Created on 2 October 2023, about 1 year ago
Updated 13 September 2024, 3 months ago

Problem/Motivation

When using angular bracket notation (<fn value=โ€ฆ textโ€ฆ>), everything before the last colon is lost when:

  • viewing (the original is still stored in the database) if filter_html is enabled
  • editing (if you save then, it will be lost) if you are using a rich text editor such as CKEditor 4 or 5

Steps to reproduce

  1. Create this content:
    <p>
        Hello, world<fn value="1" text="My first footnote: does this work?">&nbsp;</fn>.
    </p>
    
  2. You will see:
  3. Go edit it, enable the Source view and you'll see:
    <p>
        Hello, world<fn value="1" text=" does this work?">&nbsp;</fn>.
    </p>
    

Proposed resolution

Root cause: \Drupal\Component\Utility\Xss::filter() is used to guarantee safety, and its helper \Drupal\Component\Utility\Xss::attributes() calls \Drupal\Component\Utility\UrlHelper::filterBadProtocol(), which will strip everything before the colon and including the colon. Except for a few specific attributes:

            // Values for attributes of type URI should be filtered for
            // potentially malicious protocols (for example, an href-attribute
            // starting with "javascript:"). However, for some non-URI
            // attributes performing this filtering causes valid and safe data
            // to be mangled. We prevent this by skipping protocol filtering on
            // such attributes.
            // @see \Drupal\Component\Utility\UrlHelper::filterBadProtocol()
            // @see http://www.w3.org/TR/html4/index/attributes.html
            $skip_protocol_filtering = substr($attribute_name, 0, 5) === 'data-' || in_array($attribute_name, [
              'title',
              'alt',
              'rel',
              'property',
              'class',
            ]);

Solution: either switch to a data-value attribute or the title attribute?

Remaining tasks

User interface changes

API changes

Data model changes

๐Ÿ› Bug report
Status

RTBC

Version

3.1

Component

Footnotes

Created by

๐Ÿ‡ง๐Ÿ‡ชBelgium wim leers Ghent ๐Ÿ‡ง๐Ÿ‡ช๐Ÿ‡ช๐Ÿ‡บ

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Merge Requests

Comments & Activities

  • Issue created by @wim leers
  • ๐Ÿ‡ง๐Ÿ‡ชBelgium wim leers Ghent ๐Ÿ‡ง๐Ÿ‡ช๐Ÿ‡ช๐Ÿ‡บ
  • ๐Ÿ‡ง๐Ÿ‡ชBelgium wim leers Ghent ๐Ÿ‡ง๐Ÿ‡ช๐Ÿ‡ช๐Ÿ‡บ

    This is the root cause: ๐Ÿ› XSS attribute filtering is inconsistent and strips valid attributes Needs work . Thanks for the link, @catch!

  • ๐Ÿ‡ง๐Ÿ‡ชBelgium wim leers Ghent ๐Ÿ‡ง๐Ÿ‡ช๐Ÿ‡ช๐Ÿ‡บ
  • ๐Ÿ‡ซ๐Ÿ‡ฎFinland miikamakarainen

    Changing that the "data-value" attribute is used instead of "text" resolves the issue with data loss associated with colons.

    The patch is only for Ckeditor 5 and does not migrate any old data to the new format.

  • ๐Ÿ‡จ๐Ÿ‡ญSwitzerland ayalon

    I am coming from the old format and have thousands of old footnotes. Is there a possible fix to not lose data and having the colon fix ?

    I have to use this path https://drupal.org/node/3381594 to support the legacy update.

  • Status changed to Needs review about 1 year ago
  • Open on Drupal.org โ†’
    Core: 9.5.5 + Environment: PHP 7.4 & MySQL 5.7
    last update about 1 year ago
    Waiting for branch to pass
  • ๐Ÿ‡ฌ๐Ÿ‡งUnited Kingdom catch

    @ayalon if you're trying to update from pre-ckeditor5 to ckeditor5, it is possible that the patch above will work. By 'old format' I think @miikamakarainen means the 'old ckeditor5 format', not 'the pre-ckeditor5 format' - however I haven't tested this yet.

    Marking needs review since there's a patch here.

  • Status changed to Needs work about 1 year ago
  • ๐Ÿ‡จ๐Ÿ‡ญSwitzerland ayalon

    I did some further testing. The patch fixes the issue with the colon. But the same happens with apostroph.

    CKEditor replaces an apostrophe ' with &apos;. All data after d&apos; is lost for good.

    I think we need to extend this fix.

  • ๐Ÿ‡ฑ๐Ÿ‡งLebanon layalk

    Added the fix for the apos as well.. tested it with different variations and seems to be working well.
    Note you need to disable/enable the filter to get the new attribute in the allowed HTML tags if you have the "Limit allowed HTML tags and correct faulty HTML" filter enabled as well.
    Added also the check if the value isn't set(from https://www.drupal.org/node/3381366 โ†’ )

  • ๐Ÿ‡ฑ๐Ÿ‡งLebanon layalk

    Please ignore #9 and use this one instead. My bad.

  • Merge request !11Fixes #3391021 โ†’ (Open) created by layalk
  • Open on Drupal.org โ†’
    Core: 9.5.5 + Environment: PHP 7.4 & MySQL 5.7
    last update about 1 year ago
    Waiting for branch to pass
  • Open on Drupal.org โ†’
    Core: 9.5.5 + Environment: PHP 7.4 & MySQL 5.7
    last update about 1 year ago
    Waiting for branch to pass
  • ๐Ÿ‡ฌ๐Ÿ‡งUnited Kingdom scott_euser

    Thanks for your work on this!

    I believe we also need an update hook to convert any stored data from <fn value="" text="footnote text here"></fn> to <fn value="" data-text="footnote text here"></fn> before this change could be committed.

    There is a bit of formatting/coding standards scope creep in there, but it looks trivial so that seems fine.

  • Open on Drupal.org โ†’
    Core: 9.5.5 + Environment: PHP 7.4 & MySQL 5.7
    last update 12 months ago
    Waiting for branch to pass
  • Status changed to Needs review 12 months ago
  • Open on Drupal.org โ†’
    Core: 9.5.5 + Environment: PHP 7.4 & MySQL 5.7
    last update 12 months ago
    Waiting for branch to pass
  • ๐Ÿ‡ฌ๐Ÿ‡งUnited Kingdom scott_euser

    I believe that should do it, though I think it needs to be a batch update hook. I suppose as is if it fails a user will have to run it again and it will have less to do until it eventually passes... but not great.

    I tested this locally using the example in the issue summary and the update hook converted the text attribute correctly to data-text.

  • ๐Ÿ‡ง๐Ÿ‡ชBelgium wim leers Ghent ๐Ÿ‡ง๐Ÿ‡ช๐Ÿ‡ช๐Ÿ‡บ

    I doubt an update hook is viable. There could be millions of revisions.

    Itโ€™d be safer to continue to support the old format, but recommend and generate only the new format.

  • ๐Ÿ‡ฌ๐Ÿ‡งUnited Kingdom scott_euser

    Fair point. Anyways working on 4x branch for eg https://www.drupal.org/project/footnotes/issues/3405986#comment-15385557 โœจ Switch to Modal window to allow more editor control Fixed and other breaking changes from the issue queue here, so perhaps a drush command instead of upgrade hook so someone can opt-in to a 'use at own risk' upgrade path and this can target 4x as well.

    Then backporting to 3x branch eg in a follow-up can do as you suggest: support old and new formats for those not wanting to upgrade to 4x.

    Ill revert the update hook from the merge request.and after doing so will target this to 4x so breaking change like this can be okay unless others disagree with this proposed route forward.

  • Assigned to scott_euser
  • Status changed to Needs work 12 months ago
  • ๐Ÿ‡ฌ๐Ÿ‡งUnited Kingdom scott_euser
  • Open on Drupal.org โ†’
    Core: 9.5.5 + Environment: PHP 7.4 & MySQL 5.7
    last update 12 months ago
    Waiting for branch to pass
  • Issue was unassigned.
  • Status changed to RTBC 12 months ago
  • ๐Ÿ‡ฌ๐Ÿ‡งUnited Kingdom scott_euser

    Okay I reverted my update hook and changed to RTBC since I've now not made any chance and have confirmed that the new functionality here works as expected with data-text. Given this is now targeting 4x branch, the breaking change is fine. I created a follow-up issue here ๐Ÿ› Backport switch from text to data-text attribute, maintaining support for text attribute Active to backport to 3x which does actually then need to avoid the breaking change and continue to support text attribute as per #14.

  • ๐Ÿ‡ฌ๐Ÿ‡งUnited Kingdom scott_euser

    Okay switching back to 3.1.x-dev as target since 4.0.x branch already uses data-text/data-value and does not have this issue.

    You can see the Roadmap here for 4.0.x stable ๐ŸŒฑ Roadmap to stable 4x branch Active

  • ๐Ÿ‡บ๐Ÿ‡ธUnited States brendanthinkshout

    I'd like to make use of this, but the patch from #10 doesn't seem to apply anymore to the latest revision of 3.1.x-dev. Is it straightforward to reroll?

  • ๐Ÿ‡ฌ๐Ÿ‡งUnited Kingdom scott_euser

    Other than some more support for auto-converting paste from Word (not a feature anyways in 3x) is there something blocking you from upgrading out of curiosity? Homepage has instructions. I'm supporting the 4x branch and none of the previous maintainers are involved any more so I don't expect you'll get any support on the 3x branch any more unfortunately.

  • ๐Ÿ‡ฌ๐Ÿ‡งUnited Kingdom scott_euser
  • ๐Ÿ‡บ๐Ÿ‡ธUnited States brendanthinkshout

    Thanks for the update. I guess I don't understand why, if there's no development happening on the v3 branch, the code would have changed enough to make the patch invalid! I'll have to look into it myself.

    We do want to upgrade to v4 when possible, but the issues that arose from updating to CKE 5 were so time-consuming that the client in question just wants to get existing footnotes fixed first before assessing the time required for an upgrade. Not a lot of hours available to put into the work. I'm sure the upgrade path you pointed out has taken a lot of possible formats into consideration, but there are thousands of entities using footnotes on the site and it's hard to trust that we won't end up writing custom queries and upgrade hooks to fix the edge cases.

  • ๐Ÿ‡ฌ๐Ÿ‡งUnited Kingdom scott_euser

    We've update many sites with tens of thousands of content items without issue and hundreds of other sites have upgraded according to the module stats so maybe give it a shot. It's just a matter of running the drush command in a local site.

    Sorry I can't support multiple versions but it's mostly in my free time before my daughter wakes up each morning and I'm supporting a fair number of other modules as well. Thanks for your understanding.

Production build 0.71.5 2024