Improve interaction with HTML filter

Created on 19 June 2025, 26 days ago

Problem/Motivation

As discussed in 📌 Support additional extensions Active , it can be difficult to configure the filter as tags missing from the HTML filter can be difficult to spot.

In that discussion, four solutions were suggested:

  1. Validate on save, and warn about any missing tags for the selected Markdown flavor.
  2. Autoconfigure allowed tags based on selected flavor.
  3. Make the HTML filter optional. This would be a potential security risk.
  4. Move the HTML filter before the Markdown Easy filter.

Please see that issue for full discussion and reasoning.

Steps to reproduce

Proposed resolution

I suggest continuing discussion on the above options, but to begin work on implementation of option 1, as this is useul, even if combined with one of the other options.

To do this, add additional validation to the _markdown_easy_filter_format_form_submit hook, to validate against list of tags for the selected flavor.

Remaining tasks

  • Continue discussion
  • Add validation on filter save

User interface changes

None.

API changes

None.

Data model changes

None.

📌 Task
Status

Active

Version

2.0

Component

Code

Created by

🇮🇪Ireland lostcarpark

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Merge Requests

Comments & Activities

  • Issue created by @lostcarpark
  • 🇺🇸United States ultimike Florida, USA

    My thoughts on the four options (largely based on the discussion in #3493938):

    1. Validate on save, and warn about any missing tags for the selected Markdown flavor. - if we were only considering options 1-3, I think this would be the easy choice.
    2. Autoconfigure allowed tags based on selected flavor. - I'm not a fan of modifying config for users. 👎🏼
    3. Make the HTML filter optional. This would be a potential security risk. - nope. 👎🏼
    4. Move the HTML filter before the Markdown Easy filter. - this is an interesting idea. One of the tenants of my initial vision for the module was to make it "secure by default" in a way so that folks new to Markdown didn't even have to think (or know or care) about it in order to be secure. One of the things this decision drove was to use the Markdown config 'html_input' => 'strip', - so, if "Limit HTML" is moved before "Markdown Easy", it doesn't really matter what HTML is allowed, as it will be neutralized by Markdown Easy (unless the user overrides 'html_input' => 'strip', via hook_markdown_easy_config_modify()).

    So, unless I'm thinking about this wrong (which is entirely possible), I think option 1 is the best way to go. If we do this, I think it should only be a warning, not a validation error, and should depend on the selected Markdown Easy flavor. For example, if GitHub or Smörgåsbord is selected, then the warning should look something like:

    Consider adding the following HTML tags to the "Limit allowed HTML" filter configuration to fully support the Markdown Easy Flavor you selected:

    • For Task Lists: <input type checked disabled>
    • For Tables: <table> <thead> <tbody> <tr> <th class> <td class>
    • For blah: <some> <other> <tag>
    • etc...

    Thoughts?

    -mike

  • 🇺🇸United States ultimike Florida, USA

    I just remembered another bit about my thinking that the "Limit allowed HTML tags" filter should remain after Markdown Easy...

    Since Markdown is configured with 'html_input' => 'strip' by default, the "Limit HTML tags" filter is really to prevent authors from adding wild-west HTML that the design doesn't support.

    For example, by stripping out <img> tags, it forces the author to use Markdown style images - which leads to more predictable HTML. But, if the site design doesn't allow inline HTML images, they can be removed by "Limit HTML tags" regardless of what Markdown syntax the author added. Same for things like table and other elements that Markdown will happily process, but may not be wanted/allowed by the site design.

    Then again, if the user chooses to change the 'html_input' => 'strip' setting, all bets are off and they're on their own.

    -mike

  • 🇧🇪Belgium Dries

    If 'html_input' => 'strip' is active, the Markdown library already removes raw HTML from the author's input. That raises the question: why do we still need a separate “Limit allowed HTML tags” filter to clean up the output?

    In practice, the HTML filter feels redundant. We strip all raw HTML, and then we have to manually re-allow every tag that the Markdown library generates. That seems to add friction without clearly improving safety.

    It would be good to document the reason as it is a bit confusing or unclear.

    One possible reason to do this, is stricter control. The HTML filter can block certain elements like <table> or <input>, even if the Markdown flavor supports them. That can be useful for enforcing layout or editorial rules. Disabling those Markdown features might be a clearer solution.

    Another reason could be caution. We may not fully trust the Markdown library, or want an extra layer of protection in case of future bugs or supply chain issues. I assume this is the main reason, but it is not 100% clear.

    That said, it does not hurt to keep the filter in place as long as we are comfortable making that tradeoff in user experience.

    In the end, I support whatever direction you choose, as long as expert users have a way to work around it. I am warming up to the idea of having a setting to disable the guardrails. It would be nice not to maintain a custom module just to turn this off.

  • 🇮🇪Ireland lostcarpark

    I hadn't realised that the HTML stripping was also in the CommonMark library. I had assumed the part of the reason for the HTML filter was the catch any HTML in the input.

    If we retain this going forward, we should probably better document that HTML isn't allowed in the input, as the Markdown spec does offer the option of including HTML tags in a number of cases.

    I'm happy to work on this, but I think we need a bit more clarity on the correct direction before committing development time to it.

  • 🇺🇸United States ultimike Florida, USA

    I really like this discussion.

    That can be useful for enforcing layout or editorial rules.

    This was my original reason. I was thinking that site admins might want to make Markdown available to authors, but also limit what they are allowed to do with it. I've been assuming this is the 80% use case. Perhaps I'm wrong?

    Is the 80% use case users who will want/need/understand the full power of the Markdown flavor they choose?

    Images are at the front of my mind here - allowing images added via Markdown seems problematic as they would bypass Drupal's image styles and potentially break layout.

    I think what we're discussing here is what is the best default config for this module. "Best" being defined as "secure" and "serving the highest percentage of users".

    As for:

    I am warming up to the idea of having a setting to disable the guardrails. It would be nice not to maintain a custom module just to turn this off.

    As am I. How about a hidden (no UI) config variable. Something like "disable-all-warnings" (or "power-user-mode") that can be set by flipping a boolean in config. I would include documentation for doing this in the "advanced" section of the docs. If this is agreeable, I'll open a new issue for this.

    -mike

  • 🇧🇪Belgium Dries

    I will try to respond to the rest of your questions later. I am going to think about them on today's bike ride.

    I am definitely in favor of the setting idea. I like a name like skip_filter_enforcement. It is clear and descriptive. It tells you exactly what the setting does. That makes it easy to understand, easy to document, and easy to find in code or configuration.

    Thanks for being open to this idea. A hidden config key with a short explanation in the advanced documentation sounds like a great solution.

  • 🇮🇪Ireland lostcarpark

    Okay, so we might have a "skip filter enforcement" mode, which does not require the HTML filter to be enabled, and does not warn about missing tags in the filter.

    There are several options for the actual check that spring to mind:

    • List the expected tags for the Markdown "flavor" on the form.
      • Ideally the recommended list would change when you select a new flavour, without requiring a page refresh.
      • A nice to have would be some JS that checked the list of recommended tags as you input and lists missing ones.
      • Another feature to consider would be a button labelled "Apply recommended tags". Clicking it would replace the current HTML filter list with the entire list of recommended filters.
    • On form save, if all recommended tags for the flavor are not in the filter list, display a warning listing missing tags.
    • On the site status page, if there are missing tags, display a warning listing them.

    One other thing to consider is allowing the use of Markdown without 'html_input' => 'strip', so that users would be allowed to input HTML, but the HTML filter would restrict the tags they could use. A hidden setting like above would make sense for this, but I could see two approaches both being useful:

    • Allowing HTML in the input, but relying on HTML filter to prevent any undesired tags.
    • Trusting that HTML is removed by the Markdown library, so we don't need to filter the generated HTML.
  • 🇺🇸United States ultimike Florida, USA

    I created a new issue for the creation of a new setting for disabling guardrails: Create new "power user" flag to disable warnings and validation Active

    For the rest of the issue, assuming we go with option 1 from above ("Validate on save, and warn about any missing tags for the selected Markdown flavor.") I'd like to focus on the MVP and then potentially open new issues for @lostcarpark's "nice to haves". With that in mind, as we already have a "Tips" section on the Markdown Easy config form, how about we add the list of tags that need to be added to the corresponding bullet. For example:

    • GitHub-flavored Markdown includes the following extensions: Autolinks, Disallowed Raw HTML, Strikethrough, Tables, and Task Lists. Add the following tags to the "Limit allowed HTML tags" filter for full support: <del> <table> <thead> <tbody> <tr> <th class> <td class> <input type checked disabled>
    • Markdown Smörgåsbord includes everything from GitHub-flavored Markdown plus the following extensions: Footnotes, Description lists. Add the following tags to the "Limit allowed HTML tags" filter for full support: <sup id> <dl> <dt> <dd>

    Then, upon the saving of the text format, display a non-blocking warning if any tags are missing for the selected Markdown Easy flavor.

    I don't think these warnings need to appear on the Site Status page. A missing "Limit allowed HTML tags" filter warning is fine, but I don't think we need to worry about missing tags on that page.

    Finally, I'd really like to keep 'html_input' => 'strip' as-is. I've worked on too many sites where folks have copy/pasted crazy HTML from another source into a body field not understanding what the consequences are. Besides, it is (and has been for some time) documented here how to change it. Perhaps this is a discussion in a new issue for another hidden config variable in the future.

    -mike

  • 🇮🇪Ireland lostcarpark

    That all sounds pretty reasonable.

    Keeping the 'html_input' => 'strip' sounds fine to me. If anyone really wants to live dangerously, they can override with the hook.

    I think what you propose for the MVP makes sense. We could possibly add enhancements in follow up issues.

  • 🇺🇸United States ultimike Florida, USA

    @lostcarpark - based on our Slack conversation, I've added the additional <a class role> and <li class role id> attributes to fully support Markdown footnotes.

    thanks!

    -mike

  • Merge request !22Resolve #3530906 "Improve interaction with" → (Open) created by ultimike
  • 🇺🇸United States ultimike Florida, USA

    Alrighty, I think this one is ready for review. It ended up being a bit more work that I originally planned, but I think it will be useful. When a user submits a new or edited text format, if Markdown Easy is enabled and the "Limit allowed HTML" filter is missing any tags or attributes that prevent full compatibility with the selected Markdown Easy "flavor", then a non-blocking warning message appears of the form:

    For full compatibility with the selected Markdown flavor, add the following tags and attributes to the "Limit allowed HTML tags and correct faulty HTML" filter: <a class href role title> <img alt src title> <li class id role> <del> <input checked disabled type> <sup id>

    I'm thinking about changing the first few words to "For (optional) full compatibility...", but I'd like some feedback on it first.

    This check is made as part of the text format validation, so if skip_filter_enforcement is enabled (from Create new "power user" flag to disable warnings and validation Active ) then this warning will not appear.

    -mike

  • 🇮🇪Ireland lostcarpark

    The same string was repeated 5 times in AddEditFormatTest.php, so I made a small change to add constants for the messages relating to expected tags in HTML filter. I think this makes it tidier, and hopefully clearer what each message relates to.

    The rest looks good to me, but I'll carry out some additional testing.

  • 🇺🇸United States ultimike Florida, USA

    @lostcarpark - your changes are fine with me and much appreciated.

    thanks,
    -mike

Production build 0.71.5 2024