Adding links around embedded media through CKEditor might lead to invalid/complex markup when rendered

Created on 16 August 2019, over 5 years ago
Updated 24 February 2023, over 1 year ago

Problem/Motivation

The Drupal link CKEditor button can be used to add links to embedded media items. However, we can't guarantee that the rendered representation of that media item will be a single HTML element, or one that can be safely wrapped with <a> tags. As a result, editors can create links that have arbitrary HTML elements inside them.

Steps to reproduce:

1) Go to simplytest.me and spin a vanilla install of Drupal 8.8.x. Log in with admin/admin, install the "Media" and "Media Library" modules.
2) Go to /admin/config/content/formats/manage/basic_html?destination=/admin/config/content/formats and drag the "Insert from Media Library" button into the CKEditor toolbar. Enable the "Embed media" filter, and click "Save configuration".
3) Go to /admin/structure/media/manage/image/display and drag the "Authored by" field up into the visible fields area. Click "Save" to save this view display settings.
4) Go to /node/add/article to open the article node form. In the wysiwyg, click the "Insert from Media Library" button. Upload a file into the upload zone, give it an ALT text. Save and insert. At this point you will have something like this:

5) Click the media item to select it, then click the "Link" button from the CKEditor toolbar. Give it a random URL (such as https://drupal.org)
6) At this point, if you view the source markup you will have something similar to this:

<p>text before media</p>
<a href="https://drupal.org">
<drupal-media data-align="center" data-entity-type="media" data-entity-uuid="94a3ff34-a958-427c-ac51-127259cbc1e2"></drupal-media>
</a>

<p>text after media</p>

7) Fill in the required fields and save the node.
8) Inspect the markup of the node body, you will find something like this:

Proposed resolution

TBD

Remaining tasks

- Discuss and agree on a reasonable approach
- Implement it
- Review & commit

User interface changes

TBD

API changes

TBD

Data model changes

TBD

Release notes snippet

TBD

Original report from @wrd :

With the progress being made on #2801307: [META] Support WYSIWYG embedding of media entities โ†’ , I've been playing with the new capabilities and have run into what seems like an important issue for typical users: after embedding some media entities, particularly images, a user will probably expect to be able to link the media to another page. At present, this doesn't really work.

  1. Perform a Standard install.
  2. Enable the Media and Media Library modules.
  3. Edit the Basic HTML editor. Enable the "Embed media" filter, remove the "Image" button, add the "Insert from Media Library" button. <drupal-media data-entity-type data-entity-uuid> is added to the allowed HTML tags automatically.
  4. Create a new Image media entity in the Media Library.
  5. Create a new Basic Page.
  6. Use the "Insert from Media Library" button to embed an image in the body field.

  7. Click on the embedded image to select it.
  8. Click the Link button to link the image to a destination. I'll link it to https://drupal.org.

  9. When done, note that there's no visual indication that the embedded image has been linked to anything.

  10. Save and publish the node, and view it.

You'll see that the image is, indeed, linked to https://drupal.org. However, if you inspect the element, the <a> tag has been wrapped around the entire media entity:

<article data-history-node-id="1" role="article" about="/node/1" typeof="schema:WebPage" class="node node--type-page node--view-mode-full clearfix">
  <header>
    <span property="schema:name" content="Test Page 1" class="rdf-meta hidden"></span>
  </header>
  <div class="node__content clearfix">
    <div property="schema:text" class="clearfix text-formatted field field--name-body field--type-text-with-summary field--label-hidden field__item">
      <p>This is a test. </p>
      <a href="https://drupal.org">
        <article class="media media--type-image media--view-mode-full">
          <div class="field field--name-field-media-image field--type-image field--label-visually_hidden">
            <div class="field__label visually-hidden">
              Image
            </div>
            <div class="field__item">
              <a href="https://drupal88.dd:8443/sites/default/files/2019-08/pears_1.jpg">
                <img src="/sites/default/files/2019-08/pears_1.jpg" width="640" height="480" alt="Pears" typeof="foaf:Image" />
              </a>
            </div>
          </div>
        </article>
      </a>
      <p>This is only a test. </p>
    </div>  
  </div>
</article>

While we can wrap block elements in <a> tags in HTML5, the result here is problematic. The image itself is (by default) linked to the raw image file, so we've got nested hyperlinks, which AFAIK isn't permitted in HTML5. And browsers (Safari, anyway) will ultimately render this as multiple separate <a> tags, most of them empty or wrapping whitespace.

It seems like the expected default behavior for a typical user would be that after using the Link button, there would be some kind of visual indication in CKEditor that the entity is linked, and the rendered page would simply have the image linked to the destination specified using the Link button.

However, to complicate matters, we have to anticipate the possibility that the media entity being embedded is more complex than just an image. For example, what if I have a custom media type that looks like the Image type, but has a few more fields:

  • Copyright (date field)
  • Location (geolocation field)
  • Photographer (entity reference field)

...and a default view mode that displays the image, the copyright date, the location, and .

If I embed this entity, and then use the link button, the desired behavior is for the image to link to the URL entered using the link button, while leaving the link to the Photographer node untouched so that it still works.

I'm not sure if this more complex case can be handled in any reasonably generalized fashion that'll work for most users. However, the more basic behavior of simply linking an embedded image entity should probably work out of the box, shouldn't it?

I hope this is useful! Let me know if I can clarify anything.

๐Ÿ› Bug report
Status

Active

Version

10.1 โœจ

Component
Filterย  โ†’

Last updated 5 days ago

No maintainer
Created by

๐Ÿ‡บ๐Ÿ‡ธUnited States wrd

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Comments & Activities

Not all content is available!

It's likely this issue predates Contrib.social: some issue and comment data are missing.

  • ๐Ÿ‡ณ๐Ÿ‡ฎNicaragua edysmp Nicaragua

    #28 Fixes my issue. Now media images can be linkable. Thanks!

  • ๐Ÿ‡บ๐Ÿ‡ธUnited States bkosborne New Jersey, USA

    I wonder if anyone has tested how this works in CKEditor 5? Maybe it's easier to accomplish there?

  • ๐Ÿ‡บ๐Ÿ‡ธUnited States scotwith1t Birmingham, AL

    Still seeing problems with this in CKEditor 5 with D10.1.2. While adjusting the order of the filters does link the image (Yay!) the markup is still broken.

    This is compounded and becomes particularly visible when using ExtLink module, where the broken markup results in multiple external link icons being added.

  • ๐Ÿ‡บ๐Ÿ‡ธUnited States cewernlund

    Just wanted to say the comment in #28 worked for me. I was thinking something in the Text Editor format might be the culprit and this proves it. Many thanks for reporting it!

  • First commit to issue fork.
  • last update about 1 year ago
    30,358 pass, 2 fail
  • @jayhuskins opened merge request.
  • ๐Ÿ‡บ๐Ÿ‡ธUnited States jayhuskins

    Here is a patch for 10.1.5 that simply removes the ability to link media.

  • ๐Ÿ‡บ๐Ÿ‡ธUnited States scotwith1t Birmingham, AL

    I don't know about others but, to me, removing the ability to link media altogether is not a viable solution. Linking media is a critical function for many sites.

  • ๐Ÿ‡ฉ๐Ÿ‡ชGermany marcoka

    @scotwith1t yes, i agree, of course that is an important function. Linking media is used everywhere :)

  • ๐Ÿ‡บ๐Ÿ‡ธUnited States loze Los Angeles

    I think all the options available in ckeditor5 (edit, link, align, caption, etc.) should all be configurable on the media bundle, or view mode.

    Is that possible?

  • ๐Ÿ‡บ๐Ÿ‡ธUnited States scotwith1t Birmingham, AL

    @loze While that would definitely be a great feature, it doesn't really fit into the scope of what's being addressed in this issue. See https://www.drupal.org/project/media_embed_view_mode_restrictions โ†’ contrib module which was created out of #3308069: Provide per-bundle configuration for embeddable view modes โ†’ as a potential solution (tho it doesn't have a D10 release yet).

  • ๐Ÿ‡ฉ๐Ÿ‡ชGermany rgpublic Dรผsseldorf ๐Ÿ‡ฉ๐Ÿ‡ช ๐Ÿ‡ช๐Ÿ‡บ

    Well, the main problem seems to be (IMHO) that way to many field templates (e.g. core/themes/stable9/templates/field/field.html.twig and many more) in Drupal contain a div tag when they should better use a span tag, I guess. A span tag would be more correct because a whole field, for example, is an inline element. There can be multiple fields in a row. As long as they are all using div tags, they can never appear inside a p tag :-(

  • ๐Ÿ‡บ๐Ÿ‡ธUnited States millerrs

    I get the same broken markup in #43 with 10.2.1. My filter order is the same as suggested in #28.

    While I have limited knowledge of CKEditor's mechanics, the proposal in #18 to utilize an attribute for the link appears feasible and aligns with the methods used for alignment and other options.

  • ๐Ÿ‡บ๐Ÿ‡ธUnited States chuck_theobald

    On Drupal 10, CKEditor 5, fully updated Media tooling:

    FWIW, the issue I am seeing is a link with no content, followed by the image.

    Seen in CKEditor Source mode:

    test media image link

    ย  โ†’

    Delivered to the browser:

    test media image link

    โ†’




    Moving "Convert line breaks into HTML (i.e.
    and

    )" to last in the processing order fixed the linking issue. My filter processing order:

    Convert URLs into links
    Advanced Insert View
    Restrict images to this site
    Align images
    Caption images
    Embed media
    Display embedded entities
    Linkit URL converter
    Correct faulty and chopped off HTML
    Convert line breaks into HTML (i.e.
    and

    )

  • ๐Ÿ‡บ๐Ÿ‡ธUnited States mortona2k Seattle

    I ran into a similar issue as above.

    I have a media view mode that displays an image linked to the file.

    When it's embedded, the link is stripped.

    The issue was with the order of filters, after moving things around the link worked.

  • ๐Ÿ‡ฉ๐Ÿ‡ชGermany drubb Sindelfingen

    I did something similar like in #28 or #54. While the image is correctly displayed and linked, there are still additional empty links before and after the image. Something like

    <p><a></a></p>
    <p><a>Rendered image</a></p>
    <p><a></a></p>
    

    It's weird!

  • ๐Ÿ‡บ๐Ÿ‡ธUnited States caspervoogt

    I was running into issues with linked media images not rendering as linked images. Disabling the 'Convert line breaks into HTML' filter fixed that for me. It was not enough for me to move it to the end of the list of filters. I also tested putting it first. Only disabling that filter solved it for me.

  • ๐Ÿ‡บ๐Ÿ‡ธUnited States chuck_theobald

    #57: ++
    Finally getting back to this and can confirm that caspervoogt's solution works for my sites as well.
    Thank you!

  • Maybe no new information here, but I'm also having this issue and being driven crazy by it, so I'll include what I've seen just in case.

    My CKE5 content is a simple <drupal-media> element with a link added and the "Convert line breaks into HTML" text filter is near the top of the list. This is what the source looks like:

    <a href="/node/106"><drupal-media data-entity-type="media" data-entity-uuid="3b055276-978a-445d-83e5-5ac8c217030a">&nbsp;</drupal-media></a>
    

    This is the rendered output:

    <div class="c-basic-text o-wysiwyg">
      <p>
        <a href="/node/106"><br></a>
      </p>
      <article class="media media--type-image media--view-mode-default">
        <p>
          <img loading="lazy" src="/system/files/styles/large/private/2024-06/img_0243_0.jpeg?itok=eeSOaOLy" width="480" height="360" alt="Kittens">
        </p>
      </article>
      <p></p>
    </div>
    

    Note the <p> that precedes the <article>. It contains an <a> that contains only a <br> and which of course results in a duplicate link (which is bad practice). None of these elements were present in the CKE5 source.

    I figured out (as others have) that if I turned off the "Convert line breaks into HTML" text filter, the image link began working as expected and the rendered output looked like this:

    <div class="c-basic-text o-wysiwyg">
      <a href="/node/106">
        <article class="media media--type-image media--view-mode-default">
          <img loading="lazy" src="/system/files/styles/large/private/2024-06/img_0243_0.jpeg?itok=eeSOaOLy" width="480" height="360" alt="Kittens">
        </article>
      </a>
    </div>
    

    So that solves it! But I don't want to turn off the filter completely because that might mess up existing content that relies on the line breaks being converted, so instead I moved the "Convert line breaks into HTML" text filter to the bottom of the filter list. That results in this rendered output:

    <div class="c-basic-text o-wysiwyg">
      <p>
        <a href="/node/106"></a>
      </p>
      <a href="/node/106" data-once="lifespan_extlink">
        <article class="media media--type-image media--view-mode-default">
          <img loading="lazy" src="/system/files/styles/large/private/2024-06/img_0243_0.jpeg?itok=eeSOaOLy" width="480" height="360" alt="Kittens">
          <br>
        </article>
      </a>
    </div>
    

    If you're playing along, you can see that we have an image link that works, but now the duplicate link is back. This time it does not have a <br> in it. Instead, a <br> mysteriously appears after the <img>. Again, neither the extra link nor the <br> were present in the CKE5 source.

    Another wrinkle: I noticed that if I add an image caption to the linked <drupal-media> element, I get this rendered output:

    <div class="c-basic-text o-wysiwyg">
      <figure role="group" class="caption">
        <a href="/node/106">
          <p></p>
          <article class="o-img-wrapper">
            <img class="media media--type-image media--view-mode-default" src="/system/files/2024-06/img_0243_0.jpeg" alt="Kittens" loading="lazy">
            <br>
          </article>
        </a>
        <p>
          <a href="/node/106" data-once="lifespan_extlink"></a>
        </p>
        <figcaption>test caption</figcaption>
      </figure>
    </div>
    

    Of note here is that the image link is correctly formed and works, but we again have a duplicate link and a <br> following the <img>.

    My default display for images does not include any link. This is happening on 10.2.x.

    Again, as others have already noted, it seems like turning the filter off if is currently the best option for getting a correctly linked image and no problematic markup.

  • ๐Ÿ‡ฉ๐Ÿ‡ชGermany drubb Sindelfingen

    Did a quick test to look at this filter in isolation, by calling the _filter_autop() function directly. This is broken indeed:

    $output = _filter_autop('<p><a href="https://google.de"><drupal-media/></a></p>')
    

    Result:

    <p><a href="https://google.de"></p>\n
      <drupal-media/></a></p>
    
  • ๐Ÿ‡ฌ๐Ÿ‡งUnited Kingdom catch

    Moving this to filter module. Should be possible to write a test case given the examples above to reproduce the duplicate link, then we can try to fix it in _filter_autop

  • ๐Ÿ‡ฌ๐Ÿ‡งUnited Kingdom catch

    Re-titling.

  • ๐Ÿ‡บ๐Ÿ‡ธUnited States phenaproxima Massachusetts

    Glad we got this isolated, I think with that in mind it shouldn't be too hard to write a test case.

  • ๐Ÿ‡ฉ๐Ÿ‡ชGermany drubb Sindelfingen

    The question is, does this filter make any sense at all in conjunction with CKEditor fields? I think it's primarily meant for use with plain text fields, as CKEditor does this conversion already. I can't think of a use case for fields powered by CKE, so maybe we need a hint not to use it, or prevent usage at all for CKE fields. At least it should fail silently and not destroy the markup given by CKE.

  • ๐Ÿ‡ฉ๐Ÿ‡ชGermany rgpublic Dรผsseldorf ๐Ÿ‡ฉ๐Ÿ‡ช ๐Ÿ‡ช๐Ÿ‡บ

    * IMHO, I think this is correct. The filter is only for text. It shouldn't be used for HTML. I don't see any valid use-case either. Whether it really needs to check and silently fail or whether a description would enough is anyone's guess. There are other filters which just use the issue description to specifiy their restrictions in order to work correctly ("only use after/before this and that"). They don't check and silently fail in these cases either.
    * This issue morphed into sth. else though. The original issue as still seen in the issue description was "we can't guarantee that the rendered representation of that media item will be a single HTML element". That's still a valid concern and by far isn't solved by just disabling this filter. But perhaps a new issue could be filed for this. The far more complicated problem to solve is: If a media *entity(!)* is rendered it unsurprisingly consists of entity *fields(!)*. When rendering these fields, there are DIV tags introduced at the very least because the default field item template contains DIV tags. I know this affects issues with far reaching consequences and this is probably not easy to change, but my take on this is: This is actually wrong. A DIV cannot appear inside a SPAN tag. This causes invalid HTML. Whenever a field is rendered inline you will get these problems which is kind-of sad. It could perhaps be solved without turning the whole of Drupal on its head if we said that field elements should be rendered with a special SPAN tag template just for the media entity display mode or sth. along these lines.

  • ๐Ÿ‡ฌ๐Ÿ‡งUnited Kingdom catch

    The filter is only for text. It shouldn't be used for HTML.

    The output of a text format is by definition HTML. Even if you're entering markdown or something, the end result is HTML, and the auto line break filter can theoretically be applied at any point in that process. You could say it should only be applied at the beginning, before any other HTML formatting is applied from different filters, and that editors shouldn't copy and paste paragraph breaks when using a format with the automatic line break filter, but the text format system doesn't enforce this at all, and I've had plenty of sites where a mixture of handwritten and copy and pasted content gets put into the same text format.

    The question is, does this filter make any sense at all in conjunction with CKEditor fields

    The media filter can be used without ckeditor5, so the line break filter ought to be able to work with it - and comments above suggest this is broken without any involvement from ckeditor5 at all.

    There might well be a case for adding some validation to prevent the auto-line break filter from being configured alongside ckeditor5 though but that feels like a new issue to me.

  • ๐Ÿ‡จ๐Ÿ‡ฆCanada ryrye

    When faced with this issue, caspervoogt's solution https://www.drupal.org/project/drupal/issues/3075527#comment-15604529 ๐Ÿ› Adding links around embedded media through CKEditor might lead to invalid/complex markup when rendered Active to disable to the "Convert line breaks into HTML" filter worked for me.

  • ๐Ÿ‡ซ๐Ÿ‡ทFrance johnatas

    Hey!

    I'm working on a project with Drupal version 10.3.6 and using CKE5. I'm experiencing the same issue when inserting a Drupal media with a link.

    In my case, disabling the "autop" filter isn't an option, as this filter is essential for my site.
    So, I decided to work on a custom patch to resolve the issues I was facing.
    Since it seems to be working, I'm sharing it with you.

    The attached patch has completely resolved all my problems. Note that it works perfectly when the filter order is set so that "autop" comes after "media embed".

    So far, I've seen no side effectsโ€”no empty

    tags or unwanted line breaks.
    If you notice any side effects with this approach, please let me know.

Production build 0.71.5 2024