Migrations strips out all but one entity embed

Created on 4 December 2024, 7 months ago

Problem/Motivation

When there is more than one entity embed in a text area, they are removed leaving only a single migrated core media embed. This is not good at all since you cannot blindly assume that all text areas will only have a single entity embed.

Steps to reproduce

  1. Create new content with a text area with a WYSIWYG with an entity embed button & embed several different images & videos
  2. Enable the `convert_entity_media_embed` module
  3. Keep running cron until all eligible entities have been converted
  4. Check the content created in step 1 & notice that only a single migrated entity embed to core media embed remains while the others have been stripped out.

Before running the migration.

After running the migration & I just noticed that some text was stripped out so this really is a huge bug.

Proposed resolution

All entity embed markup will be migrated to core media embed code without losing anything.

🐛 Bug report
Status

Active

Version

1.0

Component

Code

Created by

🇺🇸United States mcannon Philadelphia, PA

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Comments & Activities

  • Issue created by @mcannon
  • 🇺🇸United States mcannon Philadelphia, PA
  • 🇺🇸United States mcannon Philadelphia, PA

    I've been doing a lot of testing today & most of the time the result has been the same. My last attempt involved not changing the editor from CKEditor from 4 to 5 before running through all of the cron migrations & that seemed to work. So I think a good trouble shooting note to include in the README file would be that the migrations should run before changing the CKEditor version from 4 to 5.

  • 🇺🇸United States inversed

    Are you certain that the embed tags were removed from the content? Using something like the diff module can help you confirm this. I have an issue that looks the same as your screenshots but where not all the <drupal-entity> tags were converted to <drupal-media>. So my CKEditor was breaking and content was being hidden on the full content display when the open and close tags did not match.

  • Status changed to Postponed: needs info 5 months ago
  • 🇳🇱Netherlands eelkeblok Netherlands 🇳🇱

    If you have a suggestion as to the formulation of that part of the README, merge requests are welcome. Fundamentally, the WYSIWYG/RTE editor doesn't have much to do with what this module does. However, removing support for the drupal-entity tags from the input filter (and the editor configuration, when one is linked to the input filter) will mean the editor does not recognize the tags anymore, and may filter out the unknown tags; it would be critical not to save the resulting content, because that would wipe out the old tags before the module gets a chance to convert them. Also, it would be good to confirm at a HTML level (e.g. by looking in the database) if the tags were removed, or "just" not converted. You may have found another case of the module failing to recognize a certain flavour of entity embed tags.

    Disclaimer: this is just what I can come up with what might have happened theoretically.

  • 🇨🇦Canada teknocat

    I'm trying to use this module on a site with ckeditor 5. I found that it appeared to have only replaced one of the embeds, so I looked for one specific field value before and then after the migration and compared the original HTML to the final HTML.

    Here's a beautified version of the HTML BEFORE migration:

    ```

    The Mackenzie Valley Land and Water Board (MVLWB) is a regulatory
    authority that originates from Part 4 of the Mackenzie Valley
    Resource Management Act
     (MVRMA). The roles and responsibilities of
    the MVLWB include:

    • Reviewing and making decisions on transboundary projects;
    • Ensuring consistent application of the MVRMA up and down the Mackenzie
      Valley; and,
    • Reviewing and making decisions on applications filed in the regions where
      land claims have not been settled.

    The MVLWB meets at least once per year. The Section 103 Panel meets
    regularly to make decisions respecting applications in the regions where land
    claims have not been settled or for transboundary applications.

    The MVLWB consists of:

    • The MVLWB Chairperson, nominated by the majority of the members and
      appointed by the Minister of Indigenous and Northern Affairs Canada (INAC);
    • Five members of the Sahtu Land and Water Board;
    • Five members of the Gwich’in Land and Water Board;
    • Five members of the Wek'èezhìı Land and Water Board; and,
    • Four members appointed pursuant to Section 99 of the MVRMA.

    All members are appointed by the Minister of INAC, except for the Tłı̨chǫ
    nominees to the WLWB, who are appointed by the Tłı̨chǫ Government (TG). All
    members appointed are members of the MVLWB. 

     

    data-entity-type="media"
    data-entity-uuid="0d8862c2-8a52-428a-807b-aab378695683"
    data-embed-button="media_browser" data-entity-embed-display="media_image"
    data-entity-embed-display-settings="{"image_style":"medium","image_link":""}">
     

    Tanya MacIntosh - Chair

    Lesley Allen - Canada

    Debbie Watsyk - Dehcho First Nations

    Cathie Bolstad - Akaitcho Territory Government

     

    data-entity-type="media"
    data-entity-uuid="d01c95d9-2d9f-4556-8305-8473e7840361"
    data-embed-button="media_browser" data-entity-embed-display="media_image"
    data-entity-embed-display-settings="{"image_style":"medium","image_link":""}">
     

    Elizabeth Wright - Chair

    Deanna Smith – GTC

    William Koe – GNWT

    Roger Fraser – GTC

     

    data-entity-type="media"
    data-entity-uuid="ab51cb64-e63f-4adb-a006-109ec99ca354"
    data-embed-button="media_browser" data-entity-embed-display="media_image"
    data-entity-embed-display-settings="{"image_style":"medium","image_link":""}">
     

    Philippe di Pizzo – GNWT

    Violet Doolittle – Canada

    Gina Dolphus - SSI

     

    data-entity-type="media"
    data-entity-uuid="f6934bbf-4195-4124-9cf5-08e41969202c"
    data-embed-button="media_browser" data-entity-embed-display="media_image"
    data-entity-embed-display-settings="{"image_style":"large","image_link":""}">
     

    Mason Mantla – Chair 

    Mike Nitsiza – GNWT

    Rachel Crapeau – Canada

    Jocelyn Zoe – TG

    Regan Jeremick’ca – TG

    The MVLWB is supported by a Chairs Committee, an Executive Directors
    Committee, and staff from the MVLWB Yellowknife office and from the offices of
    the Regional Boards, as required.  The offices of the GLWB, SLWB,
    and WLWB are located in Inuvik, Fort Good Hope, and Wekweètì respectively. The
    WLWB has a second office in Yellowknife.

    Updated: 13/05/2025

    ```

    And here's the beautified version AFTER migration:

    ```

    The Mackenzie Valley Land and Water Board (MVLWB) is a regulatory
    authority that originates from Part 4 of the Mackenzie Valley
    Resource Management Act
     (MVRMA). The roles and responsibilities of
    the MVLWB include:

    • Reviewing and making decisions on transboundary projects;
    • Ensuring consistent application of the MVRMA up and down the Mackenzie
      Valley; and,
    • Reviewing and making decisions on applications filed in the regions where
      land claims have not been settled.

    The MVLWB meets at least once per year. The Section 103 Panel meets
    regularly to make decisions respecting applications in the regions where land
    claims have not been settled or for transboundary applications.

    The MVLWB consists of:

    • The MVLWB Chairperson, nominated by the majority of the members and
      appointed by the Minister of Indigenous and Northern Affairs Canada (INAC);
    • Five members of the Sahtu Land and Water Board;
    • Five members of the Gwich’in Land and Water Board;
    • Five members of the Wek'èezhìı Land and Water Board; and,
    • Four members appointed pursuant to Section 99 of the MVRMA.

    All members are appointed by the Minister of INAC, except for the Tłı̨chǫ
    nominees to the WLWB, who are appointed by the Tłı̨chǫ Government (TG). All
    members appointed are members of the MVLWB. 

     

    data-entity-uuid="0d8862c2-8a52-428a-807b-aab378695683"
    data-view-mode="media_image"> 

    Tanya MacIntosh - Chair

    Lesley Allen - Canada

    Debbie Watsyk - Dehcho First Nations

    Cathie Bolstad - Akaitcho Territory Government

     

    data-entity-uuid="d01c95d9-2d9f-4556-8305-8473e7840361"
    data-view-mode="media_image"> 

    Elizabeth Wright - Chair

    Deanna Smith – GTC

    William Koe – GNWT

    Roger Fraser – GTC

     

    data-entity-uuid="ab51cb64-e63f-4adb-a006-109ec99ca354"
    data-view-mode="media_image"> 

    Philippe di Pizzo – GNWT

    Violet Doolittle – Canada

    Gina Dolphus - SSI

     

    data-entity-uuid="f6934bbf-4195-4124-9cf5-08e41969202c"
    data-view-mode="media_image">

    Mason Mantla – Chair 

    Mike Nitsiza – GNWT

    Rachel Crapeau – Canada

    Jocelyn Zoe – TG

    Regan Jeremick’ca – TG

    The MVLWB is supported by a Chairs Committee, an Executive Directors
    Committee, and staff from the MVLWB Yellowknife office and from the offices of
    the Regional Boards, as required.  The offices of the GLWB, SLWB,
    and WLWB are located in Inuvik, Fort Good Hope, and Wekweètì respectively. The
    WLWB has a second office in Yellowknife.

    Updated: 13/05/2025

    ```

    As you can see, what it's done is converted the first opening tag and the last closing tag, but did not convert any of the opening or closing tags in between. The issue is with the (.*) parts in the regex for matching the drupal entity tags, because they will match ANYTHING at all, not stopping at the '>' character that ends the opening tag.

    The solution is to change this regex:

    ```
    /( )?<\/drupal-entity>/
    ```

    To this:

    ```
    /(?: )?<\/drupal-entity>/
    ```

    The added question marks make those parts of the expression non-greedy. Here's the result of that same conversion after making that change to the regex:

    ```

    The Mackenzie Valley Land and Water Board (MVLWB) is a regulatory
    authority that originates from Part 4 of the Mackenzie Valley
    Resource Management Act
     (MVRMA). The roles and responsibilities of
    the MVLWB include:

    • Reviewing and making decisions on transboundary projects;
    • Ensuring consistent application of the MVRMA up and down the Mackenzie
      Valley; and,
    • Reviewing and making decisions on applications filed in the regions where
      land claims have not been settled.

    The MVLWB meets at least once per year. The Section 103 Panel meets
    regularly to make decisions respecting applications in the regions where land
    claims have not been settled or for transboundary applications.

    The MVLWB consists of:

    • The MVLWB Chairperson, nominated by the majority of the members and
      appointed by the Minister of Indigenous and Northern Affairs Canada (INAC);
    • Five members of the Sahtu Land and Water Board;
    • Five members of the Gwich’in Land and Water Board;
    • Five members of the Wek'èezhìı Land and Water Board; and,
    • Four members appointed pursuant to Section 99 of the MVRMA.

    All members are appointed by the Minister of INAC, except for the Tłı̨chǫ
    nominees to the WLWB, who are appointed by the Tłı̨chǫ Government (TG). All
    members appointed are members of the MVLWB. 

     

    data-entity-uuid="0d8862c2-8a52-428a-807b-aab378695683"
    data-view-mode="media_image">

    Tanya MacIntosh - Chair

    Lesley Allen - Canada

    Debbie Watsyk - Dehcho First Nations

    Cathie Bolstad - Akaitcho Territory Government

     

    data-entity-uuid="d01c95d9-2d9f-4556-8305-8473e7840361"
    data-view-mode="media_image">

    Elizabeth Wright - Chair

    Deanna Smith – GTC

    William Koe – GNWT

    Roger Fraser – GTC

     

    data-entity-uuid="ab51cb64-e63f-4adb-a006-109ec99ca354"
    data-view-mode="media_image">

    Philippe di Pizzo – GNWT

    Violet Doolittle – Canada

    Gina Dolphus - SSI

     

    data-entity-uuid="f6934bbf-4195-4124-9cf5-08e41969202c"
    data-view-mode="media_image">

    Mason Mantla – Chair 

    Mike Nitsiza – GNWT

    Rachel Crapeau – Canada

    Jocelyn Zoe – TG

    Regan Jeremick’ca – TG

    The MVLWB is supported by a Chairs Committee, an Executive Directors
    Committee, and staff from the MVLWB Yellowknife office and from the offices of
    the Regional Boards, as required.  The offices of the GLWB, SLWB,
    and WLWB are located in Inuvik, Fort Good Hope, and Wekweètì respectively. The
    WLWB has a second office in Yellowknife.

    Updated: 13/05/2025

    ```

    However, I also should mention that when it changes data-entity-embed-display into data-entity-view-mode, the result is no good because the embed display mode is not necessarily a valid view mode. In this case, the source entities are rendering the image using an image style, so the embed display of "media_image" works in conjunction with the entity display settings to achieve the final result. So this needs more work than just fixing the regex in order for it to be useful. It needs to either check to see if the embed display is a valid view mode and, if so, use it, otherwise have it use the default view mode instead.

    Of course every site will have some differences and unique configuration that will make it impossible to fully automatically migrate it to the desired end result with one module, but with a couple of little improvements this module can be very helpful and get you most of the way there.

  • @teknocat opened merge request.
  • 🇨🇦Canada teknocat

    Closing this as a duplicate of https://www.drupal.org/project/convert_entity_media_embed/issues/3460653 Markup having new lines in tags not matched Active , since that resolves both this and another related issue and is thus the more robust solution that should be incorporated into a new release.

Production build 0.71.5 2024