Use CDATA in XML RSS Feeds (was: Malformed XML Feeds)

Created on 2 October 2003, over 21 years ago
Updated 18 June 2023, almost 2 years ago

Problem/Motivation

http://drupal.org/node/feed

Description:
The feed has html elements in the description field for each node. This is not allowed, all blocks of html codes should be contained within a CDATA tag.

You can read more at this site, http://webservices.xml.com/pub/a/ws/2002/11/19/rssfeedquality.html.

Steps to reproduce

Proposed resolution

TBD

Remaining tasks

Explain why using CDATA over escaped HTML would be preferable. See

User interface changes

API changes

Data model changes

Release notes snippet

Feature request
Status

Closed: outdated

Version

11.0 🔥

Component
Views 

Last updated about 10 hours ago

Created by

🇺🇸United States CodeMonkeyX

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Merge Requests

Comments & Activities

Not all content is available!

It's likely this issue predates Contrib.social: some issue and comment data are missing.

  • 🇺🇸United States smustgrave

    If still a valid feature request please reopen updating issue summary per #28.

    Also could be unrelated but 🐛 Html::escapeCdataElement() not adding CDATA correctly Fixed was committed so wonder if that solved anything.

    Thanks!

  • Status changed to Active over 1 year ago
  • 🇫🇷France fgm Paris, France

    Technically, as I understand the RSS 2.0 standard, HTML in the items should (must ?) be escaped, and should still be in a CDATA section: I had to look this up when upgrading the G2 module for Drupal 10/9.

    This being said, comment #28 mentions Google Reader, which is no longer a thing, but all current RSS readers I've used (only a few, admittedly) handle this "incorrect" escaped HTML correctly these days.

    I've reset the issue to active because it still feels like we should have a clearly defined, standards-backed answer, instead of just some "it works" anecdotal info.

  • 🇳🇿New Zealand luke.stewart

    Assuming this validator is valid...

    https://validator.w3.org/feed/check.cgi?url=drupal.org%2Fnode%2Ffeed

    As of now reports that this is a valid feed.
    It recommends the following:
    description should not contain iframe tag
    description should not contain relative URL references:
    Missing atom:link with rel="self"

    Does this mean that this issue can be closed? (Again?)

  • 🇬🇧United Kingdom james.williams

    @luke.stewart If you’re seeing “description should not contain iframe tag”, then no, this is still a problem to fix. Drupal is outputting HTML in the description elements which needs escaping via CDATA tags.

  • First commit to issue fork.
  • 🇺🇸United States nicxvan

    I've been following this for a while. I found this: https://www.drupal.org/project/views_rss/issues/3409451 🐛 RSS output is no longer valid XML Fixed and the related with more discussion: https://www.drupal.org/project/views_rss/issues/3079683 💬 Add CDATA tags at description field on D8 Closed: duplicate

    I am going to create an issue fork and just wrap the description in CDATA and see if any tests complain to see if we can move this forward.

  • 🇺🇸United States nicxvan

    Aside from being curious if this will have any failures I do have two implementation questions.

    1. Do we need to detect if there is html in the description before adding the CDATA wrapper, my understanding is it is not harmful to add the CDATA wrapper everywhere.
    2. Do we need to wrap the main description in views-view-rss.html.twig? It was done in the issue I linked above, but I don't think that is actually necessary.

  • Merge request !7201Add CDATA Wrapper to item description → (Open) created by nicxvan
  • Pipeline finished with Failed
    about 1 year ago
    Total: 1032s
    #130157
  • 🇺🇸United States nicxvan

    Looks like there was at least one failure in:

    Testing Drupal\Tests\node\Functional\NodeRSSContentTest
        .E                                                                  2 / 2
        (100%)
        
        Time: 00:14.744, Memory: 4.00 MB
        
        There was 1 error:
        
        1) Drupal\Tests\node\Functional\NodeRSSContentTest::testUrlHandling
        Behat\Mink\Exception\ExpectationException: The string
        "http://localhost/subdirectory/sites/simpletest/14042105/files/root-relative"
        was not found anywhere in the HTML response of the current page.
        
        /builds/issue/drupal-3433/vendor/behat/mink/src/WebAssert.php:888
        /builds/issue/drupal-3433/vendor/behat/mink/src/WebAssert.php:363
        /builds/issue/drupal-3433/core/tests/Drupal/Tests/WebAssert.php:547
        /builds/issue/drupal-3433/core/modules/node/tests/src/Functional/NodeRSSContentTest.php:120
        /builds/issue/drupal-3433/vendor/phpunit/phpunit/src/Framework/TestResult.php:728
    

    I can look at this later if someone else doesn't get to it before me.
    There was another failure in AJAX but I think it was unrelated because rerunning it succeeded.

  • 🇫🇷France duaelfr Montpellier, France

    I did the exact same change in my own views-view-row-rss.html.twig for both the title and the description.
    While there is no issue with the title, the description gets filtered somehow.

    I tried to duplicate the description field with another name:

    <item>
      <title><![CDATA[{{ title }}]]></title>
      <link>{{ link }}</link>
      <description><![CDATA[{{ description }}]]></description>
      <anything_else_but_description><![CDATA[{{ description }}]]></anything_else_but_description>
      {% for item in item_elements -%}
        <{{ item.key }}{{ item.attributes -}}
        {% if item.value -%}
          >{{ item.value }}</{{ item.key }}>
        {% else -%}
          {{ ' />' }}
        {% endif %}
      {%- endfor %}
    </item>

    The "anything_else_but_description" field does not get filtered out.
    Does anyone know where it could happen?

  • 🇺🇸United States nicxvan

    I'm not sure why that is happening, but you shouldn't need to add cdata to the title as I don't think that can be html.

  • 🇺🇸United States nicxvan

    I did some investigation, there is some process that is converting root relative and protocol relative urls to absolute in the twig file when the cdata is not wrapping it.

    Here are the assertions with comments alluding to this:
    // Verify that root-relative URL is transformed to absolute.
    $this->assertSession()->responseContains($file_url_generator->generateAbsoluteString('public://root-relative'));
    // Verify that protocol-relative URL is left untouched.
    $this->assertSession()->responseContains($protocol_relative_url);
    // Verify that absolute URL is left untouched.
    $this->assertSession()->responseContains($absolute_url);

    I downloaded the artifacts and here are the urls after decoding:
    Protocol-relative URL
    Root-relative URL

    I compared with a passing test to confirm those do get http://localhost. Does anyone know what mechanism is converting those to absolute urls?

    I pushed up a description filtered with RAW to test how that gets output. For some reason these tests are not working locally.

  • Pipeline finished with Success
    about 1 year ago
    Total: 603s
    #132157
  • 🇺🇸United States nicxvan

    That actually worked!

    I'm not sure it actually solves the issue, I don't see the CDATA section properly.

    Other modules that use xml with CDATA add it like this:
    public function addCdata($cdata_text): void {
    $node = dom_import_simplexml($this);
    $no = $node->ownerDocument;
    $node->appendChild($no->createCDATASection($cdata_text));
    }

    And builds it from a \SimpleXMLElement object first.

    I'm not sure we need to do all of that. It's possible I'm looking at the wrong thing.

  • 🇺🇸United States nicxvan

    There is something weird going on.
    With this in the template:

    <description><![CDATA[{{ description|raw }}]]></description>
    The output looks like this and is encoded:

    <description>
    &lt;span&gt;test&lt;/span&gt;
    &lt;span&gt;&lt;span&gt;admin&lt;/span&gt;&lt;/span&gt;
    &lt;span&gt;&lt;time datetime="2023-11-19T02:52:51+00:00" title="Sunday, November 19, 2023 - 02:52"&gt;Sun, 11/19/2023 - 02:52&lt;/time&gt;
    &lt;/span&gt;
    </description>
    

    I then changed the twig template to this to confirm that I was editing the right template:
    <description>Before CDATA<![CDATA[{{ description|raw }}]]>After CDATA</description>

    Now the output looks like this
    The description content shows up twice for some reason, the before callout shows up once and the after callout shows up twice.

    <description>Before CDATA
    &lt;span&gt;adsf&lt;/span&gt;
    &lt;span&gt;&lt;span&gt;admin&lt;/span&gt;&lt;/span&gt;
    &lt;span&gt;&lt;time datetime="2023-11-19T02:52:58+00:00" title="Sunday, November 19, 2023 - 02:52"&gt;Sun, 11/19/2023 - 02:52&lt;/time&gt;
    &lt;/span&gt;
    After CDATA<![CDATA[
    <span>adsf</span>
    <span><span>admin</span></span>
    <span><time datetime="2023-11-19T02:52:58+00:00" title="Sunday, November 19, 2023 - 02:52">Sun, 11/19/2023 - 02:52</time>
    </span>
    ]]>After CDATA</description>
    
  • 🇫🇷France fgm Paris, France

    FWIW I temporarily had this problem of escaped content after the D9->D10 migration on https://osinet.fr/rss.xml, and is disappeared a few days ago when I upgraded to 10.2.4. Not sure what changed in core that could cause these differences.

  • Status changed to Needs review 12 months ago
  • 🇫🇷France duaelfr Montpellier, France

    I traced down the CDATA removal to \Drupal\Core\EventSubscriber\RssResponseRelativeUrlFilter::transformRootRelativeUrlsToAbsolute()

    I updated MR !7201 with a fix.

  • 🇺🇸United States nicxvan

    Great find!

  • Status changed to Needs work 12 months ago
  • 🇺🇸United States nicxvan

    Functional test is failing, I can't take a look at the moment.

  • 🇺🇸United States nicxvan

    Ah it's the new test: Drupal\Tests\Core\EventSubscriber\RssResponseRelativeUrlFilter

  • Pipeline finished with Failed
    12 months ago
    Total: 9902s
    #143057
  • 🇺🇸United States nicxvan

    I'm rerunning again, I'm slightly suspicious since the failing test is the one that was just updated, but running that test locally passes, trying to run all tests locally and see if I get failures.

  • Assigned to nicxvan
  • 🇺🇸United States nicxvan

    Ok I can replicate this locally now, I think I can fix this.

  • Issue was unassigned.
  • 🇺🇸United States nicxvan

    Ok this test should run, it seems to put the closing CDATA tag on a new line.

    I patched a site and ran the rss validator and it looks valid as far as the CDATA is concerned.
    It shows this error: Missing atom:link with rel="self" which is unrelated to the issue being solved here.

    I'll move to needs review once I confirm tests are passing.

  • Pipeline finished with Success
    12 months ago
    Total: 613s
    #143193
  • Status changed to Needs review 12 months ago
  • 🇺🇸United States nicxvan

    I am using this site: https://validator.w3.org/feed/#validate_by_input
    Here is the test rss:

    <?xml version="1.0" encoding="utf-8"?>
    <rss xmlns:dc="http://purl.org/dc/elements/1.1/" version="2.0" xml:base="https://mercury.ddev.site/">
      <channel>
        <title>Drush Site-Install</title>
        <link>https://mercury.ddev.site/</link>
        <description/>
        <language>en</language>
        
        <item>
      <title>test</title>
      <link>https://mercury.ddev.site/node/3</link>
      <description><![CDATA[
    <span data-entity-title-editable-id="node/3">test</span>
    
    <span><a title="View user profile." href="https://mercury.ddev.site/user/1">admin</a></span>
    
    <span><time datetime="2024-04-10T19:11:26+00:00" title="Wednesday, April 10, 2024 - 19:11">Wed, 04/10/2024 - 19:11</time>
    </span>
    
                <div class="text-content clearfix field field--name-body field--type-text-with-summary field--label-hidden field__item"><p>asdf</p></div>
          ]]></description>
      <pubDate>Wed, 10 Apr 2024 19:11:26 +0000</pubDate>
        <dc:creator>admin</dc:creator>
        <guid isPermaLink="false">3 at https://mercury.ddev.site</guid>
        </item>
    
      </channel>
    </rss>
    

    If I remove the cdata tag and use this instead

    <?xml version="1.0" encoding="utf-8"?>
    <rss xmlns:dc="http://purl.org/dc/elements/1.1/" version="2.0" xml:base="https://mercury.ddev.site/">
      <channel>
        <title>Drush Site-Install</title>
        <link>https://mercury.ddev.site/</link>
        <description/>
        <language>en</language>
        
        <item>
      <title>test</title>
      <link>https://mercury.ddev.site/node/3</link>
      <description>
    <span data-entity-title-editable-id="node/3">test</span>
    
    <span><a title="View user profile." href="https://mercury.ddev.site/user/1">admin</a></span>
    
    <span><time datetime="2024-04-10T19:11:26+00:00" title="Wednesday, April 10, 2024 - 19:11">Wed, 04/10/2024 - 19:11</time>
    </span>
    
                <div class="text-content clearfix field field--name-body field--type-text-with-summary field--label-hidden field__item"><p>asdf</p></div>
          </description>
      <pubDate>Wed, 10 Apr 2024 19:11:26 +0000</pubDate>
        <dc:creator>admin</dc:creator>
        <guid isPermaLink="false">3 at https://mercury.ddev.site</guid>
        </item>
    
      </channel>
    </rss>
    

    The validator complains about the div and html as expected.

  • Status changed to Needs work 12 months ago
  • 🇺🇸United States smustgrave

    So as a template change will need to update the other views-view-row-rss.html.twig in core.

    But also do a CR to announce to other themes to update their templates.

  • Pipeline finished with Canceled
    12 months ago
    Total: 107s
    #144990
  • Pipeline finished with Failed
    12 months ago
    Total: 568s
    #144992
  • 🇺🇸United States nicxvan

    Got some new failures, I'm taking a look at them.

  • Pipeline finished with Failed
    12 months ago
    Total: 1110s
    #145005
  • 🇺🇸United States nicxvan

    Here is the failing test:

    [31mDrupal\Tests\views\Functional\Plugin\DisplayFeedTest           0 passes             1 exceptions             
    [0mFATAL Drupal\Tests\views\Functional\Plugin\DisplayFeedTest: test runner returned a non-zero error code (2).
    [31mDrupal\Tests\views\Functional\Plugin\DisplayFeedTest           0 passes   1 fails    
    

    I am unable to replicate this failure locally at the moment.

  • 🇺🇸United States nicxvan

    I was able to replicate I am pushing up a fix:

    I'm not sure if the updated test needs more work.

    We also updated stable, let me know if that needs a follow up issue instead.

  • Pipeline finished with Success
    12 months ago
    Total: 1019s
    #145059
  • Status changed to Needs review 12 months ago
  • Status changed to RTBC 12 months ago
  • 🇺🇸United States smustgrave

    I reviewed this the other day and forgot to hit save :(

    But tweaked the CR slightly to include an examples.

    https://git.drupalcode.org/issue/drupal-3433/-/jobs/1313487 shows the test coverage

    All core templates appear to be updated.

    Believe this is good.

  • Status changed to Needs work 12 months ago
  • 🇺🇸United States nicxvan

    Temporarily moving this back to needs work while I double check something, I think I can simplify this a bit.

  • Pipeline finished with Canceled
    12 months ago
    Total: 10s
    #150667
  • Status changed to Needs review 12 months ago
  • 🇺🇸United States nicxvan

    Moving back to needs review since I changed the approach.

    I think the raw filter was not appropriate and I realized we could always wrap the description in CDATA so I moved it to the url transform.

    I then used XSS::filterAdmin to make sure no dangerous scripts get through.

    I'll bump it back to needs work if the tests don't pass.

  • Status changed to Needs work 12 months ago
  • 🇺🇸United States nicxvan

    Tests faililng

  • Pipeline finished with Canceled
    12 months ago
    #150668
  • Pipeline finished with Canceled
    12 months ago
    Total: 108s
    #150677
  • Status changed to Needs review 12 months ago
  • 🇺🇸United States nicxvan

    I cleaned up the tests further since we are wrapping the description in CDATA we don't need an extra test case, we just need to update the feed test to account for CDATA and ensure the feed is outputting CDATA (which we already put in place).

  • 🇺🇸United States nicxvan

    I also simplified the Change Record since we are no longer updating the templates.

  • Pipeline finished with Success
    12 months ago
    Total: 989s
    #150683
  • Status changed to RTBC 12 months ago
  • 🇺🇸United States smustgrave

    Even better! Whenever we don't have to update all the twig templates its definitely the route. Helps contrib themes.

    Believe the CR is still needed and what you have is good.

  • Status changed to Needs work 12 months ago
  • 🇬🇧United Kingdom alexpott 🇪🇺🌍

    Is the HTML admin created or user created. If it is user created we should be using Xss::filter() because the admin filter is still very liberal.

  • 🇺🇸United States nicxvan

    I used filterAdmin because that is what views usually uses as a final pass for rewrites and I wanted to preserve the most HTML that might have been configured as allowed on filter formats and added to the view.

    You're probably right though, cause if someone entered HTML in a plain text field then it passed through this filter it would convert it to HTML.

    I think converting this to filter could strip out some HTML that was enabled by an admin on a text format once it passes through the CDATA conversion, but that is probably better than outputting more HTML than expected.

    I'll make the change.

  • Pipeline finished with Failed
    12 months ago
    Total: 507s
    #152214
  • Pipeline finished with Failed
    12 months ago
    Total: 1106s
    #152215
  • Pipeline finished with Success
    12 months ago
    Total: 988s
    #152226
  • Status changed to Needs review 12 months ago
  • 🇺🇸United States nicxvan

    Ok I used the example in locale for which html tags to allow since the default blank filter even removes paragraph tags:

    ['a', 'abbr', 'acronym', 'address', 'b', 'bdo', 'big', 'blockquote', 'br', 'caption', 'cite', 'code', 'col', 'colgroup', 'dd', 'del', 'dfn', 'dl', 'dt', 'em', 'h1', 'h2', 'h3', 'h4', 'h5', 'h6', 'hr', 'i', 'ins', 'kbd', 'li', 'ol', 'p', 'pre', 'q', 'samp', 'small', 'span', 'strong', 'sub', 'sup', 'table', 'tbody', 'td', 'tfoot', 'th', 'thead', 'tr', 'tt', 'ul', 'var']

  • Status changed to RTBC 11 months ago
  • 🇺🇸United States smustgrave

    Coverage still appears to be there https://git.drupalcode.org/issue/drupal-3433/-/jobs/1385050

    Believe the feedback/change request has been addressed.

    CR probably is still useful, 50/50 since it doesn't require a twig change now.

  • First commit to issue fork.
  • Merge request !7859Resolve #3433 Reroll for 10.2 → (Open) created by paulsilva
  • Pipeline finished with Failed
    11 months ago
    Total: 836s
    #161112
  • Pipeline finished with Success
    11 months ago
    Total: 1047s
    #161666
  • 🇦🇺Australia larowlan 🇦🇺🏝.au GMT+10

    I think this is a bug, not a feature request. Which means its still eligible for 11.x/10.3.x

  • Status changed to Needs work 11 months ago
  • 🇦🇺Australia larowlan 🇦🇺🏝.au GMT+10

    Left some minor nits on the MR, feel free to self RTBC after addressing.

  • Status changed to RTBC 11 months ago
  • 🇺🇸United States nicxvan

    Thanks, I updated the comments for both instances!

  • Status changed to Needs work 11 months ago
  • 🇺🇸United States nicxvan

    ah my local is out of date, one moment while I rebase.

  • Status changed to RTBC 11 months ago
  • 🇺🇸United States nicxvan

    Ok, that is resolved now, I'll keep an eye on tests, but changing comments shouldn't affect them.

  • Pipeline finished with Failed
    11 months ago
    Total: 482s
    #176677
  • Status changed to Needs work 11 months ago
  • 🇺🇸United States nicxvan

    I'm doing a full rebase, there are a bunch of unrelated failures.

  • Pipeline finished with Canceled
    11 months ago
    #176674
  • Status changed to RTBC 11 months ago
  • 🇺🇸United States nicxvan

    Ok, I rebased on 11.x I'll keep an eye on tests.

  • Pipeline finished with Success
    11 months ago
    Total: 574s
    #176685
  • 🇺🇸United States nicxvan

    Tests are happy on 11.x.

    10.2 branch is failing for an unrelated test.

  • 🇳🇿New Zealand quietone

    Added useful links to the issue summary and updated credit.

  • Status changed to Needs review 10 months ago
  • 🇬🇧United Kingdom longwave UK

    Is piggybacking on RssResponseRelativeUrlFilter really the right thing to do? That event subscriber is only supposed to be fixing up relative URIs, and now we're extending it to convert the description to CDATA - this feels like the wrong place to be doing this. Should we be doing this when the RSS feed is generated? Or maybe we need a separate event subscriber?

  • 🇺🇸United States nicxvan

    Good point, I'll take a look later this week unless someone else gets to it first.

  • Assigned to nicxvan
  • Pipeline finished with Canceled
    10 months ago
    #195647
  • Pipeline finished with Failed
    10 months ago
    Total: 581s
    #195650
  • Issue was unassigned.
  • 🇺🇸United States nicxvan

    Got some failures, my local tests are not working at the moment so I'm working on figuring that out now.

  • 🇺🇸United States nicxvan

    Got my local working, this should fix it.

  • Pipeline finished with Failed
    10 months ago
    Total: 669s
    #195664
  • 🇺🇸United States nicxvan

    I did not update the 10.2 branch this time since I didn't create it and I'm not sure this is eligible for backport anyway.

  • Pipeline finished with Success
    10 months ago
    Total: 766s
    #195679
  • 🇺🇸United States nicxvan

    I ran test only and it's failing as expected.

  • Pipeline finished with Canceled
    10 months ago
    Total: 213s
    #195740
  • Status changed to RTBC 10 months ago
  • 🇺🇸United States smustgrave

    Removing Needs markup review as this isn't changing the twig template (assuming that's what the tag is for).

    Per the slack thread, issue summary and change record have both been updated with new approach.

    Small feedback to the MR has also been addressed.

    Test-only feature did fail before last push so coverage is there.

    LGTM

  • Pipeline finished with Success
    10 months ago
    #195744
  • 🇳🇿New Zealand quietone

    I made changes to comments, mostly about capitalization. And small changes to the change record to put the change first. I did not do a code review.

    After that I then saw that there are two new functions that do not have return types. I need a break so I am setting to NW for that. See https://www.drupal.org/docs/develop/standards/php/php-coding-standards#s... .

  • Status changed to Needs work 9 months ago
  • 🇺🇸United States nicxvan

    I'll take a look

  • Status changed to Needs review 9 months ago
  • 🇺🇸United States nicxvan

    I set string|false since https://www.php.net/manual/en/domdocument.savexml.php shows that, let me know if that should be handled differently.

  • Pipeline finished with Success
    9 months ago
    Total: 499s
    #208256
  • Status changed to Needs work 9 months ago
  • 🇺🇸United States smustgrave

    2 small comments.

  • Status changed to Needs review 9 months ago
  • 🇺🇸United States nicxvan

    Committed the suggestion and answered your question.

  • Pipeline finished with Success
    9 months ago
    Total: 658s
    #212978
  • 🇺🇸United States nicxvan

    I addressed your second comment.

  • Pipeline finished with Success
    9 months ago
    Total: 586s
    #212998
  • Status changed to RTBC 9 months ago
  • 🇺🇸United States smustgrave

    Thanks LGTM!

  • Status changed to Needs work 9 months ago
  • 🇬🇧United Kingdom longwave UK

    I think this solution is really clean, just one nit about an unused argument.

  • Status changed to RTBC 9 months ago
  • 🇬🇧United Kingdom longwave UK

    In the interests of not holding up a 20+ year old issue any longer, I fixed the nit and set it back to RTBC and will commit later if the tests pass.

  • Pipeline finished with Success
    9 months ago
    Total: 512s
    #231051
    • longwave committed ebc8e063 on 10.4.x
      Issue #3433 by nicxvan, quietone, longwave, DuaelFr, smustgrave,...
    • longwave committed eba56b0f on 11.x
      Issue #3433 by nicxvan, quietone, longwave, DuaelFr, smustgrave,...
  • Status changed to Fixed 9 months ago
  • 🇬🇧United Kingdom longwave UK

    Decided to commit this to 11.1.x and 10.4.x only as there is a chance of it being vaguely disruptive to existing sites that are using RSS and not necessarily expecting CDATA in their output here, given we wrote a change record it makes sense to put it in a minor release only.

    Still, great to see a 20+ year old bug finally fixed, and in such a clean way given we can just add the event subscriber to modify RSS responses.

    Committed and pushed eba56b0fd7 to 11.x and ebc8e0639e to 10.4.x. Thanks!

  • Automatically closed - issue fixed for 2 weeks with no activity.

  • Is there a way to avoid this xss filter? If we're constructing an rss feed whose content we trust, then there should be a way for that content to survive this event subscriber. I would've assumed that wrapping the trusted content in its own CDATA during rendering would've worked, but this new event subscriber doesn't check for existing cdata, and just filters everything without prejudice.

  • 🇬🇧United Kingdom opsdemon

    Definitely agree with this idea. Since 10.4.0 all div, style and img tags are getting stripped from our newsletter RSS feed and we ended up downgrading to 10.3 to get it working again.

    We use mailchimp and there is no option to add html structure or styling on the other end. If we could wrap our trusted, styled output from the view using a CDATA section and bypass the new filter then that would clear our upgrade path to 10.4 (which is currently blocked by this issue).

  • 🇺🇸United States nicxvan

    They are event subscribers so you should be able to override them.

  • 🇬🇧United Kingdom opsdemon

    How and where would I do that? Any pointers or examples would be much appreciated.

  • I put together this module to override those services (both the xss filter and the relative url filter, as both strip an existing cdata tag) and add in a check for cdata sections: https://www.drupal.org/project/rss_trust_cdata

    You can use that module or check the source to see how it was implemented in order to write your own.

    I still think this core service should include some ability to trust content or just opt out of rewriting, but as nicxvan said, overriding the core services will work for now.

  • 🇬🇧United Kingdom opsdemon

    Thank you for providing a module to override this behaviour.

    I installed the module, enabled it and then wrapped the output of the view (Rewrite Results) in a CDATA section:

    But looking at the rss.xml I can see that the output is still stripping the div, style and img tags.

    Is there anything else I need to do to make this work?

  • 🇧🇪Belgium Dries

    This commit appears to have broken my RSS feed at https://dri.es/rss.xml: all 'img' tags are now being stripped from the feed.

    I suspect the issue is related to this change, as it seems the 'img' tag isn't included in the allow-list.

    It is also be worth considering whether other elements like 'figcaption', 'figure', and 'video' should be preserved to maintain important content in RSS feeds.

  • 🇬🇧United Kingdom longwave UK

    Thanks for reporting, the regression with img and certain other tags is being tracked at 🐛 RssResponseCdata filtering out common html tags Active

  • Merge request !11127Remove CDATA RSS listener → (Open) created by alexpott
  • Pipeline finished with Failed
    about 2 months ago
    Total: 428s
    #416589
  • 🇬🇧United Kingdom alexpott 🇪🇺🌍

    FYI: this issue is being reverted in 🐛 RssResponseCdata filtering out common html tags Active

  • 🇫🇮Finland heikkiy Oulu

    Just wanted to add a small detail that we have been using a view where we generate the RSS feed with fields instead of the display mode. We preprocess the feed and add CDATA to the description that is basically combining two fields together as the feed description.

    After updating to 10.3 we started to see double encoding in the RSS item description.

    I was able to also fix the problem based on comment #117 and using Markup::create() to define the CDATA content. I guess the RSS system now sanitizes the input also later and using Markup::create() marks it as safe to render.

    We haven't yet updated to 10.4 so will be monitoring this issue for possible changes in the related regressions.

Production build 0.71.5 2024