- Issue created by @wlofgren
- 🇬🇧United Kingdom opsdemon
I'm having the same issue. Since 10.4.0 all styles and images are getting stripped from our newsletter RSS feed and we ended up downgrading to 10.3 to get it working again.
Is there any workaround for this?
It's obviously a temporary measure, but you could apply the reverse of the feature patch.
git diff ebc8e0639e0e12391087d9abe0ad8eca0034d1ea ebc8e0639e0e12391087d9abe0ad8eca0034d1ea~1 >patches/3497758.patch
I've attached that patch.
- 🇬🇧United Kingdom opsdemon
After some more investigation I found a workaround for the issue here: views_rss/issues/3079683#comment-15110037 💬 Add CDATA tags at description field on D8 Closed: duplicate
Applying the fix to the .theme file fixed the issue and the newsletter formatting looks ok now.
I ended up extending the RssResponseCdata class and register a service that decorates it in my module.
module_name.services.yml
services: module_name.response_filter.rss.cdata: class: Drupal\module_name\EventSubscriber\RssResponseCdata decorates: response_filter.rss.cdata decoration_priority: 10
src/EventSubscriber/RssResponseCdata.php
<?php namespace Drupal\module_name\EventSubscriber; use Drupal\Component\Utility\Xss; use Drupal\Core\EventSubscriber\RssResponseCdata as PrimaryRssResponseCdata; use Symfony\Component\EventDispatcher\EventSubscriberInterface; use Symfony\Component\HttpKernel\Event\ResponseEvent; use Symfony\Component\HttpKernel\KernelEvents; /** * Subscribes to wrap RSS descriptions in CDATA. */ class RssResponseCdata extends PrimaryRssResponseCdata implements EventSubscriberInterface { /** * Converts description node to CDATA RSS markup. * * @param string $rss_markup * The RSS markup to update. * * @return string|false * The updated RSS XML or FALSE if there is an error saving the xml. */ protected function wrapDescriptionCdata(string $rss_markup): string|false { $rss_dom = new \DOMDocument(); // Load the RSS, if there are parsing errors, abort and return the unchanged // markup. $previous_value = libxml_use_internal_errors(TRUE); $rss_dom->loadXML($rss_markup); $errors = libxml_get_errors(); libxml_use_internal_errors($previous_value); if ($errors) { return $rss_markup; } foreach ($rss_dom->getElementsByTagName('item') as $item) { foreach ($item->getElementsByTagName('description') as $node) { $html_markup = $node->nodeValue; if (!empty($html_markup)) { $tag_list = [ 'a', 'abbr', 'acronym', 'address', 'b', 'bdo', 'big', 'blockquote', 'br', 'caption', 'cite', 'code', 'col', 'colgroup', 'dd', 'del', 'dfn', 'div', 'dl', 'dt', 'em', 'h1', 'h2', 'h3', 'h4', 'h5', 'h6', 'hr', 'i', 'img', 'ins', 'kbd', 'li', 'ol', 'p', 'pre', 'q', 'samp', 'small', 'span', 'strong', 'sub', 'sup', 'table', 'tbody', 'td', 'tfoot', 'th', 'thead', 'tr', 'tt', 'ul', 'var' ]; $html_markup = Xss::filter($html_markup, $tag_list); $new_node = $rss_dom->createCDATASection($html_markup); $node->replaceChild($new_node, $node->firstChild); } } } return $rss_dom->saveXML(); } }
- 🇺🇸United States nicxvan
You can also use the module posted at the end of the parent issue.
It's easy to override, but as you can imagine we couldn't allow all html by default for security.
At least we get valid rss feeds now, and if you need more you can decorate or override.
- 🇧🇪Belgium wim leers Ghent 🇧🇪🇪🇺
I too am getting hit by this on my blog 😅 This explains why mysteriously I couldn't get feeds to work correctly after migrating from Drupal 7 to 10!
The change record → states:
[…] add the CDATA directive and provide XSS Filtering. Since this can be user entered data we use strict filtering.
… but in my case, this is not data entered by an untrusted user. And isn't that quite often the case? It's content filtered through a text format, so any images that appear are explicitly allowed by the text format.
AFAICT this was not considered in ✨ Use CDATA in XML RSS Feeds (was: Malformed XML Feeds) Closed: outdated 🫣
How often are RSS feeds created that expose data from untrusted user input (e.g. a feed of comments) vs that what is the very purpose of the site (blog, photos, products …)?
- 🇺🇸United States nicxvan
It was considered, initially I wanted to use the admin filter and @alexpott recommended a more strict filter since users can override it.
I'd be happy changing the core if it passes security muster or we can release a module that overrides it like referenced so it's more secure in core and more eyes open contrib if you want more permissive.
- 🇬🇧United Kingdom longwave UK
@Dries also reported this over in #3433-118: Use CDATA in XML RSS Feeds → .
If we add
<img>
and<div>
to the list of allowed tags, is that good enough as a stopgap? We can consider widening the filtering later but it feels like that needs more investigation. - 🇺🇸United States nicxvan
That seems more reasonable to add, are there other tags we should add?
- 🇬🇧United Kingdom opsdemon
In my case, the style tag was also needed as there is no way to style our newsletter items at the other end in mailchimp.
- 🇧🇪Belgium Dries
I believe allowing the
img
anddiv
tags would resolve many of the issues, but not all of them.For example, on my site, videos are being incorrectly filtered out. My implementation for embedding videos uses an
iframe
, which is a common approach recommended by providers like YouTube (see Google's YouTube documentation). I believe that Drupal's oEmbed feature relies oniframe
, as this is the standard method for embedding external content.iframes
are interesting because they can be used for embedding malicious content or phishing. But of course, so can thea
orimg
tags.I tend to agree with Wim's comment in #7: why do we need additional filtering beyond what already passes Drupal's text formatters and output filters? Don't they do the job of balancing flexibility and security? It seems like RSS feeds should not require filtering beyond what is applied to regular HTML pages.
If there are specific security concerns, it would be great to have those clarified or documented to help everyone figure out a solution, or understand the solution.
- 🇺🇸United States nicxvan
I think there are some alternatives:
- We can use XSS:filterAdmin
- We can create a contrib module that overrides this and gives more flexibility
- We can add a form to configure this in Drupal directly
The first option is the direction I took in the original issue, that was considered too risky, do we want to add a form / permission and configuration to allow overriding this? Or does that belong in contrib?
- 🇧🇪Belgium Dries
Why not the following option: 'Use Drupal's text formatters and output filters already configured for the fields'?
- 🇺🇸United States nicxvan
When that event fires we don't necessarily know the source, I bet we could let you choose a text formatter to use then you could configure it however you wanted. There are still concerns there if you have untrusted content in the RSS feed and you use a text format that is more trusted.
That might work though.
- 🇺🇸United States nicxvan
Ok so this probably needs tests.
Also a review of the following specifics.
Confirmation that the text format loading is correct.
Confirmation that renderInIsolation is correct.@alexpott had security concerns on the original issue so we may want him to take a look here too.
- 🇺🇸United States nicxvan
Tests are failing, but a review of the process would be nice before continuing.
- First commit to issue fork.
- 🇺🇸United States nicxvan
@oily would you mind reverting your changes?
We likely need that additional mock, phpstan was failing because I forgot the use statement and it needs some more work, but it's needed so we can provide the filter format which this is adding.
It's also probably in the wrong place, but as I mentioned I'm hoping for a review of the approach.
- 🇬🇧United Kingdom longwave UK
We are trying to remove the global RSS settings form over in 🐛 Views RSS view mode settings are completely broken Needs work , I don't think we should be adding to it here.
After thinking about it a bit more this event subscriber should really only be responsible for adding CDATA - any sanitization needs to be done further up the chain. In core is there anything other than Views that outputs RSS? Can we add a test that ensures XSS attempts via e.g. node body are correctly escaped by Twig/Views and that we don't need to filter here at all?
- 🇺🇸United States nicxvan
That might work, I don't know how we'd write that test though, but that would be a simpler change here.
I think the concern in the original issue was changing contexts, if only partially trusted users add the content then we could promote the content here.
But if we only wrap it I think that might work.
- 🇬🇧United Kingdom catch
I don't think we should be using text formats here, it makes the entire thing dependent on filter module. The whole event subscriber seems a bit wrong, which seems to be what @longwave is pointing to in #26 - e.g. remove the sanitization and make the it the responsibility of the code that's producing the RSS to sanitize it.
- 🇺🇸United States nicxvan
I think this is ready for review. I added a test for the title, full html and plain text.
It's escaped for plain text and title and stripped for the full that I created.
I think this is ready for review, I tagged for security review.
- 🇺🇸United States smustgrave
Left a comment on the MR.
Can issue summary be updated too as proposed solution doesn't seem to line up.
Question has it ever been discussed allowing a settings variable to be set for what's allowed? Not in scope of this but just curious
- 🇺🇸United States nicxvan
I responded to your comment with a question, I updated the proposed resolution.
I think the consensus is to not have filtering at this stage at all so a variable doesn't make sense.
- 🇺🇸United States cmlara
Adjusting tag assigned in #32.
Security is for disclosing vulnerabilities that need fixing.
- 🇧🇪Belgium wim leers Ghent 🇧🇪🇪🇺
any sanitization needs to be done further up the chain
+1
In core is there anything other than Views that outputs RSS?
AFAICT any/all other RSS feed generation logic was removed in the Drupal 7-to-8 transition, where we embraced Views' ability to generate it in favor of custom logic.
Can we add a test that ensures XSS attempts via e.g. node body are correctly escaped by Twig/Views and that we don't need to filter here at all?
Such escaping is not the responsibility of Twig/Views, but of the text format author, unless I'm missing something? 🤔
The whole event subscriber seems a bit wrong, which seems to be what @longwave is pointing to in #26 - e.g. remove the sanitization and make the it the responsibility of the code that's producing the RSS to sanitize it.
This! The text format/filtering discussion is a red herring. I only raised it in #7 because that is the only way Drupal displays untrusted user input, unless the site is horrendously broken already.
The MR looks like it's on the right track, but I think the test coverage could be a lot clearer. Left suggestions to clarify it.
- 🇺🇸United States nicxvan
Thank you for your review!
I applied most suggestions. I had to revert most of them for the following reason:
- We reuse the field storage array multiple times
- I think your computer made all of the arrows pretty font which broke the arrays
- I had to add the save for the format so that the permission is created.
I added a div to the plaintext one to confirm it's being escaped and added the img to the trusted format.
I think that addresses all of your comments.
I'll got through and review after tests run to double check.
- 🇬🇧United Kingdom longwave UK
The fix is trivial and the new test looks good to me and should ensure this feature won't regress in the future. Removing the security review tag. Thanks!
- 🇬🇧United Kingdom alexpott 🇪🇺🌍
I wonder if we should be concerned if there is another way of meeting the condition
stripos($event->getResponse()->headers->get('Content-Type', ''), 'application/rss+xml') === TRUE
and not having already filtered HTML here. Probably not from views… but what about contrib?I think adding the CDATA sections in \Drupal\Core\EventSubscriber\RssResponseCdata means we become responsible for all ways that Drupal generates rss not just from Views and that makes this very tricky as we have to consider the security of custom and contrib code.
- 🇬🇧United Kingdom longwave UK
Wondering if we should drop the event subscriber entirely and try to do this in Views. https://www.rssboard.org/rss-encoding-examples documents the encoding.
In views-view-row-rss.html.twig can we just force Twig to double encode
<description>{{ description|escape }}</description>
or just add CDATA there
<description><![CDATA[{{ description }} ]]></description>
?
- 🇬🇧United Kingdom alexpott 🇪🇺🌍
I really like the idea of making this the responsibility of the thing constructing the output and not a blanket response subscriber where the security concerns are hard to impossible to consider all possibilities. Setting back to needs work to see if we can implement #43.
I think we'll also need to adjust views-view-rss.html.twig. Oh we don't need to do that because of
// The RSS 2.0 "spec" doesn't indicate HTML can be used in the description. // We strip all HTML tags, but need to prevent double encoding from properly // escaped source data (such as & becoming &amp;). $variables['description'] = Html::decodeEntities(strip_tags($style->getDescription()));
in \template_preprocess_views_view_rss()
Hmmm.... actually reading template_preprocess_views_view_rss()
// The description is the only place where we should find HTML. // @see https://validator.w3.org/feed/docs/rss2.html#hrelementsOfLtitemgt // If we have a render array, render it here and pass the result to the // template, letting Twig autoescape it. if (isset($item->description) && is_array($item->description)) { $variables['description'] = (string) \Drupal::service('renderer')->render($item->description); }
I think the whole cdata approach we did can be removed. It's just not necessary for views in core. Anything in contrib that needed that should implement what views has.
- 🇺🇸United States nicxvan
That was attempted here: [3433-49]
Cdata gets stripped by Drupal\Core\EventSubscriber\RssResponseRelativeUrlFilter::transformRootRelativeUrlsToAbsolute()
- 🇬🇧United Kingdom catch
At this point should we revert ✨ Use CDATA in XML RSS Feeds (was: Malformed XML Feeds) Closed: outdated , add the XSS test coverage in this issue, and then start again on ✨ Use CDATA in XML RSS Feeds (was: Malformed XML Feeds) Closed: outdated ?
- 🇬🇧United Kingdom alexpott 🇪🇺🌍
I agree with #46 I think we should revert the cdata change. I think we already were complying to the spec as per the escaping being done by
// The description is the only place where we should find HTML. // @see https://validator.w3.org/feed/docs/rss2.html#hrelementsOfLtitemgt // If we have a render array, render it here and pass the result to the // template, letting Twig autoescape it. if (isset($item->description) && is_array($item->description)) { $variables['description'] = (string) \Drupal::service('renderer')->render($item->description); }
The comment in #3433-60: Use CDATA in XML RSS Feeds → is not a justification for the change because we escape html in the description field. We need to test that XSS injections are filtered or double escaped.
- 🇬🇧United Kingdom alexpott 🇪🇺🌍
I've added two MRs to this issue to do the revert - for 11.x and 10.x - we need a post update function because the container needs rebuilding due to the event registration.
- 🇬🇧United Kingdom alexpott 🇪🇺🌍
alexpott → changed the visibility of the branch 3497758-SubscriberFiltering to hidden.
- 🇬🇧United Kingdom catch
Should we also still add the additional test coverage from https://git.drupalcode.org/project/drupal/-/merge_requests/11023 here?
- 🇬🇧United Kingdom alexpott 🇪🇺🌍
@catch I think we should do that in a follow-up.
- 🇬🇧United Kingdom catch
OK makes sense. Let's make sure we open the follow-up and transfer the test coverage over, moving to RTBC.
- 🇬🇧United Kingdom alexpott 🇪🇺🌍
Created the follow-up - 📌 Add XSS testing for RSS descriptions Active
- 🇬🇧United Kingdom catch
We can skip the post update because the container is cached by both Drupal version and deployment identifier so will automatically be invalidated by a core update.
Also makes this easier to backport to a patch release.
- 🇬🇧United Kingdom alexpott 🇪🇺🌍
Removed the update function. Back to RTBC as this is quite a simple change.
- 🇫🇮Finland heikkiy Oulu
I would like to understand this change record a bit better. I was testing this with the core rss.xml.
We basically have previously implemented a preprocess where we add CDATA to certain RSS views and also alter the headers to use text/xml to pass RSS validation with CDATA.
After upgrading to 10.3 we started to see double encoding in the feed. I was testing this with the core rss.xml and I was seeing also the double encoding there. After investigation I noticed that core is now checking for the header to be application/rss+xml before it adds the CDATA wrapper to the description. So our problem was setting a wrong header type in custom functionality. Upgrading to 10.4 and removing our header alter seemed to fix the problem and it started to render HTML and CDATA correctly. But after I apply the patch from here the CDATA gets removed and it gives double encoding again.
So my question basically is that is it still possible to get CDATA in RSS views from ✨ Use CDATA in XML RSS Feeds (was: Malformed XML Feeds) Closed: outdated or is this functionality going to be completely reverted from core? To me this issue title seems to say that certain HTML tags are being stripped away but when looking at the MR, it seems that the CDATA functionality is going to be removed and the recommended alternative is to use for example https://www.drupal.org/project/rss_trust_cdata → ?
- 🇬🇧United Kingdom alexpott 🇪🇺🌍
@heikkiy we're going to remove the cdata tags because they are not necessary. We're already escaping the html in RSS description fields which as per #43 is a supported way of embedding HTML in there. The whole cdata approach appears to not have been necessary. Do you have needs or information that says it is necessary?
- 🇬🇧United Kingdom catch
Committed/pushed to 11.x and 10.5.x and cherry-picked back to 11.1.x and 10.4.x, thanks!
@heikkiy this commit will restore the situation to how it was in 10.3.x, as if ✨ Use CDATA in XML RSS Feeds (was: Malformed XML Feeds) Closed: outdated had never been committed.
- 🇫🇮Finland heikkiy Oulu
@alexpott Not necessarily. I would like to perhaps just understand why the CDATA was deemed as a good solution in #3433 and then reverted back to the original version? I can understand the regression problem but was just wondering when a very old issue gets resolved and then it gets reverted. My original approach has been also that if you want to put HTML in your RSS feed that it should be inside CDATA but seems like encoding is fine.
I guess the change record still needs updating https://www.drupal.org/node/3504403 → because it says:
The CDATA event now only wraps the HTML output in CDATA tags.
One thing I noticed is that when I was testing our current feed in 10.3 that when validating the feed I got a warning about drupal-media not being a valid tag. This might be because of our current custom preprocess and definitely out of scope for this issue but just something that caught my eye when testing.
- 🇬🇧United Kingdom alexpott 🇪🇺🌍
@heikkiy I think the premise of ✨ Use CDATA in XML RSS Feeds (was: Malformed XML Feeds) Closed: outdated was probably valid when it was filed - back in 2003!!! - but the code in views and use of views for core's RSS feed came later and had an alternate fix - escaping the html. Unfortunately 3433 remained open and people continued to work on it without validating that the fix was still necessary although there definitely is discussion of this on the issue. Also it appears that what RSS validors say and RSS readers do is different. Note that https://www.rssboard.org/rss-encoding-examples is quite clear that escaped HTML tags in the description field are supposed to work and be respected and this is was working well prior to 10.3.
- 🇬🇧United Kingdom alexpott 🇪🇺🌍
I've rewritten the CR and published it and I've added a note to the original CR https://www.drupal.org/node/3440505 → that it is no longer valid.
- 🇬🇧United Kingdom longwave UK
https://stackoverflow.com/questions/7220670/difference-between-descripti... has a summary of the problem, but there is no clear answer, because it depends on the reader. Some interpret
<description>
as plaintext, others as encoded HTML. Some readers also support<content:encoded>
for HTML but this is not backward compatible with older readers that only support<description>
. What a mess, and I don't think we can have a one-size-fits-all solution here. - 🇬🇧United Kingdom longwave UK
FWIW Wordpress.com's RSS feed does use
<content:encoded>
*and* wraps everything in CDATA: https://wordpress.com/blog/feed/Neither d.o or Wordpress.com's RSS validates 100%:
https://validator.w3.org/feed/check.cgi?url=https%3A%2F%2Fwww.drupal.org...
https://validator.w3.org/feed/check.cgi?url=https%3A%2F%2Fwordpress.com%... - 🇫🇮Finland heikkiy Oulu
Would it make sense to create a follow up ticket where this would be possible to be configured? Or perhaps a contrib module. I guess it's quite regular that people are using RSS feed to share content and now that Drupal CMS has already launched I think the need might be growing to be able to build RSS feeds easily for different use cases. Would be nice if you could also configure the
<content:encoded>
as an optional value if you know you are exposing the RSS feed to a known reader that can support it.My original interpretation also was that CDATA is more supported and it was news to me that encoded is fine also. Live and learn.
And regarding @longwave WP example, it's interesting that iframe for example is not passing validation.
- 🇫🇮Finland heikkiy Oulu
Answering myself but seems like https://www.drupal.org/project/views_rss → does have the support for both
description
andcontent:encoded
.I quickly tested it with the core rss.xml view and it seems like it wraps both the
description
andcontent:encoded
inside CDATA.I think that might be a good option if you need more robust RSS support.
- 🇺🇸United States aaronrus
What is the solution? I updated from 10.3.x to 10.4.3 and now my xml feed used for a mail program no longer shows images. The img tags are being stripped out and it shows ![cdata[[
- 🇬🇧United Kingdom catch
@aaronrus this will get released in the next patch release of Drupal 11/10, which is likely to be first week of March (e.g. next week).
- 🇺🇸United States aaronrus
Ive been waiting for the next update with the fix but Ive not seen anything past 10.4.3 ready for production. My newsletter is down on my production site an I need a little help. What is the status of the fix being rolled out into the next production update? or what is the process of patching the existing 10.4.3 as Ive been unable to successfully apply the patch.
- 🇺🇸United States nicxvan
@aaronrus catch already told you the planned release is this week, I'm not sure if that is still the case, but you can patch your site using the instructions here: https://www.drupal.org/docs/develop/git/using-git-to-contribute-to-drupa... →
- 🇬🇧United Kingdom longwave UK
https://www.drupal.org/project/drupal/releases/10.4.4 → has been tagged and will be available in the next few minutes, including this fix.
- 🇬🇧United Kingdom opsdemon
Upgraded today to drupal 10.4.4 and removed the workaround we've been using since January to fix the issues with our newsletters.
I tested the RSS feed with the new release and it seems to be working normally again.
- Status changed to Fixed
12 days ago 6:54pm 21 March 2025 Automatically closed - issue fixed for 2 weeks with no activity.