- last update
over 1 year ago 29,885 pass, 1 fail - last update
over 1 year ago 29,886 pass - 🇳🇿New Zealand quietone
I don't know why the fix is made in two places. In my testing I found that having it only in the preprocess function was sufficient. Therefor, I have removed the change from the update manager. I also added a simple unit test. Also, screenshots in the Issue Summary.
The last submitted patch, 21: 3069491-21-fail.patch, failed testing. View results →
- Status changed to RTBC
over 1 year ago 9:18pm 30 July 2023 - 🇺🇸United States smustgrave
Thanks @quietone test case seems to show the problem so will remove that tag
Fix LGTM.
- last update
over 1 year ago 29,909 pass - Status changed to Needs work
over 1 year ago 10:45am 31 July 2023 - 🇫🇮Finland lauriii Finland
-
+++ b/core/modules/update/tests/src/Unit/ProjectStatusTest.php @@ -0,0 +1,39 @@ + 'title' => Html::escape('This & that'),
I think it would be better to write an integration test for this to make sure that this continues to work if the input changes.
-
+++ b/core/modules/update/update.report.inc @@ -5,6 +5,7 @@ +use Drupal\Component\Utility\Html; @@ -149,6 +150,7 @@ function template_preprocess_update_project_status(&$variables) { + $variables['title'] = Html::decodeEntities($variables['title']);
Why do we need to call
\Drupal\Component\Utility\Html::decodeEntities
here? Shouldn't we wrap the title with\Drupal\Component\Render\MarkupInterface
where it gets escaped?
-
- 🇳🇿New Zealand quietone
It looks like it is escaped in the release xml.
<?xml version="1.0" encoding="utf-8"?> <project xmlns:dc="http://purl.org/dc/elements/1.1/"> <title>Save &amp; Edit</title> <short_name>save_edit</short_name> <dc:creator>himerus</dc:creator>
- Status changed to Needs review
over 1 year ago 3:41am 1 August 2023 - last update
over 1 year ago 29,891 pass, 2 fail - last update
over 1 year ago 29,912 pass - 🇳🇿New Zealand quietone
The comment above is not correct.
The XML from d.o is<project> <title>Save & Edit</title> <short_name>save_edit</short_name> <dc:creator>himerus</dc:creator> <type>project_module</type> <supported_branches>8.x-1.</supported_bran
where it is escaped but not double escape. The extra escape happens in the fetching process, in stream_get_contents().
It took awhile to figure out how the update Functional tests work. Or, at least enough to add tests for problem.
#24
1. This patch adds a new test module and update XML with double escaped characters. The additions to \Drupal\Tests\update\Functional\UpdateContribTest::testUpdateContribBasic test the text on the updates page.
2. I don't really understand this nor do I know about MarkupInterface. I opted to decode the title just after it is fetched.The interdiff is between the success and fail patch. There is no interdiff with a previous patch because this is a new approach. Testing confirms that the screenshots are still correct.
The last submitted patch, 26: 3069491-26-fail.patch, failed testing. View results →
- Status changed to RTBC
over 1 year ago 3:14pm 2 August 2023 - 🇺🇸United States smustgrave
Additional test coverage looks good.
For #24
1 looks like additional test coverage has been added
2 decodeEntities is still there but a comment was added.LGTM!
- last update
over 1 year ago 29,947 pass - last update
over 1 year ago 29,954 pass - last update
over 1 year ago 29,954 pass - last update
over 1 year ago 29,959 pass - last update
over 1 year ago 29,959 pass - last update
over 1 year ago 29,959 pass - last update
over 1 year ago 29,960 pass - last update
over 1 year ago 29,972 pass - last update
over 1 year ago 30,050 pass - last update
over 1 year ago 30,057 pass - last update
over 1 year ago 30,057 pass - 🇳🇿New Zealand quietone
I'm triaging RTBC issues → .
After reading the IS and comments I don't see anything left to do. I guess there is still a question of whether adding the double escape to the title in source XML is the correct way to test this but I didn't see another way. See #26
Leaving at RTBC.
- last update
over 1 year ago 30,059 pass - last update
over 1 year ago 30,061 pass - last update
over 1 year ago 30,064 pass - last update
over 1 year ago 30,118 pass, 4 fail The last submitted patch, 26: 3069491-26.patch, failed testing. View results →
- last update
over 1 year ago 30,135 pass - Status changed to Needs review
over 1 year ago 11:45pm 31 August 2023 - 🇳🇿New Zealand quietone
Neither failure is related to the change here. Retesting.
- Status changed to RTBC
over 1 year ago 2:33am 1 September 2023 - last update
over 1 year ago 30,136 pass - last update
over 1 year ago 30,137 pass - last update
over 1 year ago 30,137 pass - last update
over 1 year ago 30,147 pass - last update
over 1 year ago 30,147 pass - last update
over 1 year ago 30,151 pass 26:11 25:03 Running- last update
over 1 year ago 30,162 pass - last update
over 1 year ago 30,169 pass - last update
over 1 year ago 30,157 pass, 2 fail The last submitted patch, 26: 3069491-26.patch, failed testing. View results →
- last update
over 1 year ago 30,169 pass - last update
over 1 year ago 30,206 pass - last update
over 1 year ago 30,209 pass - last update
over 1 year ago 30,361 pass - last update
over 1 year ago 30,362 pass - Status changed to Needs work
over 1 year ago 5:08pm 30 September 2023 - 🇺🇸United States xjm
-
+++ b/core/modules/update/src/UpdateProcessor.php @@ -175,6 +176,9 @@ public function processFetchTask($project) { + // The title is escaped in the XML from drupal.org and the fetch + // process will encode it again. Therefor, it is decoded here.
Two things:
- Drupal.org is canonically capitalized when used as a sentence. (Or put otherwise, there is a website at https://drupal.org and it is called "Drupal.org".)
- "Therefore" is misspelled.
-
+++ b/core/modules/update/src/UpdateProcessor.php @@ -175,6 +176,9 @@ public function processFetchTask($project) { + $available['title'] = Html::decodeEntities($available['title']) ?? '';
Honestly, this makes me quite nervous. I read the explanation in the comment, but still.
I mean, if the update data from d.o is compromised, the site has way worse problems than a little XSS in a title (i.e., full RCE). The same is true if the module author is malicious. But I'd still be inclined to harden it after the decode so that it's protected in case of future refactoring etc. This would also mean an additional test case for a
Test module <script>confirm('DO YOU LIKE MY MODULE???');</script>
or something.
-
Has anyone tried the approach @lauriii suggested, Since @quietone confirmed it was single-escaping in the XML. That would also allow us to sanitize it for disallowed markup and mark it as safe at the same time.
-
- 🇳🇿New Zealand quietone
35.1. Fixed.
I am not running tests because it is only a comment change.
Still needs work for 35, 1 and 2.
- 🇺🇸United States mfb San Francisco
As far as I see, comment #25 correct, the bug appears to be on drupal.org
https://updates.drupal.org/release-history/save_edit/current has
<title>Save &amp; Edit</title>
- this is double escaped. - 🇺🇸United States mfb San Francisco
I found the bug in project module release/project_release.drush.inc
$xml = new SimpleXMLElement('<root></root>'); $xml->release->name = check_plain($release->title);
simplexml handles the escaping for you, so this should be
$xml->release->name = $release->title;
e.g.:
$xml = new SimpleXMLElement('<root></root>'); $xml->release->name = 'Save & Edit'; echo $xml->asXml();
Prints
<?xml version="1.0"?> <root><release><name>Save & Edit</name></release></root>
which is correct.
- Status changed to Needs review
over 1 year ago 7:37pm 1 October 2023 - last update
over 1 year ago 5 pass - 🇺🇸United States mfb San Francisco
@xjm well, all I can do is assure you that the property overloading used by project_release module, e.g.
$xml->title = $project->title;
is how you instruct SimpleXML to escape some text and add it to the document.SimpleXML actually does have a way to add already-escaped text to a document, but this is not it. The code could be rewritten to use this method, but would be more verbose, and presumably slower since the escaping would be happening via the drupal function rather than C code.
- 🇺🇸United States drumm NY, US
My concern would be if any version of Drupal (or other clients) double-decoded this data, so if resetting this to be technically correct would open them up to XSS/etc issues.
- 🇺🇸United States mfb San Francisco
I would say it's best to just fix the bug at the source, because if a client is HTML-decoding the data and treating it as trusted markup with no filtering, they would also have a security issue today. The key is that clients need to treat this data like any other untrusted, third-party, user-generated content; i.e. these fields should be HTML-encoded when being rendered in markup.
But if you really want to leave the XML as-is for backwards compatibility reasons, then yes you could specify that these fields have an extra round of HTML-encoding, meaning clients need to get the data from XML parser, do the extra round of HTML-decoding, then HTML-encode it again when rendering in markup. Seems silly (to me) but certainly works as a solution.
- 🇺🇸United States dww
Coming here from 🐛 Double encoded ampersand in /admin/reports/updates Active which I was pinged about in Slack for a subsystem maintainer review.
From a little Git archeology, looks like this double-encoding was introduced in #1003764: Add an alter hook invoked while generating release-history XML files → when @drumm converted this from manually putting the XML together into using SimpleXML. Apparently no one noticed that SimpleXML is doing another round of encoding for us.
I just did this:
% wget https://updates.drupal.org/release-history/save_edit/current -O save_edit.xml ... HTTP request sent, awaiting response... 200 OK Length: 9702 (9.5K) [text/xml] Saving to: ‘save_edit.xml’ save_edit.xml 100%[=====================================>] 9.47K --.-KB/s in 0s 2025-01-15 14:45:23 (23.8 MB/s) - ‘save_edit.xml’ saved [9702/9702] % grep title save_edit.xml <title>Save &amp; Edit</title>
So the source is definitely double-encoded. It seems wonky to say that core should have to decode twice to get the raw values.
I don't understand the "BC" concern here. The primary client for these feeds, update.module, was written to assume properly encoded markup in the XML feeds, and is now "broken" by the double encoding. If other clients are already decoding twice, and we stop double encoding, what's the harm?
Here's a little test script:
$inputs = [ 'not_encoded' => '<title>Save & Edit</title>', 'encoded_once' => '<title>Save & Edit</title>', 'encoded_twice' => '<title>Save &amp; Edit</title>', ]; foreach ($inputs as $key => $value) { echo "------\n$key\n"; echo "Raw: $value\n"; $decoded = html_entity_decode($value, ENT_QUOTES | ENT_HTML5, 'UTF-8'); echo "Decoded: $decoded\n"; echo "Double decoded: " . html_entity_decode($decoded, ENT_QUOTES | ENT_HTML5, 'UTF-8') . "\n"; }
If I run this, I get:
------ not_encoded Raw: <title>Save & Edit</title> Decoded: <title>Save & Edit</title> Double decoded: <title>Save & Edit</title> ------ encoded_once Raw: <title>Save & Edit</title> Decoded: <title>Save & Edit</title> Double decoded: <title>Save & Edit</title> ------ encoded_twice Raw: <title>Save &amp; Edit</title> Decoded: <title>Save & Edit</title> Double decoded: <title>Save & Edit</title>
We're still going to turn around and re-encode this when printing it in the available updates report. That happens via
core/modules/update/templates/update-project-status.html.twig
right here:<div class="project-update__title"> {%- if url -%} <a href="{{ url }}">{{ title }}</a> {%- else -%} {{ title }} {%- endif %} {{ existing_version }} {% if install_type == 'dev' and datestamp %} <span class="project-update__version-date">({{ datestamp }})</span> {% endif %} </div>
So Twig is going to escape this for us, and I don't see how this could be used as an attack vector.
IMHO, this is the right fix, not the extra decode being proposed at #3493742. I am a maintainer for project*, so I could just commit this, but I can't deploy the fix on d.o anymore. 😅
- 🇺🇸United States mfb San Francisco
Yeah only bad code interpreting the feed could be a danger here, because untrusted third party strings need to be just that, untrusted, and run thru something like HTML Purifier or the filter built into drupal core; i.e. security issues shouldn't be possible regardless of what's in a third party string.
- 🇺🇸United States drumm NY, US
Thanks for the research!
I don't understand the "BC" concern here. The primary client for these feeds, update.module, was written to assume properly encoded markup in the XML feeds, and is now "broken" by the double encoding. If other clients are already decoding twice, and we stop double encoding, what's the harm?
My only real concern is if core was ever double-decoding. And I hope every version of core does always filter on output, so this should be fine.
- 🇺🇸United States drumm NY, US
This is now deployed and I rebuilt the XML for https://www.drupal.org/project/entityreference_dragdrop →
Before rebuilding the XML for every project with an
&
in its title, I'd appreciate a final confirmation that this is looking good in Drupal core now. - 🇺🇸United States dww
Sweet, thanks!
Looks great to me on a local site still running 10.3.10 core. Just did:
composer require drupal/entityreference_dragdrop drush en entityreference_dragdrop
Visited /admin/reports/updates on the site, and saw this:
Please rebuild all the projects with
&
.Thanks so much!
-Derek Automatically closed - issue fixed for 2 weeks with no activity.