Url::fromUri($uri)->isExternal() and UrlHelper::isExternal($uri) do not always match

Created on 15 March 2019, almost 6 years ago
Updated 21 February 2023, almost 2 years ago

Problem/Motivation

Drupal\Component\Utility\UrlHelper::isExternal($url) and \Drupal\Core\Url::fromUri($url)->isExternal() can return different results for the same URL in certain situation (with data URLs, for example).

Per #33 πŸ› Url::fromUri($uri)->isExternal() and UrlHelper::isExternal($uri) do not always match Needs work core contributors believe the data example, at least, should return FALSE and ultimately that the results should match for the same URL.

Example reproduced from #28 πŸ› Url::fromUri($uri)->isExternal() and UrlHelper::isExternal($uri) do not always match Needs work :

> use \Drupal\Component\Utility\UrlHelper;
> use \Drupal\Core\Url;
> $url = "data:text/plain;base64,SGVsbG8sIFdvcmxkIQ%3D%3D";
= "data:text/plain;base64,SGVsbG8sIFdvcmxkIQ%3D%3D"

> UrlHelper::isExternal($url);
= false

> Url::fromUri($url)->isExternal();
= true

Original issue description

The Drupal\Component\Utility\UrlHelper::isExternal() method returns FALSE for URLs using the "data" URL scheme. This method is used, for example, as part of the link Twig function and prevents the building of data URLs (with that function) in Twig templates.

The following example Twig code throws an InvalidArgumentException error:

{% set dataUrl = 'data:text/plain;base64,SGVsbG8sIFdvcmxkIQ%3D%3D' %}
{{ link('Text file link', dataUrl) }}

The expected result is a link with the text "Text file link" linking to the data URL data:text/plain;base64,SGVsbG8sIFdvcmxkIQ%3D%3D (a plain text file with the content "Hello, world!").

Proposed resolution

Set the external property to the result of UrlHelper::isExternal($uri) instead of just TRUE for certain URI values.

Original proposed resolution

Add a strpos check for "data:" at the beginning of the path, similar to the current method for handling URLs without an explicit protocol part.

A workaround for the issue with the Twig link function is to build the a element directly, e.g. --

{% set dataUrl = 'data:text/plain;base64,SGVsbG8sIFdvcmxkIQ%3D%3D' %}
<a href="{{ dataUrl }}">Text file link</a>

Remaining tasks

  • Reviews needed
  • Tests to be written and run?

User interface changes

None.

API changes

None.

Data model changes

None.

πŸ› Bug report
Status

Needs work

Version

10.1 ✨

Component
BaseΒ  β†’

Last updated about 2 hours ago

Created by

πŸ‡ΊπŸ‡ΈUnited States wells Seattle, WA

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Comments & Activities

Not all content is available!

It's likely this issue predates Contrib.social: some issue and comment data are missing.

  • The Needs Review Queue Bot β†’ tested this issue. It either no longer applies to Drupal core, or fails the Drupal core commit checks. Therefore, this issue status is now "Needs work".

    Apart from a re-roll or rebase, this issue may need more work to address feedback in the issue or MR comments. To progress an issue, incorporate this feedback as part of the process of updating the issue. This helps other contributors to know what is outstanding.

    Consult the Drupal Contributor Guide β†’ to find step-by-step guides for working with issues.

  • Status changed to Needs review almost 2 years ago
  • πŸ‡ΊπŸ‡ΈUnited States wells Seattle, WA

    Review so far has been addressed and this bug is still relevant. Attaching a re-roll for 10.1.x-dev and moving back to NR.

  • Status changed to Needs work almost 2 years ago
  • πŸ‡«πŸ‡·France nod_ Lille

    commit check failures

  • πŸ‡ΊπŸ‡ΈUnited States wells Seattle, WA

    So the commit check failure here is β€”

    /var/www/html/core/tests/Drupal/Tests/Component/Utility/UrlHelperTest.php:461:41 - Unknown word (Fdvcmxk)

    Caused by this added test case β€”

    ['data:text/plain;base64,SGVsbG8sIFdvcmxkIQ%3D%3D', TRUE]
    

    I’m not quite sure what I can do about this. Any recommendations? I suppose I could change SGVsbG8sIFdvcmxkIQ to some word that pleases cspell as it doesn’t necessary have to be a real data URL but that seems like a silly workaround.

  • Status changed to Needs review almost 2 years ago
  • πŸ‡ΊπŸ‡ΈUnited States wells Seattle, WA

    Attaching an updated patch using base64_encode with a string that doesn't make CSpell unhappy.

  • πŸ‡§πŸ‡ͺBelgium borisson_ Mechelen, πŸ‡§πŸ‡ͺ

    That's a great solution! +1

    I wonder how to answer #11

  • πŸ‡ΊπŸ‡ΈUnited States wells Seattle, WA

    I think I answered #11 in #13. I don't have a strong use case beyond wanting to use the features of the link function in Twig with data URLs.

    Having said that... I guess this fix is too narrow. The real issue here is that \Drupal\Component\Utility\UrlHelper::stripDangerousProtocols only allows http and https and the docblock on that function makes it all pretty clear why.

    I think the bad DX here is just this oddness:

    > use \Drupal\Component\Utility\UrlHelper;
    > use \Drupal\Core\Url;
    > $url = "data:text/plain;base64,SGVsbG8sIFdvcmxkIQ%3D%3D";
    = "data:text/plain;base64,SGVsbG8sIFdvcmxkIQ%3D%3D"
    
    > UrlHelper::isExternal($url);
    = false
    
    > Url::fromUri($url)->isExternal();
    = true
    

    It's inconvenient not to be able to use link with data URLs but now I see this more as a feature request (to support data as an external URL scheme) than a bug. And that perhaps leads to a much more complicated consideration.

  • Status changed to RTBC almost 2 years ago
  • πŸ‡ΊπŸ‡ΈUnited States smustgrave

    This issue is being reviewed by the kind folks in Slack, #needs-review-queue-initiative. We are working to keep the size of Needs Review queue [2700+ issues] to around 400 (1 month or less), following Review a patch or merge request β†’ as a guide.

    We may want to have a follow up to do more research.

    But for this ticket and this request I think it covers this one case, even if it's updated later.

    May not want to bother with a 9.5 backport but will need a different patch as str_starts_with is php8 only.

    Will see if the committers have any other thought.

  • πŸ‡¦πŸ‡ΊAustralia larowlan πŸ‡¦πŸ‡ΊπŸ.au GMT+10

    We have a polyfill for str_starts_with

    I'm still no certain about this one, will ask for some other opinions

  • Status changed to Closed: works as designed almost 2 years ago
  • πŸ‡¦πŸ‡ΊAustralia larowlan πŸ‡¦πŸ‡ΊπŸ.au GMT+10

    Discussed this with other core committers who agreed that we don't think isExternal should return TRUE for data urls. One suggestion was to instead add an isData method but then that opened up a whole question of what the security implications are here.

    @alexpott also noted that in theory you could add data to filter_protocols in sites/default/services.yml and this might work, but also that there were security concerns around that.

    I think we're leaning towards closed works as designed here folks, given you can just output a raw a tag. We don't really want to open a security can of worms around letting content-editors enter data: urls.

    I'm going to mark it as such, but if anyone feels strongly the other way, please re-open with counter points and we can discuss further.

    Crediting @alexpott and @GΓ‘bor Hojtsy who I discussed this with.

  • πŸ‡¬πŸ‡§United Kingdom alexpott πŸ‡ͺπŸ‡ΊπŸŒ
  • πŸ‡ΊπŸ‡ΈUnited States wells Seattle, WA

    @larowlan, how about the DX example in #28?

    Discussed this with other core committers who agreed that we don't think isExternal should return TRUE for data urls.

    So Url::fromUri($url)->isExternal(); should return FALSE instead of TRUE (current behavior), right?

  • Status changed to Active almost 2 years ago
  • πŸ‡¦πŸ‡ΊAustralia larowlan πŸ‡¦πŸ‡ΊπŸ.au GMT+10

    Yeah good point, we might need to repurpose things to address that?

  • Status changed to Needs review almost 2 years ago
  • πŸ‡ΊπŸ‡ΈUnited States wells Seattle, WA

    Agreed. Attempting to (coherently) re-work the ticket a bit here and attaching a potential patch for approach.

    In this patch I have changed final else dealing with the URI to use the value of UrlHelper::isExternal($uri) instead of just TRUE.

    Interested to see how this tests and curious for feedback on if this makes sense and/or how it could be tested? I took a quick look at \Drupal\Tests\Core\UrlTest but couldn't think of a good way to fit anything that isn't just a rehash of \Drupal\Tests\Component\Utility\UrlHelperTest::testIsExternal.

  • Status changed to Needs work almost 2 years ago
  • πŸ‡ΊπŸ‡ΈUnited States wells Seattle, WA

    Alright so the first two test fails relate to the test URL invalid://not-a-valid-protocol.

    Here's how that URL is treated before #36 patch:

    > use \Drupal\Core\Url;
    > use \Drupal\Component\Utility\UrlHelper;
    > $url = "invalid://not-a-valid-protocol"
    = "invalid://not-a-valid-protocol"
    
    > UrlHelper::isExternal($url);
    = false
    
    > Url::fromUri($url)->isExternal();
    = true
    

    And after #36 patch:

    > use \Drupal\Core\Url;
    > use \Drupal\Component\Utility\UrlHelper;
    > $url = "invalid://not-a-valid-protocol"
    = "invalid://not-a-valid-protocol"
    
    > UrlHelper::isExternal($url);
    = false
    
    > Url::fromUri($url)->isExternal();
    = false
    

    This fails because \Drupal\link\Plugin\Validation\Constraint\LinkExternalProtocolsConstraintValidator::validate prevents external URLs using untrusted protocols only.

    So it seems this change goes to far? If so I don't really think there is a logical place to draw a line here. This issue bleeds from a seemingly simple internal vs. external to to a much more complex valid vs. invalid scenario.

    Maybe there is some other solution here? Or maybe this really does work as designed? The design is just a bit tough to parse here.

Production build 0.71.5 2024