Internal URL handling (language prefixes, base://, ...)

Created on 2 February 2015, almost 10 years ago
Updated 20 January 2023, about 2 years ago

The config schema and test is currently a mess, there seem to be multiple different ways of configurating the global and local settings.

This cleans it up a bit. There are still lots of test fails, but at least the test is now running through. I also fixed the incorrect getEditableConfigNames(), changed the test @group to pathologic (the standard is to have a group per project. This is also required for my d8 module status overview at http://d8ms.worldempire.ch/.

Replacing base:// with user-path: as I think this can be considered as user input, base: is still available too.

πŸ› Bug report
Status

Needs work

Version

1.0

Component

Code

Created by

πŸ‡¨πŸ‡­Switzerland berdir Switzerland

Live updates comments and jobs are added and updated live.
  • Needs tests

    The change is currently missing an automated test that fails when run with the original code, and succeeds when the bug has been fixed.

Sign in to follow issues

Merge Requests

Comments & Activities

Not all content is available!

It's likely this issue predates Contrib.social: some issue and comment data are missing.

  • πŸ‡¦πŸ‡ΉAustria guedressel
    +++ b/pathologic.module
    @@ -192,10 +199,18 @@ function _pathologic_replace($matches) {
    +  // file url starting with 'sites/default/files' (or with a leading slash).
    ...
    +    preg_match('/sites\/(.*?)\/files\//', $parts['path'], $files_matches);
    

    The public file path is configurable - see https://www.drupal.org/node/22241 β†’

    Please check on actual path, instead of hard-coding patterns for the default path.

  • πŸ‡³πŸ‡±Netherlands spadxiii

    I found that the last patch didn't work for our platform anymore. We use flysystem (and s3/cloudfront) so our 'file public path' isn't anywhere near /sites//files and had to find a different solution.

    I removed the whole part of trying to rewrite the file-path, because the path is fine already.
    I've added the 'is_file' boolean to the $parts so that the alter can change it later on. (We added an alter to set is_file to true for flysystem-urls)

  • πŸ‡¨πŸ‡¦Canada nickdjm

    Updated to work with latest release.

  • πŸ‡§πŸ‡ͺBelgium RandalV

    I've refactored my patch from #48 to use the dynamic path retrieved from PublicStream::basePath().
    Now this should work with non-standard public file paths.

    As well as applying the changes from the latest version, testing it against 2.0.x.

    I've taken #48 instead of the other patches because I do think we need to step away from the hardcoded list of file extensions, even if I was the one (to my regret) to introduce that into the line of patches.

  • Open in Jenkins β†’ Open on Drupal.org β†’
    Core: 9.5.x + Environment: PHP 8.1 & MySQL 8
    last update almost 2 years ago
    2 pass, 2 fail
  • Open in Jenkins β†’ Open on Drupal.org β†’
    Core: 9.5.x + Environment: PHP 8.1 & MySQL 8
    last update over 1 year ago
    Patch Failed to Apply
  • πŸ‡ͺπŸ‡ΈSpain bletch

    Patch against 2.0-alpha

  • πŸ‡ΊπŸ‡ΈUnited States ronaldmulero

    Patch #52 πŸ› Internal URL handling (language prefixes, base://, ...) Needs work fixes the following problem.

    1. Create a Spanish translation of an English page.
    2. Hard-code a link to an existing pdf on the same site.
    <a href="/files/whatever/file.pdf">same-site.pdf</a>
    3. Save
    4. Before patch, the same-site link is incorrectly rendered with "/es" at the start of the path: <a href="http://example.com/es/files/whatever/file.pdf">same-site.pdf</a>
    5. After patch, same-site link is correctly rendered as: <a href="http://example.com/files/whatever/file.pdf">same-site.pdf</a>

    Thanks @RandalV

    Pathologic: 8.x-1.0-alpha4
    Drupal core 9.5.9

  • Open in Jenkins β†’ Open on Drupal.org β†’
    Core: 9.5.x + Environment: PHP 8.1 & MariaDB 10.3.22
    last update over 1 year ago
    2 pass, 2 fail
  • Open in Jenkins β†’ Open on Drupal.org β†’
    Core: 10.1.x + Environment: PHP 8.1 & MariaDB 10.3.22
    last update over 1 year ago
    2 pass, 2 fail
  • Status changed to Postponed: needs info over 1 year ago
  • πŸ‡ΊπŸ‡ΈUnited States dww

    Now that I'm starting to get the tests for this module a bit more under control, I'm turning my attention to trying to fix this. However, even after re-reviewing the issue, I'm not clear what bugs we're actually fixing here. πŸ˜… I'm not sure how to flesh out the test coverage we need since I don't know what should be working once the patch is applied.

    Can someone who's experiencing a problem here clarify what bug you're actually hitting that this patch would solve? What kinds of links are being broken and why?

  • πŸ‡§πŸ‡ͺBelgium RandalV

    Hi @dww,

    Thank you for turning your attention to this issue.
    One of the main problems is the module adding language prefixes to inline file URLs.

    If an editor adds a file inline, for example: `/sites/default/files/2023-05/a-picture-of-a-cat.jpg`
    Then the module would refactor this to, for example: `/en-GB/sites/default/files/2023-05/a-picture-of-a-cat.jpg`

    I've been messing around to find the right solution but even in my last patch I'm not quite sure that it's what we want.
    I do think it's the most stable patch so far, with dynamic public file paths recognition etc.

    TL;DR: I think we need to check if language path prefixes are added (or not added) correctly in all situations, this doesn't just include files but also other URLs like translated nodes etc

  • πŸ‡§πŸ‡ͺBelgium svendecabooter Gent

    FYI: patch #52 breaks our specific use case, where we have relative paths to files, that we want to turn into a full url ("Full URL" selected for "Processed URL format" setting).

    This change gets ignored in the following blob of code:

    elseif (str_contains($parts['path'], PublicStream::basePath())) {
        // This url should not be turned into a URL object, because we don't want
        // language handling for this path.
        $dont_rewrite = TRUE;
      }

    In the initial if statement, the clause $parts['scheme'] === 'files' is FALSE by the way.
    Perhaps because these are images uploaded inline via a WYSIWYG editor?

  • πŸ‡¨πŸ‡­Switzerland tcrawford

    We are using #51 on a project. I have re-rolled patch #51 so that it applies on 2.0.x, which is required for an upgrade to D10.0.x. As per previous concerns, there is no test coverage for the patch, but I hope it helps someone who is upgrading.

  • πŸ‡―πŸ‡΄Jordan n.ghunaim Amman - Jordan

    Adding "webp" extension for file extensions.

  • πŸ‡ΊπŸ‡ΈUnited States dww

    I started working on fleshing out test coverage for this, and ran into πŸ› Kernel tests can't use path aliases on entities RTBC . Maybe we'll have to do this in a Functional test until that's fixed and backported. 😞

  • Assigned to dww
  • Status changed to Needs work about 1 year ago
  • πŸ‡ΊπŸ‡ΈUnited States dww

    Re: #60: No, we can override that behavior in specific Kernel tests if we want to. But I still couldn't get Kernel tests to run the Pathologic filter and get the same behavior as when doing it in Functional tests. 😒 It's a mess I don't really have time to fully untangle.

    I'm adding a Functional test for this, which starts to flesh out the expected behavior, and the need for something like the patches in here. Moving the code for creating and using a filter format out of PathologicKernelTestBase into a Trait for reuse.

    However, some of the expected cases I added are still failing. I think the way the filter is stripping country codes is broken, and it's doing it far more often than it should. See also πŸ› Option to keep language prefix in URLs Fixed . For now, I commented those out.

    Anyway, I'm actively working on this now, so I'm going to assign myself. Stay tuned for more.

  • Open in Jenkins β†’ Open on Drupal.org β†’
    Core: 10.1.x + Environment: PHP 8.1 & MariaDB 10.3.22
    last update about 1 year ago
    3 pass, 1 fail
  • Open in Jenkins β†’ Open on Drupal.org β†’
    Core: 10.1.x + Environment: PHP 8.1 & MariaDB 10.3.22
    last update about 1 year ago
    4 pass
  • πŸ‡ΊπŸ‡ΈUnited States dww

    This is with all the assertions active. Fails locally like so:

    There was 1 failure:
    
    1) Drupal\Tests\pathologic\Functional\PathologicLanguageAliasTest::testContentTranslation
    en: fr/reference-fr link uses the FR alias
    Failed asserting that two strings are identical.
    --- Expected
    +++ Actual
    @@ @@
    -'<a href="/drupal-10/fr/reference-fr">Test node link</a>'
    +'<a href="/drupal-10/reference-fr">Test node link</a>'
    
    /Applications/MAMP-common/htdocs/drupal-10/vendor/phpunit/phpunit/src/Framework/Constraint/Constraint.php:122
    /Applications/MAMP-common/htdocs/drupal-10/vendor/phpunit/phpunit/src/Framework/Constraint/IsIdentical.php:79
    /.../pathologic/tests/src/Functional/PathologicLanguageAliasTest.php:145
    /Applications/MAMP-common/htdocs/drupal-10/vendor/phpunit/phpunit/src/Framework/TestResult.php:728
    
    FAILURES!
    Tests: 4, Assertions: 94, Failures: 1.
    
    
  • Open in Jenkins β†’ Open on Drupal.org β†’
    Core: 10.1.x + Environment: PHP 8.1 & MariaDB 10.3.22
    last update about 1 year ago
    3 pass, 1 fail
  • Open in Jenkins β†’ Open on Drupal.org β†’
    Core: 10.1.x + Environment: PHP 8.1 & MariaDB 10.3.22
    last update about 1 year ago
    4 pass
  • Issue was unassigned.
  • Status changed to Needs review about 1 year ago
  • πŸ‡ΊπŸ‡ΈUnited States dww

    Moved to https://git.drupalcode.org/project/pathologic/-/merge_requests/12

    Would love to get feedback on https://git.drupalcode.org/project/pathologic/-/merge_requests/12/diffs?...

    When do we ever want to be stripping off the langcodes in the URLs we're processing? That just always seems wrong. All existing tests are now passing, but I don't trust them that well to believe this isn't going to break things for someone.

    Should we proceed with πŸ› Option to keep language prefix in URLs Fixed where we "fix" the language handling with a setting? That also doesn't seem right.

  • The last submitted patch, 62: 2418369-62.patch, failed testing. View results β†’
    - codesniffer_fixes.patch Interdiff of automated coding standards fixes only.

  • πŸ‡ΊπŸ‡ΈUnited States dww

    Bot is confused, the good code is in the MR now.

  • πŸ‡ΊπŸ‡ΈUnited States dww

    To minimize disruption, should we add filter settings that control any / all of these behaviors? Seems like a mess, but I’m open to input from others.

    Thanks,
    -Derek

  • πŸ‡ΊπŸ‡ΈUnited States mark_fullmer Tucson

    To minimize disruption, should we add filter settings that control any / all of these behaviors? Seems like a mess, but I’m open to input from others.

    My initial instinct is similiar -- adding settings would complicate the permutations of what this module can do, requiring additional overhead to maintain and add test coverage. That said, given that it appears that there are multiple, valid ways that sites may want to handle language prefixes in URLs, I don't think there is a way to have a one-size-fits-all solution.

    This feeling is based on conversations coming from the Linkit module, which like Pathologic, provides a text format filter that manipulates URLs:

    I think this should be configurable, whether the link should switch to the language of the referenced entity or if it should stay in the current.

    -- Berdir, https://www.drupal.org/project/linkit/issues/3041045#comment-13038924 ✨ Allow "Linkit URL converter" filter to generate links based on the language of the current page, rather than the language of the referenced entity Postponed: needs info

    A far safer solution would be to instead set hreflang if the user chose a specific translation that does not match the current language.

    -- Wim Leers, https://www.drupal.org/project/linkit/issues/2886455#comment-14982870 ✨ Multilingual support: Allow linking to specific translations Needs review

    Agreed this probably something that needs to be configurable given the maturity of the module. Hard to sell a change like this without a way to revert it to how it works now.

    -- bkosborne, https://www.drupal.org/project/linkit/issues/3041045#comment-13875365 ✨ Allow "Linkit URL converter" filter to generate links based on the language of the current page, rather than the language of the referenced entity Postponed: needs info

    (That last link provides an example of where different sites would want different URL outputs for translated node URLs)

    So, with that in mind, I'm almost leaning toward thinking we *should* put πŸ› Option to keep language prefix in URLs Fixed in place as a first step, and maybe even expand the settings options.

    Following along on this issue now and helping where I can...

  • Open in Jenkins β†’ Open on Drupal.org β†’
    Core: 10.1.x + Environment: PHP 8.1 & MariaDB 10.3.22
    last update about 1 year ago
    4 pass
  • Status changed to Postponed about 1 year ago
  • πŸ‡ΊπŸ‡ΈUnited States dww

    I'm still not clear when the current behavior is ever correct. πŸ˜… But I picked up πŸ› Option to keep language prefix in URLs Fixed , moved it to 2.0.x, added tests, etc. I'll be working on an upgrade path next to maintain current behavior for existing sites.

    Meanwhile, I guess I'll postpone this one until that's merged...

  • Open in Jenkins β†’ Open on Drupal.org β†’
    Core: 10.1.x + Environment: PHP 8.1 & MariaDB 10.3.22
    4:12
    59:42
    Running
  • Status changed to Needs review about 1 year ago
  • πŸ‡ΊπŸ‡ΈUnited States dww

    πŸ› Option to keep language prefix in URLs Fixed is merged. This needed to be rebased. Probably still needs work, but the MR is reviewable again. Thanks!

  • Open in Jenkins β†’ Open on Drupal.org β†’
    Core: 10.1.x + Environment: PHP 8.1 & MariaDB 10.3.22
    last update about 1 year ago
    4 pass
  • πŸ‡ΊπŸ‡ΈUnited States dww

    The full logic and tests here are now merged the latest changes in 2.0.x, namely πŸ› Option to keep language prefix in URLs Fixed .

    Not sure if we now want to add yet another setting here to control if the filter should allow URL aliases and language processing to happen at all or not. Basically, do we make the final fallback case where we're now going to try to use internal: check a config setting for if we should continue to use base: like we do now. 😬 πŸ€”

    Also, we really need to flesh out the test coverage for the "is_file" logic. Tagging for that, but leaving NR since I'd love more eyes on what we should do here now.

  • Open in Jenkins β†’ Open on Drupal.org β†’
    Core: 10.1.x + Environment: PHP 8.1 & MariaDB 10.3.22
    last update about 1 year ago
    Patch Failed to Apply
  • πŸ‡―πŸ‡΄Jordan n.ghunaim Amman - Jordan

    Leave internal links alone without changing them.

  • πŸ‡§πŸ‡·Brazil astutonet Sao Paulo, SP

    The patch in #59 solved my issues with language prefixes in the URL. But we need a definitive and fast alternative, as I have seen many users abandon Drupal due to problems like this.

  • Open in Jenkins β†’ Open on Drupal.org β†’
    Core: 10.2.x + Environment: PHP 8.1 & MariaDB 10.3.22
    last update 12 months ago
    8 fail
  • πŸ‡¨πŸ‡¦Canada mark.labrecque Vancouver

    Patch reroll from #59 to account for upstream updates. Still no tests, unfortunately, but it will work with the latest module update.

  • The last submitted patch, 74: pathologic_2418369-74.patch, failed testing. View results β†’
    - codesniffer_fixes.patch Interdiff of automated coding standards fixes only.

  • Status changed to Needs review 12 months ago
  • Open in Jenkins β†’ Open on Drupal.org β†’
    Core: 10.2.x + Environment: PHP 8.1 & MariaDB 10.3.22
    last update 12 months ago
    8 fail
  • πŸ‡¨πŸ‡¦Canada mark.labrecque Vancouver

    Let's try that again....

  • The last submitted patch, 76: pathologic_2418369-76.patch, failed testing. View results β†’
    - codesniffer_fixes.patch Interdiff of automated coding standards fixes only.

  • πŸ‡ΊπŸ‡ΈUnited States dww

    Thanks. However, please drop patch #59 or anything based on it. The code is now in the merge request. That’s where the action needs to happen now.

  • πŸ‡ΊπŸ‡ΈUnited States dww

    Also note that language prefixes in the URL should be mostly solved in alpha2 thanks to πŸ› Option to keep language prefix in URLs Fixed

  • πŸ‡ΈπŸ‡ͺSweden twod Sweden

    The PR works well for us. Had issues with canonical node paths being marked as unrouted and bypassing the outbound path processing. Having them processed with the internal:// prefix where possible, makes more sense.

  • πŸ‡¨πŸ‡­Switzerland ayalon

    Here is the latest MR as a diff for composer patches

Production build 0.71.5 2024