Part of the path is lost after image path conversion

Created on 4 August 2023, over 1 year ago
Updated 10 July 2024, 6 months ago

When I use the process() method to convert the path, some image paths will be filtered out.
For example, when the input value is: /sites/japan/files/styles/wide_300/public/image/2023-08/5.png?itok=xxx, after conversion, it will become: https://xxx.com/sites /japan/files/styles/wide_300/public?itok=xxx

πŸ› Bug report
Status

Needs review

Version

2.2

Component

Code

Created by

πŸ‡¨πŸ‡³China eason xu

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Comments & Activities

  • Issue created by @eason xu
  • Open in Jenkins β†’ Open on Drupal.org β†’
    Core: 10.1.x + Environment: PHP 8.2 & MySQL 8
    last update over 1 year ago
    2 fail
  • πŸ‡ΊπŸ‡ΈUnited States jeffschuler Boulder, Colorado

    Weird. You said "some image paths will be filtered out". Can you try to identify which paths? ... What are the conditions where this happens?

    The code in your patch is kind of re-doing functionality that setAbsolute() already handles, so it shouldn't really be necessary.

    Maybe you can drop in some debug dpm()s, like on $matches and $url inside the preg_replace callback to see at exactly which point you're losing part of the path.

  • πŸ‡¨πŸ‡³China eason xu

    Just like my example, when I upload a picture in CKeditor and select wide, when I go through this function, I will lose /image/2023-08/5.png.

  • Status changed to Needs review over 1 year ago
  • πŸ‡ΊπŸ‡ΈUnited States mlncn Minneapolis, MN, USA

    Same thing happening to us and it took hours to debug and identify this module as the culprit.

    This causes a conflict with the Imagefield to text if none β†’ module, but as Eason Xu showed it can happen with any setup. For some inserted images the problem could be worked around by putting the relative to absolute filter before the image processing, except that defeats the purpose of rel_to_abs module!

    The error was introduced in this massive re-factoring: https://git.drupalcode.org/project/rel_to_abs/-/commit/1e353a211b72252a6...

    We are working around this by holding to "drupal/rel_to_abs": "2.2.3", in our composer.json but Eason Xu's fix needs review.

    And it would seem we need more tests.

    It should be reproducible with just the string:

    <img src="/sites/default/files/styles/large/public/images/Award5_Press%20Conference.jpeg?itok=L25Jd2Zr" />

    And ensuring it is still completely output, rather than transformed to no longer having the filename portion:

    <img src="https://example.com/sites/default/files/styles/large/public/images/?itok=L25Jd2Zr

  • πŸ‡¨πŸ‡³China eason xu

    Sorry, there is something wrong with this patch. It does not take into account the use of the == symbol in the judgment, and 0 and false will be recognized as equal. I have uploaded a new patch.

  • Open in Jenkins β†’ Open on Drupal.org β†’
    Core: 10.1.x + Environment: PHP 8.2 & MySQL 8
    last update about 1 year ago
    1 fail
  • πŸ‡ΊπŸ‡ΈUnited States jeffschuler Boulder, Colorado

    I just had this happen, and noticed the filename only gets removed in URLs with a query string.

    It looks like the removal happens here:

    $url = Url::fromUserInput($url)->setAbsolute()->toString();
    

    specifically, when fromUri() (called by fromUserInput()) gets the internal: prefix that fromUserInput() adds, it removes the filename.

    Test with a simple example using @mlncn's URL:

    $url = '/sites/default/files/styles/large/public/images/test.jpeg?itok=L25Jd2Zr';
    print_r(Url::fromUserInput($url)->setAbsolute()->toString());
    

    The output is

    http://MYSITE/sites/default/files/styles/large/public?itok=L25Jd2Zr

    I note that the docs for Url::fromUri say:

    "For user input that may correspond to a Drupal route, use internal: for the scheme. For paths that are known not to be handled by the Drupal routing system (such as static files), use base: for the scheme to get a link relative to the Drupal base path (like the HTML element)."

    We should be able to do this instead:

    $url = Url::fromUri('base:' . $url)->setAbsolute()->toString();
    

    This has fixed the issue for me and I'm continuing to test it.

    I'm not sure, BTW, why it was only failing when there was a query string present.

  • Status changed to Needs work about 1 year ago
  • πŸ‡ΊπŸ‡ΈUnited States jeffschuler Boulder, Colorado

    This gets a little weirder.

    With internal:, the filename is removed but the query string is not urlEncoded.

    print_r(Url::fromUri('internal:' . $url)->setAbsolute()->toString());
    

    Result: http://default/sites/default/files/styles/large/public?itok=L25Jd2Zr

    But with base:, while the filename persists, the query string is urlEncoded, which will break image embeds:

    print_r(Url::fromUri('base:' . $url)->setAbsolute()->toString());
    

    Result: http://default/sites/default/files/styles/large/public/images/test.jpeg%3Fitok%3DL25Jd2Zr

  • πŸ‡¬πŸ‡§United Kingdom paulmartin84

    The attached patch fixes the query string issue mentioned in #8 and also handles a url fragment should it exist.

  • πŸ‡¬πŸ‡§United Kingdom paulmartin84

    Sorry I made a small mistake in the last patch, This is correct now

  • Status changed to Needs review 6 months ago
  • πŸ‡ΊπŸ‡ΈUnited States jeffschuler Boulder, Colorado

    Makes sense.
    Thanks @paulmartin84!

  • πŸ‡¬πŸ‡§United Kingdom paulmartin84

    I have spotted another issue with the above parsing method. URLs that contain multiple query strings will have & instead of &.

    Accoring to https://www.drupal.org/project/entity_embed/issues/2973178 β†’ this is correct.

    This causes parse_url not to work correctly, so I have added decoding and re-encoding the url to work around this.

  • πŸ‡¬πŸ‡§United Kingdom paulmartin84

    Hopefully this is the final change, I have noticed that URLs containing percentage encoded characters are getting double encoded.
    The final solution isn't that clean, but not sure much else can be done.

Production build 0.71.5 2024