The URI is malformed

Created on 29 August 2023, about 1 year ago
Updated 30 August 2023, about 1 year ago

I have a drupal 10.1 site that uses the migration module to import rss feed items and also look up a og:image tag from a feeds source url.

This setup also requires remote_steam_wrapper to import a remote file entity.

When importing images using the DOM process plugin I have the following conditions:

field_media_image:
    -
      plugin: migrate_process_html
      source: link
    -
      plugin: dom
      method: import
      # log_messages: false
    -
      plugin: dom_select
      selector: //meta[@property="og:image"]/@content
    -
      plugin: skip_on_empty
      method: process
      message: 'Meta Field og:image is missing'
    -
      plugin: extract
      index:
        - 0
    -
      plugin: skip_on_condition
      method: row
      condition:
        plugin: not:matches
        regex: /^(https?:\/\/)/i
      message: 'We only want a string if it starts with http(s)://'

However some remote images start https:////. These urls pass as valid and are imported successfully. However when drupal tries to render these files say using a teaser or default display, they result in a the following exception:

The website encountered an unexpected error. Please try again later.

InvalidArgumentException: The URI 'https:////m.files.bbci.co.uk/modules/bbc-morph-sport-seo-meta/1.23.3/images/bbc-sport-logo.png' is malformed. in Drupal\Core\Url::fromUri() (line 286 of core/lib/Drupal/Core/Url.php).
Drupal\Core\File\FileUrlGenerator->generate() (Line: 246)
Drupal\image\Plugin\Field\FieldFormatter\ImageFormatter->viewElements() (Line: 89)
Drupal\Core\Field\FormatterBase->view() (Line: 76)
Drupal\Core\Field\Plugin\Field\FieldFormatter\EntityReferenceFormatterBase->view() (Line: 265)
Drupal\Core\Entity\Entity\EntityViewDisplay->buildMultiple() (Line: 339)
Drupal\Core\Entity\EntityViewBuilder->buildComponents() (Line: 281)
Drupal\Core\Entity\EntityViewBuilder->buildMultiple() (Line: 238)
Drupal\Core\Entity\EntityViewBuilder->build()
call_user_func_array() (Line: 111)
Drupal\Core\Render\Renderer->doTrustedCallback() (Line: 788)
Drupal\Core\Render\Renderer->doCallback() (Line: 377)
Drupal\Core\Render\Renderer->doRender() (Line: 204)
Drupal\Core\Render\Renderer->render() (Line: 474)
Drupal\Core\Template\TwigExtension->escapeFilter() (Line: 124)
__TwigTemplate_9389c3ff9b0808d2f7f2ed1006f046b0->doDisplay() (Line: 394)
Twig\Template->displayWithErrorHandling() (Line: 367)
Twig\Template->display() (Line: 379)
Twig\Template->render() (Line: 40)
Twig\TemplateWrapper->render() (Line: 53)
twig_render_template() (Line: 372)
Drupal\Core\Theme\ThemeManager->render() (Line: 436)
Drupal\Core\Render\Renderer->doRender() (Line: 449)
Drupal\Core\Render\Renderer->doRender() (Line: 204)
Drupal\Core\Render\Renderer->render() (Line: 474)
Drupal\Core\Template\TwigExtension->escapeFilter() (Line: 107)
__TwigTemplate_398f8481f4b8ac91d449f82143cb4dab->doDisplay() (Line: 394)
Twig\Template->displayWithErrorHandling() (Line: 367)
Twig\Template->display() (Line: 379)
Twig\Template->render() (Line: 40)
Twig\TemplateWrapper->render() (Line: 53)
twig_render_template() (Line: 372)
Drupal\Core\Theme\ThemeManager->render() (Line: 436)
Drupal\Core\Render\Renderer->doRender() (Line: 204)
Drupal\Core\Render\Renderer->render() (Line: 238)
Drupal\Core\Render\MainContent\HtmlRenderer->Drupal\Core\Render\MainContent\{closure}() (Line: 583)
Drupal\Core\Render\Renderer->executeInRenderContext() (Line: 239)
Drupal\Core\Render\MainContent\HtmlRenderer->prepare() (Line: 128)
Drupal\Core\Render\MainContent\HtmlRenderer->renderResponse() (Line: 90)
Drupal\Core\EventSubscriber\MainContentViewSubscriber->onViewRenderArray()
call_user_func() (Line: 111)
Drupal\Component\EventDispatcher\ContainerAwareEventDispatcher->dispatch() (Line: 171)
Symfony\Component\HttpKernel\HttpKernel->handleRaw() (Line: 74)
Symfony\Component\HttpKernel\HttpKernel->handle() (Line: 58)
Drupal\Core\StackMiddleware\Session->handle() (Line: 48)
Drupal\Core\StackMiddleware\KernelPreHandle->handle() (Line: 106)
Drupal\page_cache\StackMiddleware\PageCache->pass() (Line: 85)
Drupal\page_cache\StackMiddleware\PageCache->handle() (Line: 48)
Drupal\Core\StackMiddleware\ReverseProxyMiddleware->handle() (Line: 51)
Drupal\Core\StackMiddleware\NegotiationMiddleware->handle() (Line: 51)
Drupal\Core\StackMiddleware\StackedHttpKernel->handle() (Line: 704)
Drupal\Core\DrupalKernel->handle() (Line: 19)

Url also renders in browser without issue. I have attached screenshot of this. If I select to open this up in a new tab it seems to remove a // from the url. Opening as shown seems to result in it redirecting to: https://localhost//m.files.bbci.co.uk/... with can;t connect to server message.

https://// also seems to redirect to localhost for me locally with file not found.

I have added browser screenshot that seems to show asset being rendered without issue dom edit screen.

More images attached showing setup

OK, I think the issue stems from a need to improve on my regular expression to check if the url is malformed or not e.g.

-
plugin: skip_on_condition
method: row
condition:
plugin: not:matches
regex: /^(https?:\/\/)/i
message: 'We only want a string if it starts with http(s)://'

In fact there may be a use case for a process plugin to use FILTER_VALIDATE_URL which does seem to work as expected here.

πŸ› Bug report
Status

Closed: works as designed

Version

10.1 ✨

Component
File systemΒ  β†’

Last updated 1 day ago

Created by

πŸ‡¬πŸ‡§United Kingdom 2dareis2do

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Comments & Activities

Production build 0.71.5 2024