Url::fromUri() can throw an exception with a bad URI value from user data, log an error instead?

Created on 15 March 2018, about 7 years ago
Updated 27 January 2023, over 2 years ago

I had invalid URL values in a link field, stored in the database due to a flawed import from contentimport β†’ . The module managed to save "n/a" in some URL field values. That issue about the module is here: #2953064: Invalid URLs cause cron & views to stop working β†’

Later I found that both views and cron content indexing died fatally from any attempt to work with the nodes containing this URL. Fortunately the node editing page let me delete the bad links easily and everything returns to normal.

A step to reproduce, could be making a view that displays a node with "n/a" set in the URL field (or any other invalid value).

I think this is too severe a failure for Drupal for this type of content. If invalid content is in a field it is better to help the end user locate and remove this, than it is to simply stop working without any pointer to the source of the problem. Bad links in particular are something more likely to happen, perhaps imported from a feed automatically or similarly.

IMHO there should be an error placed on Watchdog logs with helpful information about which node is offending. Currently the error does not indicate what node is the problem because I think that is not available to the function.

I'm not sure what parts of the software are counting on an Exception to happen so what needs to be accounted for in a change, isn't clear to me.

Also I have put this under cron but it applies equally to views. If a particular node is jamming cron's search indexing process then that would be best to make it possible to trace which node is bad.

The exceptions are like:
Uncaught PHP Exception InvalidArgumentException: "The URI 'n/a' is invalid. You must use a valid URI scheme." at /web/core/lib/Drupal/Core/Url.php line 281

The bit of the public static function fromUri is:

    elseif (empty($uri_parts['scheme'])) {
      throw new \InvalidArgumentException("The URI '$uri' is invalid. You must use a valid URI scheme.");
    }

I wonder if it would be better to simply return an empty string instead of an exception in this case. Certainly, I think if it is killing the cron indexing it would be better to not have an exception or work around it somehow.

Also maybe another optional element on the $options parameter array, perhaps like, "context" or "origin" which might be a node URL or some other reference to include in the error. Then whatever process fired the function fromUri could include this info. That alone would improve the situation a lot.

Related issues: This concerns generally changing the ways that URLs are stored and read out of the database in Drupal. Some of these are kind of loosely related but illustrate different aspects of how URLs are handled.

πŸ“Œ Task
Status

Active

Version

10.1 ✨

Component
BaseΒ  β†’

Last updated about 1 hour ago

Created by

πŸ‡ΊπŸ‡ΈUnited States hongpong Philadelphia

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Comments & Activities

Not all content is available!

It's likely this issue predates Contrib.social: some issue and comment data are missing.

  • πŸ‡³πŸ‡±Netherlands seanB Netherlands

    The big problem with the InvalidArgumentException is the site breaking in a way users are not able to correct any mistakes. Of course an argument could be made that bad links should never get saved in the first place, but once you get to a place where this already happened, it would be nice to be able to fix it.

  • πŸ‡«πŸ‡·France mably

    We have the problem with entities imported using the Feeds module.

    What would be the best way to handle it?

    Check the the url validity when importing and remove the invalid ones?

Production build 0.71.5 2024