Image link is mapped incorrectly

Created on 2 June 2022, about 2 years ago
Updated 28 September 2023, 9 months ago

Importing a RSS feed using Feeds. The import works fine when running it from the UI.
When running drush command to import all feeds, it runs into mismatch data issue.

I have a field for image, which is an URL to the image, this field for some reason has incorrect links. Some nodes have links from which are part of another node, every node should have their own image link.

I am running drupal locally using ddev and I can see all the images are downloaded in the files directory but not all of them are mapped to the field correctly. I don't see any errors.

πŸ› Bug report
Status

Active

Component

Code

Created by

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Comments & Activities

Not all content is available!

It's likely this issue predates Contrib.social: some issue and comment data are missing.

  • πŸ‡ΊπŸ‡ΈUnited States jnicola

    We're running into a similar issue. No updates on how we manage to handle it.

  • First commit to issue fork.
  • Open in Jenkins β†’ Open on Drupal.org β†’
    Core: 9.5.5 + Environment: PHP 7.4 & MySQL 5.7
    last update 9 months ago
    703 pass
  • @maskedjellybean opened merge request.
  • πŸ‡ΊπŸ‡ΈUnited States maskedjellybean Portland, OR

    Merge request opened:

    https://git.drupalcode.org/project/feeds/-/merge_requests/137

    If anyone needs a patch:

    https://git.drupalcode.org/project/feeds/-/merge_requests/137.patch

    Explanation and disclaimer

    Please know this only a workaround and I don't truly expect this to be merged. It does not get at the true source of the issue. But let me explain what this workaround does.

    I can only speak from our experience. This can happen when you have multiple feed types with multiple feeds and they all update one content type. It's certainly possible this bug occurs under other circumstances, but this is ours.

    Removing this if statement from \Drupal\feeds\Laminas\Extension\Mediarss\Entry::getMediaElement will work around the issue:

        if (array_key_exists($media_key, $this->data)) {
          return $this->data[$media_key];
        }
    

    This is because if you are importing multiple feeds at once either by running cron or by running drush feeds:import-all (although I have seen this happen more consistently when triggered by cron), and a feed has already imported that has <media:content /> data, when the next feed is imported that also has <media:content /> data, $this->data can contain stale/incorrect data from the previous feed. Removing the if statement forces the media:content url to parsed/retrieved from the feed.

    Why $this->data contains stale data is a larger issue. I've tried to understand it but I can't. I think it has to do with \Drupal\feeds\Feeds\Parser\SyndicationParser::parse and the way $entry is instantiated within the foreach loop (currently starting at line 94). It is not clear to me what class $entry is or how the current code even works.

    Within the foreach loop, $entry may be one of these classes:

    • \Drupal\feeds\Laminas\Extension\Mediarss\Entry
    • \Laminas\Feed\Reader\Feed\Rss
    • \Laminas\Feed\Reader\Entry\Rss

    It seems to me that \Drupal\feeds\Laminas\Extension\Mediarss\Entry should extend \Laminas\Feed\Reader\Feed\Rss instead of \Laminas\Feed\Reader\Feed\AbstractFeed. I say this because there are no issues with stale/incorrect data for the methods on \Laminas\Feed\Reader\Feed\Rss such as \Laminas\Feed\Reader\Feed\Rss::getTitle.

Production build 0.69.0 2024