Problem/Motivation
In
https://www.drupal.org/project/feeds/issues/2989789 →
logic was added that will reset the hash of a feed item if an entity reference was not found to force the item to update on the next run. The logic being that if feeds with dependencies are imported in an incorrect order and a referenced item is not present on the intial import, it will still get updated next time (when the reference should be there).
However, sometimes a missing reference field could be a valid situation and we don't want to reset the hash. This is the case for me, where I'm importing feeds in a strict order and some entity references are missing because that's the way the API operates. See Drupal's JSON API documentation ->
Missing Resource Identifiers →
.
Because of this code my purge queue was constantly growing faster than it could be purged, see
https://www.drupal.org/project/purge/issues/3132514
💬
your queue exceeded 100 000 items ! Purge shut down
Needs review
because a significant number of entities with missing reference fields were constantly being re-imported (due to having their hash reset each time).
Steps to reproduce
- Install drupal with standard profile so you have a page and article content type.
- Add an entity reference field called
field_article
to the page content type, that targets the article content type.
- Create an article feed type that uses a csv parser. Under processor settings check the option
'Update existing content items'.
Save and add mappings.
- Using the data in
tests/resources/content.csv
as a guide, map source data to target fields.
guid > feeds_item (guid) (unique=TRUE)
title > title
- Create a page feed type that uses a csv parser. Under processor settings check the option
'Update existing content items'.
Save and add mappings.
- Using the data in
tests/resources/content-with-reference.csv
as a guide, map source data to target fields.
guid > feeds_item (guid) (unique=TRUE)
title > title
article > field_article (referenced by feeds_item guid)
- Run the page feed type with the
content-with-reference.csv
file. There should be two page nodes imported. Both should have an empty field_article
entity reference.
- Change the title of
'Eodem modo typi' to Page 1
and save.
- Change the title of
'Aliquam feugiat diam' to Page 2
and save.
- Re-import the page feed with the same file.
- The titles should have reverted to
'Eodem modo typi' and 'Aliquam feugiat diam'
because the hash was reset.
Proposed resolution
I think if the behaviour in issue 2989789 is desired, then we should make a new option that is opt-in to change this behaviour, preserving existing functionality.
Remaining tasks
Create a patch to change the behaviour.
User interface changes
A new config field (probably boolean).
API changes
N/A
Data model changes
A new configuration setting on the entity reference target plugin.
Manual Testing
To manually test, simply apply the patch and reference Steps to Reproduce. The only different step is when you map the entity reference field in the page feed type, check the new 'Do not reset hash when an entity reference was not found'
option. With that change applied the second time you import the feed, the page titles will not revert.