Empty file doesn't trigger clean() to delete all imported items

Created on 2 May 2024, 8 months ago
Updated 8 September 2024, 4 months ago

Hi !

I implemented a specific feed parser, extending CSV parser, to parse IDX protocol based files (same format, without header, separating by #). I selected "delete" on the processor to remove items that disapears from the file. When I import the first time the file, parsing is successful and items are created. When I remove some lines in my files, the processor remove the item accordingly.

Now when I give an empty file to the parser, it bypass the "clean" event and old items remains in the system... It seems the clean method is called ONLY if there is one or more items in the file...

I don't know if it's possible, but maybe is there a possibility to "force" the cleaning of old items when the file is empty ? If anyone see a quick solution to implement it would be helpful.

Thanks in advance for any suggestion.

πŸ› Bug report
Status

Needs work

Component

Generic entity processor

Created by

πŸ‡¨πŸ‡­Switzerland titouille

Live updates comments and jobs are added and updated live.
  • Needs tests

    The change is currently missing an automated test that fails when run with the original code, and succeeds when the bug has been fixed.

Sign in to follow issues

Merge Requests

Comments & Activities

  • Issue created by @titouille
  • πŸ‡¨πŸ‡­Switzerland titouille

    Ok, so if anyone search a quick answer to this problem, I solved it like this.

    Because I have a IDXFetcherResult extending FetcherResult, I can override the "cleanUp" method.

    I have a IDXFileFetcher too, with fetch() method. I used this fetch method to pass to the IDXFetcherResult constructor the $feed and the "update_non_existent" state :

    return new IDXFetcherResult(
      $path,
      $feed,
      $this->feedType->getProcessor()->getConfiguration('update_non_existent')
    );
    

    Now, on my IDXFetcherResult, I implemented the cleanUp method like this :

      public function cleanUp() {
    
        // Get file content as string.
        $content = file_get_contents($this->filePath);
    
        // Check if file is empty.
        // If it's the case, and only if it's the case, we must
        // clean ALL previously imported items.
        // We must add this process because the standard feeds system
        // doesn't clean items when it encount an empty list.
        if (empty(trim($content))) {
    
          $storage = \Drupal::entityTypeManager()
            ->getStorage('my-entity-type');
    
          // Get list of entities to clean.
          $ids = $storage
            ->getQuery()
            ->accessCheck(FALSE)
            ->condition('feeds_item.target_id', $this->feed->id())
            ->condition('feeds_item.hash', $this->updateNonExistent, '<>')
            ->execute();
    
          if (!empty($ids)) {
            /** @var \Drupal\Component\EventDispatcher\ContainerAwareEventDispatcher $dispatcher */
            $dispatcher = \Drupal::service('event_dispatcher');
    
            $entities = $storage->loadMultiple($ids);
    
            // Iterate over each entity and mimic the clean process.
            // We use the same process to avoid potential errors.
            foreach ($entities as $entity) {
    
              $dispatcher->dispatch(new InitEvent($this->feed, 'clean'), FeedsEvents::INIT_IMPORT);
              $dispatcher->dispatch(new CleanEvent($this->feed, $entity), FeedsEvents::CLEAN);
            }
          }
        }
      }
    

    Hope this can help.

  • πŸ‡¦πŸ‡ΊAustralia nigelcunningham Geelong

    nigel cunningham β†’ made their first commit to this issue’s fork.

  • Status changed to Needs review 4 months ago
  • πŸ‡¦πŸ‡ΊAustralia nigelcunningham Geelong

    Attaching a patch I've prepared that ensures the cleanup is done when there is no data to import.

  • Pipeline finished with Failed
    4 months ago
    Total: 555s
    #272937
  • Status changed to Needs work 4 months ago
  • πŸ‡³πŸ‡±Netherlands megachriz

    This looks like a good approach.

    It does require fixing to tests and spelling:

    And I think that this requires test coverage:

    • Add a test case to \Drupal\Tests\feeds\Kernel\UpdateNonExistentTest that demonstrates that content is cleaned when the source is empty.
    • Add a test case to \Drupal\Tests\feeds\Kernel\UpdateNonExistentTest that ensures that content is not cleaned when there is a fetch error. For example when the HTTP Fetcher receives a 403. Perhaps this case can be simplified with a custom fetcher that throws an exception of some kind. I think that a \Drupal\feeds\Exception\FetchException gets thrown on a 403.
    • Add a test case to \Drupal\Tests\feeds\Kernel\UpdateNonExistentTest that ensures that content is not cleaned when a fetch resulted into a case where no data changed. The HTTP Fetcher receives a 304 in that case and throws a \Drupal\feeds\Exception\EmptyFeedException. Perhaps this case can be simplified with a custom fetcher that just throws a EmptyFeedException.
  • πŸ‡¦πŸ‡ΊAustralia nigelcunningham Geelong

    Ah, the spelling isn't a mistake, it's because I'm not an American!

    I'll see if I can find time to write the tests but I'm quite likely not to get around to it due to all the other things on my plate, sorry.

  • πŸ‡³πŸ‡±Netherlands megachriz

    Ah, the spelling isn't a mistake, it's because I'm not an American!

    Ah okay. I'm not American either. But I'm also not English. I just happen to see that the MR failed cspell tests. I think that the standards on drupal.org are American English. For example, elsewhere I see that the word "color" is used instead of the British "colour". So I think we should follow the American English spelling here.

    I'll see if I can find time to write the tests but I'm quite likely not to get around to it due to all the other things on my plate, sorry.

    I understand. Then this issue will likely hang here for a while. My plate is also quite big and I do my best to prioritize Feeds issues. Currently I think there are many issues with a higher priority than this one, so this one then will have to wait.

Production build 0.71.5 2024