Allow parsers to validate source before import

Created on 28 August 2024, 8 months ago

Problem/Motivation

Sometimes users make errors in the naming of columns when supplying a CSV file. The used column names then don't match with those specified on the mapping form. This creates unnecessary frustation for the user: they either get validation errors for every imported item (for example if the column name for a required field was misspelled) or they wonder why data for a certain field wasn't imported.

Steps to reproduce

  1. Create a feed type, select the CSV parser.
  2. On the mapping form, map "title" to "Title" and "body" to "Body".
  3. Create a CSV file with the following contents instead:
    title,description
    Foo,Lorem Ipsum Dolor Sit Amet
    

In the above case, the column that was meant for the Body field is called "description" instead of "body" and as a result no value for the Body field is imported.

Proposed resolution

When clicking the "Import" button on /feed/x/import:

  1. Validate the source before starting the import process (in case the parser supports that).
  2. If the parser reports issues:
    1. Display a warning message with the reported issues.
    2. Provide an option to ignore the warning and continue the import.

In follow-up issues we could also think about how to apply this type of validation on other places where a user can start an import:

  • On "Import on background" page (/feed/1/schedule-import)
  • On bulk imports (/admin/content/feed)
  • Import using the drush command "feeds:import" or "feeds:import-all"

But for now, I think we better limit the focus on the import form at /feed/x/import.

Implementation

This is what I think should be coded to get the proposed solution.

  • An interface needs to be added that specifies a method to validate the source.
  • \Drupal\feeds\Feeds\Parser\CsvParser would implement this interface.
  • \Drupal\feeds\Form\FeedImportForm needs to check on submit if the selected parser implements the added interface. If not, it starts the import like it does now.
  • There needs to be an easy programmatically way to only fetch a source and provide this to a parser for validation. Perhaps add a method for that to \Drupal\feeds\FeedImportHandler?
  • The import form needs to call the method(s) required to fetch the source and let it validate by the parser.
  • When there are no issues found on validation, the import should continue with parsing. It should not fetch again.
  • When the user chooses to continue the import (despite the validation issues), it would be cool if refetching isn't necessary either.
  • When there are validation issues, the import form should unlock the feed. And lock it again when the user chooses to continue the import.

Remaining tasks

TBD.

User interface changes

After clicking "Import" on /feed/x/import, the user would see a warning when there are validation issues. The user would see a button called "Continue import". Clicking this button would ignore the validation issues and continues the import.

API changes

TBD.

Data model changes

None.

Feature request
Status

Active

Version

3.0

Component

Code

Created by

🇳🇱Netherlands megachriz

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Comments & Activities

Production build 0.71.5 2024