Migration messages for failed rows are gone

Created on 9 July 2021, over 3 years ago
Updated 21 June 2023, over 1 year ago

Problem/Motivation

I am creating migrations to add content to the system. I would like Entity Validation to occur, so I included "validation: true" in the yml migration file. To test out validation I created a migration that was missing required data. When I run the migration, however, I only get error messages for the last record in the csv migration. I expected a message for every row.

I am using Migrate Source UI to run the migration.

Steps to reproduce

I created this test scenario to reproduce this issue.

Create field on Article content type that's required. I created a field called Random Info (field_random_info) that's just a plan text field. Make it required.

Create a migration for Articles, like so

langcode: en
status: true
dependencies: {  }
id: new_articles
class: null
field_plugin_method: null
cck_plugin_method: null
migration_tags: null
migration_group: article_migration_group
label: 'Test article ingest'
source:
  plugin: csv
  ids:
    - local_id
  path: 'Will be populated by the Migrate Source UI'
process:
  title: title
  body: body
  field_random_info: random_info
destination:
  plugin: 'entity:node'
  default_bundle: article
  validate: true
migration_dependencies: null

Migration group:

langcode: en
status: true
dependencies: {  }
id: article_migration_group
label: article_migration_group
description: ''
source_type: null
module: null
shared_configuration: null

Put in "config/sync" directory and import configuration.

Create CSV migration file:

local_id,title,random_info,body
test_01,Article 1,,This is article one
test_02,Article 2,,This is article two
test_03,Article 3,,This is article three

Note that it's missing data for the required field Random Info. Run the migration via the Migrate Source UI. (I don't think Migrate Source UI is causing the issue, but I'm really not sure. It might be).

All the records should fail to ingest because they are missing the required field "Random Info".

However, the only message you'll see in the migration table is about the 3rd item. There is no messaging about the other two items. That seems odd.

Proposed resolution

Ideally all error messages would show up. From some testing, errors are being created and then aggressively deleted.

I've tracked it to here -- It appears the code here might need some reordering, but I'm not sure about all the scenarios this code handles.
It seems like this line of code here: https://git.drupalcode.org/project/drupal/-/blob/8.9.x/core/modules/migr...
should only be run if you're going to act on the row. I noticed that it iterates over all the rows each time it's looking for a new row to work on. So each time it iterates looking, it appears to delete all the messages for the rows that have already run (and failed), even if it's not going to act on that data again.

Not sure if reordering this code is the solution, it's just what I've come up with after hours of looking at this. Perhaps there is something else going on that I'm not aware of.

πŸ› Bug report
Status

Closed: won't fix

Version

11.0 πŸ”₯

Component
MigrationΒ  β†’

Last updated about 8 hours ago

Created by

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Comments & Activities

Not all content is available!

It's likely this issue predates Contrib.social: some issue and comment data are missing.

  • πŸ‡ΊπŸ‡ΈUnited States mikelutz Michigan, USA

    Hmm, I'm not sure we can/should do anything in core here. Your issue is due to migrate_source ui, because that uses a batched migrate executable. It runs the migration a few rows at a time, and since this isn't a sql source that we can do some magic with the query to remove rows that aren't going to be processed in advance, the batch executable does what you said, each batch has to run through each already processed row to see if it needs to be processed in this batch. That check happens further down

          // Check whether the row needs processing.
          // 1. This row has not been imported yet.
          // 2. Explicitly set to update.
          // 3. The row is newer than the current highwater mark.
          // 4. If no such property exists then try by checking the hash of the row.
          if (!$row->getIdMap() || $row->needsUpdate() || $this->aboveHighwater($row) || $this->rowChanged($row)) {
            $this->currentRow = $row->freezeSource();
          }
    

    But, before we can run that check, we need to give the source and hooks a chance to act on the row:

          // Preparing the row gives source plugins the chance to skip.
          if ($this->prepareRow($row) === FALSE) {
            continue;
          }
    

    But these prepare row functions can also skip rows, and can insert a message as to why they are skipping the row at the same time. So if we were to move the message clear to below the needs_update check, any messages from prepare row on rows that are skipped by prepare row would stack up and never be cleared. Any sort of attempt to save the old messages and re-insert them if no new messages are added in prepare row would be flakey at best, and incorrect from a core standpoint, as the message table is supposed to reflect messages from the most recent migration run only. The problem is in the way that migrate_source_ui works under the hood, by making a migration run only process a few (or one) row at a time. Since we don't natively support batching in core, we have no way of knowing that this isn't a full migration run and we should somehow not clear messages for things we aren't processing.

    I hate to, but because this gets triggered on a non-core method of running migrations, I'm going to close it as 'won't fix'

Production build 0.71.5 2024