Skipped rows are processed continuously at each batch iteration

Created on 6 January 2022, over 2 years ago
Updated 17 July 2024, about 2 months ago

Problem/Motivation

It seems like there's some problem in the way MigrateBatchExecutable handles skipped rows.

Since there's no way the MigrateBatchExecutable (or even core MigrateExecutable) could distinguish between rows skipped during this run and those skipped previously, each batch iteration retries importing skipped rows, leading to increased processing time as batch runs.

Also as a side-effect, this may lead to incomplete migrations (e.g. in case of batch-based migrations, like CSV import running from UI) since the batch would count skipped rows multiple times.

Steps to reproduce

Execute the migration using skip_on_empty, attach with debugger. Observe that skip_on_empty transform() method is being called for first skipped row at each batch process.

Proposed resolution

Patch attached.

Remaining tasks

None.

User interface changes

None.

API changes

None.

Data model changes

None.

πŸ› Bug report
Status

Needs work

Version

6.0

Component

Code

Created by

πŸ‡ΊπŸ‡¦Ukraine abramm Lutsk

Live updates comments and jobs are added and updated live.
  • Needs issue summary update

    Issue summaries save everyone time if they are kept up-to-date. See Update issue summary task instructions.

Sign in to follow issues

Merge Requests

Comments & Activities

Not all content is available!

It's likely this issue predates Contrib.social: some issue and comment data are missing.

  • First commit to issue fork.
  • Pipeline finished with Failed
    about 2 months ago
    Total: 287s
    #226393
  • Pipeline finished with Canceled
    about 2 months ago
    Total: 187s
    #226397
  • Pipeline finished with Canceled
    about 2 months ago
    Total: 112s
    #226398
  • Pipeline finished with Failed
    about 2 months ago
    Total: 204s
    #226400
Production build 0.71.5 2024