Provide a way to unset empty Migrate Row destinations

Created on 3 May 2022, almost 3 years ago
Updated 12 January 2024, over 1 year ago

Problem/Motivation

Currently some non-one-to-one Drupal migrations can result in empty destination properties of really important fields e.g. ID.

e.g. doing a migration lookup against the primary ID, and returning a NULL would result in the ID being set as an "empty property", and subsequently set to NULL everytime the Migration entity is updated as part of the migration, which might not necessarily be the expected behaviour especially for Migration stubs.

And there's currently no way for process plugins to remove empty properties.

Steps to reproduce

Provide a migration which does a lookup against the exisiting ID in some way in order to detect uniqueness, if the lookup fails, the ID will always be set to NULL.

Proposed resolution

Provide API that allows pre row save hooks to easily stop certain important properties from being explicitly flagged as NULL.

Or provide a plugin property/API which can be used to skip the setEmptyDestinationProperty behaviour similar to the $plugin->multiple(); method.

Or we could explicitly ignore IDs and other special fields from being flagged as "empty destinations" (unsure what consequences this might have).

Remaining tasks

Provide issue fork/patch with tests.

User interface changes

N/A

API changes

Potentially new Migrate methods:

  • \Drupal\migrate\Row::unsetEmptyDestinationProperty($property)

And potentially:

  • \Drupal\migrate\Plugin\MigrateProcessInterface::emptyDestination()
  • \Drupal\migrate\ProcessPluginBase::emptyDestination()

To determine when $row->setEmptyDestinationProperty($destination); is called by the MigrateExecutable.

Data model changes

N/A

Release notes snippet

✨ Feature request
Status

Active

Version

9.5

Component
MigrationΒ  β†’

Last updated about 7 hours ago

Created by

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Merge Requests

Comments & Activities

Not all content is available!

It's likely this issue predates Contrib.social: some issue and comment data are missing.

  • πŸ‡ΊπŸ‡ΈUnited States amaisano Boston

    Not sure if this is 100% the same problem, but I solved this by completely rejecting migration mappings from entering the database if arbitrary criteria was not met. I did this by extending the skip_on_empty plugin to what I call exclude_on_empty.

    class ExcludeOnEmpty extends SkipOnEmpty {
    
      /**
       * Excludes the current row when value is not set.
       *
       * {@inheritDoc}
       */
      public function row($value, MigrateExecutableInterface $migrate_executable, Row $row, $destination_property) {
        try {
          return parent::row($value, $migrate_executable, $row, $destination_property);
        }
        catch (MigrateSkipRowException $e) {
          throw new MigrateSkipRowException($e->getMessage(), FALSE);
        }
      }
    
    }
    
  • πŸ‡ΊπŸ‡ΈUnited States danflanagan8 St. Louis, US

    Here's a process-plugin approach to the same basic problem: ✨ Add set_on_condition process plugin Needs review

    I'm not sure it's a good idea, but I wanted to throw it out there.

  • Pipeline finished with Failed
    6 months ago
    Total: 97s
    #313556
  • Thanks @danflanagan8, I think this issue addresses a different situation, which is related to the fact that the empty value gets out of sync with the actual state of the row field.

    I've updated the issue summary accordingly.

  • Pipeline finished with Canceled
    6 months ago
    Total: 133s
    #313568
  • Pipeline finished with Failed
    6 months ago
    Total: 491s
    #313570
  • Pipeline finished with Failed
    6 months ago
    Total: 400s
    #313606
  • Pipeline finished with Failed
    6 months ago
    Total: 5374s
    #313817
  • πŸ‡¨πŸ‡¦Canada dylan donkersgoed London, Ontario

    I also encountered a problem related to this and codebymikey's solution fixed it for me. I don't think a process plugin would've worked for my use case.

    Essentially I have several migrations that would conditionally set the node or user ID (e.g. because the node might already have translations or revisions which I think is a fairly common use case).

    So, e.g., I most recently ran into the issue with a user migration that looked like this:

    uuid: 337cbd24-5039-4a30-b646-1c2032f68163
    langcode: en
    status: true
    dependencies: {  }
    id: upgrade_d7_user
    class: Drupal\user\Plugin\migrate\User
    field_plugin_method: null
    cck_plugin_method: null
    migration_tags:
      - 'Drupal 7'
      - Content
    migration_group: migrate_drupal_7
    label: 'User accounts'
    source:
      plugin: hh_migration_d7_user_with_group
    process:
      # A subset of users already exist, target their UIDs if they are found.
      uid:
        - plugin: skip_on_value
          source: uid
          method: process
          not_equals: true
          value:
            - 1
            - 14347
            - 14927
            - 16473
            - 18850
        - plugin: static_map
          map:
            1: 1
            14347: 6
            14927: 4
            16473: 5
            18850: 3
          default_value: null
      name:
        -
          plugin: get
          source: name
        -
          plugin: skip_on_value
          method: row
          value:
      pass:
        -
          plugin: get
          source: pass
      mail:
        -
          plugin: get
          source: mail
      created:
        -
          plugin: get
          source: created
      access:
        -
          plugin: get
          source: access
      login:
        -
          plugin: get
          source: login
      status:
        -
          plugin: get
          source: status
      timezone:
        -
          plugin: get
          source: timezone
      langcode:
        -
          plugin: user_langcode
          source: entity_language
          fallback_to_site_default: false
      preferred_langcode:
        -
          plugin: user_langcode
          source: language
          fallback_to_site_default: true
      preferred_admin_langcode:
        -
          plugin: user_langcode
          source: language
          fallback_to_site_default: true
      init:
        -
          plugin: get
          source: init
      roles:
        - plugin: static_map
          source: roles
          map:
            authenticated user: employee
            administrator: administrator
          default_value: employee
      user_picture:
        -
          plugin: default_value
          source: picture
          default_value: null
        -
          plugin: migration_lookup
          migration: upgrade_d7_file
      groups:
        - plugin: static_map
          source: og_gid
          map:
            6093: 3
            18179: 4
          default_value: 1
        - plugin: default_value
          default_value: 1
      field_about_me:
        -
          plugin: get
          source: field_about_me
      field_contact_me:
        -
          plugin: get
          source: field_contact_me
      field_first_name:
        -
          plugin: get
          source: field_first_name
      field_last_name:
        -
          plugin: get
          source: field_last_name
      field_photo:
        -
          plugin: sub_process
          source: field_photo
          process:
            target_id:
              plugin: migration_lookup
              migration: upgrade_d7_file
              source: fid
            alt: alt
            title: title
            width: width
            height: height
    destination:
      plugin: 'entity:user'
    migration_dependencies:
      optional:
        - upgrade_d7_file
    

    The important section is the uid bit. It will assign the UID for certain users only if the user already exists in Drupal so the migration will map it to the existing user. This is needed to make sure all the user relationships are preserved in revision history etc. even for the users that were already manually created. I've run into the problem in several other places though for migrations that handle node revisions/translations.

    The problem was when I ran a migration with the --update flag I would get numerous errors like:

    [error] SQLSTATE[23000]: Integrity constraint violation: 1062 Duplicate entry '775cbb48-ed5d-4fb0-aa15-09b315905fe7' for key 'user_field__uuid__value': INSERT INTO "users" ("uuid", "langcode") VALUES (:db_insert_placeholder_0, :db_insert_placeholder_1); Array
    (
    [:db_insert_placeholder_0] => 775cbb48-ed5d-4fb0-aa15-09b315905fe7
    [:db_insert_placeholder_1] => en
    )

    this happened when the uid was configured in the migration even if a process plugin skipped processing that field and no value was set. I think because the migration would somehow end up trying to do a mix of updating the already existing entity based on the existing destination IDs and creating a new one because the uid was not set.

    Removing the empty destination property seems to have fixed the issue, but there is no way to do that without codebymikey's patch.

Production build 0.71.5 2024