The migrate Row class API is incomplete

Created on 19 April 2025, 26 days ago

Problem/Motivation

There is currently no way to remove an empty destination property from a Row once it has been added. This limits the ability of migration event subscribers to alter the outcome of a migration update.

Steps to reproduce

Create a migration that will null out the contents of a field for a particular entity. Use the PRE_ROW_SAVE event to attempt to undo that, and see that the Row class does not provide a public method to accomplish this.

Proposed resolution

Add public methods to the Row class. Rather than add a method specific to remove an empty property, the MR adds functions that can check for and remove any property, whether it is empty or not, which I think is more useful.

Remaining tasks

None

User interface changes

None

Introduced terminology

None

API changes

New methods added to \Drupal\migrate\Row.

Data model changes

None

Release notes snippet

None

πŸ› Bug report
Status

Active

Version

11.1 πŸ”₯

Component

migration system

Created by

πŸ‡ΊπŸ‡ΈUnited States john.oltman

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Merge Requests

Comments & Activities

  • Issue created by @john.oltman
  • πŸ‡ΊπŸ‡ΈUnited States john.oltman
  • Pipeline finished with Canceled
    26 days ago
    Total: 70s
    #477387
  • πŸ‡ΊπŸ‡ΈUnited States john.oltman
  • πŸ‡ΊπŸ‡ΈUnited States john.oltman

    john.oltman β†’ changed the visibility of the branch 3520065-migrate-row-remove-empty to hidden.

  • Pipeline finished with Failed
    26 days ago
    Total: 677s
    #477386
  • Pipeline finished with Success
    26 days ago
    Total: 1145s
    #477391
  • πŸ‡ΊπŸ‡ΈUnited States john.oltman
  • Pipeline finished with Success
    26 days ago
    Total: 550s
    #477403
  • Pipeline finished with Success
    26 days ago
    Total: 759s
    #477409
  • πŸ‡ΊπŸ‡ΈUnited States john.oltman
  • πŸ‡ΊπŸ‡ΈUnited States smustgrave

    Small comments. Tagging for sub-maintainer sign off for API changes.

  • Pipeline finished with Success
    25 days ago
    Total: 1246s
    #477813
  • πŸ‡ΊπŸ‡ΈUnited States john.oltman
  • πŸ‡ΊπŸ‡ΈUnited States benjifisher Boston area

    Thanks for suggesting this addition to the Migrate API. I think this is more of a feature request than a bug report, so I am updating the category.

    I am trying to understand the use case for this addition. Is it mostly for migrations that update existing entities? If a migration is creating a new entity, then I think it does not matter whether the Row object has a null property or is just missing that property.

    I also wonder whether we need all four of the new methods. Are hasDestinationOrEmptyProperty() and removeDestinationOrEmptyProperty() really useful? Again, what is the use case?

    For both of those questions, it would help to see a code snippet of how you are using (or plan to use) the new methods.

  • πŸ‡ΊπŸ‡ΈUnited States john.oltman

    Yes, typically an update migration. I have a client with a business rule that states that under certain circumstances, data in an important field should not be altered or removed by a migration import, even if it otherwise would be. This is impossible without patching the Row class. Simply put, the class has two protected variables that work in tandem, destination properties and empty properties. The destination has set, get and remove. The empty only has set and get. It needs remove, just like its sibling. Call it a feature request if you will, but it seems to be an oversight.

    Regarding `hasDestinationOrEmptyProperty` and `removeDestinationOrEmptyProperty`, those are convenience methods - strictly speaking they are not needed, but are very useful when trying to implement the "prevent alteration of field in any way" use case. We might as well add them while we are in there.

  • πŸ‡ΊπŸ‡ΈUnited States john.oltman
  • πŸ‡ΊπŸ‡ΈUnited States john.oltman

    Code snippet in an event subscriber:

    public function onPreRowSave(MigratePreRowSaveEvent $event): void {
      $row = $event->getRow();
      if ($field_name = $this->complexBusinessLogicThatDeterminesWhichFieldNotToAlter($row)) {
        if ($row->hasDestinationOrEmptyProperty($field_name)) {
          $row->removeDestinationOrEmptyProperty($field_name);
        }
      }
    }
    
    public static function getSubscribedEvents(): array {
      return [
        MigrateEvents::PRE_ROW_SAVE => ['onPreRowSave'],
      ];
    }
    
  • πŸ‡ΊπŸ‡ΈUnited States benjifisher Boston area

    Thanks for the feedback.

    I think we should not implement the two convenience methods. There are several reasons:

    The has...() convenience method calls hasDestinationProperty() and (if that returns FALSE) hasEmptyDestinationProperty(). If the code next calls the remove... convenience method, as in your snippet, it calls the same has... method(s). That seems sloppy.

    The method names are long, and they do not adequately describe what the methods do. Anyone reading the code will have to look up what the method does. It is convenient to write, but not to read.

    Adding extra methods means we have to add and maintain test coverage for them.

    In simple cases with content entities, you can get the same effect by adding overwrite_properties to the configuration of the destination plugin. (See Drupal\migrate\Plugin\migrate\destination\EntityContentBase.) I think this covers the most common use cases. I do not want to add convenience methods to the API if they are only going to be used by occasional projects that have especially complex business logic. Those projects can add convenience functions (not methods on the Row class) themselves.

  • πŸ‡ΊπŸ‡ΈUnited States benjifisher Boston area

    On second thought, I do not think we need to add hasEmptyDestinationProperty().

    First reason: it is safe to call removeEmptyDestinationProperty() without first testing hasEmptyDestinationProperty().

    Second reason: if code does need to test, it can just use in_array($field_name, $row->getEmptyDestinationProperties()).

    Source and destination properties need the has... methods because they use the NestedArray methods, but $emptyDestinationProperties is just a flat list of strings.

Production build 0.71.5 2024