Field storage migration fails on re-import

Created on 3 November 2023, 11 months ago
Updated 14 March 2024, 7 months ago

Problem/Motivation

When running a migration repeatedly, with the "update" option, it should update existing entities, instead of creating new entities.

Unfortunately, when migrating field storage configs from e.g. Drupal 7, the migration fails to find the existing field storage.

This leads to errors like "'field_storage_config' entity with ID 'comment.comment_body' already exists."

The reason is that the ->getEntity method only uses the first part of the config entity id array, when trying to load an existing field storage.

Other destination plugins don't have this problem, because they have their own logic when dealing with the id.

-----

The problem is the reset() in Drupal\migrate\Plugin\migrate\destination\Entity::getEntity():

  protected function getEntity(Row $row, array $old_destination_id_values) {
    $entity_id = reset($old_destination_id_values) ?: $this->getEntityId($row);
    if (!empty($entity_id) && ($entity = $this->storage->load($entity_id))) {

For field storage this will fail, if e.g. $old_destination_id_values === ['comment', 'comment_body'].

Steps to reproduce

Generate migrations to upgrade from a from Drupal 7 site.
(I use migrate_tools and migrate_plus to create them as config entities. But I suppose the same happens if you just use core)

Run the "upgrade_d7_field" (or d7_field) migration repeatedly with --update.

Expected: Field storages are updated on repeated import.
Actual: Errors like "'field_storage_config' entity with ID 'comment.comment_body' already exists."

Proposed resolution

Field storage needs its own logic to deal with the ids or load the old entity.

Remaining tasks

Also review other migrate destinations.
Those that I checked seemed ok, but maybe there are others that are broken like field storage.

User interface changes

API changes

Data model changes

Release notes snippet

πŸ› Bug report
Status

Needs work

Version

11.0 πŸ”₯

Component
MigrationΒ  β†’

Last updated about 3 hours ago

Created by

πŸ‡©πŸ‡ͺGermany donquixote

Live updates comments and jobs are added and updated live.
  • Needs tests

    The change is currently missing an automated test that fails when run with the original code, and succeeds when the bug has been fixed.

  • Needs subsystem maintainer review

    It is used to alert the maintainer(s) of a particular core subsystem that an issue significantly impacts their subsystem, and their signoff is needed (see the governance policy draft for more information). Also, if you use this tag, make sure the issue component is set to the correct subsystem. If an issue significantly impacts more than one subsystem, use needs framework manager review instead.

Sign in to follow issues

Merge Requests

Comments & Activities

  • Issue created by @donquixote
  • Status changed to Needs review 11 months ago
  • πŸ‡©πŸ‡ͺGermany donquixote

    Needs tests, but first I just want to get some feedback.

  • Status changed to Needs work 11 months ago
  • πŸ‡ΊπŸ‡ΈUnited States smustgrave

    should a new approach be taken? Seemed to cause some kernel and functional test failures.

  • πŸ‡©πŸ‡ͺGermany donquixote

    should a new approach be taken? Seemed to cause some kernel and functional test failures.

    I see failures where it tries to serialize the container.
    This is probably obfuscating something else.

  • πŸ‡©πŸ‡ͺGermany donquixote

    This is probably obfuscating something else.

    Actually.. my code does not check whether $old_destination_id_values is empty!
    So I should be able to fix it.

  • πŸ‡©πŸ‡ͺGermany donquixote

    I just found that these arrays are sometimes like [NULL, NULL].

  • Status changed to Needs review 9 months ago
  • πŸ‡©πŸ‡ͺGermany donquixote

    This needs tests. But a review is not wrong.

  • πŸ‡ΊπŸ‡ΈUnited States smustgrave

    Will post in #migration channel

  • πŸ‡ΊπŸ‡ΈUnited States alison

    I stopped by to say I have this issue with other types of migrations (view modes, field instances) -- looks like y'all already accounted for that, though! -- but, issue could use a title and summary update, I think? (but I'm too anxious to add that tag myself :) )

    ANYWAY. Thank you for the fix!! -- it applies cleanly and does the trick on Drupal core 10.2.0, btw (I haven't tried on 11.x).

  • Status changed to Needs work 8 months ago
  • πŸ‡ΊπŸ‡ΈUnited States mikelutz Michigan, USA

    Dang. I had a whole thing typed up here, but I must not have hit submit. Here goes trying to remember what I was going to post...

    This is a bug, NW for tests, but I'm not certain if this is the right approach here.

    So the core EntityConfigBase expects the entity id key (generally id for configs) to contain the whole id. for a field storage entity, that would look like "node.field_text' we have 12 plugins that extend EntityConfigBase for various reasons. The three addressed in the MR are the only ones affected by this bug, as they override the getIds() method to change the destination plugin's ids and map table storage such that the individual parts of the config id are separate properties in the migration (i.e entity_type: node, field_name: field_text)

    And as mentioned above, and fixed in this patch is the issue that the base getEntity method (from \Drupal\migrate\Plugin\migrate\destination\Entity) can't handle that, it tries to load an entity using only the first element of the ids array.

    So there are two interesting things to note here. The first is EntityBaseFieldOverride which needs to do a similar thing but does it in a different way. Instead of overriding getIds, it overrides getEntityId instead. I got a bit into the weeds with trying to understand it, but it seems like this works for base_field_override, and might have worked for field instances because of the FieldConfigBase id() method, but it won't work in general.

    The main thing that give me pause is this bit of EntityConfigBase's import method:

        if (count($ids) > 1) {
          // Ids is keyed by the key name so grab the keys.
          $id_keys = array_keys($ids);
          if (!$row->getDestinationProperty($id_key)) {
            // Set the ID into the destination in for form "val1.val2.val3".
            $row->setDestinationProperty($id_key, $this->generateId($row, $id_keys));
          }
        }
    

    So, EntityConfigBase is smart enough to know that if a subclass has overridden the getIds() method, then it needs to implode the non langcode ids with a '.' separator and load it into the 'id' property so the config can be created with the right id, but it doesn't do the same with the old_destination_id_values before calling getEntity(). So my gut says we should fix this in EntityConfigBase, either by manuipulating old_destination_id_values before calling getEntity(), or more likely by overriding getEntity and putting the logic there (That way subclasses could still override getEntity with the original old_destination_id_values if they wanted to.

    NW to refactor this into EntityConfigBase for any subclass that overrides getIds() and for tests.

  • πŸ‡ΊπŸ‡ΈUnited States rclemings

    I know this is awaiting a refactoring but meanwhile I had to move ahead with a migration and made a patch from the current merge request. To say others step I have attached it here.

Production build 0.71.5 2024