Include uuid field in node migrations, if present

Created on 25 July 2022, over 2 years ago
Updated 31 August 2024, 4 months ago

Problem/Motivation

When migrating Drupal 7 data to Drupal 8+, it is desirable to retain universally unique identifiers (UUIDs) of each entity. Although UUIDs are provided by a contributed UUID module in Drupal 7, Drupal 8+ supports them natively, which can lead people to expect that they can import the uuid field by simply mapping it in the process portion of a migration YML file. This works as expected for files and taxonomy terms because the source plugins for those entities include all fields from the source table.

However, the Node.php source plugin explicitly names the fields to import from the node table, not including the uuid field. This results in new data being generated during import instead of retaining these identifiers and can prevent the migrations from being updated.

Steps to reproduce

Create a custom migration (see migrate_plus module) of nodes from a Drupal 7 site that uses UUID module. In the process part of the YML file, map the uuid field. Run the migration. Compare the uuid data of the imported nodes to their original uuids. Note that the data does not match.

Now attempt to run the migration again with the --update flag (see migrate_tools module). Note that the migration fails because the UUIDs do not match.

Proposed resolution

The simplest solution would be to specify $query->fields('n'); in the Node.php source plugin rather than explicitly naming all the fields to use. This approach works when I test it locally, however it is discouraged in the documentation because collisions with fields from other tables such as node_revision could produce unexpected results.

Instead, my solution is to test for the presence of the uuid field in the node table and add it if it is present.

Remaining tasks

Make a fork and/or patch.

✨ Feature request
Status

Needs work

Version

11.0 πŸ”₯

Component
Migration  β†’

Last updated 3 days ago

Created by

πŸ‡ΊπŸ‡ΈUnited States BenStallings

Live updates comments and jobs are added and updated live.
  • Needs tests

    The change is currently missing an automated test that fails when run with the original code, and succeeds when the bug has been fixed.

  • Needs issue summary update

    Issue summaries save everyone time if they are kept up-to-date. See Update issue summary task instructions.

  • Needs change record

    A change record needs to be drafted before an issue is committed. Note: Change records used to be called change notifications.

Sign in to follow issues

Comments & Activities

Not all content is available!

It's likely this issue predates Contrib.social: some issue and comment data are missing.

  • πŸ‡©πŸ‡ͺGermany donquixote

    I like patch #7 as an incremental improvement.
    For it to work correctly, we also need to add 'uuid' in the ->fields() method, with the same condition.

    As for adding the "uuid: uuid" mapping to migrations:
    One option is to conditionally add this mapping. So it would not be present in the default yml files.

    If we always want to add the "uuid: uuid" mapping, then we need to auto-fill this if the uuid module was not installed in D7.

    The usual random uuid is not great, it will produce a different value each time the migration is run.
    This causes issues e.g. with views that reference taxonomy terms by uuid.

    I can see two options to generate "stable" uuids, if uuid module was not enabled in D7:
    - Store the generated uuids in a yml file somewhere, which would be committed to code. I don't like this, because this file would have to scale with the amount of content.
    - Generate fake uuids that are actually hashes based on entity type, entity id, and a fixed salt defined somewhere. E.g. create a sha256, get a substring, insert the dashes for uuid format, done.

    Maybe all of this would have to be opt-in, to not mess with exiting migrations.

    And yes, test coverage.

  • πŸ‡©πŸ‡ͺGermany donquixote

    Here is a patch that adds to the #7 patch.
    This only covers the first part of the problem, exposing uuid for nodes.

    Actually it might be a good idea to treat this as two separate issues:
    - Add uuid for node source plugin, if uuid module was installed in D7.
    - Follow-up to add uuid mapping to the default migrations.

  • πŸ‡©πŸ‡ͺGermany donquixote

    And btw, if I run "migrate:import --update" with a node or term migration with this patch, it does work, and it does update the entity uuid to the correct value.

    We probably do need something else for revision uuids.
    But in a first version we can live without them.
    Term uuids and some content uuids are important if they are referenced from elsewhere, esp in config.
    Revision uuids are quite unlikely to be referenced in config. The main reason to support them would be a sense of technical completeness.

Production build 0.71.5 2024