Process plugin to build an array

Created on 14 April 2024, 8 months ago

Problem/Motivation

The process plugins callback (core) and service (this module) now accept an array of arguments if the option unpack_source is set. What is still missing is an easy way to construct the array to pass to these plugins.

Proposed resolution

Add a process plugin that generates an array based on a template (part of configuration) making the following substitutions:

  1. 'source:foo' is replaced by the source property foo.
  2. 'dest:bar' is replaced by the destination property bar.
  3. 'pipeline:' is replaced by the pipeline value.

All three support sub-properties, like source:body/0/value. Any other value is copied directly from the template.

This plugin will provide an alternative, often simpler, to adding constants to the source plugin configuration and to using "pseudo fields" as temporary variables.

Remaining tasks

User interface changes

API changes

Data model changes

✨ Feature request
Status

Active

Version

6.0

Component

Plugins

Created by

πŸ‡ΊπŸ‡ΈUnited States benjifisher Boston area

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Merge Requests

Comments & Activities

  • Issue created by @benjifisher
  • πŸ‡ΊπŸ‡ΈUnited States benjifisher Boston area

    In #3236774-9: Provide ability to reference current value of process pipeline as a source property β†’ , @danflanagan8 pointed out that passing null as part of the source array of a process plugin has the effect of inserting the pipeline value.

    At πŸ“Œ [meeting] Migrate Meeting 2024-03-29 Active , @mikelutz said that this behavior is a bug, and we should avoid using it.

    The process plugin proposed here is more flexible, and it makes the intention clearer.

    I think this process plugin is also more flexible than the one proposed in ✨ Add wrapper process plugin to wrap/unwrap values in arrays Needs review .

  • Merge request !91Add the build_array process plugin β†’ (Merged) created by benjifisher
  • Status changed to Needs work 8 months ago
  • πŸ‡ΊπŸ‡ΈUnited States benjifisher Boston area
  • πŸ‡ΊπŸ‡ΈUnited States benjifisher Boston area
  • Pipeline finished with Success
    8 months ago
    Total: 231s
    #146438
  • πŸ‡ΊπŸ‡ΈUnited States benjifisher Boston area

    Some more examples, from the doc block:

    Generic example:

    process:
      bar:
        plugin: build_array
        source: foo
        template:
          key: literal string
          properties:
            - source:field_body/0/value
            - dest:field_body/0/value
          - pipeline:some/nested/key
    

    Prepare an entity reference revision (ERR) field

    process:
      field_paragraph:
        - plugin: migration_lookup
          # ...
        - plugin: build_array
          template:
            target_id: pipeline:0
            target_revision_id: pipeline:1
    

    Here is an example from my current project.

    Create a serialized array for the layout_paragraphs module

    process:
      behavior_settings:
        # ...
        - plugin: build_array
          template:
            layout_paragraphs:
              parent_uuid: pipeline:0/value
              region: first
        - plugin: callback
          callable: serialize
    
  • πŸ‡ΊπŸ‡ΈUnited States danflanagan8 St. Louis, US

    Very excited to see this, @benjifisher! I haven't reviewed the code yet, but my first impression on the name is that we're dangerously close to the core array_build plugin! The first best replacement name that jumps out at me would be array_template.

  • Pipeline finished with Success
    8 months ago
    Total: 242s
    #152144
  • Status changed to Needs review 8 months ago
  • πŸ‡ΊπŸ‡ΈUnited States benjifisher Boston area

    @danflanagan8:

    Thanks for taking a look!

    I added some tests, starting with yours from ✨ Add wrapper process plugin to wrap/unwrap values in arrays Needs review . That way,

    1. Your excellent test coverage does not go to waste if we decide to close that issue in favor of this one.
    2. I confirm that this plugin can do anything the wrap plugin can do (with method: wrap).

    When I first wrote this plugin, I called it array_build, but then I realized the problem. I confess I was not very creative when I turned it into build_array, and you are right about the potential for confusion.

    I can go with array_template, but other variations are worth considering:

    • array_template
    • template_array
    • template

    Or maybe the name should indicate that it can work with source, destination, and pipeline values.

    How do you feel about using the short form "dest" instead of spelling out "destination"?

  • Issue was unassigned.
  • πŸ‡ΊπŸ‡ΈUnited States benjifisher Boston area
  • πŸ‡ΊπŸ‡ΈUnited States benjifisher Boston area

    I changed the plugin ID to array_template, and I changed the class names (plugin and test classes) to match.

  • Pipeline finished with Success
    6 months ago
    Total: 206s
    #203298
  • πŸ‡ΊπŸ‡ΈUnited States benjifisher Boston area

    Here is a more complicated example from my current project. A custom source plugin provides the source field filters that looks something like this:

    [
      [
        'vid' => 'some_vocab',
        'tids' => [1, 2, 3, 5],
      ],
      [
        'vid' => 'another_vocab',
        'tids' => [8, 13, 21],
      ],
    ]
    

    That represents two vocabularies and a few terms from each vocabulary.

    Here is the pipeline:

      field_hwp_default_filter_values:
        - plugin: sub_process
          source: filters
          process:
            data:
              - plugin: array_template
                template:
                  target_id: 'pipeline:'
                source: tids
            reference_field:
              - plugin: migration_lookup
                migration: hwp_vocabularies
                source: vid
              - plugin: array_template
                template:
                  - field
                  - 'pipeline:'
              - plugin: concat
                delimiter: _
        - plugin: single_value
        - plugin: array_template
          template:
            - 'pipeline:'
            - data
            - reference_field
        - plugin: callback
          callable: array_column
          unpack_source: true
        - plugin: callback
          callable: serialize
    

    After the first step in the pipeline (sub_process), the example at the top is converted to this:

    [
      [
        'data' => [['target_id' => 1], ['target_id' => 2], ['target_id' => 3], ['target_id' => 5]],
        'reference_field' => 'field_some_vocab',
      ],
      [
        'data' => [['target_id' => 8], ['target_id' => 13], ['target_id' => 21]],
        'reference_field' => 'field_another_vocab',
      ],
    ]
    

    By default, a process plugin (like array_template) is applied to each element of a source array. In this example, migration_lookup is a no-op.

    The single_value process plugin overrides that default behavior, so the next array_template prepares the input for the callback plugin:

    [
      [['data' => ..., 'reference_field' => ...], ['data' => ..., 'reference_field' => ...]],
      'data',
      'reference_field',
    ]
    

    and so the callback plugin returns

    array_column(..., 'data', 'reference_field')
    

    or

    [
      'field_some_vocab' => [['target_id' => 1], ['target_id' => 2], ['target_id' => 3], ['target_id' => 5]],
      'field_another_vocab' => [['target_id' => 8], ['target_id' => 13], ['target_id' => 21]],
    ]
    

    The last step in the pipeline uses callback with callable: serialize to serialize that array.

  • Pipeline finished with Failed
    4 months ago
    Total: 257s
    #245151
  • Status changed to Needs work 3 months ago
  • πŸ‡ΊπŸ‡ΈUnited States danflanagan8 St. Louis, US

    I played around with this in Migrate Sandbox with great success. I also took my findings over to one of the related issues. ( ✨ Add wrapper process plugin to wrap/unwrap values in arrays Needs review )

    I love being able to easily mix string literals with source properties and destination properties and (perhaps best of all) the pipeline value.

    The test coverage is expansive, too. And the documentation isn't bad at all. Really nice stuff, @benjifisher.

    My only complaint (well, I complained about something else way back in #7 that Benji humored me on) is that I feel weird referring about the trailing colon in pipeline:. It's a strange thing to type.

    At the same time, it's consistent with source: and dest:, though with those you have to put something after the colon. IT's just that with pipeline: you don't have to put anything after the colon and I would naively think that I would rarely put anything after the colon.

    And what would I suggest in place of that syntax that wouldn't simply be my personal preference just based on my personal tastes?

    At the end of the day, I think I've convinced myself that the syntax on the MR is fine. Phew!

  • Pipeline finished with Failed
    3 months ago
    Total: 260s
    #274246
  • Status changed to Needs review 3 months ago
  • πŸ‡ΊπŸ‡ΈUnited States benjifisher Boston area

    @danflanagan8:

    I think I fixed all the things you pointed out on the MR.

    I agree that 'pipeline:' is awkward. I even considered allowing pipeline as a synonym, but I think the Migrate API already goes too far in making things "convenient". If I did that, then I would have to add test coverage for it, too. In the end, I decided to keep the PHP simple and accept a little ugliness in the YAML for the sake of consistency.

  • Pipeline finished with Failed
    3 months ago
    Total: 274s
    #277510
  • Status changed to RTBC 3 months ago
  • πŸ‡ΊπŸ‡ΈUnited States danflanagan8 St. Louis, US

    This is great stuff. Thanks, @benjifisher!

  • Pipeline finished with Skipped
    13 days ago
    #344970
  • First commit to issue fork.
  • Status changed to Fixed 13 days ago
  • heddn Nicaragua

    Thxs for the contributions.

Production build 0.71.5 2024