Process plugin: build an array from source, destination, pipeline

Created on 14 April 2024, 2 months ago
Updated 24 June 2024, 1 day ago

Problem/Motivation

The process plugins callback (core) and service (this module) now accept an array of arguments if the option unpack_source is set. What is still missing is an easy way to construct the array to pass to these plugins.

Proposed resolution

Add a process plugin that generates an array based on a template (part of configuration) making the following substitutions:

  1. 'source:foo' is replaced by the source property foo.
  2. 'dest:bar' is replaced by the destination property bar.
  3. 'pipeline:' is replaced by the pipeline value.

All three support sub-properties, like source:body/0/value. Any other value is copied directly from the template.

This plugin will provide an alternative, often simpler, to adding constants to the source plugin configuration and to using "pseudo fields" as temporary variables.

Remaining tasks

User interface changes

None

API changes

Add a new process plugin.

Data model changes

None

✨ Feature request
Status

Needs review

Version

6.0

Component

Plugins

Created by

πŸ‡ΊπŸ‡ΈUnited States benjifisher Boston area

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Merge Requests

Comments & Activities

  • Issue created by @benjifisher
  • πŸ‡ΊπŸ‡ΈUnited States benjifisher Boston area

    In #3236774-9: Provide ability to reference current value of process pipeline as a source property β†’ , @danflanagan8 pointed out that passing null as part of the source array of a process plugin has the effect of inserting the pipeline value.

    At πŸ“Œ [meeting] Migrate Meeting 2024-03-29 Active , @mikelutz said that this behavior is a bug, and we should avoid using it.

    The process plugin proposed here is more flexible, and it makes the intention clearer.

    I think this process plugin is also more flexible than the one proposed in ✨ Add wrapper process plugin to wrap/unwrap values in arrays Needs review .

  • Merge request !91Add the build_array process plugin β†’ (Open) created by benjifisher
  • Status changed to Needs work 2 months ago
  • πŸ‡ΊπŸ‡ΈUnited States benjifisher Boston area
  • πŸ‡ΊπŸ‡ΈUnited States benjifisher Boston area
  • Pipeline finished with Success
    2 months ago
    Total: 231s
    #146438
  • πŸ‡ΊπŸ‡ΈUnited States benjifisher Boston area

    Some more examples, from the doc block:

    Generic example:

    process:
      bar:
        plugin: build_array
        source: foo
        template:
          key: literal string
          properties:
            - source:field_body/0/value
            - dest:field_body/0/value
          - pipeline:some/nested/key
    

    Prepare an entity reference revision (ERR) field

    process:
      field_paragraph:
        - plugin: migration_lookup
          # ...
        - plugin: build_array
          template:
            target_id: pipeline:0
            target_revision_id: pipeline:1
    

    Here is an example from my current project.

    Create a serialized array for the layout_paragraphs module

    process:
      behavior_settings:
        # ...
        - plugin: build_array
          template:
            layout_paragraphs:
              parent_uuid: pipeline:0/value
              region: first
        - plugin: callback
          callable: serialize
    
  • πŸ‡ΊπŸ‡ΈUnited States danflanagan8 St. Louis, US

    Very excited to see this, @benjifisher! I haven't reviewed the code yet, but my first impression on the name is that we're dangerously close to the core array_build plugin! The first best replacement name that jumps out at me would be array_template.

  • Pipeline finished with Success
    2 months ago
    Total: 242s
    #152144
  • Status changed to Needs review 2 months ago
  • πŸ‡ΊπŸ‡ΈUnited States benjifisher Boston area

    @danflanagan8:

    Thanks for taking a look!

    I added some tests, starting with yours from ✨ Add wrapper process plugin to wrap/unwrap values in arrays Needs review . That way,

    1. Your excellent test coverage does not go to waste if we decide to close that issue in favor of this one.
    2. I confirm that this plugin can do anything the wrap plugin can do (with method: wrap).

    When I first wrote this plugin, I called it array_build, but then I realized the problem. I confess I was not very creative when I turned it into build_array, and you are right about the potential for confusion.

    I can go with array_template, but other variations are worth considering:

    • array_template
    • template_array
    • template

    Or maybe the name should indicate that it can work with source, destination, and pipeline values.

    How do you feel about using the short form "dest" instead of spelling out "destination"?

  • Issue was unassigned.
  • πŸ‡ΊπŸ‡ΈUnited States benjifisher Boston area
  • πŸ‡ΊπŸ‡ΈUnited States benjifisher Boston area

    I changed the plugin ID to array_template, and I changed the class names (plugin and test classes) to match.

  • Pipeline finished with Success
    6 days ago
    Total: 206s
    #203298
  • πŸ‡ΊπŸ‡ΈUnited States benjifisher Boston area

    Here is a more complicated example from my current project. A custom source plugin provides the source field filters that looks something like this:

    [
      [
        'vid' => 'some_vocab',
        'tids' => [1, 2, 3, 5],
      ],
      [
        'vid' => 'another_vocab',
        'tids' => [8, 13, 21],
      ],
    ]
    

    That represents two vocabularies and a few terms from each vocabulary.

    Here is the pipeline:

      field_hwp_default_filter_values:
        - plugin: sub_process
          source: filters
          process:
            data:
              - plugin: array_template
                template:
                  target_id: 'pipeline:'
                source: tids
            reference_field:
              - plugin: migration_lookup
                migration: hwp_vocabularies
                source: vid
              - plugin: array_template
                template:
                  - field
                  - 'pipeline:'
              - plugin: concat
                delimiter: _
        - plugin: single_value
        - plugin: array_template
          template:
            - 'pipeline:'
            - data
            - reference_field
        - plugin: callback
          callable: array_column
          unpack_source: true
        - plugin: callback
          callable: serialize
    

    After the first step in the pipeline (sub_process), the example at the top is converted to this:

    [
      [
        'data' => [['target_id' => 1], ['target_id' => 2], ['target_id' => 3], ['target_id' => 5]],
        'reference_field' => 'field_some_vocab',
      ],
      [
        'data' => [['target_id' => 8], ['target_id' => 13], ['target_id' => 21]],
        'reference_field' => 'field_another_vocab',
      ],
    ]
    

    By default, a process plugin (like array_template) is applied to each element of a source array. In this example, migration_lookup is a no-op.

    The single_value process plugin overrides that default behavior, so the next array_template prepares the input for the callback plugin:

    [
      [['data' => ..., 'reference_field' => ...], ['data' => ..., 'reference_field' => ...]],
      'data',
      'reference_field',
    ]
    

    and so the callback plugin returns

    array_column(..., 'data', 'reference_field')
    

    or

    [
      'field_some_vocab' => [['target_id' => 1], ['target_id' => 2], ['target_id' => 3], ['target_id' => 5]],
      'field_another_vocab' => [['target_id' => 8], ['target_id' => 13], ['target_id' => 21]],
    ]
    

    The last step in the pipeline uses callback with callable: serialize to serialize that array.

Production build 0.69.0 2024