Add wrapper process plugin to wrap/unwrap values in arrays

Created on 10 October 2022, over 2 years ago
Updated 14 April 2024, about 1 year ago

Problem/Motivation

Sometimes I wish a process plugin that is set to "handle_multiples" didn't handle multiples.

Proposed resolution

I ran into another issue that proposes an "iterate" process plugin. ✨ Add an iterate process plugin Needs review . While reviewing that issue, it seemed like a lot of complexity that I'd rather defer to the sub_process plugin, which already handles this kind of thing. The problem is that sub_process requires an array of arrays as input.

A simple way around this would be to add a "wrapper" process plugin that could turn an array into an array of arrays. Then the process plugin I wished didn't handle multiples could be used inside a sub_process. Boom! Oh, and then the "wrapper" process plugin could "unwrap" after the sub_process if needed.

Keeping the "hard part" in sub_process instead of introducing a new "iterate" process plugin makes this very easy to unit test. It's also possible that wrapping or unwrapping a value could be useful outside the context of iteration/sub_processing.

Example:

 * For an input array of arrays, flatten each of child arrays.
 *
 * Assume the input array my_array is:
 * [
 *   ['a', 'b', ['c, 'd']],
 *   [1, 2, [3, 4]],
 * ]
 *
 * And we want to transform this into:
 * [
 *   ['a', 'b', 'c, 'd'],
 *   [1, 2, 3, 4],
 * ]
 *
 * Since the flatten plugin is designed to "handle_multiples",
 * doing this:
 *
 * @code
 * flatten_wont_work:
 *   plugin: flatten
 *   source: my_array
 * @endcode
 *
 * Will result in a single array:
 * ['a', 'b', 'c', 'd', 1, 2, 3, 4]
 *
 * This is where the wrapper plugin can save the day. We can
 * call wrap on my_array, then use a sub_process, and finally
 * unwrap the result of the sub_process.
 *
 * @code
 * my_desired_output:
 *   -
 *     plugin: wrapper
 *     method: wrap
 *     source: my_array
 *     key: element
 *   -
 *     plugin: sub_process
 *     process:
 *       '0':
 *         plugin: flatten
 *         source: element # This should match the 'key' used when wrapping.
 *   -
 *     plugin: wrapper
 *     method: unwrap
 * @endcode
 *
 * In general terms, the most powerful use case is when you wish
 * that an existing process plugin didn't "handle_multiples". In such
 * a case the pattern above can be used: wrap, sub_process, unwrap.

Remaining tasks

N/A

User interface changes

N/A

API changes

New "wrapper" process plugin

Data model changes

N/A

✨ Feature request
Status

Needs review

Version

6.0

Component

Plugins

Created by

πŸ‡ΊπŸ‡ΈUnited States danflanagan8 St. Louis, US

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Comments & Activities

Not all content is available!

It's likely this issue predates Contrib.social: some issue and comment data are missing.

  • πŸ‡ΊπŸ‡ΈUnited States benjifisher Boston area

    I think that wrapper with method: wrap can be replaced by the more flexible build_array plugin from ✨ Process plugin to build an array Active ; and method: unwrap can be replaced by callback with callable: array_pop.

    For example, the pipeline from the issue description,

      -
        plugin: wrapper
        method: wrap
        source: my_array
        key: element
      -
        plugin: sub_process
        process:
          '0':
            plugin: flatten
            source: element # This should match the 'key' used when wrapping.
      -
        plugin: wrapper
        method: unwrap
    

    is equivalent to

      - 
        plugin: build_array
        template:
          element: 'pipeline:'
        source: my_array
      -
        plugin: sub_process
        process:
          '0':
            plugin: flatten
            source: element
      - plugin: callback
        callable: array_pop
    

    I verified that they give the same results on the example input using migrate_sandbox.

    So I suggest we close this issue in favor of ✨ Process plugin to build an array Active .

  • πŸ‡ΊπŸ‡ΈUnited States danflanagan8 St. Louis, US

    I'm working my way through reviewing the related array_template process plugin.

    I can confirm the part of comment #5 about replacing the wrap method using array_template.

    I'm thinking about the array_pop as a replacement for unwrap...it's not exactly the same because the unwrap method fails if the argument is not a single-valued array, but that's not really a big deal.

    So I'm convinced that the wrapper plugin could be completely reproduced as described by @benjifisher in #5

    The only question is whether there's enough DX value in:
    1. the symmetry that comes with wrap/subprocess/unwrap
    2. the relative simplicity or the wrap syntax compared to the array_template syntax

  • πŸ‡¬πŸ‡§United Kingdom joachim

    > The only question is whether there's enough DX value in:
    > 1. the symmetry that comes with wrap/subprocess/unwrap
    > 2. the relative simplicity or the wrap syntax compared to the array_template syntax

    For the syntax, the plugin from this issue is definitely nicer.

    The MR from the other issue, ✨ Process plugin to build an array Active , has this example for migrating a paragraph reference:

     * @code
     * process:
     *   field_paragraph:
     *     - plugin: migration_lookup
     *       # ...
     *     - plugin: array_template
     *       template:
     *         target_id: pipeline:0
     *         target_revision_id: pipeline:1
     * @endcode
    

    But with the plugin from this issue, my process array is simpler:

      field_paragraph:
        -
          plugin: migration_lookup
          # SNIP
        -
          plugin: wrapper
          method: wrap
    

    The array_template plugin allows much more complexity and is much more powerful for other use cases, but for this particular use case, having to specify a template for the array feels a bit redundant when I just want to nest the values down a level feels like overkill.

    > 1. the symmetry that comes with wrap/subprocess/unwrap

    I'd actually prefer two plugins, called 'wrap' and 'unwrap'.

    This would then match the symmetry of the 'multiple_values' / 'single_value' plugins.

    (An thought which may muddy the waters, if so, ignore -- what about instead of adding a wrapper plugin, we add an option to work with nesting to the 'multiple_values' / 'single_value' plugins? The functionality of the 'key' property here would be covered by the more complex 'array_template' plugin from ✨ Process plugin to build an array Active .)

  • πŸ‡ΊπŸ‡ΈUnited States mikelutz Michigan, USA

    I just want to note that "wrap" is just syntactical sugar for

    plugin: get
          source:
            - ~
    

    and unwrap is just syntactical sugar for

    plugin: extract
          index:
            - 0
    

    With maybe some single_value, multiple_value plugins in there if you need them, though if the purpose is to pipe a value into a handle_multiples plugin that can't handle a single value, there will often not be much point, as those plugins don't care if the value is a multiple or not.

    (BTW, I'm starting to come around to just documenting the 'hack' with the get plugin above and officially supporting it. I've found enough cases where it's useful)

  • Status changed to Closed: outdated 4 months ago
  • πŸ‡ΊπŸ‡ΈUnited States benjifisher Boston area

    Now that ✨ Process plugin to build an array Active is fixed (and part of the 6.0.5 release) I think we can close this issue.

    @mikelutz:

    (BTW, I'm starting to come around to just documenting the 'hack' with the get plugin above and officially supporting it. I've found enough cases where it's useful)

    Your timing is interesting. Now that we have the array_template plugin, we can get the same functionality (a little more verbose, but perhaps easier to read/clearer intention). So now you decide to change your mind? ;)

Production build 0.71.5 2024