How to get the current source within field process

Created on 3 May 2023, over 1 year ago

Hi,
I use migrate_plus to migrate multiple JSON files into content. Each target node has its own JSON source.
My source section looks like this:

source:
  plugin: url
  data_fetcher_plugin: http
  data_parser_plugin: json
  track_changes: true
  urls:
    - https://exmaple.com/slug-a/api/node
    - https://exmaple.com/slug-a/api/node
    - https://exmaple.com/slug-a/api/node

As I need the URL to create some remote file URLs - is there any way to get the CURRENT source URL within the process section of my migration?

E.g.

source_full_path:
    -
      plugin: concat
      delimiter: /
      source:
        - '@url'
        - 'images'
        - filename
    -
      plugin: urlencode
  uri:
    plugin: file_copy
    source:
      - '@source_full_path'
      - uri

Where '@url' is driven by the source urls containing the current source file?

I am not sure if
https://www.drupal.org/project/migrate_plus/issues/3050274
is related?!

💬 Support request
Status

Active

Version

6.0

Component

Documentation

Created by

🇩🇪Germany vistree

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Merge Requests

Comments & Activities

  • Issue created by @vistree
  • 🇩🇪Germany vistree

    I were able to solve this issue with a custom source plugin:
    mymodule/src/Plugin/migrate/source/UrlWithActiveSourceUrl.php

    <?php
    namespace Drupal\mymodule\Plugin\migrate\source;
    
    use Drupal\migrate_plus\Plugin\migrate\source\Url;
    use Drupal\migrate\Row;
    
    /**
    * Source plugin for retrieving data via URLs and add active URL as variable.
    *
    * @MigrateSource(
    *   id = "url_with_active_source_url"
    * )
    */
    
    class UrlWithActiveSourceUrl extends Url  {
    
      /**
      * {@inheritdoc}
      */
      public function prepareRow(Row $row) {
        $migration = $this->migration->getSourcePlugin();
        $dataParser = $migration->dataParserPlugin;
        $activeUrl = $dataParser->currentUrl();
    
        $row->setSourceProperty('active_url', $activeUrl);
        return parent::prepareRow($row);
      }
    }
    

    Now I can use in my migration config:

    source:
      plugin: url_with_exhibit_url
      data_fetcher_plugin: http
      data_parser_plugin: json
      track_changes: true
      urls:
        - https://www.url1.de/data.json
        - https://www.url2.de/data.json
        - https://www.url3.de/data.json
    

    And now within process, I can use "active_url" whereever needed (e.g.)

    process:
      _active_source_url: active_url
  • 🇩🇪Germany vistree

    I found one problem with my implementation: if there is a source change, the last item BEFORE the source change already uses the new source-URL. Seems that $dataParser->currentUrl() does not return the correct item ;-(
    Any idea? I can't access the property $dataParser->activeUrl or $dataParser->urls - as those are protected properties ;-(

  • 🇩🇪Germany vistree

    As it looks we definitely have a problem using the $dataParser->currentUrl() inside a source plugin. As a solution for myself I now use a data_parser plugin - following @hctom's comment ( https://www.drupal.org/project/migrate_plus/issues/3050274#comment-13935165 ):
    1. Create a file within your custom module (mymodule/src/Plugin/migrate_plus/data_parser/CustomJson.php)
    2. Add to CustomJson.php

    <?php
    
    namespace Drupal\mymodule\Plugin\migrate_plus\data_parser;
    
    use Drupal\migrate_plus\Plugin\migrate_plus\data_parser\Json;
    
    /**
     * Obtain JSON data for migration.
     *
     * @DataParser(
     *   id = "custom_json",
     *   title = @Translation("JSON Parser for My Module")
     * )
     */
    class CustomJson extends Json {
      
      /**
       * {@inheritdoc}
       */
      protected function fetchNextRow(): void {
        parent::fetchNextRow();
    
        // Inject file metadata.
        if ($this->valid()) {
          $activeUrl = $this->currentUrl();
          $activeUrlParts = explode('/api/', $activeUrl);
          $activeBaseUrl = str_replace('https://', 'https://username:userpassword@', $activeUrlParts[0]);
    
          $this->currentItem['remote_auth_url'] = $activeBaseUrl;
          $this->currentItem['active_source_url'] = $activeUrl;
        }
      }
    }
    
    

    Now within your migration use:

    source:
      plugin: url
      data_fetcher_plugin: http
      data_parser_plugin: omeka_json

    Within the process section you will now be able to use "active_source_url" and other defined parameters.

    process:
      _active_source_url: active_source_url
  • Hi,
    I want to get the filename for my XML to set the name of the block content based on the file name.
    I tried the above solutions. However, it skips the first URL and repeats the last one. Any suggestions how you fixed the issue

    namespace Drupal\teamsite_migration\Plugin\migrate_plus\data_parser;
    
    use Drupal\migrate_plus\Plugin\migrate_plus\data_parser\Xml;
    
    /**
     * Obtain XML data for migration.
     *
     * @DataParser(
     *   id = "custom_xml",
     * )
     */
    class CustomXML extends Xml {
      
      /**
       * {@inheritdoc}
       */
      protected function fetchNextRow(): void {
        parent::fetchNextRow();
    
        // Inject file metadata.
        if ($this->valid()) {
          $current_file_url = $this->currentUrl();
          $this->currentItem['active_source_url'] = $current_file_url;
          $filename = basename($current_file_url);
          $this->currentItem['current_filename']= pathinfo($filename, PATHINFO_FILENAME);
        }
      }
    }
    

    and using it like below:

    source:
      plugin: url_with_active_source_url
      data_fetcher_plugin: file
      data_parser_plugin: custom_xml
      urls:
        - modules/custom/teamsite_migration/data/Accordion/default.xml
        - modules/custom/teamsite_migration/data/Accordion/default-skin-thumbnail.xml
        - modules/custom/teamsite_migration/data/Accordion/default-skin-enhanced.xml
        - modules/custom/teamsite_migration/data/Accordion/default-skin-enhanced-large.xml
      item_selector: /accordion
      
      # defined source fields from XML
      fields:
        - name: title
          label: 'Title'
          selector: 'title'
        - name: description
          label: 'Description'
          selector: 'description'
        - name: disclaimer
          label: 'Disclaimer'
          selector: 'disclaimer'
        - name: accItems
          label: 'Accordion Items'
          selector: 'items'
          
      
      ids:
        title:
          type: string
    
    process:
      type:
        plugin: default_value
        default_value: accordion_block_type
    
      # Modify the info field to use the filename without extension
      info: current_filename
  • 🇩🇪Germany vistree

    Sorry, I don't see the error. Maybe it is related to parent::fetchNextRow(); ??

  • 🇬🇧United Kingdom joachim

    I'd say this is a feature request rather than support requests.

    There are cases where some data you need is only in the filename, not in the XML or JSON data.

  • 🇬🇧United Kingdom joachim

    I've made a start on an MR, but it needs review from a maintainer for the approach before more is done:

    - I'm adding a field, and checking we're not clobbering an existing field, but that check is being done when a migration runs, rather than during plugin discovery. It would be better done in that point, so that a `drush ms` command fails before a `drush mim` does.
    - I'm not sure how to declare this field, as the data parser plugin doesn't have a way to influence the source plugin's declaration of fields().

    This also needs to be done to the other data parser plugins.

    As a follow-on, it would be useful to have another metadata field which is the delta of the current item in the current URL.

  • Pipeline finished with Success
    about 2 months ago
    Total: 305s
    #298139
  • Pipeline finished with Success
    about 2 months ago
    Total: 307s
    #298766
Production build 0.71.5 2024