Return data source ids in Index::getDatasourceIds() without initalizing plugins

Created on 16 April 2025, 6 days ago

Problem/Motivation

An infinite loop occurs when a custom datasource deriver interacts with Search API Solr's dynamic data types.
The loop is triggered by calling $this->entityFieldManager->getFieldMapByFieldType() while
search_api_solr is enabled, which leads to recursive datasource plugin loading.

Steps to Reproduce:

  1. Enable both search_api and search_api_solr.
  2. Implement a datasource deriver that uses $this->entityFieldManager->getFieldMapByFieldType().
  3. Observe the infinite recursion caused by:
    • Custom deriver → getFieldMapByFieldType()
    • Triggers SolrDocumentDeriver::getDerivativeDefinitions()
    • Calls Utility::hasIndexSolrDatasources()Index::getDatasourceIds()
    • Reloads datasource plugins, re-triggering the custom deriver

Fixing this could also lead to a potential performance improvement.

Steps to reproduce

Proposed resolution

array_keys($this->datasource_settings) could be used to return the data source plugin ids with initialization - this value is also used by \Drupal\search_api\Entity\Index::getDatasources() which leads to data source plugin initializations.

Remaining tasks

🐛 Bug report
Status

Active

Version

1.0

Component

General code

Created by

🇭🇺Hungary mxr576 Hungary

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Merge Requests

Comments & Activities

  • Issue created by @mxr576
  • 🇭🇺Hungary mxr576 Hungary
  • Pipeline finished with Failed
    6 days ago
    Total: 447s
    #475094
  • 🇭🇺Hungary mxr576 Hungary
  • Pipeline finished with Success
    6 days ago
    Total: 357s
    #475204
  • 🇦🇹Austria drunken monkey Vienna, Austria

    drunken monkey made their first commit to this issue’s fork.

  • 🇦🇹Austria drunken monkey Vienna, Austria

    Thanks for reporting this problem!

    As you already saw by the test failure, the Index class uses $this->datasourceInstances as the “source of truth” for its datasources. Keeping this consistent helps avoid a lot of thorny problems when changing index settings (see #2638116: Clean up caching of Index class method results (especially fields) for background on this decision). For instance, a similar change as the one for removeDatasource() would have to be made to addDatasource() or getDatasourceIds() would return incorrect data when called after adding a datasource.

    However, I guess this doesn’t really apply when the datasource plugins aren’t loaded yet, so maybe using either $datasourceInstances or $datasource_settings based on whether the former is initialized would work in all cases. The only “break” in functionality would now be that Index::getDatasourceIds() will never throw an exception, and since that exception wasn’t even documented I guess this is acceptable. As you say, it might even improve performance in rare cases.

    Please give the new code in the MR a try!

  • 🇭🇺Hungary mxr576 Hungary

    Good thinking! I can confirm that the changes you made still mitigates the reported issue. Should I just RTBC this then? :thinking-face:

Production build 0.71.5 2024