Search API: Paragraph IDs and Cross-Domain Search Issues

Created on 13 December 2024, 4 months ago

Problem/Motivation

When using Paragraphs, Domain, and Search API together, a search view is configured to query nodes across all domains, displaying them even if the current domain does not match. This works correctly for displaying results.

However, two issues arise:

  • When performing a search that uses an excerpt, everything functions correctly except when a node does not belong to the current domain. In this case, paragraphs are displayed as IDs (e.g.,123..124..125..126) instead of their text or formatted content.
  • If a search query includes a term that exists in a paragraph attached to a node from another domain, the result is not returned at all. This prevents the search from functioning across all domains as expected.

Steps to reproduce

  • Install and configure the Paragraphs, Domain, and Search API modules.
    • Ensure that Domain is set up to manage content visibility across multiple domains.
    • Configure Paragraphs to be used as fields in nodes.
    • Set up Search API to index nodes and their paragraph content.
  • Create nodes with paragraph fields.
    • Assign some nodes to different domains using the Domain module.
    • Populate the paragraph fields with text content for testing.
  • Configure a Search API view:
    • Ensure the view is set up to query all indexed nodes, regardless of their domain.
    • Add an excerpt field to display search context in the results.
  • Perform the following searches from a specific domain:
    • Search for a term that matches the content of a paragraph in a node from a different domain.
    • Observe whether the node appears in the search results.
    • Check the excerpt field for nodes from other domains: other fields display correctly, but paragraph fields are shown as IDs.

Proposed resolution

  • Modify the Search API integration to ensure that paragraphs render their content in the excerpt field, even when the node belongs to a different domain.
  • Adjust the indexing or querying logic to ensure that paragraph content is indexed and searchable across domains.
  • Extend normalization plugins to better handle paragraphs in the search context.
  • Document the required configuration or implement an option to handle these edge cases effectively.

Remaining tasks

🐛 Bug report
Status

Active

Version

1.37

Component

General code

Created by

🇧🇪Belgium hezounay

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Comments & Activities

  • Issue created by @hezounay
  • 🇦🇹Austria drunken monkey Vienna, Austria

    Thanks for reporting this issue!

    However, it seems very specific to your setup, with modules I’m not using, so I’m not available for debugging this. I’d be open to reviewing and merging any MRs, though, if you find a solution that works (and doesn’t negatively impact the module in other ways). We might also need test coverage at that point, but probably pointless to start with that unless we have an idea about a solution.

    Are you using Solr? Then I can at least suggest a workaround that should work pretty well. When enabling the “Retrieve highlighted snippets” setting, Solr will already give you highlighted field values for all searched fields, so you could just build the excerpt using those. (This is actually what we did in Drupal 7 – see SearchApiSolrService::getExcerpt(). We abandoned this approach for security reasons, but if there is no field-level access restrictions for any of the fulltext fields on your site then this wouldn’t be a concern for you.)
    You can get the Solr response (with the highlighting values) via $result_set->getExtraData('search_api_solr_response').

Production build 0.71.5 2024