Rendering is not changing the current content and string translation language

Created on 9 August 2023, almost 2 years ago

Problem/Motivation

During the rendering of content entities search_api is not changing the current content language to the translation language of the item being processed. This is an issue only when the content is indexed through the UI job or through drush, but not when indexed on saving. The issue itself is caused when for example in a preprocess hook string translation is being used that relies on the current interface language. This would lead to showing messages that are indexed as part of the content in the interface language that the user was on while executing the indexing job instead of going for the current language of the item being indexed.

Steps to reproduce

See Problem/Motivation

Proposed resolution

When indexing change the current content language and the language for the string translation service to the language of the item being processed.

Remaining tasks

πŸ› Bug report
Status

Active

Version

1.0

Component

General code

Created by

πŸ‡©πŸ‡ͺGermany hchonov πŸ‡ͺπŸ‡ΊπŸ‡©πŸ‡ͺπŸ‡§πŸ‡¬

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Merge Requests

Comments & Activities

  • Issue created by @hchonov
  • Status changed to Needs review almost 2 years ago
  • Open in Jenkins β†’ Open on Drupal.org β†’
    Core: 9.5.x + Environment: PHP 8.1 & sqlite-3.27
    last update almost 2 years ago
    Patch Failed to Apply
  • Open in Jenkins β†’ Open on Drupal.org β†’
    Core: 9.5.x + Environment: PHP 8.1 & sqlite-3.27
    last update almost 2 years ago
    538 pass, 2 fail
  • πŸ‡©πŸ‡ͺGermany hchonov πŸ‡ͺπŸ‡ΊπŸ‡©πŸ‡ͺπŸ‡§πŸ‡¬
  • Status changed to Needs work over 1 year ago
  • πŸ‡¦πŸ‡ΉAustria drunken monkey Vienna, Austria

    Thanks for reporting this problem and already providing a patch!

    First off, it’s important to note that, in general, no UI text should be present in the indexed HTML contents of an item, since that would lead to useless results. E.g., if your rendered item all contained a β€œTags:” field label, then you couldn’t search for the word β€œtag” anymore without receiving every single piece of content as a result.
    However, some setups of course use rendered fields in the index just as a storage, sort of a cache, to be able to display results faster, without loading the entity. In this case, of course, you want the HTML exactly as it appears on the page, including all UI text. And then I guess you will run into this problem.

    Regarding your patch: I’m always very hesitant to change the global application state during indexing, as it seems almost certain that this will lead to problems in some scenarios. However, as we can see by the code that is already in RenderedItem::addFieldValues(), this cannot really be avoided without sacrificing reliability of the generated field values. So, I guess we might need to change the translation language, too, as you indicate – and just hope it doesn’t lead to more problems than it solves.

    However, this patch is definitely missing two things:

    1. Code that switches the language back after indexing (unless I’m mistaken?)
    2. A regression test demonstrating the problem and that it is actually fixed by this change

    Would be great if you could add these two things. In any case, thanks again!

  • πŸ‡§πŸ‡ͺBelgium swentel

    Hmm, is this related/the same as πŸ› Rendered HTML Output doesnt respect activeLanguage completely Needs work

  • Status changed to Closed: duplicate over 1 year ago
  • πŸ‡¦πŸ‡ΉAustria drunken monkey Vienna, Austria

    @swentel: You’re right, thanks for noticing!

    @hchonov: Please see whether the patch in #3035977-43: Rendered HTML Output doesnt respect activeLanguage completely β†’ resolves the problem for you, too. Otherwise, please re-post your patch there.

  • First commit to issue fork.
  • Merge request !237Add patch as GitLab MR β†’ (Open) created by ressa
  • Pipeline finished with Failed
    19 days ago
    Total: 1163s
    #498934
  • πŸ‡©πŸ‡°Denmark ressa Copenhagen

    Thanks @hchonov and @drunken monkey, I tried the patch in the other issue (#3035977), but the MR here is actually the only one, which gets translated labels in a "Rendered HTML output" (rendered_item) field correctly indexed in Search API Solr, so thanks very much for sharing.

    Otherwise all rendered field labels use the original value, not both the original as well as translated label. So I have added the patch as a GitLab MR, in case it's useful. Perhaps some of the code which results in translations could get transferred to #3035977?

    However, I still cannot get my View to filter correctly ... it seems to ignore the filter "Language (with fallback)", I added as a Search API field. Also, I don't see "Language (with fallback)" as an option on the "Processors" page (/admin/config/search/search-api/index/solr_index/processors) I wonder if that's a bug, or the expected behaviour?

    It would be fantastic if the minimum steps required to get data pulled from Search API Solr and Language (with fallback) as a Views filter with "Detection and selection" (/admin/config/regional/language/detection) configured correctly was documented somewhere, or maybe it already is?

    Perhaps I should create a dedicated issue to this problem, or is it so connected to the task solved here, that it should be handled here as well?

  • πŸ‡©πŸ‡°Denmark ressa Copenhagen

    After trying many things, I went for a more systematic approach, using the wonderful Search API Solr Devel to debug the requests and responses. Looking at a node translated into two languages (da and en), which had a Views block attached (using Contextual filters > Content datasource: ID (Default: Content ID from URL)), I noticed that both languages were being requested, which puzzled me -- all nodes showed the block twice, in both languages. I thought I had set up the "Language (with fallback)" filter in my view correctly, by selecting "Danish" and "English", but apparently not ...

    Because this request was sent to Solr, under a /da/node/123 page:

    (+ss_type:"municipality" +sm_language_with_fallback:("da" "en"))'

    I tried some other options in the Views "Language (with fallback)" filter, and when I finally selected "Content language selected for page" it worked, and only either da or en were requested, as desired. Now this was sent to Solr, under example.org/da/ pages:

    '(+ss_type:"municipality" +sm_language_with_fallback:"da")'

    So just to spell it out, to get data pulled from Search API Solr, using "Language (with fallback)" as a Views filter, you should add the field as a filter in your View, and then select "Content language selected for page", for example if you are using "Path prefix" under "URL language detection configuration". Search API will use the context of the page as a filter, so that when pages under example.org/da/ is accessed, Solr will only return items with "sm_language_with_fallback":["da"],. Likewise, if pages under example.org/en/ are accessed, Solr will only return items with "sm_language_with_fallback":["en"],:

    Configure filter criterion: Search: Language (with fallback)
    
    Operator
    x Is one of
      Is not one of
    
    Language
      Select all
      Site's default language (Danish)
      Interface text language selected for page
    x Content language selected for page
      Danish
      English
      Not specified
      Not applicable

    I wonder if calling that option something else like "Content language of page" would work better, or something else? The current wording make it sounds like the user is doing something actively ("selected for") which the user isn't ...

Production build 0.71.5 2024