Make external entities language aware

Created on 31 January 2023, over 1 year ago
Updated 6 May 2024, about 2 months ago

Problem/Motivation

Currently it's not possible to map external entity langcode, as external entities have no supoort for this. Content language is special property in Drupal, that requires more action than just create langcode field.

Proposed resolution

Proposed patch adds langcode field and allows to map external data language code. LanguageInterface::LANGCODE_NOT_SPECIFIED is used when mapping is not set.

โœจ Feature request
Status

RTBC

Version

2.0

Component

Code

Created by

๐Ÿ‡ต๐Ÿ‡ฑPoland gugalamaciek

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Comments & Activities

  • Issue created by @gugalamaciek
  • Status changed to Needs review over 1 year ago
  • First commit to issue fork.
  • Status changed to Needs work over 1 year ago
  • ๐Ÿ‡ณ๐Ÿ‡ฑNetherlands pgrond

    @gugulamaciek This works great and fixes the issue with multilingual listings https://www.drupal.org/project/external_entities/issues/3249146 ๐Ÿ› Links in listing do not respect language of external entity Postponed

    The only thing that botters me is that we get 3 extra fields in the mapping:

    • Language code
    • Language object
    • Default translation

    I think we only need Language code, and maybe default translation, although I don't see the use case right now. I think we should hide these fields from the user, maybe with a hook_form_alter?

  • ๐Ÿ‡จ๐Ÿ‡ฆCanada joseph.olstad

    This fix is needed because if you're indexing into a search engine like solr, language undefined 'und' results in broken stemming and advanced text processing. Advanced text processing is dependant on language and if you use language undefined then the result is, your search functionality will be vastly degraded in all languages including english.

  • Assigned to joseph.olstad
  • Status changed to Active 5 months ago
  • ๐Ÿ‡จ๐Ÿ‡ฆCanada joseph.olstad

    Reviewing, so far, the patch still is applying to my build.

  • Open in Jenkins โ†’ Open on Drupal.org โ†’
    Core: 10.2.x + Environment: PHP 8.1 & MySQL 8
    last update 5 months ago
    4 fail
  • ๐Ÿ‡จ๐Ÿ‡ฆCanada joseph.olstad

    Looks like a test failure was introduced in the recent batch of commits, unrelated to this patch above
    โ†’

  • ๐Ÿ‡จ๐Ÿ‡ฆCanada joseph.olstad

    This patch allowed us to specify a language for our external_entities and then fixed the language specific advanced processing such as stemming when indexing into solr

    When using this patch and correctly configuring the language mapping , then re-indexing our external_entities indexes our search results for baseball also show results for baseballs and now searching for baseballs also shows results for baseball. This gives us the expected results we were looking for.

  • Issue was unassigned.
  • Status changed to RTBC 5 months ago
  • ๐Ÿ‡จ๐Ÿ‡ฆCanada francismak

    Hi,

    After applying the patch, I am able to see there are 3 new fields in the field mappings:
    Language Code
    Language Object
    Default translation

    May I know how to add translations? Let's say I have a custom table:
    CREATE TABLE `samples` (
    `id` int(11) NOT NULL AUTO_INCREMENT,
    `langcode` varchar(12) NOT NULL,
    `field1` text,
    PRIMARY KEY (`id`)
    );

    With 2 rows:
    id | langcode | field1
    1 | 'en' | 'English field1 text'
    2 | 'fr' | 'French field1 text'

    My goal is to setup the entity, and let Solr index them and provide a views to search the 'field1'.
    I am able to setup the field1 mapping using this module, setup the Solr index and created a views with search box to search field1.

    However, for multi-language, my views just showing all rows from the above. Even though in my views, I have already added the filter "The item's language" using "Interface text language selected for page".

    Thank you.

  • ๐Ÿ‡จ๐Ÿ‡ฆCanada joseph.olstad

    @francismak
    TWO choices, if you have a monolingual external entity and you want to map it to a specific language you can use the literal without a real mapping
    example for english:
    +en

    The + symbol is a litteral

    See screenshot

    However if your external entity is multilingual or bilingual you'll want to actually map to a language field from the external entity.

    In my case I have a language per external entity bundle so I hard code the mapping with the +en or +fr

    With that said, I did have a langcode from the external source, could skip that.

    Hope this helps.

  • ๐Ÿ‡จ๐Ÿ‡ฆCanada joseph.olstad

    Note: this is important also for unilingual configurations. Without this fix if you index external content in solr without a language (example language undefined) then your search functionality will be severely degraded.

    Solrs stemming functionality requires a language to be defined.
    Other advanced text processing use cases require "A LANGUAGE" other than Undefined.

  • ๐Ÿ‡จ๐Ÿ‡ฆCanada francismak

    Thank you @josepholstad for the instructions, the screenshot is important for me.
    After some trial and error, the way I could configure it:

    1. Create 2 external entities for both English and French.
    2. Then in the Language mapping field, I use '+en' or '+fr' to hardcode my language.
    3. Create two separate Solr index for those two languages
    4. Create two views for and connect to the Solr index for both languages ...

    I think there is no way to dynamically pickup the langcode column from the table.

  • ๐Ÿ‡จ๐Ÿ‡ฆCanada joseph.olstad

    @francismak

    yes there is a way to pickup the langcode

    it is not necessary to have two external entities

    add the langcode to your external datasource

    if you exclude the + symbol it's a mapping

    The instructions I provided are another option of hard coding the langcode instead of mapping it

  • ๐Ÿ‡จ๐Ÿ‡ฆCanada francismak

    @josepholstad appreciate your help!! I got it working now, using one entity without hard code language.

    Sorry to make this thread becoming a support issue. Just sharing my setup here hope could benefit other users.

    So I have my external database table as described in #13.

    Apply the patch in this issue, setup external entity. Make sure 2 things.
    1. Under the sql queries, select with the langcode field. i.e.
    Full object. select id, field1, langcode from {1:samples} where id = :id
    List objects. select id, field1, langcode from {1:samples} where TRUE :filters

    2. Under field mappings, put the langcode column under the "Language ยป Language code" field.

    Then create Solr search index with the new entity. If there is any field updated, make sure to 'Rebuild tracking information'.

    At last, under the views, will be able to filter by "Item Language".

    01-mapping

    02-views

    03-views-preview

  • ๐Ÿ‡จ๐Ÿ‡ฆCanada joseph.olstad

    @francis

    I would set your default language to a hard coded value instead of a mapped value

    what you're likely ending up with mapping 'en' is likely null since you likely do not have an en field in your external source

    Change this to a hard coded value instead:

    en
    becomes
    +en

  • Open in Jenkins โ†’ Open on Drupal.org โ†’
    Core: 9.5.x + Environment: PHP 8.1 & MySQL 8
    last update 5 months ago
    2 pass
  • ๐Ÿ‡จ๐Ÿ‡ฆCanada joseph.olstad

    This is an important fix because if we do not have language, solr indexes are degraded into very poor results, advanced text processing in languages is lost , including English.

    • 245d4a63 committed on 3.0.x
      Issue #3337903 by pgrond, gugalamaciek: Make external entities language...
  • ๐Ÿ‡ซ๐Ÿ‡ทFrance guignonv

    Applied to v3. Needs to be also applied to v2.

Production build 0.69.0 2024