Synonyms don't work because the analyzer isn't used

Created on 28 March 2023, over 1 year ago

Problem/Motivation

Synonyms don't seem to work. I can add the list of synonyms to the server settings but they don't take effect when running a query.

Steps to reproduce

  1. Add some synonyms using the form on the 'edit server' page
  2. Force a recreation of the index - for example, by visiting the 'edit fields' page on our search index and saving the form with no changes (see extra note below)
  3. Confirm the synonyms are shown in an analyzer in the index settings - for example, using the OpenSearch dashboard: GET /my_index_name/_settings
  4. Perform a search that references synonyms in our list
  5. The search does not seem to use the synonyms. I expect it to.

Proposed resolution

I've done a bit of poking and can see the synonym config is added by SynonymsSubscriber::onAlterSettings.

The config is added to an analyzer called querytime_synonyms. I can't find any other reference to querytime_synonyms in the code or in OpenSearch/Elasticsearch docs. It looks like a custom name. I think that means it could only be invoked by setting it as the analyzer for a field in the field mapping, and/or at query time. I can't see that the module currently does this, which means I think it's just not used?

I think a reasonable choice would be to set it as the default analyzer (and default_search too?), or have an option alongside the synonym form for whether or not to use it as the defaults. For comparison, the default config that ships with the Solr module has synonyms enabled for fulltext fields.

Alternatives could be to create a processor that can be enabled per-field, or a new field type, but I think either of those are quite a bit more complex.

---

As an aside, I think there's also a minor problem shown by steps #1/#2 in my steps to reproduce above. Changing synonyms on the 'edit server' form, and then saving the form, doesn't trigger an update or recreation of the index.

This means that, separately from the fact that the analyzer config isn't being used, the analyzer config isn't actually set on the Opensearch index until you make a different change to the index that triggers BackendClient::updateIndex (like re-saving the field configuration). I think I'd expect it to, or, if not, show a message that notes the index hasn't been updated.

πŸ› Bug report
Status

Active

Version

2.0

Component

Code

Created by

πŸ‡¦πŸ‡ΊAustralia tallytarik

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Comments & Activities

  • Issue created by @tallytarik
  • πŸ‡³πŸ‡ΏNew Zealand magunz

    I confirming that: Renaming "querytime_synonyms" to "default" fix the problem

    diff --git a/src/Event/SynonymsSubscriber.php b/src/Event/SynonymsSubscriber.php
    index 829bc4e..56e4713 100644
    --- a/src/Event/SynonymsSubscriber.php
    +++ b/src/Event/SynonymsSubscriber.php
    @@ -26,7 +26,7 @@ class SynonymsSubscriber implements EventSubscriberInterface {
             'lenient' => TRUE,
             'synonyms' => array_map('trim', $synonyms),
           ];
    -      $settings['analysis']['analyzer']['querytime_synonyms'] = [
    +      $settings['analysis']['analyzer']['default'] = [
             'type' => 'custom',
             'tokenizer' => 'standard',
             'filter' => ['lowercase', 'asciifolding', 'synonyms'],
    
Production build 0.69.0 2024