Synonyms don't work because the analyzer isn't used

Created on 28 March 2023, over 1 year ago

Problem/Motivation

Synonyms don't seem to work. I can add the list of synonyms to the server settings but they don't take effect when running a query.

Steps to reproduce

  1. Add some synonyms using the form on the 'edit server' page
  2. Force a recreation of the index - for example, by visiting the 'edit fields' page on our search index and saving the form with no changes (see extra note below)
  3. Confirm the synonyms are shown in an analyzer in the index settings - for example, using the OpenSearch dashboard: GET /my_index_name/_settings
  4. Perform a search that references synonyms in our list
  5. The search does not seem to use the synonyms. I expect it to.

Proposed resolution

I've done a bit of poking and can see the synonym config is added by SynonymsSubscriber::onAlterSettings.

The config is added to an analyzer called querytime_synonyms. I can't find any other reference to querytime_synonyms in the code or in OpenSearch/Elasticsearch docs. It looks like a custom name. I think that means it could only be invoked by setting it as the analyzer for a field in the field mapping, and/or at query time. I can't see that the module currently does this, which means I think it's just not used?

I think a reasonable choice would be to set it as the default analyzer (and default_search too?), or have an option alongside the synonym form for whether or not to use it as the defaults. For comparison, the default config that ships with the Solr module has synonyms enabled for fulltext fields.

Alternatives could be to create a processor that can be enabled per-field, or a new field type, but I think either of those are quite a bit more complex.

---

As an aside, I think there's also a minor problem shown by steps #1/#2 in my steps to reproduce above. Changing synonyms on the 'edit server' form, and then saving the form, doesn't trigger an update or recreation of the index.

This means that, separately from the fact that the analyzer config isn't being used, the analyzer config isn't actually set on the Opensearch index until you make a different change to the index that triggers BackendClient::updateIndex (like re-saving the field configuration). I think I'd expect it to, or, if not, show a message that notes the index hasn't been updated.

πŸ› Bug report
Status

Active

Version

2.0

Component

Code

Created by

πŸ‡¦πŸ‡ΊAustralia tallytarik

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Comments & Activities

  • Issue created by @tallytarik
  • πŸ‡³πŸ‡ΏNew Zealand magunz

    I confirming that: Renaming "querytime_synonyms" to "default" fix the problem

    diff --git a/src/Event/SynonymsSubscriber.php b/src/Event/SynonymsSubscriber.php
    index 829bc4e..56e4713 100644
    --- a/src/Event/SynonymsSubscriber.php
    +++ b/src/Event/SynonymsSubscriber.php
    @@ -26,7 +26,7 @@ class SynonymsSubscriber implements EventSubscriberInterface {
             'lenient' => TRUE,
             'synonyms' => array_map('trim', $synonyms),
           ];
    -      $settings['analysis']['analyzer']['querytime_synonyms'] = [
    +      $settings['analysis']['analyzer']['default'] = [
             'type' => 'custom',
             'tokenizer' => 'standard',
             'filter' => ['lowercase', 'asciifolding', 'synonyms'],
    
  • πŸ‡¦πŸ‡ΊAustralia mstrelan

    The solution in #2 did not work for me, but what did work was to disable fuzziness on the server configuration. We may need to find a way to combine fuzziness with synonyms.

    The other issue I found is that the AlterSettingsEvent is dispatched when the index is saved, not when the server is saved, yet the synonyms configuration is on the server. It might make more sense to move the synonyms to the index, but failing that we should probably update all indexes for a server when the server is updated.

  • πŸ‡¦πŸ‡ΊAustralia kim.pepper πŸ„β€β™‚οΈπŸ‡¦πŸ‡ΊSydney, Australia

    kim.pepper β†’ made their first commit to this issue’s fork.

  • Pipeline finished with Failed
    2 months ago
    Total: 196s
    #313113
  • Pipeline finished with Failed
    2 months ago
    Total: 226s
    #313112
  • Pipeline finished with Failed
    2 months ago
    Total: 317s
    #313111
  • πŸ‡¦πŸ‡ΊAustralia kim.pepper πŸ„β€β™‚οΈπŸ‡¦πŸ‡ΊSydney, Australia

    kim.pepper β†’ changed the visibility of the branch 3.x to hidden.

  • πŸ‡¦πŸ‡ΊAustralia kim.pepper πŸ„β€β™‚οΈπŸ‡¦πŸ‡ΊSydney, Australia

    Yes, the AlterSettingsEvent should probably be called AlterIndexSettingsEvent.

    We don't currently have anything to specifically update the server settings.

  • πŸ‡¬πŸ‡§United Kingdom rupertj Bristol, UK

    The solution from #2 is working for me. I have to rebuild the index after changing the synonyms, but after that it works as expected.

Production build 0.71.5 2024