Remove end punctuations from autocomplete suggestions

Created on 29 March 2022, over 2 years ago
Updated 11 May 2024, about 2 months ago

Problem/Motivation

The end punctuations need to remove from autocomplete suggestions.

Steps to reproduce

Ignore character configurations removes all punctuations, but in some cases we need to remove only end punctuations from autocomplete suggestions.

Proposed resolution

✨ Feature request
Status

Needs work

Version

1.0

Component

Plugins

Created by

🇮🇳India mukesh.dev

Live updates comments and jobs are added and updated live.
  • Needs tests

    The change is currently missing an automated test that fails when run with the original code, and succeeds when the bug has been fixed.

Sign in to follow issues

Merge Requests

Comments & Activities

Not all content is available!

It's likely this issue predates Contrib.social: some issue and comment data are missing.

  • 🇦🇹Austria drunken monkey Vienna, Austria

    Thanks a lot for suggesting this improvement.
    It is true that trailing punctuation is often a problem in autocomplete, when not set up carefully. Ideally, you’ll want to make sure that the indexed fields used for autocomplete simply don’t contain any punctuation. However, I do realize that this is unfortunately not that easy, and a simpler, if less ideal, solution would make sense.

    However, as it’s not certain that everyone will want this behavior, hard-coding it into the Suggestion class, which is really just a value object, is definitely not the right way for a generally useful implementation.
    Rather, I’d suggest to make this part of the “Retrieve from server” suggester plugin, make it optional, and implement it at the end of \Drupal\search_api_autocomplete\Plugin\search_api_autocomplete\suggester\Server::getAutocompleteSuggestions(), by going through the suggestions returned from the server and adapting them accordingly. (We’ll also have to make sure to keep the suggestions unique – it’s not unlikely that there will be suggestions for both “foo” and “foo.”, so when removing punctuation you’d end up with a duplicate suggestion.)
    For what to remove, I’d suggest removing all non-letter, non-number characters (PCRE pattern /[^\pL\pN]+$/u, which should give us maximal compatibility across languages.
    If you’d be willing to work on such a patch, I’d be happy to assist/review.

  • 🇷🇴Romania adevms

    Added a patch for having a configuration option in the Retrieve from server plugin that removes non-letter, non-number characters from the suggestions.

  • 🇦🇹Austria drunken monkey Vienna, Austria

    Thanks a lot for your patch!
    In the future, though, please remember to set the issue status to “Needs review” when posting a patch or creating/updating an MR that you feel would be ready for being added to the project.
    Also, in this particular project, we have now switched to using issue forks and MRs exclusively, patches cannot be tested by the test bot anymore. I now created an MR containing the changes from your patch.

    Now, some notes on the patch itself:

    1. The new config key would need to be reflected in \Drupal\search_api_autocomplete\Plugin\search_api_autocomplete\suggester\Server::defaultConfiguration() and config/schema/search_api_autocomplete.schema.yml (under plugin.plugin_configuration.search_api_autocomplete_suggester.server).
    2. Your current code doesn’t really do what the config form description says: the description says the new option will “remove non letter characters”, but instead you remove any suggestions ending with such characters entirely. As per my comment #4 above (about both “foo” and “foo.” being suggested) this might make perfect sense, but then this should be clearly stated in the description.
      Maybe even better, there might be different settings for this option, about what to do with non-alphanumeric characters: ignore, strip, strip when at the end of the suggestion, remove whole suggestion when ending with one. However, when stripping characters we have to be careful not to run into this issue with duplicates, so if enough people are interested then starting with just your implementation might be fine. The description just needs to clearly state what this does. (Also, we might want to already define the new config key in a way that would later make it easier to expand?)
    3. Finally, before it can be merged this would also need automated test coverage.

    In any case, thanks again for making another jab at this!

  • Issue was unassigned.
  • 🇦🇹Austria drunken monkey Vienna, Austria
Production build 0.69.0 2024