Sorting and sort boost

Created on 22 September 2022, about 2 years ago
Updated 28 August 2023, about 1 year ago

Problem/Motivation

The module does not add sort parameters to the meilisearch query. Nor does it add any settings to the index about sort boosting. https://docs.meilisearch.com/learn/advanced/sorting.html#sorting

Steps to reproduce

1. Create a meilisearch index and add fields with various boost settings.
2. Create a search api index view with exposed sort
3. Visit the view and try sorting - it does not do any sorting

Proposed resolution

Add code to the backend plugin to handle sorting

📌 Task
Status

Fixed

Version

1.0

Component

Code

Created by

🇸🇮Slovenia bcizej

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Comments & Activities

Not all content is available!

It's likely this issue predates Contrib.social: some issue and comment data are missing.

  • 🇸🇮Slovenia DeaOm

    Based on the documentation available (keep in mind this module does not support the latest version, yet), the ranking attributes are an array, in which first rule/first slot has the most impact, and the last rule has the least impact. Sort is therefore added to the first place/slot, this additionally only works if there is a sort search parameter, so I kinda don't mind that this applies to all indexes, even if you do not want to sort, as it's then ignored. The boosting of the fields only works for fulltext fields, which means based on the boost number added the fields are then added to the end of the ranking attributes in that order. The sorting works as expected, I'm not entirely sure boosting does anything, but then again, I do not have a "good" dataset to test with. Based on the code and documentation, it should work.
    I think this would benefit if there were tests added, so marking this issue as needs work. It would also be good to first review and merge the 📌 Providing tests Fixed .

  • 🇸🇮Slovenia DeaOm

    Added the code for integer sorting, that removes the casting to string from related issue 🐛 Sorting impossible, numeric fields erroneously indexed as string (patch included) Closed: duplicate . Because this still needs tests, leaving it to needs work.

  • 🇦🇹Austria tgoeg

    Regarding your question over at #3365060: Sorting impossible, numeric fields erroneously indexed as string (patch included):

    The only question I have for that issue is, does this sorting attribute need to be added as a setting?

    Yes, indeed. It needs to be added with a request to the server.
    That's what these lines in my patch do (and the added function setServerSorts() I am leaving out here for clarity):

    +        if ($preverror['errorCode'] == 'invalid_sort' || $preverror['errorCode'] == 'invalid_filter') {
    +          // Attributes that could be used as sortable fields are saved in
    +          // settings/sortable-attributes or settings/filterable-attributes
    +          // on the server. The settings update needs to finish before you can
    +          // continue.
    +          $status = $this->meiliService->setServerSorts($index->id(), array_keys($sorts));
    +          if (isset($status['taskUid'])) {
    +            $this->meiliService->waitForUpdate($index->id(), $status['taskUid']);
    +          }

    But looking at the commits in the MR, this is performed here as well with updateSortableAttributes().

    If you mean a user facing webUI setting - I could imagine adding a setting like "strictly prefer sorting over relevance" and only then put "sorts" on top of the ranking rules. In the default setup, "sorts" is just a tie-breaker for results coming out at the same rank based on other criteria before, which might be preferred by some.

    Other than that: No. I don't think there's anything more that needs to be added to either of the settings.

    A short code review afaict:

    1. The updateSortableAttributes() function gets called directly in the MR's src/Plugin/search_api/backend/SearchApiMeiliSearchBackend.php whereas the unchanged src/Api/MeiliSearchApiService.php still has wrapped calls to updateFilterableAttributes(). My patch also wrapped updateSortableAttributes(). I am not the one to judge which method is better, but it seems inconsequent now.
    2. As lined out in my patch's comment, updating ranking rules every time is unnecessary. This is solved in the MR by retrieving the rules every time and comparing them. My approach seems more efficient and saves one request for each query: Blindly try to retrieve the results and only if the attributes are not set, set them and try again. This way, these requests will only be performed when strictly necessary. Which will only be a very very small fraction when actually changing settings. The bulk of the (search) queries (when everything is set) will see improved performance.
    3. The lines starting at 291 in src/Plugin/search_api/backend/SearchApiMeiliSearchBackend.php in my patch try to deal with other repeated calls to the server where I think some can be removed. At least I still seem to get the desired results with my patch applied :-)
      There are some comments as well where I think further reduction can be reached but I am unsure how to solve it.
    4. There's this comment in my patch:
      +      // TODO:  Wouldn't setting "hitsPerPage" and therefore getting an exhaustive
      +      //        number of "totalHits" be better than setting off two requests?
      +      //        https://www.meilisearch.com/docs/reference/api/search#number-of-results-per-page

      as it seems this is another piece of code that produces more requests than actually necessary, also maybe because Meilisearch gained features that need to be worked around earlier.

    5. src/Api/MeiliSearchApiService.php in my patch over at the other issue adds a try-catch in function search() as it seems I hit an error in that location that was not caught. Maybe because of the missing sorts settings mentioned above, I don't quite remember. Doesn't hurt though, I'd add that.
    6. src/Api/MeiliSearchApiServiceInterface.php in my patch adds a few comments for the now imprecisely called "setSettings" and "getSettings", as they don't update all settings but just the filterable attributes. They (and calls to them) should probably be renamed. Or, depending on point 1, they could be removed as well and updateFilterableAttributes() be called directly.
    7. I am unsure whether
      // This is done as a workaround because the Meilisearch API does not support optional query yet.
      still applies or what it means. It and the code implementing it can probably removed because of the following:
      // TODO:    An empty query string seems to be possible now, if that's what "optional query" means:
      //          https://www.meilisearch.com/docs/reference/api/search#placeholder-search

    Of the points mentioned above, 1+2 (probably 6) are directly related to this very issue.
    Points 3-5 (or 6) may be more of the general improvement kind and could be split out to a separate issue as well to get this merged more quickly. Or they are easy enough to integrate them as well while at it. You decide!

  • Open in Jenkins → Open on Drupal.org →
    Core: 9.5.x + Environment: PHP 7.4 & MySQL 5.7
    last update about 1 year ago
    26 pass
  • 🇸🇮Slovenia bcizej

    Branch rebased so tests can be implemented for sorting as well.

    As lined out in my patch's comment, updating ranking rules every time is unnecessary. This is solved in the MR by retrieving the rules every time and comparing them. My approach seems more efficient and saves one request for each query: Blindly try to retrieve the results and only if the attributes are not set, set them and try again. This way, these requests will only be performed when strictly necessary. Which will only be a very very small fraction when actually changing settings. The bulk of the (search) queries (when everything is set) will see improved performance.

    I agree and think that the settings should only be updated when creating or updating the index. We should also take into account if the index is set to "Read only" that no operations other than read should be possible. Granted this can be done by using a key with limited permissions but since this is a setting in Drupal it should also be respected regardless of key permissions. This issue is also present in filtering logic.

    The lines starting at 291 in src/Plugin/search_api/backend/SearchApiMeiliSearchBackend.php in my patch try to deal with other repeated calls to the server where I think some can be removed. At least I still seem to get the desired results with my patch applied :-)
    There are some comments as well where I think further reduction can be reached but I am unsure how to solve it.

    The code in here is pretty chaotic, MeiliSearchApiService.php is not that much better. I'd stick to solving the sorting in this issue and we open separate issues for optimizations (of course we optimize calls regarding sorts here).

    Points 4-7 are separate issues as well.

  • First commit to issue fork.
  • Open in Jenkins → Open on Drupal.org →
    Core: 9.5.x + Environment: PHP 7.4 & MySQL 5.7
    last update about 1 year ago
    26 pass
  • I agree with this.

    The updateSortableAttributes() function gets called directly in the MR's src/Plugin/search_api/backend/SearchApiMeiliSearchBackend.php whereas the unchanged src/Api/MeiliSearchApiService.php still has wrapped calls to updateFilterableAttributes(). My patch also wrapped updateSortableAttributes(). I am not the one to judge which method is better, but it seems inconsequent now.

    So for code consistency, I added a wrapper for ranking rules and sortable attributes.

  • 🇦🇹Austria tgoeg

    Alright, I think we're heading the right way.
    ad point 2 / setting attributes only when they are changed:
    Definitely. This would be even better than having to test/compare or try/catch in expectancy of missing settings. A try/catch should probably still stay in, just as a safety measure (but catch could just bail out with a meaningful error message; I don't know how a DB import in Meilisearch works for example, maybe the attribute settings have to be recreated in such a scenario. Or when deploying to a production system.)

    Thanks for the wrapping functions, that solves point 1 then.

    I'll rip out the remaining points and open new issues.
    Let's get the core issue here "sorted" ;-) first.

  • 🇦🇹Austria tgoeg

    Split out issues:
    https://www.drupal.org/project/search_api_meilisearch/issues/3382615 📌 Fix unnecessary multiple requests for each search Fixed
    https://www.drupal.org/project/search_api_meilisearch/issues/3382618 📌 Rename MeiliSearchApiServiceInterface's setSettings and getSettings Fixed
    https://www.drupal.org/project/search_api_meilisearch/issues/3382620 📌 Catch errors when trying to retrieve indexes Fixed
    https://www.drupal.org/project/search_api_meilisearch/issues/3382621 📌 Remove workaround for optional queries Fixed

    OT: Can someone please reveal how to markup links to issues correctly so they show their status and get colorized? Been searching high and low but couldn't find it!

  • 🇸🇮Slovenia bcizej

    @tgoed You can use [#issuenumber] https://www.drupal.org/filter/tips .

  • Open in Jenkins → Open on Drupal.org →
    Core: 9.5.x + Environment: PHP 7.4 & MySQL 5.7
    last update about 1 year ago
    26 pass
  • A small change to implement what @bcizej mentioned. Now the updateIndex() shall be skipped if the index is set to read-only.

  • Open in Jenkins → Open on Drupal.org →
    Core: 9.5.x + Environment: PHP 7.4 & MySQL 5.7
    last update about 1 year ago
    26 pass
  • After checking what read-only does, it's supposed to separate indexers from searchers, but it still has to allow other options to be set by both. Also, meilisearch always reindexes when changing ranking rules or sortable attributes, so there is no real need to queue reindexing on Drupal side.

  • Open in Jenkins → Open on Drupal.org →
    Core: 9.5.x + Environment: PHP 7.4 & MySQL 5.7
    last update about 1 year ago
    Unable to generate test groups
  • Open in Jenkins → Open on Drupal.org →
    Core: 9.5.x + Environment: PHP 7.4 & MySQL 5.7
    last update about 1 year ago
    28 pass, 2 fail
  • Open in Jenkins → Open on Drupal.org →
    Core: 9.5.x + Environment: PHP 7.4 & MySQL 5.7
    last update about 1 year ago
    28 pass, 2 fail
  • Open in Jenkins → Open on Drupal.org →
    Core: 9.5.x + Environment: PHP 7.4 & MySQL 5.7
    last update about 1 year ago
    31 pass
  • Status changed to Needs review about 1 year ago
  • Open on Drupal.org →
    Core: 9.5.x + Environment: PHP 7.4 & MySQL 5.7
    last update about 1 year ago
    Not currently mergeable.
  • Open on Drupal.org →
    Core: 9.5.x + Environment: PHP 7.4 & MySQL 5.7
    last update about 1 year ago
    Not currently mergeable.
  • Open in Jenkins → Open on Drupal.org →
    Core: 9.5.x + Environment: PHP 7.4 & MySQL 5.7
    last update about 1 year ago
    98 pass
  • Currently, rewriting the SortingTest.php to use BackendKernelTestBase . When that is done it should be ready for merge.

  • Open in Jenkins → Open on Drupal.org →
    Core: 9.5.x + Environment: PHP 7.4 & MySQL 5.7
    last update about 1 year ago
    99 pass
  • Open in Jenkins → Open on Drupal.org →
    Core: 9.5.x + Environment: PHP 7.4 & MySQL 5.7
    last update about 1 year ago
    99 pass
  • Open in Jenkins → Open on Drupal.org →
    Core: 9.5.x + Environment: PHP 7.4 & MySQL 5.7
    last update about 1 year ago
    99 pass
  • Open in Jenkins → Open on Drupal.org →
    Core: 9.5.x + Environment: PHP 7.4 & MySQL 5.7
    last update about 1 year ago
    95 pass
  • 🇸🇮Slovenia bcizej

    Branch rebased

  • Open in Jenkins → Open on Drupal.org →
    Core: 9.5.x + Environment: PHP 7.4 & MySQL 5.7
    last update about 1 year ago
    95 pass
  • Open in Jenkins → Open on Drupal.org →
    Core: 9.5.x + Environment: PHP 7.4 & MySQL 5.7
    last update about 1 year ago
    95 pass
  • Open in Jenkins → Open on Drupal.org →
    Core: 9.5.x + Environment: PHP 7.4 & MySQL 5.7
    last update about 1 year ago
    Unable to generate test groups
  • Open in Jenkins → Open on Drupal.org →
    Core: 9.5.x + Environment: PHP 7.4 & MySQL 5.7
    last update about 1 year ago
    95 pass, 1 fail
  • Assigned to bcizej
  • Status changed to Needs work about 1 year ago
  • 🇸🇮Slovenia bcizej

    The boosting of the fields only works for fulltext fields, which means based on the boost number added the fields are then added to the end of the ranking attributes in that order. The sorting works as expected, I'm not entirely sure boosting does anything, but then again, I do not have a "good" dataset to test with.

    Correct, the field boosting is not properly implemented as evidenced by the failing test in the last commit. I've set up a test with the index to boost the body field then added two entities, one with "keyword" string in name field, another with "keyword" string in the body field.
    When searching without a query the results order should not be altered in any way but it is because the "body:asc" is being added to the ranking rules which is incorrect. Only when searching with a query should the field boosting take into account.

  • I was looking at the documentation of meilisearch. Do I understand this correctly, we should not be adding fields to ranking rules, but to searchable attributes. Since the order of fields in this list also determines the relevancy of results.

    That said this also has an effect on what fields can be used for searching, so I assume we want all fields (string, int, and text) to be searchable, just the boosted fields are first in the list, of course, higher boost bigger priority?

  • Open in Jenkins → Open on Drupal.org →
    Core: 9.5.x + Environment: PHP 7.4 & MySQL 5.7
    last update about 1 year ago
    96 pass
  • Issue was unassigned.
  • Status changed to Needs review about 1 year ago
  • 🇸🇮Slovenia bcizej

    Yes the searchable attributes determines the relevancy of specific fields so I've changed the logic to use that. I have set the searchable fields only for the fulltext types as these are then retrieved from the $query->getKeys() method. The other fields appear in query conditions and are treated as filters.

  • Open in Jenkins → Open on Drupal.org →
    Core: 9.5.x + Environment: PHP 7.4 & MySQL 5.7
    last update about 1 year ago
    97 pass
  • Added a small test for the meilisearch API service, just to be consistent with the rest of the code. But for me, everything passes locally, and looking at the meilisearch web interface it correctly updates the field order when settings change in the admin config. I think this can be merged.

  • Status changed to Fixed about 1 year ago
  • 🇸🇮Slovenia bcizej

    Merged, thanks to everyone.

  • Automatically closed - issue fixed for 2 weeks with no activity.

Production build 0.71.5 2024