Sorting impossible, numeric fields erroneously indexed as string (patch included)

Created on 6 June 2023, about 1 year ago
Updated 24 August 2023, 10 months ago

Problem/Motivation

I am trying to sort results based on a numeric field. It comes out sorted lexically instead of numerically.

Steps to reproduce

Currently, marking the field as sortable is something that needs to be done manually as well (I'll see whether I can bake this into code as well):
curl -X PUT 'http://localhost:7700/indexes/projects/settings/sortable-attributes' -H 'Authorization: Bearer THEKEY' -H 'Content-Type: application/json' --data-binary '[ "field_project_number" ]'

To force Meilisearch to sort by this sortable field first (this most likely does not make sense in a production environment, just for debugging purposes), I issued
curl -X PUT 'http://localhost:7700/indexes/projects/settings/ranking-rules' -H 'Authorization: Bearer THEKEY' -H 'Content-Type: application/json' --data-binary '[ "sort", "words", "typo", "proximity", "attribute", "exactness" ]'

The index is set up to use the integer type for "field_project_number".

However, when a search view is configured to filter by this integer value, it has no effect. I patched in the necessary parts to add sorting functionality (see below).
Still, the output gets sorted lexically as in

$ xidel -e '$json//field_project_number' --method=post -H 'Authorization: Bearer THEKEY' -H 'Content-Type: application/json' -d '{{ "q": "", "sort": ["field_project_number:asc"] }}' 'http://localhost:7700/indexes/projects/search'
**** Retrieving (post): http://localhost:7700/indexes/projects/search ****
**** Processing: http://localhost:7700/indexes/projects/search ****
10
100
1000
1002
1006
1007
1008
101

This is caused by a forced string type-cast in src/Api/MeiliSearchApiService.php:

294             case 'integer':
295             case 'duration':
296             case 'decimal':
297               if (is_numeric($field_value)) {
298                 $values[] = (string) $field_value;
299               }
300               break;

I cannot imagine any situation where this makes sense (apart from maybe older versions of Meilisearch that did not support anything else but strings?). Any pointers? The respective line of code seems to be by @rokzabukovec.

Proposed resolution

Stop type-casting numeric values to strings.

diff --git a/src/Api/MeiliSearchApiService.php b/src/Api/MeiliSearchApiService.php
index 3e1dd16..f0b670d 100644
--- a/src/Api/MeiliSearchApiService.php
+++ b/src/Api/MeiliSearchApiService.php
@@ -295,7 +295,7 @@ class MeiliSearchApiService implements MeiliSearchApiServiceInterface {
             case 'duration':
             case 'decimal':
               if (is_numeric($field_value)) {
-                $values[] = (string) $field_value;
+                $values[] = $field_value;
               }
               break;

Allow passing of sorting criteria of a view:

--- a/src/Plugin/search_api/backend/SearchApiMeiliSearchBackend.php
+++ b/src/Plugin/search_api/backend/SearchApiMeiliSearchBackend.php
@@ -291,6 +291,7 @@ class SearchApiMeiliSearchBackend extends BackendPluginBase implements PluginFor
     $config = $this->configFactory->getEditable('search_api_meilisearch.settings');
     $results = $query->getResults();
     $options = $query->getOptions();
+    $sorts = $query->getSorts();
     $index = $query->getIndex();

     // If the synonyms processor is disabled we have to reset
@@ -354,6 +355,12 @@ class SearchApiMeiliSearchBackend extends BackendPluginBase implements PluginFor
         $meili_options['limit'] = $all_hits;
       }

+      if (isset($sorts)) {
+          foreach ($sorts as $criteria => $order) {
+              $meili_options['sort'][] = $criteria . ":" . strtolower($order);
+          }
+      }
+
       $data = $this->meiliService->search($index->id(), $keys, $meili_options);
       if (!$data->getHits()) {
         return $results;

Remaining tasks

Sorting only works if the sortable-attributes setting gets set once. Unsure how to do that in code right now, but I'll see what I can do.
Checking whether it is possible with every query and setting it then seems inefficient and will likely break the first query and all following ones until the task is completed.
Probably try to sort, catch a possible error and set the attributes then?
The place this would most likely make the most sense would be in the index config. Add a checkbox "sortable" in the field config and set/unset the sortable-attributes setting accordingly. This would be the only way to get rid of the setting as well.
Remember adding this setting could take a while as Meilisearch has to create an internal index for the respective field.

User interface changes

Probably add some kind of configuration interface for sortable attributes, maybe in the index field config (see above).

🐛 Bug report
Status

Closed: duplicate

Version

1.0

Component

Code

Created by

🇦🇹Austria tgoeg

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Comments & Activities

  • Issue created by @tgoeg
  • 🇦🇹Austria tgoeg

    Alright, this got pretty elaborate but I have a setup that works cleanly from the first search on. And it saves a request to the Meilisearch server for each query compared to the current 1.x implementation.

    There are a lot of comments in the code that would need a review as well. (DRACONES: I am just a sysadmin so consider my coding skills experimental :-) )

    Consider this w.i.p., I'll see whether I can prepare a PR sometime. Might take a while, however. Just wanted to put this up somewhere so others can benefit.

    diff --git a/src/Api/MeiliSearchApiService.php b/src/Api/MeiliSearchApiService.php
    index 3e1dd16..f304eff 100644
    --- a/src/Api/MeiliSearchApiService.php
    +++ b/src/Api/MeiliSearchApiService.php
    @@ -117,10 +117,15 @@ class MeiliSearchApiService implements MeiliSearchApiServiceInterface {
           return [];
         }
     
    -    $index = $this->getIndex($index_name);
    -    if ($index) {
    -      return $index->search($query, $options);
    +    try {
    +      $index = $this->getIndex($index_name);
    +      if ($index) {
    +        return $index->search($query, $options);
    +      }
    +    } catch (\Exception $e) { //TODO: probably catch/handle API vs. timeout exception differently ?
    +      throw new MeiliSearchApiException($e->getMessage(), $e->getCode(), $e);
         }
    +
         return [];
       }
     
    @@ -295,7 +300,7 @@ class MeiliSearchApiService implements MeiliSearchApiServiceInterface {
                 case 'duration':
                 case 'decimal':
                   if (is_numeric($field_value)) {
    -                $values[] = (string) $field_value;
    +                $values[] = $field_value;
                   }
                   break;
     
    @@ -377,6 +382,34 @@ class MeiliSearchApiService implements MeiliSearchApiServiceInterface {
         return FALSE;
       }
     
    +  /**
    +   * {@inheritdoc}
    +   * @throws \Drupal\search_api_meilisearch\Api\MeiliSearchApiException
    +   */
    +  public function setServerSorts(string $index_name, array $sorts) {
    +    try {
    +      $index = $this->getIndex($index_name);
    +      if ($index) {
    +         return $index->updateSortableAttributes($sorts);
    +      }
    +    }
    +    catch (\Exception | NetworkException $e) {
    +      throw new MeiliSearchApiException($e->getMessage(), $e->getCode(), $e);
    +    }
    +  }
    +
    +  /**
    +   * {@inheritdoc}
    +   * @throws \Drupal\search_api_meilisearch\Api\MeiliSearchApiException
    +   */
    +  public function getServerSorts(string $index_name) {
    +    $index = $this->getIndex($index_name);
    +    if ($index) {
    +      return $index->getSortableAttributes();
    +    }
    +    return FALSE;
    +  }
    +
       /**
        * {@inheritdoc}
        */
    diff --git a/src/Api/MeiliSearchApiServiceInterface.php b/src/Api/MeiliSearchApiServiceInterface.php
    index 25daa67..02d23b9 100644
    --- a/src/Api/MeiliSearchApiServiceInterface.php
    +++ b/src/Api/MeiliSearchApiServiceInterface.php
    @@ -216,6 +216,7 @@ interface MeiliSearchApiServiceInterface {
     
       /**
        * Sets the Meilisearch settings.
    +   * TODO: Imprecise. This only updates the filterableAttributes, not all "settings"
        *
        * @param string $index_name
        *   The name of the index.
    @@ -228,6 +229,7 @@ interface MeiliSearchApiServiceInterface {
     
       /**
        * Gets the Meilisearch settings.
    +   * TODO: Imprecise. This only gets the filterableAttributes, not all "settings"
        *
        * @param string $index_name
        *   The name of the index.
    @@ -239,6 +241,31 @@ interface MeiliSearchApiServiceInterface {
        */
       public function getSettings(string $index_name);
     
    +  /**
    +   * Update Meilisearch server's sortableAttributes.
    +   *
    +   * @param string $index_name
    +   *   The name of the index.
    +   * @param array $settings
    +   *   An array of setting for updating.
    +   *
    +   * @see https://www.meilisearch.com/docs/reference/api/settings#sortable-attributes
    +   */
    +  public function setServerSorts(string $index_name, array $sorts);
    +
    +  /**
    +   * Get Meilisearch server's sortableAttributes.
    +   *
    +   * @param string $index_name
    +   *   The name of the index.
    +   *
    +   * @return array
    +   *   An array of settings currently set on the server.
    +   *
    +   * @see https://www.meilisearch.com/docs/reference/api/settings#sortable-attributes
    +   */
    +  public function getServerSorts(string $index_name);
    +
       /**
        * Removes the id mapping from the config.
        *
    diff --git a/src/Plugin/search_api/backend/SearchApiMeiliSearchBackend.php b/src/Plugin/search_api/backend/SearchApiMeiliSearchBackend.php
    index bffcf15..eaf7ce7 100644
    --- a/src/Plugin/search_api/backend/SearchApiMeiliSearchBackend.php
    +++ b/src/Plugin/search_api/backend/SearchApiMeiliSearchBackend.php
    @@ -291,10 +291,13 @@ class SearchApiMeiliSearchBackend extends BackendPluginBase implements PluginFor
         $config = $this->configFactory->getEditable('search_api_meilisearch.settings');
         $results = $query->getResults();
         $options = $query->getOptions();
    +    $sorts = $query->getSorts();
         $index = $query->getIndex();
     
         // If the synonyms processor is disabled we have to reset
         // the synonyms on the server and remove the configuration string.
    +    // TODO:  Every time? Couldn't this be solved as below, only if it is really necessary?
    +    //        Or could a state be saved somewhere so this only needs to be called if the processor is on?
         $processors = $query->getIndex()->getProcessors();
         if (!isset($processors['search_api_meilisearch_synonyms'])) {
           try {
    @@ -313,6 +316,21 @@ class SearchApiMeiliSearchBackend extends BackendPluginBase implements PluginFor
         // set for every document so that you get all the documents and facets
         // information. This is done as a workaround because the Meilisearch
         // API does not support optional query yet.
    +
    +    // TODO:    An empty query string seems to be possible now, if that's what "optional query" means:
    +    //          https://www.meilisearch.com/docs/reference/api/search#placeholder-search
    +
    +    // TODO:    Saving/updating facets and sortableAttributes on the server each time seems unnecessary
    +    //          There is getServerSorts() already, could be used for comparison.
    +    //          Requesting and setting however are two calls, whereas setting although it is already set does not hurt
    +    //          For inexistant sorts, the error received is:
    +    //          {"message":"Attribute `field_myfield` is not sortable. Available sortable attributes are: `field_myfield2`.","code":"invalid_sort","type":"invalid_request","link":"https://docs.meilisearch.com/errors#invalid_sort"}
    +    //          or
    +    //          { code: 400, message: "Attribute `field_myfield` is not sortable. This index does not have configured sortable attributes.", error_code: "invalid_sort", error_type: "invalid_request", error_link: "https://docs.meilisearch.com/errors#invalid_sort" }
    +    //          Therefore, blindly try retrieving and catch the error and only then try to update settings
    +    //          This should work out in every but the very first call with a new attribute and should save one request
    +    //          for each search
    +
         $keys = $query->getOriginalKeys() != NULL ? $query->getOriginalKeys() : '*';
         try {
           $filterableAttributes = $index->get('field_settings');
    @@ -320,13 +338,6 @@ class SearchApiMeiliSearchBackend extends BackendPluginBase implements PluginFor
           $meili_options = [];
     
           if (!empty($facets)) {
    -        // Attributes that could be used as facets are saved in settings
    -        // on the server. The settings update needs to finish before you can
    -        // continue.
    -        $status = $this->meiliService->setSettings($index->id(), $facets);
    -        if (isset($status['taskUid'])) {
    -          $this->meiliService->waitForUpdate($index->id(), $status['taskUid']);
    -        }
     
             $meili_options['facets'] = $facets;
           }
    @@ -338,10 +349,51 @@ class SearchApiMeiliSearchBackend extends BackendPluginBase implements PluginFor
             $meili_options['filter'] = $conditions['facetFilters'];
           }
     
    +      if (!empty($sorts)) {
    +        foreach ($sorts as $criteria => $order) {
    +          $meili_options['sort'][] = $criteria . ":" . strtolower($order);
    +        }
    +      }
    +
           // First, perform a search with a limit of 0 documents to get the nbHits
           // that represents the number of all documents.
    +      // TODO:  Wouldn't setting "hitsPerPage" and therefore getting an exhaustive
    +      //        number of "totalHits" be better than setting off two requests?
    +      //        https://www.meilisearch.com/docs/reference/api/search#number-of-results-per-page
           $meili_options['limit'] = 0;
    -      $first_query = $this->meiliService->search($index->id(), $keys, $meili_options);
    +
    +      try {
    +        $first_query = $this->meiliService->search($index->id(), $keys, $meili_options);
    +      }
    +      catch (MeiliSearchApiException $e) {
    +        $preverror=(array)$e->getPrevious(); //TODO: This is a hack; I don't know how to actually get the 'errorCode' of Meilisearch's returned JSON(?) via a function
    +        // This is the first time ever this search gets called for this index
    +        // We don't care what attributes are actually missing (sortable, filterable)
    +        // Just set all of them, it does not hurt to do so twice and we can be sure
    +        // even the first request cuts through successfully afterwards
    +        // TODO: May there be more attributes that need to be set on server side? Set them as well!
    +
    +        if ($preverror['errorCode'] == 'invalid_sort' || $preverror['errorCode'] == 'invalid_filter') {
    +          // Attributes that could be used as sortable fields are saved in
    +          // settings/sortable-attributes or settings/filterable-attributes
    +          // on the server. The settings update needs to finish before you can
    +          // continue.
    +          $status = $this->meiliService->setServerSorts($index->id(), array_keys($sorts));
    +          if (isset($status['taskUid'])) {
    +            $this->meiliService->waitForUpdate($index->id(), $status['taskUid']);
    +          }
    +          $status = $this->meiliService->setSettings($index->id(), $facets);
    +          if (isset($status['taskUid'])) {
    +            $this->meiliService->waitForUpdate($index->id(), $status['taskUid']);
    +          }
    +          // TODO: What if this one fails b/c of some other exception? Will it be caught?
    +          $first_query = $this->meiliService->search($index->id(), $keys, $meili_options);
    +        } else {
    +          $this->handleExceptions($e);
    +          throw new SearchApiException($e->getMessage(), $e->getCode(), $e);
    +        }
    +      }
    +
           $all_hits = 0;
           if ($est_total_hits = $first_query->getEstimatedTotalHits()) {
             $all_hits = $est_total_hits;
  • Issue was unassigned.
  • Status changed to Closed: duplicate 11 months ago
  • 🇸🇮Slovenia DeaOm

    It seems this is kinda duplicate of this 📌 Sorting and sort boost Fixed , the only difference I see is the numerical sorting. But then again, I'm not sure the older version of meilisearch that this module supports (v.0.28) support anything other than strings. Will close it for now as a duplicate and if that is not the case, can be reopened.

  • 🇦🇹Austria tgoeg

    The related issue cannot solve the sorting, as the type-cast is still in the code.
    I don't consider this a duplicate, then.

    I don't quite understand how finding the work in here can rationally be done once it gets needed (and it will be if anyone wants sorting of decimals to actually work) if the ticket containing it is closed (like others that are indeed implemented when this one is not).

    I did explicitly *not* select the current stable version but 1.0.x-dev, which is definitely not only working with v0.24 (I know for sure because of https://www.drupal.org/project/search_api_meilisearch/issues/3264559 ).

    If this is considered duplicate: Where is the right place to file bugs/fixes/feature requests for the upcoming/more current version using current Meilisearch?

    Don't get me wrong, I am not trying to be pushy (and sorry in advance if it sounds like I'd be)!
    I just fear that hours of debugging/fixing work will get lost. The patch presented above works on a production data set with a few thousand nodes beautifully.

    You also mention the current stable version in the related ticket:
    Is there any good reason to stick to 1.0.0 of this module and v0.24 of Meilisearch? It's actually pretty broken without the patches floating around in various issues/branches, and Meilisearch stabilized their API starting with v0.28 (https://github.com/meilisearch/meilisearch/releases/tag/v0.28.0) so staying with an outdated API and developing new stuff for it seems like a potential loss of dev time for me.

    With currently only 4 sites reporting to use this module (which might very well be all of ours ;-) ) I'd say updating (dev version) to or at least accepting changes for a newer codebase would be the way to go!
    Is there anything preventing this?

  • Status changed to Needs work 11 months ago
  • 🇸🇮Slovenia DeaOm

    Like I said, the issue can be re-opened if you feel like it's not a duplicate, so re-opening it again with the tag needs work (as the code needs to be pushed and probably only the type-cast changed based on the other issue).

    If the ticket is closed it does not disappear, it's still present in the issue queue and can be found.

    I did mention the version as v0.28 which is the official version that module supports for DEV branch, for all other versions the code needs to be tested. I do think however it does work with 1.0. Probably since this is not yet a stable version but a beta version of the module, the goal is to make this stable version first, which means fixing all the current issues that are present and providing tests, then go for an upgrade and test/see what functionalities are breaking. Then this sort by numeric fields comes into play.

    I think the only thing that is "a mistake" for this issue is that it's reported as a bug fix. I think it falls under a feature request (so changed it to that also).

    Don't get me wrong, I'm with you on updating the meilisearch version to support the latest version or at least 1.2, but I think currently nothing new is developed or reported as an issue for this module, so the goal is to clean up the issue queue with the dev branch and v0.28 so it can be released as a stable version and then test with later versions to see which version breaks what. If nothing is breaking, there is nothing preventing the release of a new stable module version with the support of latest meilisearch. But this is on maintainers how they want to proceed.

    I think maybe the best approach here would be to open a new fork which would be intended for the latest version and a new stable release, where all the patches currently not merged or tested/reviewed can be added and manually tested.
    For testing purposes your help would be greatly appreciated as you have a good data set to test functionalities in the "real world" with "real data".

    And FYI, you are not pushy ;)

    I'm not sure if you mentioned anywhere (sorry if you did), but which version of meilisearch are you using on PROD?

  • 🇸🇮Slovenia DeaOm

    Based on the @tgoeg code and findings I tested the sortability done in related issue 📌 Sorting and sort boost Fixed via field that is integer on the supported meilisearch version v0.28 and meili-php v0.24.1 and it's not working as seen from the pic.

    Then I removed the string conversion and re-tested (re-indexed the index) and it's working as expected:

    Pushed the changes to that branch, as the code for sorting is done there a little differently. The only question I have for that issue is, does this sorting attribute need to be added as a setting? So users have the option to either enable or disable it? Meilisearch documentation says if no sort is present, the sort in the ranking attributes is ignored. But does that affect anything?

    I would really like to move this discussion to that issue, as it seems it also applies to the older version/currently supported version of meilisearch, so it's more clear and better organized. But I'll leave that to @tgoeg and leave this to needs work, as the other issue also needs tests.

  • Status changed to Closed: duplicate 10 months ago
  • 🇸🇮Slovenia bcizej

    As pointed out by @DeaOm all the work on sorting including fixing this is done on the related issue so any further issues regarding sorting should be discussed there.

Production build 0.69.0 2024