Sort by relevancy of matching terms

Created on 6 December 2024, 4 months ago

Setup

  • Solr version: 8
  • Drupal Core version: 10
  • Search API version:8.x-137
  • Search API Solr version:4.3.7
  • Configured Solr Connector: Solr

Issue

I didn't want to revitalize the older thread 💬 How to sort search results by relevance (score) Closed: works as designed but I needed some extra clearance so I write down my findings in this issue.

According to Solr's documentation: The fq parameter defines a query that can be used to restrict the superset of documents that can be returned,

. Therefore, while the fq parameter allows the queries to be cached independently of the main query, it has the drawback that it doesn't affect the score. Therefore, you have to add a boost to the fulltext fields so that your keywords have a higher relevancy.

In case you don't want this behavior though and instead you want the score to be calculated based on the most terms matching, it seems that you somehow have to change the fq parameters to q. I can't find a UI to do this change though, only events to be subscribed to programmatically.

💬 Support request
Status

Active

Version

4.3

Component

Code

Created by

🇬🇷Greece vensires

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Comments & Activities

  • Issue created by @vensires
  • 🇩🇪Germany mkalkbrenner 🇩🇪

    Did you debug the query?

    I think we create the query as "q" and only conditions as "fq". So I think that it is correct that conditions should not influence the scoring as all documents have to meet the condition.

    Which use-case does create filter queries should influence the scoring?

  • 🇬🇷Greece vensires

    Thanks for the answer. Yes, I did debug the query (attached) and the conditions from facets turn out to really be living in fq.

    What did the job for me was the following:

    
    namespace Drupal\custom\EventSubscriber;
    
    use Drupal\Core\Entity\EntityTypeManagerInterface;
    use Drupal\search_api_solr\Event\PostConvertedQueryEvent;
    use Drupal\search_api_solr\Event\SearchApiSolrEvents;
    use Symfony\Component\EventDispatcher\EventSubscriberInterface;
    
    /**
     * Event subscriber for solr queries.
     */
    final class CustomSolrQuerySubscriber implements EventSubscriberInterface {
    
      /**
       * Reacts to the post convert query event.
       *
       * @param \Drupal\search_api_solr\Event\PostConvertedQueryEvent $event
       *   The post converted query event.
       */
      public function solrQueryAlter(PostConvertedQueryEvent $event) {
        $search_api_query = $event->getSearchApiQuery();
        // The "content_relevancy_sorting" tag is set from Views UI.
        if (in_array('content_relevancy_sorting', $search_api_query->getTags(), TRUE)) {
          $sorts = &$search_api_query->getSorts();
          if (isset($sorts['search_api_relevance'])) {
            $solarium_query = $event->getSolariumQuery();
            $filter_queries = $solarium_query->getFilterQueries();
            $q = [];
            foreach ($filter_queries as $filter_id => $filter_query) {
              if (!str_starts_with($filter_id, 'filters_')) {
                continue;
              }
              $q[] = $filter_query->getOption('query');
            }
            $string_query = $solarium_query->getQuery() . ' AND ' . implode(' AND ', $q);
            $solarium_query->setQuery($string_query);
          }
        }
      }
    
      /**
       * {@inheritdoc}
       */
      public static function getSubscribedEvents(): array {
        return [
          SearchApiSolrEvents::POST_CONVERT_QUERY => 'solrQueryAlter',
        ];
      }
    
    }
    

    What I basically do in the query above is I take the conditions from fq and also add them with "AND" to the "q" parameter. But I wonder whether this should really require extra code instead of UI.

  • 🇩🇪Germany mkalkbrenner 🇩🇪

    I still don't understand the use-case.
    All result items must match any of the filters. How should a result item match more filters than another?

    Could you describe a concrete example?

  • 🇬🇷Greece vensires

    Yes, let me explain to you my exact use case... Intersection!

    Let's say we have two nodes:

    Node #1:
    - Tags: A, B, C, D
    
    Node #2:
    - Tags: C, D, E
    

    A simple user uses facets to filter the results of an indexed view displaying the nodes matching B or C or D.
    As an admin I set the view to sort the results based on Solr's score. So, I expect Solr to return Node #1 first and then Node #2 since #1 matches more terms than #2 - based on what the user has filtered by.

    Since both nodes match the B or C or D condition, both are displayed. But... since using fq instead of q, their relevancy score is "1.0" for both.

  • 🇩🇪Germany mkalkbrenner 🇩🇪

    Thanks for the explanation.

    But this is how it works. Facets are filters and don't influence the scoring.
    Converting them into a query will break a lot of different facets features, especially tagging and excluding:
    https://solr.apache.org/guide/solr/latest/query-guide/faceting.html#tagg...

  • 🇬🇷Greece vensires

    I understand your point. In case I come up with another idea in the future, a more generic approach let's say, I will come back here.

    In the meantime, I take the opportunity to change this to "Won't fix" in order to help anyone coming here in the future, considering it fixed and looking for what was changed.

Production build 0.71.5 2024