"can not use FieldCache on multivalued field" error when using boost recent dates processor

Created on 12 December 2022, almost 2 years ago
Updated 16 June 2023, over 1 year ago

Setup

  • Solr version: 7.7.3
  • Drupal Core version: 9.4.8
  • Search API version: 8.x-1.26
  • Search API Solr version: 4.2.9
  • Configured Solr Connector: Standard / Acquia Search

Issue

Since upgrading from Search API Solr 4.2.8 to 4.2.9 we have an error on search related to boost configuration.

The error from Solr is this:

can not use FieldCache on multivalued field: dm_field_sort_date

We use Acquia Solr but I am also able to replicate this locally on a Lando setup.

In the search index processors, we have "Boost more recent dates" enabled, and the error goes away when I disable this.

I've regenerated the solr configuration after updating to 4.2.9.

So it seems that something has changed between 4.2.8 and 4.2.9 that is breaking the boost more recent dates.

Any suggestions for how to debug this?

I'm thinking it might be some solr config that needs tweaking, so looking there initially, but any thoughts from people who know more would be much appreciated.

Thank you!

🐛 Bug report
Status

Fixed

Version

4.0

Component

Code

Created by

🇬🇧United Kingdom Finn Lewis

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Comments & Activities

Not all content is available!

It's likely this issue predates Contrib.social: some issue and comment data are missing.

  • 🇬🇧United Kingdom cbrody

    This resolves the issue we were having with the boost processor – can we get the MR approved?

  • 🇩🇪Germany mkalkbrenner 🇩🇪

    The minimum required schema version currently is 4.2.8, not 4.2.0.

  • Status changed to Needs work almost 2 years ago
  • 🇩🇪Germany mkalkbrenner 🇩🇪

    Regardless if the patch avoids that error or not, you have an issue in your entity model.
    It makes no sense to boost on most recent dates if you randomly pick one out of multiple.

    For example, if you store 1980 and 2023 in these fields, the entity will be shown at the end of the results if 1980 is select or on top of the list when 2023 is selected.

    You need to implement that selection by yourself.
    You can define an additional sine valued filed on the entity and store the most recent date in it on entity save.

    Or you can. Add a dummy field to the index and do that calculation in a Search API Solr pre index event.

    Or you can implement a Search API processor to do it.

    But I know that we have that stupid hack for sort fields to compatible to the database backend, just to avoid errors in views regardless that the result might be wrong.
    But the database backend can't boost on dates. So it is a question if we really should repeat the same erroneous approach here or if the exception is correct to inform you about issues in your data model.

    At least the patch should be extended to log an error if someone tries to boost on a multi valued field.

  • 🇬🇧United Kingdom Finn Lewis

    Thanks for your feedback @mkalkbrenner.

    As to the data model, I will let @cbrody comment on that as he is more familiar with the data model and how and why it is how it is.

    Even so, the data model did not change but changes in the module between 4.2.8 and 4.2.9 did break things, so I do think it would be good to get the patch committed if possible to maintain the behaviour that we were relying on, even if less than ideal.

    I will look at adding a log message with something appropriate.

  • 🇬🇧United Kingdom cbrody

    Thanks for the input @mkalkbrenner. In our case the date boost works by using search_api's aggregated field mechanism to select the last encountered value of either created or field_date_published (a field that is present only on certain specific entity types, so if present the aggregate field takes its value). So the resulting dm_field_sort_date is a single-valued field, and the data model is sound for our purposes. The underlying issue seems to be that fields are given a dm_ prefix, regardless of whether they are actually multi-valued or not. A solution might be for the aggregated field tool to name and type fields according to their cardinality.

  • 🇩🇪Germany mkalkbrenner 🇩🇪

    Thanks @cbrody for that explanation. So your data model seems to be fine.

    But the bug fix should be something else. We have a detection for that cardinality. Maybe it doesn't work correctly for that use-case and needs to be fixed.

    But could you please verify the 'Fallback to multiValued field types' setting in the advanced options of the Search API Server Backend config?
    Maybe it is turned on in your installation.

  • 🇬🇧United Kingdom cbrody

    We've tested with 'Fallback to multiValued field types' enabled and disabled and the error still appears with the unpatched module.

  • 🇩🇪Germany mkalkbrenner 🇩🇪

    Thanks for checking this.

    So the error must be here:

    if ($field->getDataDefinition()->isList() || $this->isHierarchicalField($field)) {
                      $pref .= 'm';
                    }
                    else {
                      try {
                        // Returns the correct list of field definitions including
                        // processor-added properties.
                        $index_properties = $index->getPropertyDefinitions($field->getDatasourceId());
                        $pref .= $this->getPropertyPathCardinality($field->getPropertyPath(), $index_properties) != 1 ? 'm' : 's';
                      }
    

    Either that aggregated field is erroneously declared as list or as hierarchical field or getPropertyPathCardinality() returns a wrong result.
    @Finn Lewis could you debug that?

  • 🇬🇧United Kingdom Finn Lewis

    It looks like aggregated fields are always declared as a list:

    https://git.drupalcode.org/project/search_api/-/blob/8.x-1.28/src/Plugin...

      public function getPropertyDefinitions(DatasourceInterface $datasource = NULL) {
        $properties = [];
    
        if (!$datasource) {
          $definition = [
            'label' => $this->t('Aggregated field'),
            'description' => $this->t('An aggregation of multiple other fields.'),
            'type' => 'string',
            'processor_id' => $this->getPluginId(),
            // Most aggregation types are single-valued, but "Union" isn't, and we
            // can't know which will be picked, so err on the side of caution here.
            'is_list' => TRUE,
          ];
          $properties['aggregated_field'] = new AggregatedFieldProperty($definition);
        }
    
        return $properties;
      }
    
    

    Can we check if the aggregated field is single or multivalued rather than just relying on the is_list property?

    What happens if we just remove the check for $field->getDataDefinition()->isList() to fall back to the getPropertyPathCardinality method?
    ?

    
                  else {
                    if ($this->isHierarchicalField($field)) {
                      $pref .= 'm';
                    }
                    else {
                      try {
                        // Returns the correct list of field definitions including
                        // processor-added properties.
                        $index_properties = $index->getPropertyDefinitions($field->getDatasourceId());
                        $pref .= $this->getPropertyPathCardinality($field->getPropertyPath(), $index_properties) != 1 ? 'm' : 's';
                      }
                      catch (SearchApiException $e) {
                        // Thrown by $field->getDatasource(). As all conditions for
                        // multiple values are not met, it seems to be a single
                        // value field. Note: If the assumption is wrong, Solr will
                        // throw exceptions when indexing this field. In this case
                        // you should add an explicit 'isList' => TRUE to your
                        // property or data definition! Or activate
                        // fallback_multiple in the advanced server settings.
                        $pref .= empty($this->configuration['fallback_multiple']) ? 's' : 'm';
                      }
                    }
                  }
                }
    
  • 🇩🇪Germany mkalkbrenner 🇩🇪

    Can we check if the aggregated field is single or multivalued rather than just relying on the is_list property?

    The perfect solution would be to fix that in Search API, for example by providing a dedicated aggregated field for union.
    In Search API Solr we can't skip the check for lists because it is also required for other fields.
    So what we can do in Search API Solr is to add a check before we check for lists. If it is an aggregated field we could read its config and check for single value aggregations.

  • 🇬🇧United Kingdom Finn Lewis

    The perfect solution would be to fix that in Search API, for example by providing a dedicated aggregated field for union.

    Thanks @mkalkbrenner - started an issue in search_api to explore that: https://www.drupal.org/project/search_api/issues/3340305 🐛 Aggregated fields are always declared as multivalued / list causing problems with boosting in search_api_solr Fixed

    Meanwhile, I will take a look at adding a check before to delve into the aggregated field and see if we can check for single value aggregations.

  • 🇳🇱Netherlands Gertlor

    This patch works in combination with https://www.drupal.org/project/search_api/issues/3340305#comment-14940876 🐛 Aggregated fields are always declared as multivalued / list causing problems with boosting in search_api_solr Fixed to ensure the field prefix for aggregated fields works correctly.

  • 🇬🇧United Kingdom Finn Lewis

    That looks great @Gertlor!! I will test.

  • 🇬🇧United Kingdom Finn Lewis

    The patch in #17 works for me, in combination with the patch for search_api in comment #4 of https://www.drupal.org/project/search_api/issues/3340305#comment-14940876 🐛 Aggregated fields are always declared as multivalued / list causing problems with boosting in search_api_solr Fixed

    Thank you @Gertlor !!!

    Assuming this approach is sound, what's the best way to get this committed and released @mkalkbrenner ?

  • Status changed to RTBC over 1 year ago
  • Status changed to Needs work over 1 year ago
  • 🇩🇪Germany mkalkbrenner 🇩🇪

    The patch looks good. But ideally we can extend the existing test for field names.

    That change will require to deploy a new config-set to Solr and to re-index. Therefore the required minimum schema version needs to be raised.

  • Status changed to RTBC over 1 year ago
  • 🇩🇪Germany mkalkbrenner 🇩🇪

    Search API 1.29 contains the required counterpart. I'll raise the minimum version requirement for Search API and the schema version for the next release and include this patch.

  • Status changed to Fixed over 1 year ago
  • Automatically closed - issue fixed for 2 weeks with no activity.

Production build 0.71.5 2024