Fields with no main property cannot be indexed

Created on 22 May 2020, over 4 years ago
Updated 26 August 2024, 4 months ago

I have a complex field that has no main property

  public static function mainPropertyName() {
    return NULL;
  }

I used hook_search_api_field_type_mapping_alter to map this field to a custom data type I created for it. Now I see that mapping works. Still, the field can not be indexed.

I see in IndexAddFieldsForm.php

        $can_be_indexed = FALSE;
        $nested_properties = $this->fieldsHelper->getNestedProperties($property);
        $main_property = $property->getMainPropertyName();
        if ($main_property && isset($nested_properties[$main_property])) {
          $parent_child_type = $property->getDataType() . '.';
          $property = $nested_properties[$main_property];
          $parent_child_type .= $property->getDataType();
          unset($nested_properties[$main_property]);
          $can_be_indexed = TRUE;
        }

Even though I can handle the data type, requiring the field to have a main property prevents me from indexing it.

Would it be possible to allow indexing complex fields that have no main property?

Feature request
Status

Needs review

Version

1.0

Component

Framework

Created by

🇸🇪Sweden alayham

Live updates comments and jobs are added and updated live.
  • Needs tests

    The change is currently missing an automated test that fails when run with the original code, and succeeds when the bug has been fixed.

Sign in to follow issues

Merge Requests

Comments & Activities

Not all content is available!

It's likely this issue predates Contrib.social: some issue and comment data are missing.

  • 🇿🇦South Africa rudolfbyker South Africa

    Here is an example from my use case: I have a custom Drupal FieldType with a few different properties, say xmin, xmax, ymin, ymax. None of these are the "main" property. They work together to describe a bounding box. I want to create a custom BBox or RPT data type (See https://solr.apache.org/guide/solr/latest/query-guide/spatial-search.html ). This is how far I got:

    Set up the data type:

    namespace Drupal\vv\Plugin\search_api\data_type;
    
    use Drupal\search_api\Plugin\search_api\data_type\StringDataType;
    
    /**
     * Provides a BBox / RPT data type.
     *
     * @SearchApiDataType(
     *   id = "vv_solr_bbox",
     *   label = @Translation("BBox"),
     *   description = @Translation("A bounding box field. Useful for finding overlapping ranges in 2 dimensions."),
     *   fallback_type = "string",
     *   prefix = "bbox"
     * )
     */
    class BBoxDataType extends StringDataType {
    
      /**
       * {@inheritdoc}
       */
      public function getValue($value) {
        // @todo I would expect to get the entire field item here.
        // $xmin = (int) $value->get("xmin")->value;
        // etc...
        return "ENVELOPE({$xmin}, {$xmax}, {$ymax}, {$ymin})"
      }
    
    }
    

    Set up the data type mapping:

    
    namespace Drupal\vv\EventSubscriber;
    
    use Drupal\search_api\Event\MappingFieldTypesEvent;
    use Drupal\search_api\Event\SearchApiEvents;
    use Symfony\Component\EventDispatcher\EventSubscriberInterface;
    
    /**
     * Subscribe to events from the `search_api` module.
     */
    class SearchApiEventSubscriber implements EventSubscriberInterface {
    
      /**
       * {@inheritDoc}
       */
      public static function getSubscribedEvents(): array {
        return [SearchApiEvents::MAPPING_FIELD_TYPES => 'onMappingFieldTypes'];
      }
    
      /**
       * Handle the `search_api.mapping_field_types` event.
       */
      public function onMappingFieldTypes(MappingFieldTypesEvent $event) {
        $mapping = &$event->getFieldTypeMapping();
        $mapping['field_item:my_bbox_field_type'] = 'vv_solr_bbox';
      }
    
    }
    

    Search API config at search_api.index.commentary_comments.yml:

    field_settings:
      my_bbox:
        label: BBox example
        datasource_id: 'entity:node'
        property_path: field_my_bbox
        type: vv_solr_bbox
        dependencies:
          config:
            - field.storage.node.my_bbox_field_type
    

    Solr config in schema_extra_types.xml:

    <fieldType name="bbox" class="solr.SpatialRecursivePrefixTreeFieldType" geo="false" distanceUnits="kilometers" worldBounds="ENVELOPE(0,4800,1,0)" />
    

    Solr config in schema_extra_fields.xml:

    <dynamicField name="bboxs_*" type="bbox" indexed="true" stored="true" multiValued="false" />
    <dynamicField name="bboxm_*" type="bbox" indexed="true" stored="true" multiValued="true" />
    

    But the problem is that, in FieldsHelper::extractFieldValues on line 221, it returns an empty array when there is no main property:

        // Process complex data types.
        if ($definition instanceof ComplexDataDefinitionInterface) {
          $main_property_name = $definition->getMainPropertyName();
          $data_properties = $data->getProperties(TRUE);
          if (isset($data_properties[$main_property_name])) {
            return $this->extractFieldValues($data_properties[$main_property_name]);
          }
          return []; // line 221
        }
    
  • Status changed to Needs review 5 months ago
  • 🇦🇹Austria drunken monkey Vienna, Austria

    Thanks a lot for your input, that was very helpful. I see that there are cases in which no individual (scalar) property can give you the needed information for indexing a field, where you do need complex data to arrive at a single, scalar field value. It would indeed be great if we could support this use case. From what I can see, at least the contract of \Drupal\search_api\DataType\DataTypeInterface::getValue() doesn’t actually forbid us from passing complex values (arrays or objects) as $value – but the assumption was definitely built into all of the implementations, at least of the default types, and might also be present in other places in our code.

    In any case, instead of returning $data directly, what about just returning $data->getValue()? That would seem a bit more benign. It also would enable us to better handle cases where the property contains multiple values – though, on the other hand, it seems like that should already be handled by the $definition->isList() check at the top of the method.
    Also, I guess your approach has the big advantage of carrying type information with it – if you just get an associative array like ['value' => 2, 'options' => 'abc'] it could potentially come from any number of data types, while your Search API data type plugin probably just handles a specific one. (At this point it would be helpful if we’d allow Search API data type plugins to restrict the types of properties they can be used for, but that’s another issue entirely.)

    I created a draft MR with my suggestion, let’s see whether this blows up with our existing tests. In any case, we’ll need further tests to make sure this really enables use cases such as yours, and that it doesn’t cause error when using the built-in data types with complex properties.
    Please let me know what you think. I’m now sceptic myself regarding the switch from $data to $data->getValue(), so we can also go with your approach if you agree.

  • 🇿🇦South Africa rudolfbyker South Africa

    My knowledge of the search_api code base is VERY limited. I simply stumbled upon line 221 of FieldsHelper::extractFieldValues while stepping through the code with xdebug to where my field values are getting lost. I'm not in a position to make good suggestions on the way forward with the code. I just wanted to show a sensible use case that would be solved by this issue.

  • 🇺🇸United States apmsooner

    @rudolfbyker - It might work to just add another 'computed' property on your field that combines all those other properties into a single string. Then you should be able to index just that single property in a simple way.

  • 🇦🇹Austria drunken monkey Vienna, Austria

    NW for the tests.

Production build 0.71.5 2024