Allow to alter collection names

Created on 4 June 2025, 9 days ago

Problem/Motivation

Currently the name of the Typesense collection is always the same as the ID of the TypesenseSchema config entity.

It would be great to introduce an event to allow developers to alter the collection name.

The use case we have is that we have a single Typesense server that is serving all our testing environments. We would like to add a prefix to the collection name, so we can have separate collections like test.myindex, staging.myindex etc.

Proposed resolution

Emit an event that developers can subscribe to if they need to alter the collection name.

An easier but less developer friendly solution would be to make the private method SearchApiTypesenseBackend::getCollectionName() protected, so that developers can extend the backend class and override the method with their custom logic.

API changes

A new event is added.

Data model changes

None.

Feature request
Status

Active

Version

1.0

Component

Code

Created by

🇧🇬Bulgaria pfrenssen Sofia

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Merge Requests

Comments & Activities

  • Issue created by @pfrenssen
  • 🇧🇬Bulgaria pfrenssen Sofia
  • 🇧🇬Bulgaria pfrenssen Sofia

    In the current implementation the Search API index name is used when the Typesense collection is created. This has caused some confusion in the code base for interacting with the schema:

    Using the index ID
    Sometimes we are retrieving the schema using the index ID, example:

    src/Controller/TypesenseIndexController.php
    62:    $schema = $backend->getSchema($search_api_index->id())?->getSchema();
    

    Using the collection name
    Sometimes we are retrieving the schema using the collection name, example:

    src/Plugin/search_api/backend/SearchApiTypesenseBackend.php
    407:    $schema = $this->getSchema($collection);
    

    Since in 100% of the cases, the collection name is derived from the index, let's standardize this on the index.

  • 🇧🇬Bulgaria pfrenssen Sofia
  • 🇧🇬Bulgaria pfrenssen Sofia
  • Pipeline finished with Failed
    9 days ago
    Total: 146s
    #514389
  • Pipeline finished with Failed
    9 days ago
    Total: 158s
    #514418
  • Pipeline finished with Success
    9 days ago
    Total: 148s
    #514445
  • 🇧🇬Bulgaria pfrenssen Sofia

    Many methods in TypesenseClientInterface are taking a collection name as an argument, for example:

      /**
       * Removes a Typesense index.
       *
       * @param string|null $collection_name
       *   The name of the index to remove.
       */
      public function dropCollection(?string $collection_name): void;
    

    In the current code base, since the collection name was always equal to the index ID, we were often passing in the index ID in lieu of the collection name, here is an example from SearchApiTypesenseBackend::removeIndex()

    $this->getTypesenseClient()->dropCollection($index->id());
    

    We should change this everywhere to use the collection name instead, so we properly decouple the collections from indexes. It will become:

    $collection_name = $this->getCollectionName($index);
    $this->getTypesenseClient()->dropCollection($collection_name);
    

    In order to facilitate this, we should make SearchApiTypesenseBackend::getCollectionName() public so we can easily get the collection name for a given index.

  • Pipeline finished with Failed
    9 days ago
    Total: 165s
    #514478
  • 🇧🇬Bulgaria pfrenssen Sofia

    Next up, we can improve and simplify the ::getCollectionName() method. It currently works like this:

      public function getCollectionName(?IndexInterface $index): ?string {
        if ($index == NULL) {
          return NULL;
        }
    
        try {
          /** @var \Drupal\search_api_typesense\Entity\TypesenseSchemaInterface $typesense_schema */
          $typesense_schema = $this->entityTypeManager
            ->getStorage('typesense_schema')
            ->load($index->id());
    
          if ($typesense_schema == NULL) {
            return NULL;
          }
    
          return $typesense_schema->id();
        }
        catch (InvalidPluginDefinitionException | PluginNotFoundException $e) {
          return NULL;
        }
      }
    

    So basically the logic is:

    1. Get the index ID.
    2. Load the TypesenseSchema entity with the same ID.
    3. If it exists, return the ID of the entity we just loaded.

    But since both the index and the schema entities have exactly the same ID, this whole logic can be reduced to:

      public function getCollectionName(?IndexInterface $index): ?string {
        return $index?->id();
      }
    

    I also think it doesn't make any sense to allow passing in NULL for $index, and we should also not allow NULL to be returned. So this can become:

      public function getCollectionName(IndexInterface $index): string {
        return $index->id();
      }
    
  • 🇧🇬Bulgaria pfrenssen Sofia
  • Pipeline finished with Success
    9 days ago
    Total: 262s
    #514532
  • 🇧🇬Bulgaria pfrenssen Sofia

    When creating a collection, the name of the collection is part of the schema that is passed to the Typesense server, so we need to make sure the altered collection name is set on the schema.

  • Pipeline finished with Failed
    9 days ago
    Total: 146s
    #514543
  • Pipeline finished with Failed
    9 days ago
    Total: 180s
    #514549
  • Pipeline finished with Success
    9 days ago
    Total: 150s
    #514552
  • 🇧🇬Bulgaria pfrenssen Sofia

    And finally we can introduce an event that can be used to alter the collection name.

  • 🇧🇬Bulgaria pfrenssen Sofia
  • 🇧🇬Bulgaria pfrenssen Sofia

    Thanks for the review! I will address the remarks,

  • Pipeline finished with Success
    about 19 hours ago
    Total: 233s
    #520651
Production build 0.71.5 2024