Enable optional RAG prefix for asymmetric models

Created on 17 May 2025, 21 days ago

Problem/Motivation

The AI Search Block currently sends the raw user query to the retriever, which works for symmetric embedding models. However, many instruct‑tuned or asymmetric models (e.g., multilingual-e5‑large-instruct, GTE‑QWEN) require a prompt prefix (like “query: ”) to generate embeddings correctly.

Proposed resolution

  • Add an optional checkbox “Enable retrieval prefix” to the block’s settings form.
  • When checked, reveal a textarea for users to enter their custom prefix template via Drupal’s Form API state system.
  • Modify the retrieval service to prepend the configured prefix before the query string.
  • Include usage examples (e.g., prefix = “Instruct: blablabla\nQuery: {query}”) in the field description.

User interface changes

  • New checkbox: enable_prefix under “Retrieval settings”.
  • Conditional textarea: prefix_template appears when enable_prefix is checked.
  • Help text explaining which models need a prefix and example syntax.

Remaining tasks

  1. Implement form elements and #states logic in AI_search_block\Form\SettingsForm.
  2. Update retrieval plugin to inject prefix when enabled.
  3. Create unit tests to verify prefix behavior.
  4. Adjust documentation to cover prefix configuration and best practices.
Feature request
Status

Active

Version

1.0

Component

Code

Created by

🇪🇸Spain Nikro Benalmadena, Malaga

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Merge Requests

Comments & Activities

Production build 0.71.5 2024