Add optional re‑ranking step

Created on 17 May 2025, 21 days ago

Problem/Motivation

The AI Search Block currently returns the top RAG max results directly from the retriever. However, highly relevant documents can exist beyond that cutoff, causing important results to be omitted. An optional re‑ranking pass on a larger candidate set can improve result quality.

Proposed resolution

  • Rename the existing RAG max results label to “RAG final max results” for clarity.
  • Add a checkbox Enable Reranking in the block’s settings form.
  • When enabled (via Form API #states), reveal:
    • Rerank Candidate Count: number of initial hits to fetch (e.g. 50).
    • Rerank Embedding Model: select which embedding model to use for re‑ranking.
    • Reranking Template: text-based template for reranking.
  • Enhance the retrieval pipeline:
    1. Fetch Rerank Candidate Count documents.
    2. Compute embeddings via Rerank Embedding Model and the provided template.
    3. Sort candidates by similarity to the query.
    4. Return the top RAG max results (original) as the block output.
Feature request
Status

Active

Version

1.0

Component

Code

Created by

🇪🇸Spain Nikro Benalmadena, Malaga

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Merge Requests

Comments & Activities

  • Issue created by @Nikro
  • 🇪🇸Spain Nikro Benalmadena, Malaga

    Okay now that we cleared - https://www.drupal.org/project/ai/issues/3525296 Add optional RAW vectors in RAG results Active
    And https://www.drupal.org/project/ai_vdb_provider_postgres/issues/3525329 Implement exposing raw vector Active

    --

    Along the way of implementation, realized that https://www.drupal.org/project/ai/issues/3488114 Add support for rerank operation type Active - reranking operation wasn't done yet - so we can't proceed with the FULL reranking (cross-encoder) - so, for now we can just reuse the same embedding model (especially if it's "instruct" bi-encoder) - and use it.

    I tested and it does help, it pulled some results closer (i.e. 20 candidates and 5 cut-off, those 5 usually weren't always the same as top 5 would have been). Difference is small but still.

    NOTE: this is special for instruct-based embedding models.

  • 🇪🇸Spain Nikro Benalmadena, Malaga
  • Pipeline finished with Failed
    18 days ago
    Total: 225s
    #501344
  • Pipeline finished with Success
    18 days ago
    Total: 146s
    #501632
  • Pipeline finished with Failed
    18 days ago
    Total: 157s
    #501665
  • Pipeline finished with Success
    18 days ago
    #501668
  • Pipeline finished with Success
    17 days ago
    Total: 174s
    #502262
Production build 0.71.5 2024