Convert search query to markdown

Created on 20 May 2025, 13 days ago

Problem/Motivation

When searching for reference-style identifiers such as A1_345/2003 or E6_627/2024, the search index does not return the expected content as the top result.
This appears to be due to markdown-sensitive characters not being handled in the query string, while they are escaped during indexing.

Steps to reproduce

  1. Add content with identifiers like A1_345/2003.
  2. Index the content.
  3. Search using the raw identifier (e.g., A1_345/2003) in Vector DB Explorer.
  4. Observe that expected content is not the top result.
  5. Escape the query string as markdown would (e.g., A1\_345/2003) and search again.
  6. Note that the expected content is now ranked correctly.

Proposed resolution

The search query should be preprocessed in the same way as indexed content—by escaping markdown characters—before embedding or querying.
A patch has been provided to escape the query using the same mechanism used during indexing (HTMLtoMarkdown::convert()).

🐛 Bug report
Status

Active

Version

1.0

Component

AI Search

Created by

🇩🇪Germany ayrmax

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Merge Requests

Comments & Activities

Production build 0.71.5 2024