How to Chunk Multiple Fields as Main Content in Indexing?

Created on 25 June 2025, about 2 months ago
Updated 12 August 2025, 6 days ago

Problem/Motivation

Hi all, I'm using the AI Search functionality and currently stuck at the indexing stage. Specifically, I'm trying to figure out how to handle chunking and embedding when my content is spread across multiple fields (e.g., `summary`, `solution`, etc.).

The module only allows one field to be marked as "Main content" for chunking, as described here: โ€œThis is the main body content. It is typically longer and needs to be broken into chunks. Queries by end-users are performed on this content. Usually only one field should be main content and a more advanced Embedding Strategy may be needed to support multiple main contents.โ€

I'm wondering is there a recommended way to include multiple main contents in chunking and embedding? Any guidance or clarification would be greatly appreciated!

Steps to reproduce

Proposed resolution

I attempted to use the rendered HTML output as the single field for "Main content", but it includes a lot of irrelevant markup (e.g., links, images), which introduces noise into the vectorization process.

Iโ€™d like to explore ways to merge the text from multiple fields before chunking, ideally without modifying how the content is displayed on the site.

Remaining tasks

None expected

User interface changes

None expected

API changes

None expected

Data model changes

None expected

๐Ÿ’ฌ Support request
Status

Active

Version

1.1

Component

AI Search

Created by

๐Ÿ‡บ๐Ÿ‡ธUnited States yijin928@163.com

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Comments & Activities

Not all content is available!

It's likely this issue predates Contrib.social: some issue and comment data are missing.

Production build 0.71.5 2024