Implement Advanced Input Mode with Token Chunking for Text Automator

Created on 22 November 2024, 3 months ago

Problem/Motivation

The current Text Automator in the AI module effectively handles many-to-many relationships using prompts like “Provide output row per each input row.” However, it encounters difficulties when processing long texts due to token limitations of Large Language Models (LLMs).

Proposed resolution

Introduce an “Advanced Mode (Token, Chunked)” option under the “Automator Input Mode” settings. This mode will:

  1. Allow users to define the number of tokens per chunk, enabling customization based on specific LLM constraints and content requirements.
  2. Divide lengthy input texts into manageable chunks according to the user-defined token count.
  3. Sequentially process each chunk through the LLM with instructions to iteratively refine the output as more data is provided.
  4. Aggregate the refined outputs to form a cohesive final result.

Remaining tasks

  • Develop the chunking mechanism to split input texts based on user-defined token counts.
  • Implement logic to manage iterative refinement across chunks.

User interface changes

  • Add the “Advanced Mode (Token, Chunked)” option to the “Automator Input Mode” dropdown menu within the Text Automator settings.
  • Include a field for users to specify the desired number of tokens per chunk.
📌 Task
Status

Needs work

Version

1.0

Component

AI Automators

Created by

🇬🇧United Kingdom seogow

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Comments & Activities

Production build 0.71.5 2024