Support new embeddings models and dimensions (3-small, 3-large)

Created on 10 April 2024, 7 months ago
Updated 8 May 2024, 6 months ago

Problem/Motivation

As per https://openai.com/blog/new-embedding-models-and-api-updates#new-embeddi... there are new embeddings models.

Steps to reproduce

Currently only `text-embedding-ada-002` is supported.

Proposed resolution

Give the choice of embedding model.

Remaining tasks

Update code

User interface changes

Needs a configuration/settings for this.

API changes

None

Data model changes

If changng model, re-embedding is needed. So text-embedding-ada-002 for existing installs should be maintained. New defaults for fresh installs.

Feature request
Status

Needs review

Version

1.0

Component

OpenAI Embeddings

Created by

🇬🇧United Kingdom scott_euser

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Merge Requests

Comments & Activities

  • Issue created by @scott_euser
  • Assigned to scott_euser
  • Issue was unassigned.
  • Status changed to Needs review 6 months ago
  • 🇬🇧United Kingdom scott_euser

    This adds support for any text-embedding-* model, and necessarily updates the OpenApi service to support dimensions as a parameter since the default number of dimensions from some models is not supported by vector databases yet (such as Pinecone).

    So each Vector Database plugin can say which dimensions it supports and allows the site builder to choose that when configuring embeddings.

    The QueueWorker and SearchForm both had a direct implementation of embedding generation via the OpenAI PHP Client rather than OpenApi service wrapper from the base module here, so that needed rework as well to support dimensions.

Production build 0.71.5 2024