Lower costs with Batch API

Created on 17 April 2024, 8 months ago

Problem/Motivation

"BatchAPI gives a 50% discount on regular completions and much higher rate limits (250M input tokens enqueued for GPT-4T). Results guaranteed to come back with 24hrs and often much sooner."

For more details, visit the docs: https://platform.openai.com/docs/api-reference/batch

Proposed resolution

If a delay between content generation en retrieval is acceptable, we could provide a feature to enable BatchAPI and to reduce costs.

In order to do so we should implement a delayed batch process.

Feature request
Status

Postponed

Version

1.0

Component

OpenAI Embeddings

Created by

🇧🇪Belgium mpp

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Comments & Activities

  • Issue created by @mpp
  • Status changed to Postponed 8 months ago
  • 🇬🇧United Kingdom scott_euser

    At the moment I believe batch is just for completions and not embeddings. I don't think we can use this yet (at least not for embeddings sub-module).

Production build 0.71.5 2024