Vertex provider uses PredictionServiceClient and slows down response time.

Created on 12 February 2025, 2 months ago

Problem/Motivation

Performance and Authentication Issues:
The current ai_provider_google_vertex module uses the PredictionServiceClient for requests. In testing, this approach has led to long response times.

Steps to reproduce

Install module and use to for an ai request such as ai_image_alt_text

Proposed resolution

Replace the Client Requests:
Instead of using PredictionServiceClient, instantiate a lightweight Guzzle HTTP client with the proper base URI and timeout.

Construct request payloads as associative arrays (instead of Protobuf objects) and send them as JSON.

Remaining tasks

User interface changes

API changes

Data model changes

πŸ› Bug report
Status

Active

Version

1.0

Component

Code

Created by

πŸ‡ΊπŸ‡ΈUnited States aolivera

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Merge Requests

Comments & Activities

  • Issue created by @aolivera
  • πŸ‡ΊπŸ‡ΈUnited States aolivera

    Removed Use statements no longer needed
    Remove
    use Google\Cloud\AIPlatform\V1\Blob;
    use Google\Cloud\AIPlatform\V1\Client\PredictionServiceClient;
    use Google\Cloud\AIPlatform\V1\Content;
    use Google\Cloud\AIPlatform\V1\GenerateContentRequest;
    use Google\Cloud\AIPlatform\V1\Part;
    use Google\Cloud\AIPlatform\V1\PredictRequest;

    Add Use statements needed
    Add

    use GuzzleHttp\Client;
    use GuzzleHttp\Exception\RequestException;
    use Google\Auth\Credentials\ServiceAccountCredentials;

    Change the client to use GuzzleHtttp instead of Google Cloud
    /**
    * The HTTP client.
    *
    * (Originally this was a PredictionServiceClient; now we use a Guzzle HTTP client.)
    *
    * @var \GuzzleHttp\Client|null
    */
    protected $client;

    Change from using Prediction for client
    /**
    * Gets the raw chat client.
    *
    * This is the client for inference.
    *
    * @return \Google\Cloud\AIPlatform\V1\Client\PredictionServiceClient
    * The Google Vertex client.
    */
    public function getClient(): PredictionServiceClient {
    $this->loadClient();
    return $this->client;
    }

    To using GuzzleHttp

    /**
    * Gets the raw chat client.
    *
    * This is the client for inference.
    *
    * @return \GuzzleHttp\Client
    * The HTTP client.
    */
    public function getClient(): Client {
    $this->loadClient();
    return $this->client;
    }

    Add Helper Methods GetAccessToken and GetAuthHeaders

    /**
    * Retrieves an OAuth2 access token using the service account credentials.
    *
    * @return string
    * The access token.
    *
    * @throws \Exception
    * Thrown if unable to retrieve an access token.
    */
    protected function getAccessToken(): string {
    $credentialsJson = $this->loadCredentialFile();
    $credentialsArray = json_decode($credentialsJson, true);
    if (!$credentialsArray) {
    throw new \Exception('Invalid credentials JSON.');
    }
    $sc = new ServiceAccountCredentials(
    ['https://www.googleapis.com/auth/cloud-platform'],
    $credentialsArray
    );
    $token = $sc->fetchAuthToken();
    if (!isset($token['access_token'])) {
    throw new \Exception('Unable to retrieve access token.');
    }
    return $token['access_token'];
    }

    /**
    * Returns the authentication headers.
    *
    * @return array
    * An array containing the Authorization header.
    */
    protected function getAuthHeaders(): array {
    return [
    'Authorization' => 'Bearer ' . $this->getAccessToken(),
    ];
    }

    Create Endpoint
    $endpoint = 'https://us-central1-aiplatform.googleapis.com/v1/' . $url . ':generateContent';

  • Pipeline finished with Failed
    about 2 months ago
    Total: 157s
    #430885
  • Pipeline finished with Failed
    about 2 months ago
    Total: 172s
    #433965
  • Pipeline finished with Failed
    about 2 months ago
    Total: 177s
    #433979
  • Pipeline finished with Success
    about 2 months ago
    Total: 157s
    #433988
  • Status changed to Needs work about 6 hours ago
  • πŸ‡§πŸ‡¬Bulgaria valthebald Sofia
  • Pipeline finished with Success
    about 1 hour ago
    #476143
Production build 0.71.5 2024