Vertex provider uses PredictionServiceClient and slows down response time.

Created on 12 February 2025, 3 months ago

Problem/Motivation

Performance and Authentication Issues:
The current ai_provider_google_vertex module uses the PredictionServiceClient for requests. In testing, this approach has led to long response times.

Steps to reproduce

Install module and use to for an ai request such as ai_image_alt_text

Proposed resolution

Replace the Client Requests:
Instead of using PredictionServiceClient, instantiate a lightweight Guzzle HTTP client with the proper base URI and timeout.

Construct request payloads as associative arrays (instead of Protobuf objects) and send them as JSON.

Remaining tasks

User interface changes

API changes

Data model changes

πŸ› Bug report
Status

Active

Version

1.0

Component

Code

Created by

πŸ‡ΊπŸ‡ΈUnited States aolivera

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Merge Requests

Comments & Activities

  • Issue created by @aolivera
  • πŸ‡ΊπŸ‡ΈUnited States aolivera

    Removed Use statements no longer needed
    Remove
    use Google\Cloud\AIPlatform\V1\Blob;
    use Google\Cloud\AIPlatform\V1\Client\PredictionServiceClient;
    use Google\Cloud\AIPlatform\V1\Content;
    use Google\Cloud\AIPlatform\V1\GenerateContentRequest;
    use Google\Cloud\AIPlatform\V1\Part;
    use Google\Cloud\AIPlatform\V1\PredictRequest;

    Add Use statements needed
    Add

    use GuzzleHttp\Client;
    use GuzzleHttp\Exception\RequestException;
    use Google\Auth\Credentials\ServiceAccountCredentials;

    Change the client to use GuzzleHtttp instead of Google Cloud
    /**
    * The HTTP client.
    *
    * (Originally this was a PredictionServiceClient; now we use a Guzzle HTTP client.)
    *
    * @var \GuzzleHttp\Client|null
    */
    protected $client;

    Change from using Prediction for client
    /**
    * Gets the raw chat client.
    *
    * This is the client for inference.
    *
    * @return \Google\Cloud\AIPlatform\V1\Client\PredictionServiceClient
    * The Google Vertex client.
    */
    public function getClient(): PredictionServiceClient {
    $this->loadClient();
    return $this->client;
    }

    To using GuzzleHttp

    /**
    * Gets the raw chat client.
    *
    * This is the client for inference.
    *
    * @return \GuzzleHttp\Client
    * The HTTP client.
    */
    public function getClient(): Client {
    $this->loadClient();
    return $this->client;
    }

    Add Helper Methods GetAccessToken and GetAuthHeaders

    /**
    * Retrieves an OAuth2 access token using the service account credentials.
    *
    * @return string
    * The access token.
    *
    * @throws \Exception
    * Thrown if unable to retrieve an access token.
    */
    protected function getAccessToken(): string {
    $credentialsJson = $this->loadCredentialFile();
    $credentialsArray = json_decode($credentialsJson, true);
    if (!$credentialsArray) {
    throw new \Exception('Invalid credentials JSON.');
    }
    $sc = new ServiceAccountCredentials(
    ['https://www.googleapis.com/auth/cloud-platform'],
    $credentialsArray
    );
    $token = $sc->fetchAuthToken();
    if (!isset($token['access_token'])) {
    throw new \Exception('Unable to retrieve access token.');
    }
    return $token['access_token'];
    }

    /**
    * Returns the authentication headers.
    *
    * @return array
    * An array containing the Authorization header.
    */
    protected function getAuthHeaders(): array {
    return [
    'Authorization' => 'Bearer ' . $this->getAccessToken(),
    ];
    }

    Create Endpoint
    $endpoint = 'https://us-central1-aiplatform.googleapis.com/v1/' . $url . ':generateContent';

  • Pipeline finished with Failed
    3 months ago
    Total: 157s
    #430885
  • Pipeline finished with Failed
    3 months ago
    Total: 172s
    #433965
  • Pipeline finished with Failed
    3 months ago
    Total: 177s
    #433979
  • Pipeline finished with Success
    3 months ago
    Total: 157s
    #433988
  • Status changed to Needs work 28 days ago
  • πŸ‡§πŸ‡¬Bulgaria valthebald Sofia
  • Pipeline finished with Success
    27 days ago
    Total: 151s
    #476143
  • πŸ‡ΊπŸ‡ΈUnited States aolivera

    Thank you valthebald,

    i have removed the requested code an refactored to work with the credentials the correct way. Please let me know if you need anyting else.

  • πŸ‡§πŸ‡¬Bulgaria valthebald Sofia
  • Pipeline finished with Failed
    23 days ago
    Total: 159s
    #478504
  • Pipeline finished with Success
    23 days ago
    Total: 224s
    #478513
  • πŸ‡§πŸ‡¬Bulgaria valthebald Sofia

    Adding contribution

  • πŸ‡§πŸ‡¬Bulgaria valthebald Sofia
  • πŸ‡§πŸ‡¬Bulgaria valthebald Sofia

    @aolivera: thank you, fixed!
    Just a small note for the future: it's better to use branch that is named after the issue number (i.s. 3506263-something) instead of 1.0.x on a separate git origin - that makes local distinction easier

  • πŸ‡§πŸ‡¬Bulgaria valthebald Sofia
  • Automatically closed - issue fixed for 2 weeks with no activity.

Production build 0.71.5 2024