πŸ‡ΊπŸ‡ΈUnited States @d0t101101

Account created on 26 May 2007, about 17 years ago
#

Recent comments

πŸ‡ΊπŸ‡ΈUnited States d0t101101

@kevinquillen:

Have been using ChatGPT Plus for a months but it seems for the GPT-v4 API calls to work, it must also be visible under your OpenAI accounts 'Playground' too. And for me, it still isn't! Chat.openai.com shows as a Plus paid plan. So as mentioned I applied for their waiting list here; will follow up with their support directly to expedite if possible next...

https://openai.com/waitlist/gpt-4-api

On the Development front, after upgrading php-openai/client to the latest v0.4.1 via composer, restarting Apache and clearing all Drupal caches, the main blocker I've been fighting with is getting my code to successfully call the new createStreamed() method in the vendor lib instead of just create() here:

https://git.drupalcode.org/project/openai/-/blob/1.0.x/modules/openai_ch...

Changing this line to createStreamed() results in this error getting logged:

Uncaught PHP Exception Error: "Call to undefined method OpenAI\\Resources\\Chat::createStreamed()"

Thinking this was related to the OpenAI Drupal modules factory pattern, I've tried to work through this by adding a similar public function createStreamed() method here, but that doesn't seem to have an impact:

https://git.drupalcode.org/project/openai/-/blob/1.0.x/src/Http/ClientFa...
https://git.drupalcode.org/project/openai/-/blob/1.0.x/modules/openai_ch...

Any ideas or suggestions as to where I might need to define createStreamed() for this to work in the Drupal OpenAI modules ChatGptForm.php? After that's addressed, this thread has lots of juicy details to proceed further:

https://github.com/openai-php/client/pull/84

In the above, my guess is that I'm probably missing something fundamental around the use of $instance, $container and/or symphony's dependency injection! Forgive my ignorance, sir. :)

πŸ‡ΊπŸ‡ΈUnited States d0t101101

@mindaugasd - thanks for the inputs and code examples here! I've been incredibly busy lately and stoked to get this all under wraps ASAP. Looks like the underlying openai-php/client lib now has some solid examples of how to handle streamed Chat responses too, which could vastly improve the UX:

https://github.com/openai-php/client/tree/v0.4.1#created-streamed

I'm on the 'waiting list' for but am still awaiting actual access to the OpenAI ChatGPT v4 API so I can further develop and debug that as well.

πŸ‡ΊπŸ‡ΈUnited States d0t101101

@kevinquillen - +1 - this looks awesome!!! :)

πŸ‡ΊπŸ‡ΈUnited States d0t101101

@mindaugasd - Agreed! 90% sure the native chat.openai.com UI is using the Highlight.js lib, which is what I implemented here :) Automatic language detection is pretty darn good on its own, assuming its contained, formatted and escaped correctly within the DOM.

Noticed that more recent versions of the chat.openai.com UI also let the responses span multiple replies contained within code blocks - i.e. if the code was too long for the initial/single response length, it can continue in a new code block in subsequent responses to follow too. Sometimes prompting it with 'continue this code block' helps to keep it continue on and contained as well. This can break down though as markup/code may get truncated along the way, and it reverts to spitting out raw code in the response.

Very much looking forward to getting streaming responses from the underlying PHP library/API, so we can possibly better handle this as its returned and displayed for each request.

Food for thought!

πŸ‡ΊπŸ‡ΈUnited States d0t101101

@Ressa & @mindaugasd - there are some TODOs in the code around the $message variable storage and handling. The way the underlying OpenAI PHP library works is it passes the previous messages back up with each request to keep the chat context. Without controls in place to truncate the older messages, it will inevitably grow too large resulting in AJAX errors related to using too many tokens in the request. Once that limit is hit, you currently have to reload the page. The good news there is I have this mostly fixed and will re-roll a patch to when time permits!

Regarding taking 'too long' to respond, the PHP library just recently added the ability to use streaming of the response. I'm waiting on that to be fully supported, which will make it possible to animate the response and give more immediate results to the client. Today, it gets everything back in 'one go', which can be anywhere between 3 and 15 seconds, depending on ChatGPT's load at the time.

πŸ‡ΊπŸ‡ΈUnited States d0t101101

Regarding #7 above - it seems openai-php client is working towards making this library capable of handling streamed responses from OpenAI APIs. Very much looking forward for support for this! Such a feature will vastly improve the UI/UX here as we can show immediate responses and 'type the response' as its being generated:

https://github.com/openai-php/client/issues/80

https://platform.openai.com/docs/api-reference/completions/create#comple...

πŸ‡ΊπŸ‡ΈUnited States d0t101101

Attaching a more recent screenshot of this patch in action!

πŸ‡ΊπŸ‡ΈUnited States d0t101101

@ressa - ah ha! I've attached an updated patch created using 'git diff', which should do the trick!

πŸ‡ΊπŸ‡ΈUnited States d0t101101

Also, FYI here, I did invest some time into getting this new ChatGPT form to animate the typing of the response, to mimic Chat.openai.com. Chat.openai.com seems to use its own backend APIs to get a 'stream' of the response, as its being created, making it possible for them to display results to the user faster as its being generated. Unless I maybe missed something, I don't believe that's possible with the OpenAI PHP client library in use today.

There are many ways to handle the typewritter like animation of the text, ideally in Pure CSS, but also checked out various JS libraries. The 'gotcha' I ran into was handling the code blocks in the response - getting these to cleanly animate along with normal text proved to be a challenge, so I opted to just gut this bell and whistle out for now in favor of displaying the full response as quickly as possible; smooth scroll to the bottom of the page...

πŸ‡ΊπŸ‡ΈUnited States d0t101101

@mindaugasd - thank you, sir!

I personally wasn't familiar with the ECA module at all, until now. :) After toying with it after installation on my Drupal 10 playground environment, I better see your vision there too. Would be incredibly neat to have ECA's ease of use for modeling any websites workflow be able to hook into OpenAI and create content/comments/etc on the fly.

πŸ‡ΊπŸ‡ΈUnited States d0t101101

@kevinquillen - for your consideration, here is a semi-heavy patch (created against latest DEV branch) to vastly improve the openai_chatgpt sub module! Have many ideas in mind for further enhancements, but thought I'd share this earlier on.

πŸ‡ΊπŸ‡ΈUnited States d0t101101

@ressa - thanks@ Coming your way here soon!

πŸ‡ΊπŸ‡ΈUnited States d0t101101

+1

Quality of the GPT-4 Responses are definitively improved!

πŸ‡ΊπŸ‡ΈUnited States d0t101101

Have some quick 'n crude code working on this front, providing repeated ChatGPT requests outputs appending into the same Response textarea.

Next I'd like to somewhat mimic https://chat.openai.com/chat UI/UX; will work on this as time permits! Patch to follow.

πŸ‡ΊπŸ‡ΈUnited States d0t101101

Thanks for merging this in!

@kevinquillen - "For example, after submitting once, a new ask/response field appears, similar to the ChatGPT UI"

I actually thought of that too and took a swing at it already. Ran into some issues with how to append/prepend. Will revisit to see if I can get it working!

πŸ‡ΊπŸ‡ΈUnited States d0t101101

I'm personally a big fan of PostgreSQL overall, and if a possible pg_vector implementation could bridge this gap in your OpenAI module here related to Embeddings (between vectors in Search-centric DBs vs RDBMS), I'd have no hesitation to switching over from MySQL to PostgreSQL everywhere needed!

Other large/established websites might run into challenges of the PostgreSQL DB backend switch and compatibility with other Drupal contributed modules however, so ideally best to support both MySQL and PostgreSQL if reasonably possible. Would obviously be MUCH cleaner and faster to do all computation within the local DB!

πŸ‡ΊπŸ‡ΈUnited States d0t101101

The latest 'Suggest Taxonomy' feature is great and quite powerful! Excellent English grammar skills over there with now requesting 'nouns and adjectives' only too :)

I've noticed that if you repeatedly 'Suggest Taxonomy' again and again, sometimes its in a numbered list, and other times its a comma separated list. Ideally this should be a comma separated list only so that it can be quickly copied and pasted into a Drupal Autocomplete Tags type of input. This seems to do the trick!

'Suggest five words to classify the following text. The words must be nouns or adjectives, comma separated:'

πŸ‡ΊπŸ‡ΈUnited States d0t101101

So far so good here too after pulling in the latest DEV version. Thanks again!

πŸ‡ΊπŸ‡ΈUnited States d0t101101

@kevinquillen - I can also confirm that these tweaks to the OpenAI queries made a dramatic improvement for summarization and taxonomy generation, which is part of the openai_content sub module (which now appears on the node edit pages). Tested this across 10 different nodes with varying subjects and lengths; working great.

Well done, sir!

πŸ‡ΊπŸ‡ΈUnited States d0t101101

@kevinquillen - All points taken; glad to help however I can assist!

Another thought, while building out a small blogger-like website recently with Drupal 10, I didn't want the administrative overhead of keeping the separate SOLR service (or other API) in the mix. Landed on this Drupal module for a very basic 'more like this' functionality, which has been working well thus far for this particular use case. Scalability remains to be seen/validated... In any case, its a very simple approach to a similar problem - connecting related content. This of course wouldn't help 'out of the box' for direct content comparison/matching/searching/de-duplication/etc, and its certainly not taking the sophisticated vector approach for similarities, but does pretty well connecting content assuming the nodes are already classified via taxonomy terms.

https://www.drupal.org/project/similarterms β†’

Otherwise, the underlying DB engine is obviously a key consideration. Mysql has its pros and cons, but is it possible PostgreSQL's 'fuzzy matching' on a per-field basis could boost performance here without the 3rd party dependencies?? Some interesting progress with trigrams and similarity search referenced here:

https://www.postgresql.org/docs/current/pgtrgm.html

πŸ‡ΊπŸ‡ΈUnited States d0t101101

New to the thread, and just signed up for Pinecone to experiment with this too...

This is certainly not my area of expertise, but will say I am a big fan of Apache SOLR. Have used it in numerous Drupal projects and successfully created 'more like this' type of widgets that matched related content almost uncannily well across 100k+ Drupal nodes. A huge advantage is that via the Drupal UI, it can be custom tuned to set weights and influence the score based on particular fields of interest, so for instance a similarity match on Title or Body can have a greater (custom defined) weight then just a Taxonomy term match alone. Also gets into more advanced Geospatial Lat/Lon types of considerations as to how it can score content, has visibility into all of the custom fields, and allows the site Admin to easily influence how scores are generated. How it does all of this under the hood, IDK, but looks like SOLR 9 is adding a lot using neural search capabilities here. I'd personally really prefer to see this type of functionality self hosted in Free and Open Source Software rather than relying on paid 3rd party service wherever possible! At the same time, respect how much time and energy is needed to just 'make it work' :-D

Digging into this, thought these references might be of interest to you with regards to SOLR/OpenAI, if you haven't already come across them. Just food for thought here!

https://github.com/apache/solr/pull/476#issuecomment-1028997829
"dimensionality reduction step"

https://openai.com/blog/introducing-text-and-code-embeddings/
"To visualize the embedding space, we reduced the embedding dimensionality from 2048 to 3 using PCA"

Is it maybe possible to intelligently reduce or otherwise truncate programmatically the vector size from 1536 (OpenAPI) to 1024 (SOLR), to possibly 'pair well'? And then you have the Apache SOLR community behind it to further assist rather than a 'black box' solution! Not bashing at all here; just sayin'!!!

πŸ‡ΊπŸ‡ΈUnited States d0t101101

Closing this - the module authors are working towards taxonomy classifications already. Awesome!

https://git.drupalcode.org/project/openai/-/commit/a7a785b843a54159961e6...

Interest touch with the "Suggest up to five words to classify the following text" statement here, too!

πŸ‡ΊπŸ‡ΈUnited States d0t101101

@kevinquillen - good catch, the upper limit of the MAX is 2 as you already wired in here. Nice! :)

https://platform.openai.com/docs/api-reference/completions#completions/c...

πŸ‡ΊπŸ‡ΈUnited States d0t101101

Sounds good to me!

Regarding the "Return response in HTML format." checkbox, I would be curious to see how CKEditor5 behaves on first render/resave with such an option (i.e. does it change the body text in the DB), but that suppose would depend on the text format options each user has setup on their Drupal install under Admin > Configuration > Content Authoring > Text formats and editors.

Thanks again!

πŸ‡ΊπŸ‡ΈUnited States d0t101101

Similarly, when running drush openai:generate-content (which needs updated in the README.MD BTW!), the validation around temperature wasn't working. In ContentGenerateCommand.php, line 179, I changed 'is_float' to 'is_numeric' to correct this.

Along with my manual testing of both the Content Generation form and drush, it would be great to set the default values for the Content Type/Title/Body fields somehow. Having to select them every time via the form slows down the process, and there isn't a way to do this during the interactive drush command. In drush, after going through the steps and 'Proceed with operation', in my environment, it bombs out not knowing where to store the response with:

 [warning] Undefined array key "content_type" ContentGenerationCommand.php:101
 [warning] Undefined array key "title" ContentGenerationCommand.php:102
 [warning] Undefined array key "body" ContentGenerationCommand.php:103

(and then a slew of errors around these missing arguments...)

Are there any plans to set these values via the OpenAI config pages somehow? Would also be handy to be able to create content via drush with a single terminal command (i.e. all options could be set via the CLI).

Thanks!!!

πŸ‡ΊπŸ‡ΈUnited States d0t101101

@kevinquillen - thank you very much! Pulled the latest alpha2 and tested - working well over here now (minus the Media/Image generator, as you mentioned, this wasn't 'touched' just yet). Just for reference, filling in the Generators Media 'Image description' results in the following error ATM:

Error: Call to undefined method OpenAI\Client::getImageUrl() in Drupal\openai_api\Controller\GenerationController->getImageUrlResponseBodyData() (line 112 of /data/web/modules/contrib/openai/modules/openai_api/src/Controller/GenerationController.php).
Production build 0.69.0 2024