Make the translator able to translate longer HTML content

Issue created by @amourow
@amourow opened merge request.
Status changed to Needs review about 2 years ago11:10am 2 September 2023
Comment about 2 years ago →
🇹🇼Taiwan amourow
Made changes to make the chunk processing and Batch API work.
Comment about 2 years ago →
🇹🇭Thailand AlfTheCat
This is great to have!

For me the patch didn't work, the log shows:

Error: Class "Rajentrivedi\TokenizerX\TokenizerX" not found in Drupal\tmgmt_openai\Plugin\tmgmt\Translator\OpenAiTranslator->countTokens() (line 211 of /var/www/XXXX/modules/contrib/tmgmt_openai/src/Plugin/tmgmt/Translator/OpenAiTranslator.php).

Hope this helps.
Comment about 2 years ago →
🇹🇼Taiwan amourow
@AlfTheCat

because the additional library to calculate the Token is added in the composer.json.

Adding the patch via the root composer.json won't install it.
Because during the composer update, the additional vendor is not yet in the composer.json of this module.

There are two ways you can do.

1. do composer require rajentrivedi/tokenizer-x in the drupal project to make sure it installed
2. Since the maintainer merge the MR, you can try require this module with version 1.x-dev

It should work for you, and please let me know how does it work.
@amourow opened merge request.
Comment about 2 years ago →
System Message
amourow → closed merge request !4
Status changed to Fixed about 2 years ago6:20am 16 September 2023
Comment about 2 years ago →
🇹🇼Taiwan amourow
Mark this fixed due the merged MR.
Status changed to Fixed about 2 years ago6:22am 16 September 2023
Comment about 2 years ago →
🇹🇼Taiwan amourow
Comment about 2 years ago →
🇹🇭Thailand AlfTheCat
@amourow Thanks for the info! Option 2 didn't work for me, composer reports that it can't find a dev release of this module. I don't see it under "all releases" on the project page either.

Option 1 worked though, I'm no longer getting errors. I don't see an option in the UI according to the proposed solution:

"First, before sending the OpenAI request, split content into smaller chunks. The length of chunks can be defined by user according to the model and server environment."

If that's by design then I suppose it works :)

Thanks again!
Comment about 2 years ago →
🇹🇼Taiwan amourow
@AlfTheCat

I realized the option 2 doesn't work in the module recently, because the maintainer doesn't have the branch.
Let's make a issue for that.

There is indeed one additional field "Maximum chunk tokens" in the provider settings.
It defines how maximum tokens per chunk when sending the text in the request of OpenAI API.

Also, I have found the update here may cause two batch runner, so I made another patch to fix the issue.
Can you also try include the patch from 🐛 Reduce redundant batch runner Active along with this one?
Comment about 2 years ago →
🇹🇭Thailand AlfTheCat
Awesome, thanks, I found that setting :)

Make the translator able to translate longer HTML content

Problem/Motivation

Proposed resolution

User interface changes

Comments & Activities