- Issue created by @prashant.c
- Merge request !36Added default model and provider for each operation type dynamically. → (Merged) created by prabha1997
- 🇮🇳India prabha1997
I’ve added a dynamic implementation to set the default provider and model ID for each operation type.
- 🇮🇳India Ishani Patel
I've checked with 1.1. x of the AI branch, and it is working as expected. Please refer to the attached screenshot.
Thank you!
- 🇮🇳India prashant.c Dharamshala
Left some review comments, moving to NW. Kindly address those and move back to NR.
- 🇩🇪Germany marcus_johansson
I don't think they should be dynamically set in the providers with known choices - these are editorial choices of models that fit as the best possible model for tested use cases like Drupal CMS agents, but still cheap. gpt-4 was always better than gpt-4o on text tasks, but we still choose 4o because it was faster and cheaper.
In many providers we do not even set these, because they are not up to the standards for every operation type.
The question is if its time to bump up all the chat_with_complex_json, chat_with_structured_json, chat_with_tools to 4.1 since its cheaper and better (not tested with Drupal CMS yet).
WDYT?
Techincally the code looks good, just wondering if it makes sense since the order coming back could in theory pick 3.5 turbo on json and tool calling, since its possible.
- 🇩🇪Germany marcus_johansson
A way to cleanup code is to add it here and then link to this in the form?
- 🇮🇳India prabha1997
Thank you for the feedback! As of now, I’m getting the default models from the plugin and have set them dynamically like this. Please review this approach. I understand your point about editorial choices, and regarding bumping models to GPT-4.1, we can wait until the tests with Drupal CMS are completed. Once we have more insights, we can make the necessary changes.
- 🇩🇪Germany marcus_johansson
I did some initial tests with 4.1 vs 4o and they work more or less the same on the small data set I've tested against: https://docs.google.com/spreadsheets/d/1IfWlRhlq2E4Lh5R53W3qicink9eDzi6U.... The failures happening are code or prompt failures, rather than provider issues.
Since 4.1 is cheaper I think at least for chat_with_tools we could set it?
- 🇮🇳India prabha1997
Hi @marcus_johansson,
I'm currently unable to access the spreadsheet — it looks like I don't have permission. Could you please grant me access?
Thanks! - 🇩🇪Germany marcus_johansson
Could you make a request for it or write your e-mail (on Slack if you don't want to publish it here)?
- 🇮🇳India prabha1997
Hi @marcus_johansson,
Thanks for sharing access — I’ve checked the sheet. While I understand the data, I’m not yet at the level where I feel confident analyzing and comparing the cost-effectiveness between 4.1 and 4o in depth. Automatically closed - issue fixed for 2 weeks with no activity.