Updated issue summary steps
Thanks for looking into it, I retested again and I can see I used \Drupal::httpClient()->get('https://drupal11.ddev.site/search?query=test&skip_debug=true'); (updated with your skip debug snippet) without search module in core enabled, and without yet creating a /search page. So that results in 404 not found.
Whereas my test Search API page \Drupal::httpClient()->get('https://drupal11.ddev.site/test-search-api?query=test&skip_debug=true'); then everything works fine after drush cr.
So it seems like causing an exception (GuzzleHttp\Exception\ClientException 404 response there) helped to reproduce the issue - I suppose it sort of equivalent to the Serialization failure: 1213 Deadlock found insofar as its an exception.
So I suppose in that case the try catch in search_api_views_data() is then stopping it from populating the $data array with Search API data, whereas other hook_views_data() don't seem to have a try catch which would stopped Core from continuing to build the views data and instead makes it think it has built it successfully and store it cache?
Surprisingly however I don't see this appearing in the logs as a result:
catch (\Exception $e) {
$args = [
'%index' => $index->label(),
];
Error::logException(\Drupal::logger('search_api'), $e, '%type while computing Views data for index %index: @message in %function (line %line of %file).', $args);
}
only the guzzle error shows up
Feedback from @AstonVictor addressed btw, back to ready for review. Updated the issue summary to make it more clear
Updated issue summary. I believe there is one outstanding comment whether this should be set up to more broadly cover moderated content beyond Nodes.
As discussed further offline, just to recap - closing this because custom token is the way to go that is something project specific like this. Thanks!
Unable to retest because of ✨ Cloudflare PHP SDK is abandoned, affects v1 and v2 Active but resolved conflicts. Its stored as a string so left as a string not sequence. Or do you want to split the textarea prior to save and recombine when setting default value?
You can use the rendered entity instead, then you can control within a view mode what exactly you would like to have chunked. Hope that helps!
From 1:06 in the video here I do a walk through of setup
https://www.drupal.org/project/ai/issues/3478581
🌱
#2 Drupal AI Online Meetup
Active
At 1:40 I get into combining search api database and ai search
Hope that helps!
Thanks!
scott_euser → made their first commit to this issue’s fork.
scott_euser → created an issue.
For those coming across this, here is the link → to all modules tagged with the new category to help see if there are AI modules you know that are still missing the categorisation at a glance.
Updated issue summary to be more clear
I managed to reliably reproduce the error locally in ddev with a bit of a hack per https://www.drupal.org/project/search_api/issues/3029653#comment-16226130 🐛 SQLSTATE[40001]: Serialization failure: 1213 Deadlock found Postponed: needs info and I can see that the current MR does not solve it. I provided details there rather than here as it could be this solves a separate issue.
Okay I managed to reliably reproduce this now with a bit of a hack to essentially force a request as if a visitor is visiting the site at the exact wrong moment as described in #18.
- Have a /search page for nodes with an entity reference exposed filter
- Just inside
_search_api_views_get_handlers()
function add something temporary like:\Drupal::httpClient()->get('https://mysite.ddev.site/search?query=test');
- Run
drush cr
- Visit your https://mysite.ddev.site/search
- See that the view loads find by filter handlers like entity reference are silently missing without any errors thrown
I can see that 🐛 Consider to remove exception handling from search_api_views_data Needs review does not help in that scenario - the filters will still be missing - so I think this still belongs here as a separate issue.
This is also reliably reproducable with Standard install profile of Drupal Core:
- Enable "Database Search Defaults" submodule of Search API.
- Create a few nodes and add tags to them.
- Create a new View like 'Test Search API' using the default search api index that submodule creates and add an exposed filter for 'Tags' on it. Add the above test code into _search_api_views_get_handlers()
- drush cr
And even update preview within the view fails to load exposed filter and no results loaded, subsequent updates continue to fail to load as the incorrect views data is now cached.
Found another more significant issue: if you install e.g. "Database Search Defaults" from Search API module and try to use a normal non-vector index, when you click index now at /admin/config/search/search-api/index/default_index all nodes get indexed but it gets stuck showing 0% complete rather than 100% complete. So our functionality here is seeming into indexes beyond our own it seems.
@iamdroid did you end up managing to solve it for your site(s)?
Okay looks good thanks! We just need to do coordinated release because Pinecone & Milvus both need updates to match this (and really the other unsupported VDB providers listed on AI module homepage do as well I think).
Thanks! Lot less boilerplate. Just re-running the failed test as it looks like a more random/unrelated test failure
If nobody else cares its not a hill I'll die over, but I really do think the progress bar is better off showing the overall progress rather than the progress of the individual item. E.g.
- Total progress 55%
- Reaches item with 9 chunks
- Progress bar remains at 55%
- Message below the bar changes like "Processed chunk 1 of 9 for item ...", "Processed chunk 2 of 9 for item ...", etc
- Then progress bar moves on to next percentage when next content item is moved to
In any case there are unresolved comments like Milvus specific code in there though, so not ready yet either way, sorry! Thanks for all the persistence with this
Side note, we after dealing with this we can reconsider if 📌 Deprecate AiVdbProviderInterface::getVdbIds() Active can be dealt with
Hmm actually we may need this for 📌 Improve AI Search Module Indexing to Handle Long-Running Chunk Embedding Processes Needs work for the ChunkQueueWorker. Going to postpone to see where that goes
Thanks! I think its closer, but there is a Milvus specific chunk of code that has come in here I think (added a comment to the MR)
I think we'll have a number of changes looking at issue queue, might be easier to target 2.0 with a bunch of these issues as I believe we'll have 1.2 release soon and lose the ability to add version constraints to composer
Checked and this is doable, steps would be this I believe:
- Copy the current state of deleteItems() from AiVdbProviderClientBase into Milvus
- Remove the deleteItems() method from AiVdbProviderClientBase since its Milvus specific code that others currently override
- Remove getVdbIds() from AiVdbProviderInterface
- Do coordinated release (e.g. 2.0 branch) for Milvus
Given its experimental I don't think we need to do an actual deprecation throw unless others disagree. We can just add to release notes. Currently AI module homepage does not list any supported VDB provider other than Milvus that would need coordinated update.
Okay I believe what is needed left:
- An update to search_api_ai_search.backend.schema.yml to add the new config to the schema
- An update hook like ai_search_update_10003 which sets the new config default value to false
Okay 📌 Reinstate MySQL tests Active is now fixed, rebased now
I suspect this is an upstream issue as I have seen this error in the past and required cache clearing to sort. Some possibilities:
- 🐛 SQLSTATE[40001]: Serialization failure: 1213 Deadlock found Postponed: needs info
- 🐛 Consider to remove exception handling from search_api_views_data Needs review
Feel free to re-open if we can have steps to reproduce that related to AI Search. Thanks!
Actually this is fine, given its experimental and it seems of those listed on AI module homepage, only Pinecone and Milvus are actively supported VDB providers (checked and Azure & SQLlite & Postgres) and they seem to lock to old AI core versions, and they continue to work fine since they extend AiVdbProviderClientBase.
Unless someone else has concerns, we can merge this.
Thanks! Added a comment to the MR
Thanks! I don't think that was quite what @kim.pepper was referring to. If you search for "vdbProviderManager->createInstance" in `SearchApiAiSearchBackend` you'll find many calls to creating a new instance whereas instead we should call $this->getClient()
Doesn't seem to do any harm since we are implementing all methods from it anyways it seems
Thanks makes sense. I think we might need to target 2.0 though to be able to have the VDB Providers separately lock in to earlier/later versions, but will start a slack thread discussion because there seems to be several issues in the queue that change the interfaces/methods so will get confusing talking about it in multiple issues.
I'm not sure we'll ever get them aligned as we have been building them based on what is most sensible for a free user (which in some cases is also most sensible for a paid user) who wants multiple Search API Indexes (in the Search API sense of the word).
Pinecone free tier (serverless) :
- Index = Search API Server https://docs.pinecone.io/guides/manage-data/manage-indexes
- Namespace = Search API Index https://docs.pinecone.io/guides/manage-data/manage-namespaces
Milvus/Zilliz https://milvus.io/docs/manage-collections.md :
- Database = Search API Server https://milvus.io/docs/manage_databases.md
- Collection = Search API Index https://milvus.io/docs/manage-collections.md
I think when @seogow (if I remember right) converted the first to be a VDB provider, it was with Milvus/Zilliz which is why the language in AiVdbProviderInterface is *Collection.
Maybe lining up with Search API terminology and ignoring the conflicting terms of the providers themselves might be less confusing for users. What do you think?
scott_euser → made their first commit to this issue’s fork.
scott_euser → made their first commit to this issue’s fork.
scott_euser → made their first commit to this issue’s fork.
scott_euser → made their first commit to this issue’s fork.
Sorry can you provide a bit more info? What is the error and how can it be reproduced?
Otherwise can you confirm main View for search is Search API Database? For SOLR there is 📌 Check that SOLR 'boost' of results finds results that are not found by traditional SOLR search Active open which means SOLR for now only boosts results that are also found by main View whereas database version also finds results that are not otherwise found by the database search alone
Hi! Might be the 0,2 instead of 0.2 for relevance threshold
Could it be a custom hook token that then subsequently evaluates default token setup, and then strips stopwords from the evaluated token?
Okay turns out just minor phpcs. Main test failure was 🐛 PHPUnit job hangs as of update to PHPUnit 11.5.29 Active which they resolved yesterday (or added a temp workaround at least)
scott_euser → created an issue.
scott_euser → created an issue.
Beautiful, thank you!
scott_euser → made their first commit to this issue’s fork.
Given it's a test only change that can easily be reverted again, merging to be able to then progress again on other ai search issues
Okay retry passed all green now. I think food to merge, unless any concerns Marcus?
Looks like mostly passing now but one error in Drupal\Tests\ai\Kernel\Utility\TextChunkerTest which I guess must also be an error elsewhere as I don't think at a glance that it has to do with this reinstate.
Search API change is now fixed, waiting on 🐛 PHPUnit job hangs as of update to PHPUnit 11.5.29 Active now for Gitlab CI to be working again
Confirmed locking to 11.5.28 solves it https://git.drupalcode.org/project/page_to_pdf/-/merge_requests/30/diffs?commit_id=1f61289de256f556304b775a958584f89422ae40 (temp of course) - so its definitely the change in 11.5.28...11.5.29 somewhere in here https://github.com/sebastianbergmann/phpunit/compare/11.5.28...11.5.29
With --debug
on it just seems to hang on 'Child process started' https://git.drupalcode.org/project/page_to_pdf/-/jobs/6162154 - no further debug info there unfortunately to see where the child process is hanging.
Seems will be blocked on 🐛 PHPUnit job hangs as of update to PHPUnit 11.5.29 Active
Thanks for working on this! A couple comments though:
- The merge request itself is empty; would suggest using the buttons in the issue summary area to create branch & fork & merge request
- Looking at what's here so far, I think that /admin/config/media/file-system covers most of the options around filenames (lowercase/remove space/transliterate).
- The removal of stop-words is an opinionated bit that is perhaps either better kept as a custom token using the "Generated PDF filename pattern".
Test coverage being addressed in 🐛 Fix test coverage Active as its failing in monthly pipeline as well, not related to here (or at least not fully).
So possibly this is 'Closed (works as designed)' but feel free to disagree
scott_euser → created an issue.
Thanks!
scott_euser → made their first commit to this issue’s fork.
scott_euser → created an issue.
scott_euser → created an issue.
Thank you for the work on this! Added a nitpick in addition to Kevin's comment in #8 but would be good to add a tiny bit to the test here to ensure removal continues to work in the future https://git.drupalcode.org/project/ai/-/blob/1.2.x/tests/src/Functional/...
Thank you!
Yeah the problem I had here was how to actually build context without knowing where the prompt element was being used (which could be anywhere). So for now in the docs I just said it's the responsibility of the developer implementing the element and gave sample code https://project.pages.drupalcode.org/ai/1.2.x/developers/ai_prompt_manag...
Probably someone will have a bright idea how to automatically do that but I got stuck at the context bit and didn't want to let perfect be enemy of good here. Open to suggestions :)
Yep perfectly happy with the new MR "3531256-views-row-alter-checks" you created with just the initial commit from @vegardjo and your explanation for not wanting an extra sub-module fair enough - perhaps if the use statement somehow causes a fatal error not existing for some scenario we haven't come across then a follow-up can be raised with lower priority. Thank you!!
Ah and tests now pass after updating fork now that 🐛 Fix failing pipelines Active is resolved (thank you!)
Rereading the comments and checking the code change, moving to submodule does seem to be the right way to fix it rather than just masking the problem like initial commit. If we extend a core module we should make sure it is enabled. The approach of auto-enabling the sub-module when in use is a pattern we used before when splitting out providers from the AI Core module.
That said it could be split into a seperate follow-up issue if it means getting it fixed sooner with just the initial commit workaround. At least for me this is slowing progress on AI Search since it's harder to keep up with issue queue there with tests failing and both options would unblock that.
Great thank you!
Doesn't sound fun! Couldn't see anything at a glance causing the remaining fails :(
🐛 The new purge() method is missing from \Drupal\Core\Database\Transaction\TransactionManagerInterface Active this is the interface missing reference with workaround for others.
I think it depends on provider, in this case I think the problem was Milvus if I remember right (would have to check). Pinecone was fine for it. I suppose that could be a problem for Milvus specifically to sort then... Haven't checked others
Hi! It's blocked at the moment due to an issue with Search API 📌 Reinstate MySQL tests Active
Thanks for the contribution! Yeah makes sense to me re version and timeline. Is there any reason not to also have ::getDimensions() given the eventual Symfony AI Platform compatibility plan?
@kim.pepper sounds like you are making a new VDB Provider so BTW once you have in place we can reference from AI module homepage + add to docs https://project.pages.drupalcode.org/ai/1.2.x/providers/matris/ - helps also keep track of maintainers to contact with breaking changes/coordinated releases particularly while we are still in experimental.
Ah and the attachment here →
Thanks for progressing this! When I tried it out I hit a fair number of errors that I had to hack around to get it working (comments added to MR). When it is chunking its still fairly confusing. In my attached video I have 3 items being indexed, all get chunked in various quantities of chunks. It seems to say 100% quite quickly but does a lot more and I had trouble following along to understand where I am in the process. It does seem to properly index it all in the end.
Confirmed tests pass locally again when applying the patch from 🐛 Cannot create Search API based views On Drupal 11 Active
Thank you for sorting this one out! This also fixes the AI Search module (sub-module of AI core) where tests had to be disabled temporarily as a result: 📌 Reinstate MySQL tests Active
Very vague error so it was a real pain to see where it was going wrong. For convenience, the error:
Ai Search Setup My Sql (Drupal\Tests\ai_search\Functional\AiSearchSetupMySql)
⚠ Field indexing options
✘ Search view
┐
├ Exception: Warning: Undefined array key "title"
├ Drupal\views\Views::fetchPluginNames()() (Line: 157)
Looks like the problem is upstream in 🐛 Cannot create Search API based views On Drupal 11 Active
Thanks both!
Closing per comment above unless we hear further.
Closing per comment above
Thanks for the contribution, however Taxonomy Index Tid is a valid filter though and already allows things like BEF to work on it (compared to numeric/string entity reference filters).
If this module was to accept such a change it would need to be opt in, not default (e.g. a new config page) and include test coverage. Otherwise we age forcing a new behaviour on all existing users of the module which meant others may not want. For now will close but if multiple people feel strongly and are willing to contribute the above, could reconsider.
Okay that now handles:
- New module install, navigation not yet enabled but gets installed later (hook_modules_installed kicks in)
- New module install, navigation already enabled (hook_install kicks in)
- Existing module update, navigation already enabled (hook_update_N kicks in)
- Existing module update, navigation not yet enabled but gets installed later (hook_modules_installed kicks in)
I think this needs a new 2.x to make this Drupal 11+ only
After:
Thank you!
scott_euser → made their first commit to this issue’s fork.
Hi! Anything I can do here to help get this over the line? Thank you!
Yeah the problem with placing the block is we need to check if navigation is installed, but also need to react if it's not yet installed and later gets installed. I'll see if I can spend a bit more time on this to see how to do that
This commit essentially https://git.drupalcode.org/project/rebuild_cache_access/-/merge_requests... - no css needed in the end, it just takes the styling from Navigation module defaults so less maintenance burden also if they change things
Actually could just leverage the Navigation module:
Custom icon could be a potential follow-up too, but think its not worth being a blocker.