πŸ‡¬πŸ‡§United Kingdom @andrewbelcher

Account created on 21 November 2009, over 14 years ago
#

Merge Requests

Recent comments

πŸ‡¬πŸ‡§United Kingdom andrewbelcher
πŸ‡¬πŸ‡§United Kingdom andrewbelcher
πŸ‡¬πŸ‡§United Kingdom andrewbelcher

andrewbelcher β†’ created an issue.

πŸ‡¬πŸ‡§United Kingdom andrewbelcher

@scott_euser apologies for the silence. Could you take a look at πŸ“Œ Add decouple Milvus support Needs review ? We've put effort into pluggable backends and that issue adds Milvus support.

πŸ‡¬πŸ‡§United Kingdom andrewbelcher

Can we get an update hook to enable the new submodule to provide an upgrade path?

We'll also need to fix πŸ“Œ [P1] Update pinecone backend to use new engines & base class Active before we do a release, as this change won't support existing pinecone setups without it.

πŸ‡¬πŸ‡§United Kingdom andrewbelcher

Reviewed along side ✨ Add a trait, base class and interface for AI Search API backend engines RTBC and looks good!

Worth noting this probably mostly solves πŸ“Œ Implement Milvus as a vector storage backend Active as well (and crediting @scott_euser accordingly)

πŸ‡¬πŸ‡§United Kingdom andrewbelcher

Reviewed along side πŸ“Œ Add decouple Milvus support Needs review and ✨ Add decoupled Postgres + pgvector support Needs review - very happy with the approach and will make it easier to add various backends as required. I have opened πŸ“Œ [P1] Update pinecone backend to use new engines & base class Active as a follow up to update the pinecone backend.

πŸ‡¬πŸ‡§United Kingdom andrewbelcher

Reviewed and tested in conjunction with ✨ Make Embeddings Engine Pluggable RTBC - nice approach and will be easy to add more providers.

πŸ‡¬πŸ‡§United Kingdom andrewbelcher

Reviewed and tested in conjunction with #3450812 - nice to move the embeddings into something configurable and swappable, with scope to bring in more models and providers.

πŸ‡¬πŸ‡§United Kingdom andrewbelcher

So I think there are a couple things in place already that help with this?

  1. You can index multiple fields in the embeddings, so the author etc can already be in there. However, at the moment it'll just blindly do the value, without any context of the label. That might be a worthwhile option (i.e. prefix with label). I think ensuring chunking doesn't split up small fields would also be wanted.
  2. You can index the rendered entity in the embeddings, so you can set up an "embedding" display mode, then use entity view displays to set up the exact rendering you want, including/excluding labels as appropriate.
πŸ‡¬πŸ‡§United Kingdom andrewbelcher

Yes, i think so. We're running that patch in prod and it is working nicely.

πŸ‡¬πŸ‡§United Kingdom andrewbelcher

Switched to needs work pending tests.

Have also added a new patch that fixes some indentation and an error when a menu item is not translatable.

In terms of items we need to test this with:

  • Menu is/is not translatable
  • Item is/is not a MenuLinkContent

Test should validate:

  • langcode only included on output for MenuLinkContent on translatable menus
  • Cacheability is correct in all scenarios
  • No errors for non-translatable menu links (the bug I've just fixed)
πŸ‡¬πŸ‡§United Kingdom andrewbelcher

@scot_euser thanks for this!

I think rather than swapping out Pinecone for Milvus as the only supported backend, it would be better to use an interface that can be implemented and then make the places that use the index capable of working with any backend that implements the interface. This will also pave the way for SOLR to be added into the mix once it has support for SOLR when it has the appropriate vector depth.

I think there is a wider issue of lots of modules that overlap. OpenAI now has a VectorClientInterface that overlaps a lot with what Search API AI is aiming to do. I think leveraging Search API to achieve the tracking/indexing makes more sense than OpenAI doing that itself. It's also agnostic to OpenAI itself, as you may use an alternative method for embedding.

My suggestion would be that we:

  • Have a separate, lightweight library β†’ for the interfaces. This will make it easier for modules such as Search API SOLR to support AI without a full module dependency. This could be done at a later date when SOLR is ready if we don't want the additional overhead right now.
  • Abstract the relevant methods required for AI specific tasks into an interface. The Pinecone and Milvus backends would then implement that.
  • Where checking which indexes are possible to use, we check for the interface rather than the specific backend.
  • OpenAI could then make use of Search API AI to handle the vector repositories etc, and provide an embedding plugin for Search API AI to be configured with (i.e. #3361507 could again provide an interface, OpenAI can then have it's implementation).

I see this PR also fixes some compatibility issues with free Pinecode / OpenAI's updates. I believe those are the same that are in #3403561? I'll try and get that reviewed and merged in the next few days, that can keep this issue a bit cleaner!

πŸ‡¬πŸ‡§United Kingdom andrewbelcher

This feature actually paves way to fix a bug, which is that there are a couple broken bits of cache when using menu item content:

  1. The JSON:API normalisation cache uses the language of the ResourceObject as part of it's cache key, resulting in normalisation getting cached incorrectly for language specific requests including translated menu item content.
  2. The cacheability doesn't vary enough for the language content. Adding a cacheable dependency on the content (which will also resolve cache tags etc) fixes that.

As such, have updated the patch and updated the category/priority of this issue!

πŸ‡¬πŸ‡§United Kingdom andrewbelcher

Here is a starter on an approach for the persistent cache. It caches based on the resource type and include path (parts, concatenated and hashed).

Based on the performance testing below, there is definitely potential with warm caches, but a hit with cold caches.

Some initial local performance test (with memcache as the default backend):

Baseline: 325ms

Patch as is:
- With cold caches: 1,530ms
- With warm caches: 137ms

Using cache.jsonapi_resource_types (chained memory & default):
- With cold caches: 1,100ms
- With warm caches: 146ms

As above but with the database as the default backend:

Baseline: 340ms

Patch as is:
- With cold caches: 2,460ms
- With warm caches: 175ms

Using cache.jsonapi_resource_types (chained memory & default):
- With cold caches: 2,550ms
- With warm caches: 177ms

πŸ‡¬πŸ‡§United Kingdom andrewbelcher

Have opened a MR with a fix for this. The issue is actually due to the checkbox that gets exposed and therefore submits a value when you submit the views exposed form. To fix this, I am excluding the filter value from the checkbox, which acts as a reset in the URL.

πŸ‡¬πŸ‡§United Kingdom andrewbelcher

andrewbelcher β†’ made their first commit to this issue’s fork.

Production build 0.69.0 2024