Restrict RAG content based on context like group, taxonomy, or user

Created on 15 October 2024, 6 months ago

Problem/Motivation

RAG is focused on the user being able to upload data, pdf, spreadsheets, etc locally and to have the LLM that they use be able to use that data specifically.

This question is in relation to the content that is uploaded. Can we restrict the LLM access to use the uploaded RAG by the user and maybe by the group or other content to ensure that no one else on the system without access can use that content?

I am using http://drupal.org/project/group but I should be able to use any context like taxonomy and modules like http://drupal.org/project/tac to restrict access to uploaded content.

Steps to reproduce

Proposed resolution

Remaining tasks

User interface changes

API changes

Data model changes

πŸ’¬ Support request
Status

Active

Version

1.0

Component

AI Core module

Created by

πŸ‡ΊπŸ‡ΈUnited States SocialNicheGuru

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Comments & Activities

  • Issue created by @SocialNicheGuru
  • πŸ‡ΊπŸ‡ΈUnited States SocialNicheGuru
  • πŸ‡©πŸ‡ͺGermany marcus_johansson

    There is a setting in Search API, that you can set when doing the query, however I think its off in most uses. If you set search_api_bypass_access to false, then each of the results from AI Search will do an entity check on each entity if they user has view access to it.

    This will of course be very very slow, because it will load each entitiy and check access to it individually, then if its not enough result it will do another search trying up to 10 times to see if it gets enough content or if it goes under some threshold.

    So not recommended, unless its for a few editor users or something like this, but it would be the most secure way of doing it.

    The second way is that AI Search now allows for filters, so if you can index the filters you need and use Views or a query in code that uses these, it should filter on this.

    Hope this answers your question.

  • πŸ‡ΊπŸ‡ΈUnited States SocialNicheGuru

    OK so setting up search_api_ai is crucial to using the correct documentation and ensuring only content accessible to the user is used in answering AI queries and used by AI agents, etc.

    Thank you.

  • πŸ‡¬πŸ‡§United Kingdom scott_euser

    Two things allow you to do this more efficiently:

    1. The filters Marcus mentioned; though support for filters is somewhat limited in most vector databases; ie, don't expect all the features of views filters/views exposed filters
    2. Use the combined approach (see docs) where you can have search api database or solr handle the filtering and simply use the vector database to boost results

    At the moment I believe there is no way to use the results of a View in e.g. AI Assistants, but that could be created as an AI Assistant Action plugin. See Drupal\ai_search\Plugin\AiAssistantAction\RagAction as an example action (ie, the action the assistant uses without Views in between).

  • πŸ‡¬πŸ‡§United Kingdom scott_euser

    OK so setting up search_api_ai is crucial to using the correct documentation and ensuring only content accessible to the user is used in answering AI queries and used by AI agents, etc.

    Thank you.

    Not really, AI Search now supercedes that

  • πŸ‡ΊπŸ‡ΈUnited States SocialNicheGuru

    thanks. I need go back and rewatch the videos to get a better understanding of how it all fits together now. There is great, rapid work being done with the AI module.

    AI_search should do what I thought search_api_ai did.

    OK so setting up ai_search is crucial to using the correct documentation and ensuring only content accessible to the user is used in answering AI queries and used by AI agents, etc.

    I will investigate ai_search setup.

  • πŸ‡¬πŸ‡§United Kingdom scott_euser

    Conveniently on Thursday I'll be running a talk/discussion on it https://www.drupal.org/community/events/drupal-ai-meetup-2024-10-17 β†’ virtual in case its helpful.

    Also a bit of an overview in the docs here: https://project.pages.drupalcode.org/ai/modules/ai_search/

  • πŸ‡ΊπŸ‡ΈUnited States SocialNicheGuru
  • πŸ‡ΊπŸ‡ΈUnited States SocialNicheGuru

    @scott_euser, what is the actual start time. on the page there is a 3pm CEST and 7pm CEST. Thanks!

  • πŸ‡¬πŸ‡§United Kingdom scott_euser

    Yeah it's super confusing, I messaged Nico about it and we came to the conclusion that the auto-date is buggy do he manually wrote out all the times - ie, the manually written 7pm CEST is right. The auto date (which is 3pm CEST for you, but for me it for example says 8pm CEST is wrong)

  • πŸ‡©πŸ‡ͺGermany marcus_johansson

    Was this solved @socialnicheguru and can be closed?

  • πŸ‡ΊπŸ‡ΈUnited States SocialNicheGuru
Production build 0.71.5 2024