Track Summary
This issue outlines the scope, technical specifications, and user stories for a new AI Image and Media Agents Track feature, designed to integrate AI capabilities into the Drupal Views media module and ecosystem.
- Description: Empower the Media library with AI to make it easier to find specific media, improve the creation of metadata about an image that specifically take advantage of what AI Search can offer. It may include tools for transforming images directly in Drupal.
- Workstream: 1A: Smart Content/ Page CreationAllows AI Agents to find images in the library and insert them into the landing page, bringing landing pages to life.
- Business Value: Provides a collections of tools addressing many issues people have with manipulating images and findings them.
- Lead: ???
Issues and modules included:
Deadlines
Introduction and Scenarios
The Drupal AI Views Agent will enhance the Views module by allowing users to interact with it through natural language. This feature will utilize Drupal AI agents and sub-agents to interpret user commands and translate them into specific Views configurations.
Sample scenarios include:
- An event organizer wants to bring in speaker headshots into their event landing page.
- A non-profit organization wants to their landing page to find some stock image in their library that really communicates the feel of their new initiative.
- A content creator needs help taking their existing image in the library and fitting it into this specific landing page including having a good focus, size, etc at all the difference responisive points.
These scenarios are created based on https://new.drupal.org/assets/2025-06/Drupal-AI-Strategy-June-25_0.pdf.
Scope
The project's scope covers:
- Development of the Drupal AI Media Finder Agent for finding specifc AI Agents. This can be used as a chatbot, within the XB AI Agent workflow or directly through a UI. This includes improving the metadata within a media entity to enable AI to find things more easily.
- Integration with the existing Drupal AI Agent framework.
- Identification and conversion of key parameters from natural language input (e.g., what filters should be used for search based on the goal and intention of the user.)
- Programmatic creation and configuration of media entities based on context and metadata to enable more powerful AI Search.
- Potential Media dashboard that can use AI to perform a variety of transforms and operations to existing media (Lower Priority)
Technical Specifications
The Drupal AI Image and Media Agents Track will process natural language instructions using an LLM to derive specific Views parameters.
An example instruction:
"TODO"
Extracted parameters from this instruction would include: TODO
- Content type:
article
- Display type:
block
- Style type:
grid
- Style specific options:
3x3
- Fields to use:
title, summary, image, link to the article
- Limit:
9
- Sort order:
published date descending
- Content status:
published
These parameters will then be passed to a function call plugin to find/ update the media entity..
User Stories
The user stories for covering the AI agent's ability:
TODO Borrorwed from Views)
Future Considerations
Future enhancements may include:
- Advanced filtering options (date ranges, taxonomy terms, user fields).
- Relationship support between content types.
- Contextual filters/arguments.
- Exposed filters.
- Internationalization for multi-language support.
Risks
Potential risks identified are:
- LLM inaccuracies.
- Complexity of where functionality for this should reside (AI module? Media Module, its own module?) For now - perhaps focus on playground and forking everything?
- Performance overhead.
- Security vulnerabilities from natural language interpretation.
- Instability and breaking changes due to ongoing Drupal AI module development.
- Prompt Injection from uploaded images/ videos included as text but hidden in the image.
Next Steps
The next steps involve:
TODO
- Review and approval of this document.
- Detailed technical design phase.
- Establishment of a comprehensive project plan and timeline.
- Commencement of development.
The detailed document can be found at https://docs.google.com/document/d/16yCLk7Q7WWbbqH9PxD6yJX9VufHNUqEWOGCW...
Remaining tasks
Issues and Epics
Phase 1: Core Search and Discovery for XB AI (Essential)
- Provide the AI Search Media Library as a tool/Agent - Expose AI search of media library as a tool that agents can use to create chatbots that can find media or other forms of automation.
- Problem: For AI Agents - Many problems that involve using AI to help build something will benefit from AI finding good images.
- Workstream: AI Agents can provide images to landing pages from the end-users own library, making pages feel more relevant instead of relying on placeholder images (1A: Smart Content/Page Creation)
- AI Search enabled Media Library - Enable the use of vector databases and/or agentic search to find images in the media library.
- Problem: For Content Editors - Often when adding an image you know conceptually what you want but Drupal's media library only allows searching by name, making it difficult to find images (e.g., a night time landscape picture).
- Workstream: We need to make it so humans can find things effectively before providing this to agents for automation (1A: Smart Content/Page Creation)
- Visual Review Agents - Have the ability to scan pages and take images/screenshots that can be saved and used elsewhere in a process or provided to AI to help AI improve the images/page layouts.
- Problem: For AI Agents - Currently when a user asks AI to produce something visual, AI will regularly get things wrong the first time. Humans have to keep going back to the AI to improve upon its results.
- Workstream: Important for all XB related agents to make pages look much better first time round (1A: Smart Content/Page Creation)
Phase 2: Improved Semantic Search of Media (Nice to Have)
- Media Metadata Overrides - Create the ability for metadata such as alt-text on a media entity to be overridden when used in a specific field/page.
- Problem: For Content Editors - Alt-text is created for the media entity and can't be changed per usage, but context-specific alt-text may be needed for different pages.
- Workstream: Analytics agents may need metadata on images for accessibility or search to change on specific pages (3: Performance Intelligence)
- AI Relevant and Generated Detailed Search Metadata - Create agents that can generate unstructured, lengthy and detailed metadata (use-cases, descriptions) that humans can edit and search can index.
- Problem: For AI Agents - AI Agents benefit from more context about images that humans just know. RAG search needs relevant detailed metadata to find good semantic matches.
- Workstream: AI Agents will provide more relevant images to the landing pages (1A: Smart Content/Page Creation)
- AI Generated Contextual Image Metadata - Using AI to generate AI relevant metadata about the image from the content around where it was first uploaded.
- Problem: For Content Editors - People struggle to find information as search needs contextual information not on the media entity but on the page where it was first used (e.g., a headshot on a teams page).
- Workstream: AI Agents will provide more relevant images to the landing pages (1B: Improvements - Smart Content/Page Creation)
Phase 3: Basic AI Image Manipulation (Aspirational)
- Image Augmentation AI Assistant + Agent - A chatbot assistant to help users understand available tools and provide ideas based on image purpose.
- Problem: For Content Editors/Agents - Users need help understanding what is possible with many different tools available.
- Workstream: Enables agents to change not just page content and layout but also the images themselves (3: Performance Intelligence)
- Responsive Image Styles and Resizing Agents - AI-assisted calculation and application of appropriate image styles for responsive designs.
- Problem: For Content Editors - Currently requires complicated math to figure out image sizing without distortion across multiple columns and responsive breakpoints.
- Workstream: Makes AI generated landing pages better use the image library across all design versions (3: Performance Intelligence)
- AI Image Crops - Using AI to optimize and intelligently crop images based on focal points and use case.
- Problem: For Content Editors/Agents - Image crops combined with focal point analysis could make images much nicer and more effective. People pre-crop images because Drupal tools are complicated.
- Workstream: Analytics agents may perform these operations to achieve more effective landing pages (3: Performance Intelligence)
- AI powered Focal Point Analysis - Create a tool that uses AI or ML to understand the likely focal point of an image for better cropping and resizing.
- Problem: For AI Agents - Many tools benefit from knowing focal points; resizing and cropping without this knowledge could crop out the useful section.
- Workstream: Makes other AI augmentation features more effective (3: Performance Intelligence)
- Space for Image Augmentation with AI - Create a unified interface for AI-powered image manipulation tools within Drupal.
- Problem: For Content Editors - AI provides many new tools but there needs to be an easy to use consistent space to use these tools.
- Workstream: We need to first provide tools to humans to ensure they work before automating them (3: Performance Intelligence)
Phase 4 - Advanced AI Image Manipulation (or never)
- AI Image Generation Tools - Integration with specialized image generation tools like Midjourney for creating images from scratch.
- Problem: For Content Editors/Agents - There may be situations where generating images from scratch is appropriate, especially for full automation with human in the loop.
- Workstream: Could support automated page generation when appropriate images don't exist (3: Performance Intelligence)
- AI Image Transformation Suite (DreamStudio Integration) - Integration with AI augmentation tools including inpaint, outpaint, background removal/replacement, recolor, style transfer, upscale, and variations.
- Problem: For Content Editors/Agents - Many different image manipulation problems need solving with advanced AI tools.
- Workstream: Analytics agents may perform these operations to achieve more effective landing pages (Currently: NONE - For future consideration)