Problem/Motivation
Create an image-to-image operation type that can work for all the DreamStudio type of operations, including:
Upscale
Creative Upscale
Image to creative QR code
Inpaint
Outpaint
Remove Backgroun
Erase Object
Sketch to Image
Structure to Image
Recolor
This means that we need a very flexible input and output for this. The only requirements will always be that the image is inputted and the image is outputted. In upscale models for instance, this might be the only required data.
For more complex things like Inpaint a prompt or even a mask might be required.
We create feature flags for models, similar to if tools calling works for instance. This means that the getModels of providers that support image to image has to be very flexible.
This operation type will work very badly with default operation type kind of tasks, if this gets important, for instance for upscaling, we would add a pseude default operation type called image_to_image_upscale or something similar.
Because its unclear during scoping what exactly will be needed this will be developed with AI API Explorer as part of the issue and with DreamStudio provider being used as the example for most use cases.
Steps to reproduce
Proposed resolution
Create ImageToImage operation type directory
Create ImageToImageInput that require to take an abstract Image
Add Helper Traits for media
Create ImageToImageInput that returns an abstract Image
Create ImageToImageInterface with imageToImage method
Add a method called requiresImageToImageMask that sets a boolean
Add a trait for ImageToImage that sets this to false by default
Explore what the input actually needs, but make all those fields optional - the providers have to throw exception when something is missing and make sure that the model configs has required fields for mask, prompts etc.
Create an AI API Explorer plugin for this.
Remaining tasks
User interface changes
API changes
Data model changes