- Issue created by @tonypaulbarker
This issue extends the issue AI Media Discovery: Investigate JavaScript scanners as a method of extracting information from images 🌱 AI Media Discovery: Investigate JavaScript scanners as a method of extracting information from images Active to look at contextual information from web pages.
We would like to extract information from web pages so that we have data about things that are not currently available to a media entity like:
When a crop yields undesired results (e.g. faces cropped awkwardly from image)
Effects applied to an image in a certain context
Overridden alt text and information provided by captions.
It was previously discussed to capture and analyse screenshots but JavaScript tools such as the approach used by Editoria11y module and by using tools like OpenCV.js and face-api.js may be efficient and effective.
How to then store and retrieve the data captured by these tools is not in scope of this particular issue. Perhaps the data can be stored in a vector database 🌱 AI Media Discovery: Store and retrieve extended media data in vector datatabase Active .
Explore how JavaScript tools might be used in combination with AI tools to extract such data from rendered web pages or from content saved in Drupal or content at the point of creation and editing.
- Investigate and explore capabilities of JS libraries and tools
- Investigate and explore capabilities of AI provider tools
- Document general findings
- Provide code examples and technical information that can be used to help realise stories in the AI media track.
Active
Planning