Extend DomSelect and Dom to handle web scraping

Created on 8 November 2023, 8 months ago
Updated 10 November 2023, 8 months ago

Problem/Motivation

For a project, I had need to extend/override the DomSelect and Dom plugins to handle a use case I had where I needed to pull content from the page instead of a field. This act of scraping the content from the fully rendered web page seems to have a lot of uses and isn't currently supported well from the existing plugins.

It only takes a few modifications to get there though. I've created this issue after being encouraged to contribute this work back.

Steps to reproduce

Proposed resolution

Remaining tasks

User interface changes

API changes

Data model changes

✨ Feature request
Status

Active

Version

6.0

Component

Plugins

Created by

πŸ‡ΊπŸ‡ΈUnited States cosmicdreams Minneapolis/St. Paul

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Comments & Activities

  • Issue created by @cosmicdreams
  • First commit to issue fork.
  • πŸ‡ΊπŸ‡ΈUnited States nicxvan

    The main process for updating this is to have dom select return the domElement rather than rendered html and updating dom to handle domelement rather than a string.

    An open question is how to handle if the select doesn't find the path. Some instances you would want it to keep going and some you'd want to skip the process.

  • πŸ‡ΊπŸ‡ΈUnited States nicxvan
Production build 0.69.0 2024