Html to Markdown abstraction

Created on 4 July 2025, about 18 hours ago

Problem/Motivation

At the moment HTML to markdown conversion is implemented with league/html-to-markdown package, that of course does its job really good, but it has a lot more to offer. There are settings on how the conversion is done, it is possible to add more tag converter, etc. At the moment the package is used "as is" with default (reasonable) settings.

Even though league package is good, someone might want to use another tool or configure league package differently. Or in case of usage of web components, not all HTML tags can be converted properly to markdown, just because the package is not aware of them and their purpose.

Proposed resolution

Create abstraction layer that will allow to use any HTML to markdown conversion tool to the liking of a user. For example, there are already modules like https://www.drupal.org/project/markdownify that expose HTML to markdown feature as a service with pluggable structure, so that any tool can be used to convert markup to markdown with common interface. The league package is in the module out of the box. I assume there are also other modules.

Remaining tasks

Discuss how the abstraction layer should be done:

  • maybe possible usage of markdownify or some other module
  • or plugin manager, so that other modules can implement plugins with ai module and overtake the conversion process
  • ....
Feature request
Status

Active

Version

2.0

Component

AI Core module

Created by

🇩🇪Germany a.dmitriiev

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Comments & Activities

Production build 0.71.5 2024