Question about form of llms.txt file

Created on 3 July 2025, 29 days ago

Problem/Motivation

In the documentation here https://llmstxt.org/#example most of the links in the file ends with .md. Is it better to include links already in MD format or it doesn't matter? For example the menu links token from this module adds the links to the pages, but if there is markdownify module also installed, should those links be better with .md ending?

πŸ’¬ Support request
Status

Active

Version

1.0

Component

Miscellaneous

Created by

πŸ‡©πŸ‡ͺGermany a.dmitriiev

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Comments & Activities

  • Issue created by @a.dmitriiev
  • πŸ‡©πŸ‡ͺGermany a.dmitriiev

    I checked some examples from here https://llmstxt.site/ and people sites do not have the link pointing to MD format actually. But what is the best approach? Will LLM find the MD format itself with link rel="alternate"?

  • πŸ‡­πŸ‡ΊHungary mxr576 Hungary

    Drupal's ability to serve content in Markdown format appears to be unique among CMS platforms as highlighted in this Pronovix article about making developer portals AI-ready.

    However, I haven't seen evidence that current LLM crawlers automatically fetch .md versions of content or interpret link rel attribute values. Most crawlers already have established methodologies for mass site scanning, and the llms.txt specification is still relatively new in terms of widespread adoption.

    My recommendation:Explicitly include the Markdown version of pages in your llms.txt file whenever possible, rather than relying on crawlers to discover them automatically.

    You're correct that the module's menu token doesn't currently expose URLs with enforced MD format. This connects to another requested feature for embedding entity links via tokens that point to MD versions ✨ Support dynamic entity linking using tokens Active , which would address this limitation.

  • πŸ‡ΊπŸ‡ΈUnited States christophweber

    Part of the overall llms.txt proposal is having all relevant site content available as MarkDown.

    We furthermore propose that pages on websites that have information that might be useful for LLMs to read provide a clean markdown

    version of those pages at the same URL as the original page, but with .md appended.

    Even though many example files do not contain MD links, my strong conviction is that all links in llms.txt should directly point to the MD version of the page, as the proposal implies. While current crawlers have been optimized to deal with HTML, future AI agents should not have to and ought to be able to fetch and parse MD only.

  • πŸ‡©πŸ‡ͺGermany a.dmitriiev

    Thank you all, now it is more clear.

  • Automatically closed - issue fixed for 2 weeks with no activity.

Production build 0.71.5 2024