Add option to "Invalidate robots.txt cache on each cron run"

Created on 15 April 2025, 10 days ago

Problem/Motivation

While hook_robotstxt() is an excellent mechanism for altering yourwebsite.com/robots.txt, the output is typically cached on most Drupal sites. As a result, it may serve outdated data unless the cache is manually cleared or the robotstxt cache tag is programmatically invalidated. This becomes especially important when lines are added dynamically based on database content or external APIs.

Right now, the only action that invalidates the cache is submitting the configuration form at /admin/config/search/robotstxt.

In practice, content managers or admins are unlikely to know that this step is required to keep yourwebsite.com/robots.txt up to date. And realistically, no one wants to repeatedly save a config form or clear the entire Drupal cache just to refresh the content of the file.

Proposed resolution

  1. Add a configuration option: "Invalidate robots.txt cache on each cron run".
  2. Implement hook_cron() to read that setting and invalidate the robotstxt cache tag if enabled:
    Cache::invalidateTags(['robotstxt']);

This would provide a simple, automated way to ensure the robots.txt content stays current.

Remaining tasks

  1. Start the discussion
  2. Gather feedback
  3. Create the merge request

User interface changes

New settings page to provide a checkbox to "Invalidate robots.txt cache on each cron run".

API changes

No

Data model changes

No

Feature request
Status

Active

Version

1.6

Component

Code

Created by

🇨🇴Colombia camilo.escobar

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Comments & Activities

Production build 0.71.5 2024