Robots.txt module settings only work if there is no robots.txt file

Created on 16 January 2025, 6 months ago

Problem/Motivation

The SEO tools recipe adds Robots.txt module to manage this via the UI. But the module only works if there is no robots.txt file in the web root. The file is provided by core and is needed by default, because the SEO tools recipe is optional.

Steps to reproduce

  1. Install Drupal CMS
  2. Apply the SEO tools recipe
  3. See the warning on the status page, and see that the changes in Robots.txt settings do not have any effect

Proposed resolution

??

πŸ› Bug report
Status

Active

Component

Track: SEO

Created by

πŸ‡¦πŸ‡ΊAustralia pameeela

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Comments & Activities

  • Issue created by @pameeela
  • πŸ‡ΊπŸ‡ΈUnited States thejimbirch Cape Cod, Massachusetts
  • πŸ‡¬πŸ‡§United Kingdom catch

    I don't think the robots.txt file can be removed from Drupal CMS - even if this is removed from the base recipe, the robots.txt module could be uninstalled, and then the site would be left with no robots.txt at all. Fine for sites to do it themselves, that's why it's part of scaffolding, but Drupal CMS isn't a site.

  • πŸ‡ΊπŸ‡ΈUnited States thejimbirch Cape Cod, Massachusetts

    Looks like there is an issue with a patch in the module's queue. Postponing on that.

  • Status changed to Postponed 10 days ago
  • πŸ‡¬πŸ‡§United Kingdom dunx

    I spotted the same 'RobotsTxt module works only if you remove the existing robots.txt file in your website root.' warning on the status screen this morning when doing some work on a fresh install of Drupal CMS based on D11.2 with just the SEO Recipe installed.

    Module description: Use this module when you are running multiple Drupal sites from a single code base (multisite) and you need a different robots.txt file for each one.
    Why is this module even part of the SEO recipe, which is very unlikely to be running in a multisite configuration?

    The easiest solution is to remove RobotsTxt from the SEO Recipe to Drupal CMS. Its inclusion generates a warning that the target audience of Drupal CMS may miss or be unable to easily deal with. Unless the module description is wrong and it has other uses I'm not aware of.

  • πŸ‡ΊπŸ‡ΈUnited States Amber Himes Matz Portland, OR USA

    The recommendation in the RobotsTxt module's README is to update the project's composer.json like so:

    "extra": {
            "drupal-scaffold": {
                "locations": {
                    "web-root": "web/"
                },
                "file-mapping": {
                    "[web-root]/robots.txt": false
                }
            },
    

    ...To prevent Drupal Scaffold from placing robots.txt in the web root, on installation and update.

    Would that be a viable option for Drupal CMS?

  • πŸ‡ΊπŸ‡ΈUnited States phenaproxima Massachusetts

    We could do that...but the problem is then that you don't have a robots.txt file if you either uninstall RobotsTxt, or never install it to begin with. If you're not comfortable at the command line (which is presumed to be the case in Drupal CMS's target audience), you're sort of screwed.

    To me, this is something RobotsTxt should handle on its own. It could maybe delete the existing robots.txt when it gets installed (or try to), and then automatically restore the scaffold version when it gets uninstalled. It could even use Package Manager for this, although that might be a bit of a heavy lift.

    Or maybe we could ship an ECA configuration which does the same thing (although it'd need write access to the filesystem, but it could just quietly fail with a log message if the web root is not writable).

  • πŸ‡©πŸ‡ͺGermany jurgenhaas Gottmadingen

    I second that, this is something the robotstxt module should do this because it has the same issue on every Drupal site which uses it. Site owner need to be made aware, that the scaffold file is hindering the module from doing its job, just in case the module can't delete that file after installation, or if the robots.txt file comes back during a composer update.

    The module should even show a warning message to admin users and/or show the issue in the status report using the requirements hook.

    This be done by ECA, yes. But doing it in the module would serve all its users, not just some.

  • πŸ‡¦πŸ‡ΊAustralia pameeela
  • πŸ‡¬πŸ‡§United Kingdom dunx

    @jurgenhaus, for clarity the module does display a 'RobotsTxt module works only if you remove the existing robots.txt file in your website root.' warning on the status screen. I came here because this module is included in the SEO Recipe on Drupal CMS and there's not a lot a 'marketer' can do about that message.

    Totally agree that the module should try to resolve the issue itself as already suggested.

  • πŸ‡ΊπŸ‡ΈUnited States Amber Himes Matz Portland, OR USA

    Thanks everyone for clarifying the issue. FYI, I've updated the Drupal CMS docs for SEO Checklist about configuring RobotsTxt with recommendations from the RobotsTxt module's README. The skills and permissions required to complete SEO Checklist module's action items range from basic site administration to developer-level skills, so I don't think it hurts to surface the information provided by RobotsTxt module in the Drupal CMS docs. If anything, it would help a site admin troubleshoot why their changes in the UI for RobotsTxt weren't having any effect.

    https://new.drupal.org/docs/drupal-cms/promote/using-seo-checklist-in-dr...

Production build 0.71.5 2024