For incoming requests with facet query parameters, add robots metatag with noindex value

Created on 18 November 2019, about 5 years ago
Updated 19 September 2024, 2 months ago

Problem/Motivation

Our project team is asked to add the robots metatag with the noindex value for incoming requests with facet query parameters to avoid search engines from indexing the page and penalizing for duplicate content.

While the canonical URL points to the base path, we are asked to take this conservative approach to protect against SEO penalties.

Proposed resolution

Remaining tasks

User interface changes

Not necessary, see comment #3 For incoming requests with facet query parameters, add robots metatag with noindex value Needs work

API changes

None

Data model changes

Not necessary, see comment #3 For incoming requests with facet query parameters, add robots metatag with noindex value Needs work

Release notes snippet

TBD

Feature request
Status

Needs review

Version

3.0

Component

Code

Created by

🇺🇸United States jasonawant New Orleans, USA

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Comments & Activities

Not all content is available!

It's likely this issue predates Contrib.social: some issue and comment data are missing.

  • 🇬🇧United Kingdom mbatterton

    Patch #5 works very nicely for facets 3.0 as well. We find this quite important for SEO and helps Google to stop crawling pages that have facet queries in the URL. This is dispite facet links do already have nofollow on each link apparently this is not enough.

  • 🇺🇸United States uotonyh

    Added patch #5 to version 2.0.6 due to crawlers recursing into links used for the autocomplete widget.

    We do want the page to be indexed ONCE, but bots are not smart, and they will just keep going.

    Part of the problem is how the autocomplete builds on itself. With zero terms, the links look like this:

    "urls": {
      "history": "/majors?f[0]=search:history",
      "culture": "/majors?f[0]=search:culture",
      ...
    }

    If I look for 'philosophy' then the autocomplete URLs look like this:

    "urls": {
      "history": "/majors?f[0]=search:history&f[1]=search:philosophy",
      "culture": "/majors?f[0]=search:culture&f[1]=search:philosophy",
      ...
    }

    And the crawler will just keep going.

  • Status changed to Needs work about 1 year ago
  • 🇩🇪Germany mkalkbrenner 🇩🇪

    This behavior needs to be configurable.
    We use facet blocks as menus and want the search engine to follow the links.

  • 🇬🇧United Kingdom danharper

    Looking for this feature in 3.x

    We're getting hammered by bots hitting un cached pages with various combinations of facets that rarely get used.

  • 🇮🇳India credevator

    Patch #5 is failing when upgraded to version 2.0.8.
    I also have to use #254 🐛 Facets with AJAX not working in most of situations Needs review to fix issues with Drupal 10.3.1 upgrade.
    I have updated this to work with 2.0.8 and patch #254

  • Status changed to Needs review 2 months ago
  • 🇲🇾Malaysia ckng

    We're also using facet link in menu, which we would like to be indexed. There is possibility some facets need to be excluded from noindex, or nofollow, or other settings.

    So instead of applying noindex blanketly and depending on the 'f' parameters, have taken another approach by checking the active filter(s). Also added configuration to the block setting to allow robots meta selection. Default to 'all', ie. do nothing, which is excluded in the final robots, as we don't want to mix 'all' with 'noindex' or the other options.

    This should work with facets pretty path, not tested though.

  • 🇲🇾Malaysia ckng

    Same patch from #12 but for 3.0.x.

  • 🇳🇱Netherlands rudy de kok

    Same patch as #11 but without the changes from #254 🐛 Facets with AJAX not working in most of situations Needs review

Production build 0.71.5 2024