For incoming requests with facet query parameters, add robots metatag with noindex value

Created on 18 November 2019, over 4 years ago
Updated 21 June 2024, 6 days ago

Problem/Motivation

Our project team is asked to add the robots metatag with the noindex value for incoming requests with facet query parameters to avoid search engines from indexing the page and penalizing for duplicate content.

While the canonical URL points to the base path, we are asked to take this conservative approach to protect against SEO penalties.

Proposed resolution

Remaining tasks

User interface changes

Not necessary, see comment #3 ✨ For incoming requests with facet query parameters, add robots metatag with noindex value Needs work

API changes

None

Data model changes

Not necessary, see comment #3 ✨ For incoming requests with facet query parameters, add robots metatag with noindex value Needs work

Release notes snippet

TBD

✨ Feature request
Status

Needs work

Version

3.0

Component

Code

Created by

πŸ‡ΊπŸ‡ΈUnited States jasonawant New Orleans, USA

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Comments & Activities

Not all content is available!

It's likely this issue predates Contrib.social: some issue and comment data are missing.

  • πŸ‡¬πŸ‡§United Kingdom mbatterton

    Patch #5 works very nicely for facets 3.0 as well. We find this quite important for SEO and helps Google to stop crawling pages that have facet queries in the URL. This is dispite facet links do already have nofollow on each link apparently this is not enough.

  • πŸ‡ΊπŸ‡ΈUnited States uotonyh

    Added patch #5 to version 2.0.6 due to crawlers recursing into links used for the autocomplete widget.

    We do want the page to be indexed ONCE, but bots are not smart, and they will just keep going.

    Part of the problem is how the autocomplete builds on itself. With zero terms, the links look like this:

    "urls": {
      "history": "/majors?f[0]=search:history",
      "culture": "/majors?f[0]=search:culture",
      ...
    }

    If I look for 'philosophy' then the autocomplete URLs look like this:

    "urls": {
      "history": "/majors?f[0]=search:history&f[1]=search:philosophy",
      "culture": "/majors?f[0]=search:culture&f[1]=search:philosophy",
      ...
    }

    And the crawler will just keep going.

  • Status changed to Needs work 9 months ago
  • πŸ‡©πŸ‡ͺGermany mkalkbrenner πŸ‡©πŸ‡ͺ

    This behavior needs to be configurable.
    We use facet blocks as menus and want the search engine to follow the links.

  • πŸ‡¬πŸ‡§United Kingdom danharper

    Looking for this feature in 3.x

    We're getting hammered by bots hitting un cached pages with various combinations of facets that rarely get used.

Production build 0.69.0 2024