Canonical (and other) links with multiple query parameters have ampersand encoded

Created on 1 May 2018, over 6 years ago
Updated 24 June 2023, over 1 year ago

Problem/Motivation

If a page has a URL like /blog?tag=security&page=4, the canonical URL will be rendered as /blog?tag=security&page=4.

This will cause problems with SEO and crawlers as they depend on canonical links to be valid. It also makes it impossible to meet Google's recommendations on how to indicate paginated content properly. This is why this issue is initially marked Major.

To reproduce this:

- Create a SimplyTest.me project using the MetaTags module
- Add the metatags field to the basic page content type
- Create a basic page and use the Metatag advanced setting to set the canonical URL to /blog?tag=security&page=4
- Open in Chrome and view the page source (Note: Inspect will hide the & value).
- The canonical href will have be encoded.

Proposed resolution

The underlying problem is in the HtmlTag code. It creates the tag markup using the Attribute class's __toString method. This method escapes all attribute labels and values.

The fix here is to handle link tag href attributes and script tag src attributes as special cases in the HtmlTag code.

The HREF and SRC attributes should not be added to the Attributes class. Instead, their values should be validated with UrlHelper::isValid(). If they fail validation, then they should be fully escaped. These attributes should be added to the tag directly. All other attributes should go thru the Attribute class.

Remaining tasks

- Write the patch
- Write some tests

User interface changes

None

API changes

None

Data model changes

None

🐛 Bug report
Status

Postponed: needs info

Version

11.0 🔥

Component
Render 

Last updated 3 days ago

Created by

🇺🇸United States cgmonroe

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Merge Requests

Comments & Activities

Not all content is available!

It's likely this issue predates Contrib.social: some issue and comment data are missing.

  • 🇺🇸United States smustgrave

    I tried replicating on a fresh D10 following the steps but was not able to reproduce.

    Wonder if there are additional steps.

  • Status changed to Active over 1 year ago
  • 🇦🇹Austria arthur_lorenz Vienna

    How to reproduce on a fresh D10 install:

    $options = [
      'query' => [
        'uri' => 'foo/bar',
        'page' => '/test',
      ],
    ];
    $url = new \Drupal\Core\Url('<front>', [], $options);
    (string) \Drupal::service('link_generator')->generate('Example link', $url)
    

    will result in

    <a href="/?uri=foo/bar&amp;page=/test">Example link</a>

  • 🇩🇪Germany metalbote Aachen

    Simple reroll as needed by one of our projects against 10.3.6

  • 🇮🇳India heykarthikwithu Bengaluru 🌍
    1. Able to reproduce the issue with Drupal 11.x & drupal/metatag:^2.1
    2. Updated the steps to reproduce in Issue Summary
    3. Changing the state to Needs work
  • 🇮🇳India heykarthikwithu Bengaluru 🌍
  • Merge request !10151Resolve #2968558 "Canonical and other" → (Open) created by heykarthikwithu
  • Pipeline finished with Failed
    3 months ago
    Total: 123s
    #335999
  • Pipeline finished with Failed
    3 months ago
    Total: 491s
    #336005
  • Pipeline finished with Failed
    3 months ago
    Total: 488s
    #336021
  • 🇵🇹Portugal joao.ramos.costa

    HI @heykarthikwithu , thank you for the mr.
    I'm not sure why scope is being reduced to link or src tags, when this also affects meta tags, for instance.
    Regards.

Production build 0.71.5 2024