Support Unicode regular expressions in routes, from Symfony 4.3

Created on 25 June 2023, over 1 year ago
Updated 10 April 2024, 8 months ago

Problem/Motivation

When defining a route, we can define a requirements clause, which may contain a regexp to match a route parameter, like:

g2.initial:
  path: '/g2/initial/{g2_initial}'
  defaults:
    _controller: '\Drupal\g2\Controller\Initial::indexAction'
    _title_callback: '\Drupal\g2\Controller\Initial::indexTitle'
  requirements:
    _permission: 'access content'
    g2_initial: '[\p{Ll}\p{Lm}\p{Lo}\p{Lt}\p{Lu}\p{Mc}\p{Nd}\p{Nl}\p{No} _-]'

Since Symfony 4.3 (included), these regular expressions are documented as supporting Unicode classes https://symfony.com/doc/6.4/routing.html#route-parameters which means the regexp matching needs to add the u modificator at the end of the regexp, as per https://www.php.net/manual/en/reference.pcre.pattern.modifiers.php

However, in UrlGenerator::doGenerate we use this line to match requirements:

          if (!preg_match('#^' . $token[2] . '$#', $mergedParams[$token[3]])) {

which means we do not enable full Unicode matching beyond basic PCRE, unlike Symfony ≥ 4.3.

As a consequence, any route receiving a parameter matching one of these classes will throw an InvalidParameterException when invoked with a character like é.

Steps to reproduce

  • Define a route like the one above.
  • In some code, e.g. using drush core-cli try to generate a link to /g2/initial/é, like:
    $ug = \Drupal::urlGenerator();
    $ug->generateFromRoute('g2.initial', ['g2_initial' => 'a']);
    $ug->generateFromRoute('g2.initial', ['g2_initial' => 'é']);
    
  • Observe how the first case succeeds while the second throws the exception

Note that the problem does not exist when routing incoming requests, as symfony/routing handles the regexp correctly for us.

Proposed resolution

Append the u modificator to support full Unicode matching.

Rerun the example above: now both calls work normally, generating:

"/g2/initial/a"
"/g2/initial/%C3%A9"

As expected.

Remaining tasks

Implement change.

User interface changes

None.

API changes

None.

Data model changes

None.

Release notes snippet

The UrlGenerator can now correctly generate URLs to routes using full Unicode character matching with parameters outside the ASCII range.

🐛 Bug report
Status

Needs work

Version

11.0 🔥

Component
Routing 

Last updated 2 days ago

Created by

🇫🇷France fgm Paris, France

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Merge Requests

Comments & Activities

Production build 0.71.5 2024