- Issue created by @sboden
- πΊπΈUnited States cmlara
I'm going to send this over to Drupal Core as this error itself would appear to originate from the cache system.
If we could see the file URI that would be helpful for the core developers.
I was able to reproduce this with the following code (going off my knowledge of database primary key limitations for mysql) which may or may not be the cause of the original post
$cache = \Drupal::cache(); $cache->set('test1', 'lowercase'); $cache->get('test1'); // Good here $cache->get('test1 '); // The trailing space causes an issues because mysql ignores trailing spaces on primary keys and will return a result of cid 'test1' instead of 'test1 '
https://dev.mysql.com/doc/refman/5.7/en/char.html for documentation on trailing pad characters and character primary keys.
@sboden Note that even if this is space related that even if core fixes this we will still not support paths with trailing spaces until 4.x at the earliest, see π Issue with directories and trailing whitespaces Needs work . While we have work to do in s3fs, the cache system is suppose to handle converting strings into a form that can be stored/retrieved so a trailing slash in a cache::get() is not to my knowledge a fault in s3fs.
- πΈπͺSweden twod Sweden
We get the same type of error when page cache stores the results of a view with an exposed filter of some value, and you filter again for the same string with a different combination of uppercase/lowercase characters.
Page cache generated a cache id with the first search string and stored that in the cache table via the database backend.
The next time page cache asks for the same string (only case differs) the database backend performs a case-insensitive query and returns the original item, which has a different cache id compared to what it put in the mapping array.IIRC MariaDB, MySQL and SQLite all do case-insensitive queries by default, unsure about PostgreSQL, so maybe it would be easiest to make the database backend just lowercase the cids when normalizing them.
Other cache backends, like memory, will of course treat these as different cache ids. The interface does not explicitly mention how this is intended to work so maybe we can get away with that? - πΈπͺSweden twod Sweden
There is indeed a test which validates that all cache backends have case-sensitive cache ids in
GenericCacheBackendUnitTestBase
, but what makes it case-[in]sensitive is (obvious in hind-sight), the selected collation for the cid column.We were using utf8mb4_sv_0900_ai_ci, and switching to utf8mb4_sv_0900_as_cs for just that column on the cache tables fixes the issue with notices for us.
- π¬π§United Kingdom alexpott πͺπΊπ
Can people who have the problem originally reported here confirm the following details:
- What database you using - including version - you can get this information on admin/reports/status
- If you connect to your database run the query
select cid from cache_page limit 1;
and then use the cid returned to a select where you have changed the case of the cidselect cid from cache_page where cid = 'cid_case_changed';
- The output of the SQL statement
show create table cache_page;
(mysql) or the CLI commandpg_dump --table cache_page --schema-only
(postgres)
For the create table I get
| cache_page | CREATE TABLE `cache_page` ( `cid` varchar(255) CHARACTER SET ascii COLLATE ascii_bin NOT NULL DEFAULT '' COMMENT 'Primary Key: Unique cache ID.', `data` longblob COMMENT 'A collection of data to cache.', `expire` bigint NOT NULL DEFAULT '0' COMMENT 'A Unix timestamp indicating when the cache entry should expire, or -1 for never.', `created` decimal(14,3) NOT NULL DEFAULT '0.000' COMMENT 'A timestamp with millisecond precision indicating when the cache entry was created.', `serialized` smallint NOT NULL DEFAULT '0' COMMENT 'A flag to indicate whether content is serialized (1) or not (0).', `tags` longtext COMMENT 'Space-separated list of cache tags for this entry.', `checksum` varchar(255) CHARACTER SET ascii COLLATE ascii_general_ci NOT NULL COMMENT 'The tag invalidation checksum when this entry was saved.', PRIMARY KEY (`cid`), KEY `expire` (`expire`), KEY `created` (`created`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci COMMENT='Storage for the cache API.' |
And although the collation for the table is utf8mb4_0900_ai_ci the queries are case-sensitive.
mysql> select cid from cache_page where cid = 'http://drupal8alt.test/fr:'; +----------------------------+ | cid | +----------------------------+ | http://drupal8alt.test/fr: | +----------------------------+ 1 row in set (0.00 sec) mysql> select cid from cache_page where cid = 'Http://drupal8alt.test/fr:'; Empty set (0.00 sec)
That's because the cid column has it's pwn collation of ascii_bin.
- First commit to issue fork.
- π¬π§United Kingdom catch
The trailing space issue is easy to reproduce without needing any special collation, i.e. the steps given in #2. I've pushed some test coverage that should show it on gitlab too hopefully.
- π¬π§United Kingdom catch
Here we go:
https://git.drupalcode.org/project/drupal/-/jobs/2308350
Fail Other phpunit-55.xml 0 Drupal\KernelTests\Core\Cache\Datab PHPUnit Test failed to complete; Error: PHPUnit 10.5.29 by Sebastian Bergmann and contributors. Runtime: PHP 8.3.10 Configuration: /builds/project/drupal/core/phpunit.xml.dist ..........W... 14 / 14 (100%) Time: 00:11.572, Memory: 8.00 MB 1 test triggered 1 PHP warning: 1) /builds/project/drupal/core/lib/Drupal/Core/Cache/DatabaseBackend.php:139 Undefined array key "trailing-space-test" Triggered by: * Drupal\KernelTests\Core\Cache\DatabaseBackendTest::testSetGet /builds/project/drupal/core/tests/Drupal/KernelTests/Core/Cache/DatabaseBackendTest.php:51
- Status changed to Needs review
5 months ago 1:46am 2 August 2024 - π¬π§United Kingdom catch
And pushed a separate commit with a suggested fix - comment could probably use some work, I couldn't find MySQL docs specific to treatment of trailing spaces in indexes as opposed to comparisons in general.
Major because this is potentially contributing to the 404 issue reported in π After updating Drupal from version 10.2.7 to version 10.3.0, a 404 error occurred! Needs review .
- πΊπΈUnited States cmlara
All MySQL collations are of type PAD SPACE. This means that all CHAR, VARCHAR, and TEXT values are compared without regard to any trailing spaces. βComparisonβ in this context does not include the LIKE pattern-matching operator, for which trailing spaces are significant.
For those cases where trailing pad characters are stripped or comparisons ignore them, if a column has an index that requires unique values, inserting into the column values that differ only in number of trailing pad characters results in a duplicate-key error. For example, if a table contains 'a', an attempt to store 'a ' causes a duplicate-key error.
- Status changed to RTBC
5 months ago 9:53pm 2 August 2024 - π¬π§United Kingdom alexpott πͺπΊπ
This looks great now. Nice reviews @cmlara and nice MR @catch.
- π¬π§United Kingdom catch
Oh good spot on the whitespace, I briefly noticed it but thought it was my eyes playing tricks on me instead of an actual bug :(
Retitling for the actual problem we're fixing.
-
larowlan β
committed fb2469d3 on 11.x
Issue #3368537 by catch, TwoD, alexpott, cmlara, sboden: Ensure trailing...
-
larowlan β
committed fb2469d3 on 11.x
-
larowlan β
committed ea0088d7 on 10.4.x
Issue #3368537 by catch, TwoD, alexpott, cmlara, sboden: Ensure trailing...
-
larowlan β
committed ea0088d7 on 10.4.x
-
larowlan β
committed 71b0922a on 11.0.x
Issue #3368537 by catch, TwoD, alexpott, cmlara, sboden: Ensure trailing...
-
larowlan β
committed 71b0922a on 11.0.x
- π¦πΊAustralia larowlan π¦πΊπ.au GMT+10
Committed to 11.x and backported to 10.4.x and 11.0.x
Waiting for a second opinion about 10.3.x, leaving at RTBC - π¬π§United Kingdom alexpott πͺπΊπ
@larowlan thanks for the commit. We have decent evidence this is behind some of the original reports on π After updating Drupal from version 10.2.7 to version 10.3.0, a 404 error occurred! Needs review and will fix a possible regression in 10.3.x
-
alexpott β
committed 3974f844 on 10.3.x authored by
larowlan β
Issue #3368537 by catch, TwoD, alexpott, cmlara, sboden: Ensure trailing...
-
alexpott β
committed 3974f844 on 10.3.x authored by
larowlan β
- Status changed to Fixed
5 months ago 12:02pm 3 August 2024 - π¬π§United Kingdom alexpott πͺπΊπ
Discussed with @catch and @larowlan - we agreed to backport this to 10.3.x
Automatically closed - issue fixed for 2 weeks with no activity.
- π§πͺBelgium wim leers Ghent π§πͺπͺπΊ
WOW, what a crazy edge case! π€―