\Drupal::entityTypeManager()->getStorage('user_role')->load(); can't load non-english string

Created on 18 January 2024, about 1 year ago

Problem/Motivation

Steps to reproduce

1. Downlaod latest drupal10 code:
composer create-project drupal/recommended-project my_site_name
2. Fresh install by drush:
drush si --db-url=mysql://root:password@mariadb:3306/d10 --account-pass=abcd1234 --sites-subdir=default -y
3. Use 'devel php' or 'drush ev' or 'custom module' to execute php code

\Drupal::entityTypeManager()->getStorage('user_role')->load("ไธญๆ–‡");

You will get error:

SQLSTATE[HY000]: General error: 1267 Illegal mix of collations (ascii_general_ci,IMPLICIT) and (utf8mb4_general_ci,COERCIBLE) for operation '=': SELECT "name", "data" FROM "config" WHERE "collection" = :collection AND "name" IN ( :names__0 ); Array ( [:collection] => [:names__0] => user.role.ๆต‹่ฏ• )
๐Ÿ› Bug report
Status

Active

Version

11.0 ๐Ÿ”ฅ

Component
Databaseย  โ†’

Last updated 4 days ago

  • Maintained by
  • ๐Ÿ‡ณ๐Ÿ‡ฑNetherlands @daffie
Created by

๐Ÿ‡จ๐Ÿ‡ณChina lawxen

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Merge Requests

Comments & Activities

  • Issue created by @lawxen
  • Assigned to arunkumark
  • ๐Ÿ‡ฎ๐Ÿ‡ณIndia arunkumark Coimbatore
  • Issue was unassigned.
  • ๐Ÿ‡ฎ๐Ÿ‡ณIndia arunkumark Coimbatore

    @Lawxen,
    The issue is not because of the Drupal core. It was a common database error in the Chinese language. Please refer to the SQL Collation utf8_general_ci settings.

    I have resolved the issue by running the below queries into the Database server.

    ALTER DATABASE DBNAME CHARACTER SET utf8 COLLATE utf8_general_ci;
    ALTER TABLE config CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;
    

    After running this script, I was able to successfully run the Chinese words.

  • ๐Ÿ‡จ๐Ÿ‡ณChina lawxen

    @arunkumark thanks for the review:
    I retest the problem:
    1. Create db by official drupal doc : create-a-database โ†’
    2. Install Drupal throug browser /core/install.php
    3. Check that the config table's CHARACTER is utf8mb4 and COLLATE is utf8mb4_general_ci
    4. The problem still exist.
    5. Has confirm that the problem didn't exist on Drupal8 and Drupal9
    So this is still a bug of core.


  • ๐Ÿ‡จ๐Ÿ‡ณChina lawxen

    No matter which command be used to create database;

    CREATE DATABASE d10 CHARACTER SET utf8 COLLATE utf8_general_ci;
    CREATE DATABASE d10 CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci;
    CREATE DATABASE d10 CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;

    The site install will install all tables with utf8mb4 and utf8mb4_general_ci.

    And execute command from arunkumark on #4 ๐Ÿ› \Drupal::entityTypeManager()->getStorage('user_role')->load() can't load non-english string Closed: won't fix
    ALTER TABLE config CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;

    \Drupal::entityTypeManager()->getStorage('user_role')->load("ไธญๆ–‡");
    

    can be executed successfully.

    This is wiard, But comfim this is a regression of Drupal 10.

  • Those character sets and collations are defaults for new tables. Drupal may deliberately set other character sets and collations, like ascii_general_ci, for some tables. I think that is the case here, and with some queries, when joining tables with differing collations, this can occur.

  • ๐Ÿ‡น๐Ÿ‡ทTurkey makbay

    I can also confirm that this happens for Turkish characters.

    All of my tables and database are collated as utf8mb4_general_ci

    Just came here from: https://www.drupal.org/project/webform/issues/3415445 ๐Ÿ› Search Chinese in /admin/structure/webform cause error: Illegal mix of collations Closed: duplicate

  • ๐Ÿ‡ซ๐Ÿ‡ฎFinland konstara Helsinki

    Workaround where no collation changes are required.

  • ๐Ÿ‡ซ๐Ÿ‡ฎFinland konstara Helsinki

    https://www.drupal.org/files/issues/2024-02-26/3415478-cant-load-non-eng... โ†’
    Created this for workaround where no collation changes are required.

  • ๐Ÿ‡ซ๐Ÿ‡ฎFinland konstara Helsinki
  • The site install will install all tables with utf8mb4 and utf8mb4_general_ci.

    That is true for tables, but not for columns.

    This is the config table schema:

    CREATE TABLE `config` (
      `collection` varchar(255) CHARACTER SET ascii NOT NULL DEFAULT '' COMMENT 'Primary Key: Config object collection.',
      `name` varchar(255) CHARACTER SET ascii NOT NULL DEFAULT '' COMMENT 'Primary Key: Config object name.',
      `data` longblob COMMENT 'A serialized configuration object data.'
    )
    

    It is impossible to query the collection or name columns with non-ASCII characters, for example, "โ‚ฌ".

    The table has had this design since #1923406: Use ASCII character set on alphanumeric fields so we can index all 255 characters โ†’ , which is from eight years ago. If this is a regression, in which version of Drupal did this work?

  • ๐Ÿ‡จ๐Ÿ‡ณChina lawxen

    in which version of Drupal did this work?

    I can't remember the exact version.
    When the issue is created, webform+Drupal9 works. webform+Drupal10 didn't work.

  • Status changed to Closed: won't fix 4 months ago
  • This canโ€™t be fixed as such. Webform just has to stop calling this function in that way.

  • ๐Ÿ‡ฌ๐Ÿ‡งUnited Kingdom oily Greater London

    There is now a more comprehensible error message that is at 'Needs review' stage of the related issue 3475540. I have tested this issue by using drush at the command line:

    ddev drush ev "\Drupal::entityTypeManager()->getStorage('user_role')->load('8รฎรฏรฎรถ');"

    This command triggers the same comprehensible error message as when reproducing issue 3475540. But I have adjusted the message that that it fixes this ticket. It now refers to command line input in addition to browser input.

  • ๐Ÿ‡บ๐Ÿ‡ธUnited States smustgrave
  • ๐Ÿ‡ฌ๐Ÿ‡งUnited Kingdom oily Greater London

    Failing functional test has been created. The test tests creating a user role entity with invalid characters. This is not an indentical reproduction of the issue, but once the fix from 3475540 is applied this test should pass. That should mean that entity CRUD involving user role entities in this issue together with the node type entity in 3475540 are validated without touching the database and an understandable but generically worded InvalidArgumentException is returned.

  • Pipeline finished with Failed
    4 months ago
    Total: 659s
    #299456
  • ๐Ÿ‡ฌ๐Ÿ‡งUnited Kingdom oily Greater London

    @cilefen Regarding #12, I am seeing a different CREATE TABLE schema in 11.x:

    config | CREATE TABLE `config` (
    `collection` varchar(255) CHARACTER SET ascii COLLATE ascii_general_ci NOT NULL DEFAULT '' COMMENT 'Primary Key: Config object collection.',
    `name` varchar(255) CHARACTER SET ascii COLLATE ascii_general_ci NOT NULL DEFAULT '' COMMENT 'Primary Key: Config object name.',

    This part of the configuration of the 'collection' and 'name' columns is different:
    COLLATE ascii_general_ci

  • Pipeline finished with Failed
    4 months ago
    Total: 6230s
    #301214
  • Status changed to Needs work 6 days ago
  • ๐Ÿ‡ฆ๐Ÿ‡นAustria shyam-sawhney

    #10 Works for me. Thanks

  • ๐Ÿ‡ฎ๐Ÿ‡ณIndia mdsohaib4242

    Ensure your database and tables use the utf8mb4 character set and the utf8mb4_general_ci collation.

    Run the following SQL command to check the collation

    SHOW VARIABLES LIKE 'collation_database';
    

    Update the database collation to utf8mb4_general_ci

    ALTER DATABASE your_database_name CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci;
    

    For the config table and other relevant tables, update the collation

    ALTER TABLE config CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci;
    ALTER TABLE config CHANGE name name VARCHAR(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci;
    

    Specify utf8mb4 during the installation process.
    Make sure to add the following to your settings.php inside the database connection.

        'collation' => 'utf8mb4_general_ci',
    
Production build 0.71.5 2024