The database tables should be using a consistent collation in order to avoid unexpected errors especially with multilingual sites.

Created on 27 November 2023, 7 months ago
Updated 28 November 2023, 7 months ago

The database tables should be using a consistent collation in order to avoid unexpected errors especially with multilingual sites.

We conducted an audit of our website and found an error reported for the S3FS module in the audit report. The recommendation is to utilize the default setting of 'utf8mb4_general_ci'.

🐛 Bug report
Status

Closed: works as designed

Version

3.3

Component

Code

Created by

🇮🇳India gawalin

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Comments & Activities

  • Issue created by @gawalin
  • 🇮🇳India gawalin

    I've generated a patch to address this issue. Could someone please test it and confirm it's right approach to resolve this.

  • Status changed to Closed: works as designed 7 months ago
  • 🇺🇸United States cmlara

    The utf8mb4_general_ci is case insensitive which does not work for as S3 storage is a case sensitive. The entries s3://test.txt and s3://Test.txt may both exist on S3 and we need to be able to store them without primary key conflict.

    This is our reason for using utf8_bin in 8.x-3.x. In 4.x we (IIRC) use utf8mb4_bin which does not exist on all the versions of MySQL/MariaDB that we need to support with 8.x-3.x.

    While it is traditionally encouraged to maintain the same collation in a database the reasons for doing so (multi-table queries) generally do not apply to the s3fs_table as it is only queried independently (never as part of a join).

    This can likely be considered a false-positive on part of your scanning software which does not know our design reason for the collation choice.

    Closing as WAD per above.

  • 🇮🇳India gawalin

    Thank you so much for your response. It's truly helpful.

Production build 0.69.0 2024