- last update
about 1 year ago Build Successful - last update
about 1 year ago Build Successful - Status changed to Needs work
about 1 year ago 10:26am 15 January 2024 The Needs Review Queue Bot → tested this issue.
While you are making the above changes, we recommend that you convert this patch to a merge request → . Merge requests are preferred over patches. Be sure to hide the old patch files as well. (Converting an issue to a merge request without other contributions to the issue will not receive credit.)
- Status changed to Needs review
about 1 year ago 11:04am 15 January 2024 - 🇪🇸Spain interdruper
#25 missed update the function baseFieldDefinitions(). Re-rolled as #26 with the update.
- last update
about 1 year ago Build Successful - Status changed to Needs work
about 1 year ago 12:50am 16 January 2024 - 🇺🇸United States smustgrave
Issue summary should follow standard issue template.
Also will need test coverage.
- First commit to issue fork.
- 🇪🇸Spain vidorado Logroño (La Rioja)
I've created the MR and applied the patch from #27.
I believe the original post motivation is no longer an issue. Nowadays, nearly all DBs treat the VARCHAR() length as characters, not bytes. Perhaps in the old days when this issue was posted, it was bytes.
There has been motivation in this issue about a year ago, but I think that it was for the limited length of the subject field and not the multibyte problem.
I'm not pretty sure about what to test. I've made a couple of tests that check the maximum comment subject length limitation in action... but this behavior is common for all fields, so I'm a bit uneasy testing this only for the subject field and not for all fields, generally.
So, What do you think we should really test here @smustgrave?
- 🇺🇸United States smustgrave
First glance should have test coverage for the update hook. Simple check schema, run updates, check schema again
- 🇺🇸United States tr Cascadia
I believe the original post motivation is no longer an issue. Nowadays, nearly all DBs treat the VARCHAR() length as characters, not bytes. Perhaps in the old days when this issue was posted, it was bytes.
I believe @vidorado is correct here.
I haven't done the research so I can't say for certain, but I believe all current DBs supported by Drupal use (or can use) a utf8 character set. In which case VARCHAR(n) means n characters, not n bytes. And yes, that would hold true for all fields, not just comment title fields.
In the case of MySQL, this change (from bytes to characters) took place between MySQL 4 and 5. We are now up to MySQL 8.
Drupal installation instructions say that utf8 encoding should be used when configuring the database ( https://www.drupal.org/docs/getting-started/installing-drupal/create-a-d... → ). I don't know if this enforced at the core level or whether it is just assumed - regardless, I expect if you set pure ascii encoding on your database then Drupal core would fail in many places.
It would probably be a good idea to verify that utf8 encoding in the database is enforced by core, and perhaps to clarify the Drupal documentation about defining DB schemas to point out that VARCHAR() refers to characters not bytes. Note, https://api.drupal.org/api/drupal/core%21lib%21Drupal%21Core%21Database%... does make a distinction between 'varchar' and 'varchar_ascii' in a schema definition, but doesn't make it explicit that 'varchar' refers to characters not bytes. Not sure it matters in most cases anymore, but IMO it's better to be explicit to dispel any doubts given that there is a long history in Drupal (and PHP, and SQL, and all code) of poor support for non-ascii character sets.
- 🇪🇸Spain vidorado Logroño (La Rioja)
Thanks for pointing it out @tr.
I've seen that, at least for MySQL, the
utf8mb4
character set is the default for new tables, so every table will have that, although the database could have been configured with another default character set.protected function createTableSql($name, $table) { $info = $this->connection->getConnectionOptions(); // Provide defaults if needed. $table += [ 'mysql_engine' => 'InnoDB', 'mysql_character_set' => 'utf8mb4', ];
For PostgreSQL there is no possibility of setting a different character set for each table, and all of them inherit the character set of their database, but UTF8 is the standard nowadays. The same for SQLite. I haven't searched more in-detail.
- 🇪🇸Spain vidorado Logroño (La Rioja)
After many attempts, I still haven't been able to solve it. 😞
I've modified the test to be a subclass of
UpdatePathTestBase
, but I can't update the subject field's max length. It keeps telling me: "The Subject field needs to be updated."There's something I don't fully understand about schema and field definitions.
I've read this article → and checked out this repository, but without success.
In several places, I've seen suggestions to export the data, uninstall the field, reinstall it, and then restore the data—but that approach seems strange to me. What would make sense to me is altering the database column and updating the field definition, and the existing data should remain unchanged, since it's not a dramatic change.
Maybe someone with more expertise could help out?
- 🇺🇸United States smustgrave
From a quick glance I would say trust the test. Very possible the update hook isn't working fully, as the test looks correct to me.