- Issue created by @joachim
- Merge request !10152Issue-3486125-add-long-table: Added the createAbbreviatedTable and... โ (Open) created by anish.ir
- ๐ฎ๐นItaly mondrake ๐ฎ๐น
This is kinda similar to ๐ [WIP] Decouple identifier management from database Connection, introduce an IdentifierHandler class Postponed .
I have added both the mentioned functions : createAbbreviatedTable and getAbbreviatedTableName.
The table names which will be created will be named as : @fixed_prefix . $hashed_suffix, and it will look like : test_abbrev_55ec7ea143ca25227bd0a4eff922798c8f7693d8e59 (which will be less than 64 characters).
The createAbbreviatedTable takes 3 inputs :
1. TableName - actual table name
2. Fixed Prefix - to be attached in the starting of the abbreviated table name.
3. Schema (array) - as the function uses the createTable function which requires two parameters : tableName and the schema.Please check it.
- ๐ฌ๐งUnited Kingdom joachim
Looks good!
I've made some comments on the MR.
Needs tests too.
Hey joachim โ ,
The comments you have made on the MR have been resolved now.
Please look into it.- Status changed to Needs work
about 1 month ago 5:28am 27 February 2025 Hey, I just wanted to clarifyโshould we add a new test case here to cover this scenario, or is it enough to simply resolve the pipeline errors without adding a new test case?
Please let me know how you'd like to proceed so I can move forward accordingly.
Thanks!Hi,
As part of the fix for handling long table names in the Database API, I see two possible approaches:
1. Modify createTable() to check the table name length and, if necessary, prompt users to provide a prefix before automatically calling createAbbreviatedTable().
- Pros: Ensures all table creations automatically handle long names, reducing the chance of errors.
- Cons: Alters core behavior, which may introduce unexpected changes for existing modules.
2. Keep createTable() unchanged ( as requested above and the current approach ) and let module developers explicitly use createAbbreviatedTable() when needed.
- Pros: Maintains existing behavior and gives developers full control.
- Cons: Modules that donโt explicitly use the new method might still run into long table name issues.
Which approach would be preferable? Should we enforce automatic handling within createTable(), or should it remain an opt-in feature for developers?
Looking forward to your thoughts. Thanks!
- ๐ฌ๐งUnited Kingdom joachim
> 1. Modify createTable() to check the table name length and, if necessary, prompt users to provide a prefix before automatically calling createAbbreviatedTable().
This is an API. We can't prompt users. And in any case, users shouldn't be exposed to internal details like database table names.
- ๐ณ๐ฑNetherlands daffie
The big problem for me is what are we going to do with, for instance, tables from entity field storage system that have long tables names that have been abbreviating with a hash? Do we still want to support that or do you want to rename them? To me that is the big problem. If you have a better idea on how to solve this, then please speak up.
- ๐ฌ๐งUnited Kingdom joachim
The entity field storage system currently hashes its own table names, right?
In which case, can't it just carry on doing that if it wants to? Using this new API is optional.
A follow-up could take care of renaming those tables and switching the entity field storage system to using the API.
- ๐ณ๐ฑNetherlands daffie
In which case, can't it just carry on doing that if it wants to? Using this new API is optional.
We shall need to be very careful that we do not break any core/contrib/custom code.
A follow-up could take care of renaming those tables and switching the entity field storage system to using the API.
That will be a very big update for site owners. Are you sure that this improvement is worth the update for site owners?
We have similar issues like: ๐ migrate mapping & messages table names are truncated, can lead to incorrect mapping lookups Needs work and ๐ Identifiers longer than 63 characters are truncated, causing Views to break on Postgres Needs work .
- ๐ฌ๐งUnited Kingdom joachim
> We shall need to be very careful that we do not break any core/contrib/custom code.
The MR isn't changing any existing behaviour:
- call createTable() -- same behaviour as before. If the table name you supply is too long, then Bad Things happen!
- createAbbreviatedTable() -- new API, ensures that your table name is not too long.For follow-ons, the rule is:
- were you passing table names to createTable() that MIGHT sometimes be too long? When the table name was too long, then your code was crashing anyway. Change to calling createAbbreviatedTable() to prevent crashing
- were you already passing hashed table names to createTable()? If so, you need to either:
- a. Keep doing that, no change
- b. Change to calling createAbbreviatedTable(), and rename your tables in an update function. Proceed with care! This can only really be done if your table names are internal. If contrib or custom code might be referring to your tables, there's a problem! We need a way to deprecate table names! - ๐ณ๐ฑNetherlands daffie
I am ok with adding the functionality for abbreviated table/alias names and the new methods
createAbbreviatedTable()
andgetAbbreviatedAlias()
. The last method shall be used by the views module. I'll surely look on the changes suggested above and update the MR.
Thanks!Hi,
Thank you for the feedback on the MR.
I've updated the patch per the review feedback:
- Replaced the constant with a protected property for the maximum table name length.
- Consolidated the abbreviation logic into a single helper function, which is now used in both createAbbreviatedTable() and the new getAbbreviatedAlias() method.
- Updated PHPDoc comments to clarify that table/alias names are only abbreviated when they exceed the allowed length.
I have tested these changes locally using a custom module, and everything works as expected. Please let me know if there are any further changes to be made here.
Ready for further review. Thanks !- ๐ฌ๐งUnited Kingdom joachim
@daffie your suggested change is a big change to how this was envisaged to work:
* When a table name or alias combined with the database prefix is greater * than the maximum from the class variable $maxTableNameLength, the name * will be abbreviated. The abbreviation will remove the middle part of the * name and replace it with a 10 character hash. The first and last parts of * the name will be preserved.
What counts as the 'middle' part? What if the first and last part don't guarantee uniqueness? I think it's best to leave it up to callers to know which part of their table name must be kept for uniqueness.
- ๐ณ๐ฑNetherlands daffie
To make the change as useful as possible for the views module, the last part of the original name/alias is very important. For others it is the first part. Therefore my suggestion is to replace the 'middle' part. As little as possible and replace it with a short 10 character hash. In this way we can preserve as much as possible from the original name/alias.
The problem with the views module is that we have a lot of aliases that are being abbreviated in the way you want and that results in aliases like:
- QWERTYUIOPASDFGHJKLZXCVBNM1234567890qwertyuiopasdfgh__DFSGHeabdfag
- QWERTYUIOPASDFGHJKLZXCVBNM1234567890qwertyuiopasdfgh__bdfhbEWRGEAI have 2 aliases and that are different and know idea what the originals are. Yes, they are from 2 different joins, only which is which. The first part is important and the last part is important. That is why I would like to remove the "middle" part.
We can ask a framework manager/committer for some input/ideas?
- ๐ฌ๐งUnited Kingdom joachim
> To make the change as useful as possible for the views module, the last part of the original name/alias is very important. For others it is the first part. Therefore my suggestion is to replace the 'middle' part
That's fair enough. But what's the algorithm for deciding what counts as the 'middle'?
- ๐ณ๐ฑNetherlands daffie
But what's the algorithm for deciding what counts as the 'middle'?
Take the maximum allowed length and subtract the strlen($this->getPrefix()).
From that subtract 14 characters for the hash value and 2 underscore characters before and after the hash value.
Now you got a number characters left. Half is for the first part of the table/alias name and the rest is for the last part of the table/alias name.For instance: QWERTYUIOPASDFGHJKLZXCVBNM1234567890qwertyuiopasdfghjklzxcvbnm
Becomes something like: QWERTYUIOPASDFGHJK__10charhash__iopasdfghjklzxcvbnmI have also made in the PR a suggestion for the method for how to do it..
- ๐ฌ๐งUnited Kingdom joachim
> Now you got a number characters left. Half is for the first part of the table/alias name and the rest is for the last part of the table/alias name.
That might cut off words in the middle, which is going to make some ugly table names.
I think it's better if callers can specify which bits of the table name they want to preserve -- it's fine if that's bits at the start AND at the end.
- ๐ฎ๐นItaly mondrake ๐ฎ๐น
I haven't gone through the details, but just wanted to say I'd be +1 on making the shortened table name the result of a deterministic algo only based on the length of the full name (including prefix). That would be what I'd be pursuing in ๐ [WIP] Decouple identifier management from database Connection, introduce an IdentifierHandler class Postponed .
- ๐ฎ๐นItaly mondrake ๐ฎ๐น
๐ [WIP] Decouple identifier management from database Connection, introduce an IdentifierHandler class Postponed now has a MR in NR that addresses this (for the time being for table identifiers) on the Database API level.