[WIP] Decouple identifier management from database Connection, introduce an IdentifierHandler class

Created on 27 February 2021, about 4 years ago
Updated 20 October 2021, over 3 years ago

Problem/Motivation

Database identifiers resolution has grown recently, with escaping, caching and introduction of the square brackets syntax to resolve the problem of generic SQL statements to be ported across platforms using different quoting characters. This on top of the concept of prefixing tables.

All this has been added on the Connection class that is IMHO a bit overdoing.

Also, some platforms are hitting identifier length issues like e.g. ๐Ÿ› Identifiers longer than 63 characters are truncated, causing Views to break on Postgres Needs work .

Proposed resolution

Decouple identifier management, including prefix management, in a class of its own, that platforms can override to manage their own restrictions.

Remaining tasks

User interface changes

API changes

Data model changes

Release notes snippet

๐Ÿ“Œ Task
Status

Postponed

Version

10.0 โœจ

Component
Databaseย  โ†’

Last updated about 4 hours ago

  • Maintained by
  • ๐Ÿ‡ณ๐Ÿ‡ฑNetherlands @daffie
Created by

๐Ÿ‡ฎ๐Ÿ‡นItaly mondrake ๐Ÿ‡ฎ๐Ÿ‡น

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Merge Requests

Comments & Activities

Not all content is available!

It's likely this issue predates Contrib.social: some issue and comment data are missing.

  • Open on Drupal.org โ†’
    Environment: PHP 8.2 & MySQL 8
    last update almost 2 years ago
    Not currently mergeable.
  • last update almost 2 years ago
    Custom Commands Failed
  • last update almost 2 years ago
    19,868 pass, 2,915 fail
  • last update almost 2 years ago
    19,868 pass, 2,915 fail
  • last update almost 2 years ago
    29,780 pass, 9 fail
  • ๐Ÿ‡ฎ๐Ÿ‡นItaly mondrake ๐Ÿ‡ฎ๐Ÿ‡น

    Refreshed a bit.

  • last update almost 2 years ago
    Custom Commands Failed
  • last update almost 2 years ago
    20,010 pass, 2,905 fail
  • last update almost 2 years ago
    29,901 pass, 1 fail
  • last update almost 2 years ago
    29,892 pass, 5 fail
  • Pipeline finished with Failed
    2 months ago
    Total: 164s
    #441285
  • Pipeline finished with Failed
    2 months ago
    Total: 163s
    #441298
  • Pipeline finished with Failed
    2 months ago
    Total: 560s
    #441310
  • Status changed to Active 2 months ago
  • ๐Ÿ‡ฎ๐Ÿ‡นItaly mondrake ๐Ÿ‡ฎ๐Ÿ‡น
  • Pipeline finished with Failed
    2 months ago
    Total: 117s
    #442277
  • Pipeline finished with Failed
    2 months ago
    Total: 810s
    #442550
  • Pipeline finished with Failed
    2 months ago
    Total: 196s
    #443119
  • Pipeline finished with Failed
    2 months ago
    Total: 1828s
    #443151
  • Pipeline finished with Failed
    2 months ago
    Total: 860s
    #443176
  • Pipeline finished with Failed
    2 months ago
    Total: 680s
    #443196
  • Pipeline finished with Failed
    2 months ago
    Total: 477s
    #443435
  • Pipeline finished with Failed
    2 months ago
    Total: 658s
    #443446
  • Pipeline finished with Failed
    2 months ago
    Total: 615s
    #443456
  • Pipeline finished with Failed
    2 months ago
    Total: 163s
    #443534
  • Pipeline finished with Failed
    2 months ago
    Total: 700s
    #443564
  • Pipeline finished with Failed
    2 months ago
    Total: 115s
    #443576
  • Pipeline finished with Failed
    2 months ago
    Total: 619s
    #443577
  • Pipeline finished with Failed
    2 months ago
    Total: 1000s
    #443629
  • Pipeline finished with Failed
    2 months ago
    Total: 128s
    #443935
  • Pipeline finished with Failed
    2 months ago
    Total: 132s
    #443936
  • Pipeline finished with Failed
    2 months ago
    Total: 626s
    #443940
  • Pipeline finished with Failed
    2 months ago
    Total: 189s
    #443974
  • Pipeline finished with Failed
    2 months ago
    Total: 770s
    #443978
  • Pipeline finished with Failed
    2 months ago
    Total: 947s
    #443991
  • Pipeline finished with Failed
    2 months ago
    Total: 152s
    #444014
  • Pipeline finished with Failed
    2 months ago
    Total: 127s
    #444034
  • Pipeline finished with Failed
    2 months ago
    Total: 817s
    #444037
  • Pipeline finished with Failed
    2 months ago
    Total: 787s
    #444055
  • Pipeline finished with Failed
    2 months ago
    Total: 879s
    #444059
  • Pipeline finished with Failed
    2 months ago
    Total: 644s
    #444104
  • Pipeline finished with Failed
    2 months ago
    Total: 1252s
    #444112
  • Pipeline finished with Failed
    2 months ago
    Total: 680s
    #445089
  • Pipeline finished with Failed
    2 months ago
    Total: 1283s
    #445115
  • Pipeline finished with Failed
    2 months ago
    Total: 133s
    #445922
  • Pipeline finished with Failed
    2 months ago
    Total: 186s
    #445949
  • Pipeline finished with Failed
    2 months ago
    Total: 530s
    #445980
  • Pipeline finished with Failed
    2 months ago
    Total: 967s
    #446015
  • Pipeline finished with Failed
    2 months ago
    Total: 170s
    #446555
  • Pipeline finished with Failed
    2 months ago
    Total: 2392s
    #446585
  • Pipeline finished with Failed
    2 months ago
    Total: 479s
    #446615
  • Pipeline finished with Success
    2 months ago
    Total: 6352s
    #446663
  • Pipeline finished with Failed
    2 months ago
    Total: 125s
    #446935
  • Pipeline finished with Failed
    2 months ago
    Total: 128s
    #447294
  • Pipeline finished with Failed
    2 months ago
    Total: 126s
    #447540
  • Pipeline finished with Failed
    2 months ago
    Total: 128s
    #447578
  • Pipeline finished with Failed
    2 months ago
    Total: 180s
    #447920
  • Pipeline finished with Failed
    2 months ago
    Total: 208s
    #448230
  • ๐Ÿ‡ฎ๐Ÿ‡นItaly mondrake ๐Ÿ‡ฎ๐Ÿ‡น

    So, here is the idea.

    Developers should not be worried by exceeding identifiers length or using wrong identifiers; the database API should be self-sufficient in receiving input and cleansing/transforming it in machine usable information; failing as a last resort.

    Therefore:

    1. Table names, database name, column names, index names in a db are 'identifiers'.
    2. 'Identifiers' are subject to db-specific limitations (length, chars allowed, etc)
    3. In Drupal, 'Identifiers' are value objects. Each identifier has:
    • a type (e.g. table, column, alias, etc)
    • a raw value (e.g. 'config', 'key_value', 'mydb."key_value"', 'with$$invalidchar', etc)
    • a canonical value, i.e. the raw value cleansed of unallowed characters (e.g. from the above 'config', 'key_value', 'mydb.key_value', 'withinvalidchar', etc)
    • a machine value, i.e. the canonical value transformed to be used on the db-specific SQL query - including prefixing and/or shortening (e.g. from the above, assuming a prefix 'abba': '"abbaconfig"', '"abbakey_value"', '"mydb"."key_value"', '"abbawithinvalidchar"', etc)
  • each db-driver implements an 'IdentifierHandling' class that parses/transforms a raw value into the relevant value object, including a local cache to prevent processing again and again each raw value.
  • This works now for MySql and SQLite; the MySql implementation shortens table names that are longer than 64 characters using hashing techniques.

    PgSql will follow with its own 63 chars max limitation.

    In follow ups, this approach could be extended from tables to columns, aliases, indexes etc.

  • Pipeline finished with Success
    2 months ago
    Total: 2991s
    #448237
  • Pipeline finished with Failed
    2 months ago
    Total: 113s
    #448655
  • Pipeline finished with Running
    2 months ago
    #448681
  • Pipeline finished with Failed
    about 2 months ago
    Total: 4874s
    #449197
  • Pipeline finished with Failed
    about 2 months ago
    Total: 233s
    #449524
  • Pipeline finished with Success
    about 2 months ago
    Total: 1292s
    #449537
  • Pipeline finished with Success
    about 2 months ago
    Total: 3460s
    #449563
  • ๐Ÿ‡ฎ๐Ÿ‡นItaly mondrake ๐Ÿ‡ฎ๐Ÿ‡น

    Added pgsql conversion. Tests are now green on all 3 core drivers. The concept is in and implemented; now to cleaning, BC and documenting.

  • Pipeline finished with Failed
    about 2 months ago
    Total: 188s
    #449621
  • Pipeline finished with Success
    about 2 months ago
    #449626
  • Pipeline finished with Failed
    about 2 months ago
    Total: 248s
    #449741
  • Pipeline finished with Failed
    about 2 months ago
    Total: 296s
    #449747
  • Pipeline finished with Failed
    about 2 months ago
    Total: 938s
    #449762
  • Pipeline finished with Failed
    about 2 months ago
    Total: 271s
    #450423
  • Pipeline finished with Failed
    about 2 months ago
    Total: 131s
    #450470
  • ๐Ÿ‡ฎ๐Ÿ‡นItaly mondrake ๐Ÿ‡ฎ๐Ÿ‡น

    Ready for review now.

  • ๐Ÿ‡ฎ๐Ÿ‡นItaly mondrake ๐Ÿ‡ฎ๐Ÿ‡น
  • Pipeline finished with Success
    about 2 months ago
    Total: 2712s
    #450497
  • ๐Ÿ‡ฎ๐Ÿ‡นItaly mondrake ๐Ÿ‡ฎ๐Ÿ‡น

    The machineName in the value objects should be stored unquoted. On that.

  • Pipeline finished with Success
    about 2 months ago
    Total: 3971s
    #451036
  • ๐Ÿ‡ฎ๐Ÿ‡นItaly mondrake ๐Ÿ‡ฎ๐Ÿ‡น

    Done #22, back to NR.

  • Pipeline finished with Success
    about 2 months ago
    Total: 3790s
    #451104
  • ๐Ÿ‡ณ๐Ÿ‡ฑNetherlands daffie

    I support the basic solution. Great work!

    1. The DX can be improved. Do we really need to deprecate methods like Connection::escapeTable() and Connection::getPrefix(). Can we keep them? Changes like:
      - $this->connection->getPrefix()
      + $this->connection->identifiers->tablePrefix
      

      or

      - $this->connection->escapeTable($this->table)
      + $this->connection->identifiers->table($this->table)->forMachine()
      

      Those changes do not make for an improved developer experience. Lets keep those methods as a wrapper for the new implementation. Maybe we need to add some more wrapper methods to improve the DX. It is great that we have an improved identifier management solution. Now lets hide it from them as much as possible. It works just a little bit better then before. For most of them that is all they need to know and maybe they do not need to know at all.

    2. The default database driver for Drupal is MySQL/MariaDB. Let the default implementation in the namespace Drupal\Core\Database\Identifier be for just MySQL/MariaDB. The specific changes that are needed for PostgreSQL and SQLite need to be done in their database driver modules. Like no "schema" stuff in the namespace Drupal\Core\Database\Identifier. Other database drivers that are not supported by Drupal core, have their own specific stuff they need to override. Let use the database drivers for PostgreSQL and SQLite as an example for those contrib database drivers. The current solution is limited to making it work for the by Drupal core supported database drivers.
  • ๐Ÿ‡ฎ๐Ÿ‡นItaly mondrake ๐Ÿ‡ฎ๐Ÿ‡น

    #24 thanks Daffie for your review!

    1) for sure we can undeprecate those, or postpone removal to a later major (D13+). But why would that be desirable? Besides that those 'wrappers' would be redundant, what would someone call those for, given that the value objects encapsulate the result of those wrappers? Besides, once we move to value objects then we could pass them as arguments for methods instead of strings and the reason for such wrappers would simply not hold anymore - see for example [ #3514117] and how sqlite's Schema class would benefit.

    Supposing a prefix 'foo' is required, given that the Table identifier implements \Stringable and it returns the machine name identifier:

    $table = $this->connection->identifier->table('try_me');
    echo "SELECT * FROM $table";
    

    would return, without any further processing:

    for SQLite:
    SELECT * FROM "foo"."try_me"

    for MySQL
    SELECT * FROM "footry_me"

    for PostgreSql (supposing a 'bar' schema is in use)
    SELECT * FROM "bar"."footry_me"

    2) that's exactly what the MR is doing no? Identifier/Database, Identifier/Schema, Identifier/Table are value objects, defined final - they only carry the values. The values themselves are determined by each driver's IdentifierHandler class, that parses the 'raw value' string identifier and instantiates the value objects with the needed values. Each driver's IdentifierHandler extends from the abstract IdentifierHandlerBase which has the 'regular' implementation - and all drivers need diversions one way or another, say for the different max lenght, or for how tables are prefixed, or for adding schemas etc.

  • The Needs Review Queue Bot โ†’ tested this issue. It no longer applies to Drupal core. Therefore, this issue status is now "Needs work".

    This does not mean that the patch necessarily needs to be re-rolled or the MR rebased. Read the Issue Summary, the issue tags and the latest discussion here to determine what needs to be done.

    Consult the Drupal Contributor Guide โ†’ to find step-by-step guides for working with issues.

  • ๐Ÿ‡ฎ๐Ÿ‡นItaly mondrake ๐Ÿ‡ฎ๐Ÿ‡น

    Updated fork.

  • Pipeline finished with Success
    about 2 months ago
    Total: 898s
    #452570
  • ๐Ÿ‡ฎ๐Ÿ‡นItaly mondrake ๐Ÿ‡ฎ๐Ÿ‡น
  • Pipeline finished with Success
    about 2 months ago
    Total: 1988s
    #453507
  • Pipeline finished with Canceled
    about 2 months ago
    #454981
  • Pipeline finished with Failed
    about 2 months ago
    Total: 252s
    #454983
  • Pipeline finished with Failed
    about 2 months ago
    Total: 137s
    #455939
  • Pipeline finished with Success
    about 2 months ago
    Total: 2324s
    #455948
  • Pipeline finished with Failed
    about 1 month ago
    Total: 679s
    #462834
  • Pipeline finished with Success
    about 1 month ago
    Total: 2381s
    #462842
  • Pipeline finished with Failed
    about 1 month ago
    #465859
  • Pipeline finished with Success
    26 days ago
    Total: 724s
    #476994
  • Pipeline finished with Failed
    22 days ago
    Total: 120s
    #479414
  • Pipeline finished with Success
    22 days ago
    #479424
  • Pipeline finished with Success
    7 days ago
    Total: 515s
    #491338
  • Status changed to Needs work 2 days ago
  • The Needs Review Queue Bot โ†’ tested this issue. It no longer applies to Drupal core. Therefore, this issue status is now "Needs work".

    This does not mean that the patch necessarily needs to be re-rolled or the MR rebased. Read the Issue Summary, the issue tags and the latest discussion here to determine what needs to be done.

    Consult the Drupal Contributor Guide โ†’ to find step-by-step guides for working with issues.

  • ๐Ÿ‡จ๐Ÿ‡ฆCanada Liam Morland Ontario, CA ๐Ÿ‡จ๐Ÿ‡ฆ

    liam morland โ†’ made their first commit to this issueโ€™s fork.

  • ๐Ÿ‡จ๐Ÿ‡ฆCanada Liam Morland Ontario, CA ๐Ÿ‡จ๐Ÿ‡ฆ

    Re-rolled.

  • Pipeline finished with Failed
    2 days ago
    Total: 147s
    #495004
  • ๐Ÿ‡บ๐Ÿ‡ธUnited States mradcliffe USA

    Switching to Needs work for phpunit and code quality failures.

  • Pipeline finished with Failed
    about 16 hours ago
    Total: 433s
    #496365
  • ๐Ÿ‡ฎ๐Ÿ‡นItaly mondrake ๐Ÿ‡ฎ๐Ÿ‡น
  • Pipeline finished with Success
    about 16 hours ago
    Total: 1152s
    #496380
  • ๐Ÿ‡ฌ๐Ÿ‡งUnited Kingdom joachim

    I feel this might clash a bit with โœจ add long table name abbreviation to Database API Active .

    Is the MR here just rejecting names which are too long? Could it leave space for that issue to handle them?

  • ๐Ÿ‡ฎ๐Ÿ‡นItaly mondrake ๐Ÿ‡ฎ๐Ÿ‡น

    #35: this is a totally alternative approach, that removes the need for the other issue.

    Too long names are automatically shortened here to match the db limitations.

  • ๐Ÿ‡บ๐Ÿ‡ธUnited States mradcliffe USA

    I came up with a rough outline for the change record after reviewing the changes.

    • Adds a new nullable argument to the base Connection __construct method that takes an IdentifierHandler.
    • Database drivers should implement IdentifierHandler (details) and pass this into Connection::parent::__construct().
      • Implement at least getMaxLength if extending IdentifierHandlerBase.
    • Method calls and properties that need to be changed:
      • getPrefix() should use identifiers->tablePrefix
      • $identifierQuotes should use identifiers->identifierQuotes
      • $prefix should use identiferies->tablePrefix
      • $escapedTables is no longer used
      • $tablePlaceholderReplacements should use ????
      • escapeDatabase() should use identifiers->database()
      • Any time need to use a table or prefix in code this should be identifiers->table
      • escapeTable() should be identifiers->table. escapeTable() is no longer needed when using connection->query()? A database drive may need to use schema()->quotedMachineName . $tableIdentifier->quotedMachineName?

    It looks like this also applies to any usages of query that use escapeTable manually in addition to database driver modules.

    Finally a question. If a site provided a prefix and used a non-core database driver would that site stop working because $prefix isn't set anywhere by that database driver (if it relied on the protected property)?

  • Production build 0.71.5 2024