- Issue created by @daffie
- 🇳🇱Netherlands daffie
For the bikeshedding part see 🐛 [PP-1] Performance issues with path alias generated queries on PostgreSQL Postponed
- 🇺🇸United States bradjones1 Digital Nomad Life
Minor typo fix in IS and title.
- 🇺🇸United States bradjones1 Digital Nomad Life
A bit of rubber-ducking on the possibility of using an object implementing
ArrayAccess
for a value object that would allow for additional data on the index (beyond just an array of keys.) As proposed by @alexpott.If I understand this properly, the idea is to be able to specify indexes something like (pseudocode):
$schema[$table]['indexes']['indexName'] = new DbIndexSpec($keys, $postgresIndexType);
If
DbIndexSpec
implements bothArrayAccess
andTraversable
, then it could be crafted in a BC-compatible way for any instances where it is foreach()'ed over currently. There is an instance of the Postgres driver doing a typecheck with is_array(), which would not be compatible here, but that code would be changed anyway because this is the whole reason for this conversation.Setting the
is_array()
issue aside, the value object would need to be BC-compatible for nested foreach loops such as found in the mysql driver:protected function getNormalizedIndexes(array $spec) { $indexes = $spec['indexes'] ?? []; foreach ($indexes as $index_name => $index_fields) { foreach ($index_fields as $index_key => $index_field) { // Get the name of the field from the index specification. $field_name = is_array($index_field) ? $index_field[0] : $index_field; // Check whether the field is defined in the table specification. if (isset($spec['fields'][$field_name])) { // Get the MySQL type from the processed field. $mysql_field = $this->processField($spec['fields'][$field_name]); if (in_array($mysql_field['mysql_type'], $this->mysqlStringTypes)) { // Check whether we need to shorten the index. if ((!isset($mysql_field['type']) || $mysql_field['type'] != 'varchar_ascii') && (!isset($mysql_field['length']) || $mysql_field['length'] > 191)) { // Limit the index length to 191 characters. $this->shortenIndex($indexes[$index_name][$index_key]); } } } else { throw new SchemaException("MySQL needs the '$field_name' field specification in order to normalize the '$index_name' index"); } } } return $indexes; }
That said, since there are contrib DB drivers out there, is this just "too much" to try and attempt?
- 🇺🇸United States bradjones1 Digital Nomad Life
Got some feedback and background in Slack on "ArrayPI" - the name appears to be a Drupalism (and a fun one at that). There are similar discussions around trying to chip away at particularly complex or DX-unfriendly structures, e.g. #3380145-19: ViewsData should not cache by language → . Seems @catch is optimistic about being able to refactor some of these with limited BC impact.
While poking around the related code I also discovered that an index's "key column specifier" can either be an array of column names OR arrays which contain a column name and prefix length. Database drivers can elect to respect the prefix or just use the full column value. (IOW, they must support the syntax but not necessarily create a functional index as a result.) I _think_ this potentially opens us up to using this array syntax to also express driver-specific options, e.g. index type. I don't love it, but it is potentially better than the BC concerns around a value object.
So we could rephrase that docblock to say something like:
* A key column specifier is either a string naming a column or an array of two * elements, column name and length, specifying a prefix of the named column. * Note that some DBMS drivers may opt to ignore the prefix length configuration * and still use the whole field value for the key. Code should therefore not * rely on this functionality. * A key column specifier may also contain a key, 'driver_options', which is an array * keyed by database module name (e.g., 'pgsql') which is an associative array of * keys and values to set driver-specific index options. Drivers supporting additional * options should respect a value of NULL for the prefix length, if it is unnecessary or * incompatible given the driver-specific options.
This shouldn't BC-break any code that accesses the numeric-indexed members.
- Status changed to Needs review
11 months ago 8:24am 16 December 2023 - 🇺🇸United States bradjones1 Digital Nomad Life
So I _think_ I might have found a way to implement a value object for the index specification that will allow for this kind of configuration to be conveyed. The only issue would be if a non-core database driver does an explicit
is_array()
check on the index definitions. This would only pass that kind of very strict (and not very useful) type checking if it changed tois_iterable()
, which the value object would pass.Given the talk in Slack and on related issues on this subject, I think that should be OK? And we really, really don't seem to have any other way to include this outside of sticking it in a very awkward new schema definition key like
indexes_config
or something, which is not very ergonomic and shards this information from the index definitions. - 🇺🇸United States bradjones1 Digital Nomad Life
I've added a draft CR → to help explain how this works and start the conversation on how to message this, since it's the first time we're doing a value object inside ArrayPI.
I think it could be interesting to encapsulate this into the coding standards re:
is_array()
isis_iterable()
on type checks for ArrayPI members. - 🇮🇹Italy mondrake 🇮🇹
A comment on the general approach here if I may.
I think the idea of having value objects to represent DB objects is a great step forward. Doctrine DBAL has done something similar and dropped usage of arrays to describe DB structures.
That said, and given this will set a precedent in Drupal DB API, I would suggest:
1. To have a
Schema
subfolder where to store both the value object and the enum.
2. With that, the value object name could easily beIndex
instead ofIndexSpecification
.
3. Why final? The value object could be extended by drivers exactly to cover DB specific features like here.
4. Testing of DB specifics should be done in the driver extensions of DriverSpecificSchemaTestBase, not in the base class. - 🇺🇸United States bradjones1 Digital Nomad Life
Thanks for the vote of confidence and feedback. On points 1-3 that's the kind of feedback I needed because there is no existing pattern to follow so I was just winging it.
As for #4, I do think that we need to have testing of this in the base class, to detect compatibility breaks. Basically, we need to ensure that all drivers work when they don't necessarily support the extra data and successfully just treat it as an array. Perhaps annotating the test to state as such could make it clearer why it's there. The alternative would be having a base test that has just the object and the pgsql version has the pgsql-specific semantics.
- 🇺🇸United States bradjones1 Digital Nomad Life
I realize I missed addressing the question about making the value object
final
. My justification is that the index definition object itself is driver-agnostic; it's the configuration that is specific to each flavor. So there is no legitimate reason to fork this off asPostgresIndex
or whatever because it's not a postgres index - it's an index definition with some postgres metadata attached. - Status changed to Needs work
11 months ago 9:50am 28 December 2023 - 🇳🇱Netherlands daffie
@bradjones1: The solution looks good to me.
We are missing some CRUD operations with the GIN/GIST indexes. We should also add functionality and testing to:
- to check that the GIN/GIST index exists;
- to add GIN/GIST index is created on an existing field;
- to remove a GIN/GIST index;
- when we change an existing field we can also add/remove a GIN/GIST index.Add some testing that when we create a new table with a GIN/GIST index that the index exists and is a GIN/GIST index. Please create a query with the explain to check that we have created a GIN/GIST index and that the query is using it.
- Status changed to Needs review
7 months ago 11:07pm 26 April 2024 - 🇺🇸United States bradjones1 Digital Nomad Life
Rebased.
Re: The requests in #16 -
to check that the GIN/GIST index exists
GIN/GIST indexes are cataloged by Postgres similar to other indexes, so checking they exist is already possible. To determine their type, you must dig a little deeper, so there is now an additional column of information about indexes (using
pg_get_indexdef()
) which we're also using in the new tests.to add GIN/GIST index is created on an existing field
to remove a GIN/GIST indexTest coverage added.
when we change an existing field we can also add/remove a GIN/GIST index
I'm not sure we need to do this, and if we do, this amounts to a new feature for the driver/DBAL overall and nothing specific to GIN/GIST. There is a lot of logic already in
::changeField()
but none of it currently touches indexes. In that respect GIN/GiST aren't special, and I think it's incumbent upon the caller to determine if the existing indexes will be compatible. I suggest addressing this in a separate issue. (GIN/GIST do introduce some interesting new considerations around index type support for changing field types, but I still don't think that means we must build a new feature into::changeField()
to get support in.)Add some testing that when we create a new table with a GIN/GIST index that the index exists and is a GIN/GIST index.
This is in the test coverage. Note that we are in a cart-and-horse situation with respect to testing GIN - its test is almost identical to GIST, we just need a supported field type to test with. So I put in a skipped test as a placeholder and once we commit this, we can enable that test coverage with/after JSON data types go in.
- Status changed to Needs work
6 months ago 3:06pm 8 May 2024 - 🇳🇱Netherlands daffie
The testbot is failing on PostgreSQL on your added code.
The CR needs examples on how the use the added GIN and GIST indexes.
- Status changed to Needs review
6 months ago 3:57pm 8 May 2024 - 🇺🇸United States bradjones1 Digital Nomad Life
Ah, how embarassing, I forget you need to manually trigger the PgSQL tests. Fixed with some updated coverage of the introspection method.
Updating the CR now.
- Status changed to Needs work
6 months ago 4:15pm 8 May 2024 - 🇺🇸United States bradjones1 Digital Nomad Life
Missed that there are also new MR comments.
- 🇸🇰Slovakia poker10
Thanks for great work @bradjones1! I have added some other comments to the MR.
- Status changed to Needs review
6 months ago 8:26pm 8 May 2024 - 🇺🇸United States bradjones1 Digital Nomad Life
Replied to MR comments and made a few changes as a result. Back to NR.
- 🇺🇸United States mradcliffe USA
I think this pretty good to commit. Everything makes sense based on the changes from daffie's review.
I updated the change record.
I think we can update the issue summary with the proposed resolution now. Maybe moving some of the initial resolution into the problem/motivation?
- 🇺🇸United States bradjones1 Digital Nomad Life
@mradcliffe would you mind RTBC'ing per your review?
- Status changed to RTBC
6 months ago 11:32pm 9 May 2024 - 🇺🇸United States mradcliffe USA
Agreed. We'll see if we need to do anything else :-)
- Status changed to Needs work
6 months ago 5:10am 23 May 2024 - 🇳🇿New Zealand quietone
I read the Issue Summary, the CR and then the MR. I reviewed the comments and the MR. I left comments in the MR and suggestions. At least one of the suggestions is incomplete, that is it still needs work.
The change record reads well but I was surprised to find that new enum is only mentioned at the end. I think that is important information and should be at the start. There is a code example for the 'after' situation. Can one be added for the 'before' case? And I don't follow this sentence, "The only backwards-compatibility break would be if a driver is explicitly checking the index definition with is_array(), which is currently unnecessary.' It is not clear to me when this is unnecessary, is it before this change or after. Maybe it is better to change this to something like this, "Sites that are using a driver that is explicitly checking the index definition with is_array() will need to ....'..