- Status changed to RTBC
almost 2 years ago 9:23am 16 January 2023 - Status changed to Needs review
almost 2 years ago 9:56am 15 February 2023 - 🇬🇧United Kingdom alexpott 🇪🇺🌍
Why did we chose GIST over GIN when the docs say that GIN is preferred? I read all the comments from when the GSIT functionality was introduced and I can't see it.
I wish there was a way to implement this without using magic index names. Magic like this always results in surprises. Can't we add something like another key to the schema that has the same structure as
'indexes'
but is'gists'
?Regardless of whether we do the magic index naming or add the new key to the schema we need to document this somewhere.
Actually I have a really good reason to add a new key to schema for this. If you create an index in a module that expects all databases to perform the same and it has _gist it will do different things if we make the change as is in #93. Whereas if we add a special 'gist' key that is only supported by postgres then a module can add a gist that will, as expected, only affect postgres and does not have to have a special postgres adapter module to override it's own storage implementation. It could just have something like:
// Add GIST indexes to make queries on the columns 'alias' and 'source' // faster on PostgreSQL. // @see https://www.postgresql.org/docs/12/textsearch-indexes.html $schema[$this->storage->getBaseTable()]['gists'] += [ 'path_alias__alias' => ['alias'], 'path_alias__path' => ['path'], ]; if ($revision_table = $this->storage->getRevisionTable()) { $schema[$revision_table]['gists'] += [ 'path_alias_revision__alias' => ['alias'], 'path_alias_revision__path' => ['path'], ]; }
- 🇳🇱Netherlands daffie
@alexpott: Moving the GIST/GIN indexes to their own key in the schema API, will result in PostgreSQL to have to override the class Drupal\Core\Entity\Sql\SqlContentEntityStorageSchema. As that class creates the entity tables and indexes. If we do that we will break contrib module on PostgreSQL that have content entities that override the base storage schema class.
What we can do is create a new service called Drupal\Core\Entity\Sql\SqlContentEntityStorageSchemaService that do everything that the class Drupal\Core\Entity\Sql\SqlContentEntityStorageSchem is now doing. The old class will than become a thin wrapper on top of the service. Database driver modules can than override the service and do what they need to do. Every contrib/custom module that changes content entity storage schema class will keeps working and there will be no BC break. - 🇬🇧United Kingdom alexpott 🇪🇺🌍
I think we might not need to override SqlContentEntityStorageSchema. Looking at addIndex we have the full table spec there - so maybe
// Add GIST indexes to make queries on the columns 'alias' and 'source' // faster on PostgreSQL. // @see https://www.postgresql.org/docs/12/textsearch-indexes.html $schema[$this->storage->getBaseTable()]['indexes'] += [ 'path_alias__alias' => ['alias'], 'path_alias__path' => ['path'], ]; $schema[$this->storage->getBaseTable()]['gists'] = array_merge($schema[$this->storage->getBaseTable()]['gists'] ?? [], ['path_alias__alias', 'path_alias__path']); if ($revision_table = $this->storage->getRevisionTable()) { $schema[$revision_table]['indexes'] += [ 'path_alias_revision__alias' => ['alias'], 'path_alias_revision__path' => ['path'], ]; $schema[$revision_table]['gists'] = array_merge($schema[$revision_table]['gists'] ?? [], ['path_alias_revision__alias', 'path_alias_revision__path']; }
And we can use this info in the addIndex() in postrgres. At least it is explicit and not magical. In order to make this less painful we could implement a helper static method in Postgres's schema to add a gist index. So the code becomes something like:
$schema[$this->storage->getBaseTable()] = Schema::addGistsToSchema([ 'path_alias__alias' => ['alias'], 'path_alias__path' => ['path'], ], $schema[$this->storage->getBaseTable()]); // Add then in the \Drupal\pgsql\Driver\Database\pgsql\Schema class... public static function addGistsToSchema(array $gists, array $table_spec) { $table_spec['indexes'] += $gists; $table_spec['gists'] = array_merge($table_spec['gists'] ?? [], array_keys($gists)); return $table_spec; }
Yet another way would be to use an object with array access to denote it is a gist index - but this is a way bigger API change. That said at some point table specifications should move away from ArrayPI and this at least gives us a requirement.
- Status changed to Needs work
over 1 year ago 11:46am 7 April 2023 The Needs Review Queue Bot → tested this issue. It no longer applies to Drupal core. Therefore, this issue status is now "Needs work".
This does not mean that the patch needs to be re-rolled or the MR rebased. Read the Issue Summary, the issue tags and the latest discussion here to determine what needs to be done.
Consult the Drupal Contributor Guide → to find step-by-step guides for working with issues.
- last update
over 1 year ago Patch Failed to Apply - 🇮🇹Italy kopeboy Milan
Is this only related to Postgres? I see the same type of (non-performant) path alias queries in Views with MariaDB
- 🇳🇱Netherlands daffie
@kopeboy: To test out if you are using an index, you should add "EXPLAIN " in front of the query and run it in something like "phpMyAdmin". For more info: https://mariadb.com/kb/en/explain/
- 🇮🇹Italy kopeboy Milan
Not sure.. are you saying I should add an index to taxonomy term tables that is not already in core?!
- 🇳🇱Netherlands daffie
@kopeboy Something like:
EXPLAIN SELECT "base_table"."path" AS "path", "base_table"."alias" AS "alias" FROM path_alias base_table WHERE (base_table.status = 1) AND (base_table.path LIKE '/taxonomy/term/1423' ESCAPE '\')
- 🇺🇸United States bradjones1 Digital Nomad Life
Dropping in here from ✨ Add "json" as core data type to Schema and Database API Needs work , where we need GIN index creation support for JSON path query-based indexes.
I hope this helps sway the discussion here regarding whether to use GIN or GiST; the former also allows for JSON data and would enable broader functionality in the DB API.
Updating the issue title to better reflect that this is an index-creation issue first, and also includes a specific performance improvement in path aliasing.
For what it's worth, GIN support for JSON is limited to certain operators, mostly when it comes to object/array/scalar "containment" and key-exists type queries. But, if we're going to support these types of indexes in Postgres, it would be great to also leverage GIN for power users/if we end up supporting containment queries at some point in the future.
- 🇳🇱Netherlands daffie
As the GIN indexes do to working for path aliases, the solution will be adding a GIST index. As the best solution for JSON fields is a GIN index, we should add them both to Drupal. Each having its own use case.
- 🇺🇸United States bradjones1 Digital Nomad Life
Re: #105 yes that's a good plan, however I had renamed this issue with the intent that a similar index-creation logic could be extended to GIN as well. I don't think it makes sense to have a separate issue that addresses pgsql index creation if the approach (special case by name) is going to be the same?
28:18 25:11 Running- 🇺🇸United States KoolAidGuy
I've attached a patch re-rolled for Drupal 10.1.5 as the other patches on this thread weren't applying.
- 🇸🇰Slovakia poker10
@KoolAidGuy The patch #107 is missing important parts from the last working patch #93 and it does not seems to be based on the patch #93 either. When doing rerolls please be carefull that the new patch does not loose any code, otherwise such rerolls cause only distraction. Also it is preffered to use merge requests these days (which could be a good idea also here, as it seems that this issue will require some additional changes based on #97). Thanks!
I am hiding the patch #107.
- Status changed to Postponed
11 months ago 12:50am 8 December 2023 - 🇺🇸United States bradjones1 Digital Nomad Life
Index creation (for both GIN and GIST, as there is now a concrete use case for each) is broken out into 📌 Adding GIN and GIST indexes to PostgreSQL databases RTBC .
- 🇺🇸United States bradjones1 Digital Nomad Life
I have an MR for handling creation of both GIN and GIST indexes over at 📌 Adding GIN and GIST indexes to PostgreSQL databases RTBC - please take a look.
- 🇺🇸United States bradjones1 Digital Nomad Life
📌 Adding GIN and GIST indexes to PostgreSQL databases RTBC has test coverage and is in Needs Review if anyone here wants to move that along to unblock this.
- 🇨🇦Canada mparker17 UTC-4
@bradjones1 would it be helpful to update this patch to use the API in 📌 Adding GIN and GIST indexes to PostgreSQL databases RTBC , given that it is RTBC?
Anything else I could help with?
- 🇺🇸United States mradcliffe USA
@mparker17 thanks for asking. I think the best thing is to start an issue fork and apply the patch you re-rolled into the issue fork branch and start a merge request.
Merge requests will soon be the only way to test and accept changes to Drupal core and contributed projects on drupal.org
- 🇺🇸United States bradjones1 Digital Nomad Life
Re: #13 you could apply the patch from that other issue to core using composer-patches and then use it in this issue, yes.
- Merge request !8101Draft: [PP-1] Performance issues with path alias generated queries on PostgreSQL → (Open) created by mparker17
- 🇨🇦Canada mparker17 UTC-4
> Re: #113 you could apply the patch from that other issue to core using composer-patches and then use it in this issue, yes.
I merged in the commits, but that idea sounds be a lot easier to review, so I'm going to try that shortly.
- 🇨🇦Canada mparker17 UTC-4
Re: #113 you could apply the patch from that other issue to core using composer-patches and then use it in this issue, yes.
@bradjones1 Hmm... could you explain how to do this, or point me towards some documentation?
Aside: in order to change this for !8101, I'd have to drop all the commits in the 📌 Adding GIN and GIST indexes to PostgreSQL databases RTBC branch (or re-create the branch), update composer.json to apply the patch, then re-apply the commits specific to this branch... which is fine, but I don't want to force-push until I know it's going to work, and local testing doesn't seem to work. For simplicity below, the logs will assume I'm re-creating the branch.
I've set up a dev environment using justafish/ddev-drupal-core-dev...
$ git clone --branch 11.x https://git.drupalcode.org/project/drupal.git d10-core $ cd d10-core $ ddev config --project-type=drupal10 $ ddev start $ ddev corepack enable $ ddev get justafish/ddev-drupal-core-dev $ ddev restart $ ddev composer install
... as I expected, it didn't seem like cweagans/composer-patches was required by anything...
$ ddev composer depends 'cweagans/composer-patches' In BaseDependencyCommand.php line 100: Could not find package "cweagans/composer-patches" in your project
... so I tried installing cweagans/composer-patches, but that failed because, I guess, core references itself in its own composer.json?
$ ddev composer require 'cweagans/composer-patches:^1' ./composer.json has been updated Running composer update cweagans/composer-patches > Drupal\Composer\Composer::ensureComposerVersion Loading composer repositories with package information Updating dependencies Your requirements could not be resolved to an installable set of packages. Problem 1 - Root composer.json requires drupal/core 1.0.0 (exact version match: 1.0.0 or 1.0.0.0), found drupal/core[dev-main] but it does not match the constraint. Problem 2 - Root composer.json requires drupal/core-project-message 1.0.0 (exact version match: 1.0.0 or 1.0.0.0), found drupal/core-project-message[dev-main] but it does not match the constraint. Problem 3 - Root composer.json requires drupal/core-vendor-hardening 1.0.0 (exact version match: 1.0.0 or 1.0.0.0), found drupal/core-vendor-hardening[dev-main] but it does not match the constraint. Use the option --with-all-dependencies (-W) to allow upgrades, downgrades and removals for packages currently locked to specific versions. Installation failed, reverting ./composer.json and ./composer.lock to their original content. Composer [require cweagans/composer-patches:^1] failed, composer command failed: exit status 2. stderr=
So I tried simply adding the patch to composer.json...
$ ddev composer config --json extra.patches.'drupal/core' '{"#3397622, !5839 Adding GIN and GIST indexes to PostgreSQL databases": "https://git.drupalcode.org/project/drupal/-/merge_requests/5839.diff"}' $ git diff diff --git a/composer.json b/composer.json index b56c8cec3e..e30ad42bdf 100644 --- a/composer.json +++ b/composer.json @@ -96,6 +96,11 @@ " and not intended to be used for production sites.", " See: https://www.drupal.org/node/3082474" ] + }, + "patches": { + "drupal/core": { + "#3397622, !5839 Adding GIN and GIST indexes to PostgreSQL databases": "https://git.drupalcode.org/project/drupal/-/merge_requests/5839.diff" + } } }, "autoload": {
...seems fine so far, so lets apply the patch...
$ ddev composer update drupal/core > Drupal\Composer\Composer::ensureComposerVersion Loading composer repositories with package information Updating dependencies Your requirements could not be resolved to an installable set of packages. Problem 1 - Root composer.json requires drupal/core 1.0.0 (exact version match: 1.0.0 or 1.0.0.0), found drupal/core[dev-main] but it does not match the constraint. Problem 2 - Root composer.json requires drupal/core-project-message 1.0.0 (exact version match: 1.0.0 or 1.0.0.0), found drupal/core-project-message[dev-main] but it does not match the constraint. Problem 3 - Root composer.json requires drupal/core-vendor-hardening 1.0.0 (exact version match: 1.0.0 or 1.0.0.0), found drupal/core-vendor-hardening[dev-main] but it does not match the constraint. Use the option --with-all-dependencies (-W) to allow upgrades, downgrades and removals for packages currently locked to specific versions. Composer [update drupal/core] failed, composer command failed: exit status 2. stderr=
... same as before.
What about just updating the lock file?
$ ddev composer update --lock > Drupal\Composer\Composer::ensureComposerVersion Loading composer repositories with package information Updating dependencies Your requirements could not be resolved to an installable set of packages. Problem 1 - Root composer.json requires drupal/core == 11.9999999.9999999.9999999-dev (exact version match), it is satisfiable by drupal/core[11.x-dev] from composer repo (https://repo.packagist.org) but drupal/core[dev-main] from path repo (core) has higher repository priority. The packages from the higher priority repository do not match your constraint and are therefore not installable. That repository is canonical so the lower priority repo's packages are not installable. See https://getcomposer.org/repoprio for details and assistance. Problem 2 - Root composer.json requires drupal/core-project-message == 11.9999999.9999999.9999999-dev (exact version match), it is satisfiable by drupal/core-project-message[11.x-dev] from composer repo (https://repo.packagist.org) but drupal/core-project-message[dev-main] from path repo (composer/Plugin/ProjectMessage) has higher repository priority. The packages from the higher priority repository do not match your constraint and are therefore not installable. That repository is canonical so the lower priority repo's packages are not installable. See https://getcomposer.org/repoprio for details and assistance. Problem 3 - Root composer.json requires drupal/core-vendor-hardening == 11.9999999.9999999.9999999-dev (exact version match), it is satisfiable by drupal/core-vendor-hardening[11.x-dev] from composer repo (https://repo.packagist.org) but drupal/core-vendor-hardening[dev-main] from path repo (composer/Plugin/VendorHardening) has higher repository priority. The packages from the higher priority repository do not match your constraint and are therefore not installable. That repository is canonical so the lower priority repo's packages are not installable. See https://getcomposer.org/repoprio for details and assistance. Composer [update --lock] failed, composer command failed: exit status 2. stderr=
... apparently not!
What if I try running GitLab CI tests locally → ?
$ alias drupal-ci-local='gitlab-ci-local --remote-variables git@git.drupal.org:project/gitlab_templates=includes/include.drupalci.variables.yml=main --variable="_GITLAB_TEMPLATES_REPO=project/gitlab_templates"' $ drupal-ci-local ... FAIL 📦️ Composer > Invalid version string "HEAD-dev" > > validate [--no-check-all] [--check-lock] [--no-check-lock] [--no-check-publish] [--no-check-version] [-A|--with-dependencies] [--strict] [--] [<file>]
... so I'm not sure what to do next.
- 🇨🇦Canada mparker17 UTC-4
On the plus side, though, it seems like tests are passing in the merge request as it is now.