- Issue created by @amateescu
- @amateescu opened merge request.
- 🇷🇴Romania amateescu
Added the upgrade path with test coverage, this should be fully ready for review now :)
- 🇩🇪Germany Fabianx
I am providing an AI review (made by Gemini) for discussion later:
Decision: Request Changes I would not accept this PR in its current state, although it is a very strong and well-thought-out patch. The architectural direction is excellent, but there are some significant risks and areas for improvement that should be addressed before it's merged into a stable branch like 11.x. --- High-Level Review This patch introduces a fundamental change to how Drupal's Workspaces module tracks entity revisions, by adding a new workspace_association_revision table. The Good: * Improved Architecture: The new table decouples workspace logic from the underlying storage of versionable entities. The previous implementation with its complex queries and joins directly on entity revision tables was brittle and inefficient. This change is a major improvement for maintainability and performance. * Excellent Test Coverage: The addition of a new UpdatePathTestBase test (WorkspaceAssociationRevisionTableUpdateTest) specifically for the data migration is fantastic. The approach of re-implementing the old logic within the test to compare the results before and after the update is a robust way to ensure the refactoring is correct. The Concerns: * Data Migration Risk: The workspaces_update_11302() hook performs a data migration that could be extremely slow or even time out on production sites with a large number of entities and revisions managed by workspaces. Running heavy queries inside a loop is risky for an update hook. --- Specific Feedback and Suggestions Here are the detailed points I would raise in a PR review. 1. Critical: Update Hook Performance The workspaces_update_11302() function needs to be more robust for large datasets. * Issue: The code iterates through every workspace and then every entity type within that workspace, executing a potentially complex database query for each combination. This will not scale well. * Suggestion: This update is a prime candidate for the Batch API. The process should be converted into a batch job that processes a limited number of workspaces or entities in each run. This is the standard Drupal way to handle long-running updates and prevent timeouts during drush updb or updates via the UI. 2. Code Style and Refactoring * Issue: In WorkspaceAssociation.php, the methods getAssociatedRevisions() and getAssociatedInitialRevisions() contain very similar caching logic. They both check for a cached property and then run a database query if it's not present. * Suggestion: Consolidate this logic. A single protected method could be responsible for querying the workspace_association_revision table and populating both $this->associatedRevisions and $this->associatedInitialRevisions caches at the same time. The public methods would then become simple getters that access the cached data. This would reduce code duplication. * Issue: In workspaces_update_11302(), the logic to determine if a revision is the "initial" one relies on an implicitly named column from a joined table: $revision->{\"base_$revision_id_field\"}. * Suggestion: This is not guaranteed to be portable across all database drivers that Drupal supports. The query should explicitly alias the field for reliability. * Change: $revision_query->fields('base', [$revision_id_field]) * To: $revision_query->addField('base', $revision_id_field, 'base_revision_id'); * And then use $revision->base_revision_id in the logic. 3. Test Fixture Management * Issue: The new test fixture workspaces-10.3.0.php is over 2800 lines long and consists mostly of enormous, unreadable serialized PHP object strings. * Suggestion: While this is a common pattern in Drupal core for historical reasons, it's worth questioning if there's a more maintainable way. Could a smaller, more targeted fixture be used? If not, at least ensure the comment at the top of the file, which is very helpful, is kept up-to-date with any changes. For this PR, I wouldn't block on this, but it's a technical debt concern. 4. Minor: Unused use statement * Issue: In WorkspaceAssociation.php, the use Drupal\Core\Entity\Sql\SqlContentEntityStorage; statement is removed from the top, which is correct. However, a new one is added in workspaces.install that is not used. * Suggestion: Remove the unused use statement from workspaces.install. In summary, this is a high-quality and valuable patch that is very close to being acceptable. The primary blocker is the risk associated with the update hook's performance. Once that is addressed by moving to a batch process, and the minor code suggestions are implemented, I would be happy to approve it.My own verdict (so far):
- Agree on batch API (at least one transaction per workspace), but I can probably get this conversion done with AI (so no need for Andrei to do)
- Need to look into this in detail first
- Disagree - it's standard practice
- Disagree - it's standard practice
- No idea if it's true, but would have thought an IDE would have seen that. Low priority and might be false-flag
Claude brought up some other points, which I however need to review first as it also hallucinated a missing upgrade path ...
Overall from my own review so far: Looks good, but need to look more detailed still.
- 🇷🇴Romania amateescu
That's a surprisingly insightful review :)
- Not sure I agree with the need to use batch API, I'd rather use a single transaction for the whole thing. I tested this upgrade path on a site with 13.065 workspace associations and a
node_revisiontable with 150k revisions, and it runs in 8 seconds :) - Both good points, fixed.
- Also disagree.
- Yep, false flag, that class is used in
workspaces_update_11302()
- Not sure I agree with the need to use batch API, I'd rather use a single transaction for the whole thing. I tested this upgrade path on a site with 13.065 workspace associations and a
- 🇬🇧United Kingdom catch
I don't think that it is helpful to post known incorrect points from an LLM onto a complex core issue even with the rebuttals inckuded. I have just had to read hundreds of words and think about them (and haven't re-reviewed the MR yet) whereas the actual review points boil down to:
- The update should use an alias for a table name. (and this could have been found in linting with a phpstan rule).
- We should decide whether the update should batch by workspace or not based on real world testing (which Andrei had apparently already done).
- 🇬🇧United Kingdom catch
I have an issue open about the fixtures but that issue is irrelevant to any new upgrade path testing we have now. I'm only linking it so that no one opens a 'new' followup for the LLM point. 📌 Rework database update tests so we don't have to ship database dumps in git Active .
The Needs Review Queue Bot → tested this issue. It no longer applies to Drupal core. Therefore, this issue status is now "Needs work".
This does not mean that the patch necessarily needs to be re-rolled or the MR rebased. Read the Issue Summary, the issue tags and the latest discussion here to determine what needs to be done.
Consult the Drupal Contributor Guide → to find step-by-step guides for working with issues.
- 🇷🇴Romania amateescu
Merged latest 11.x, and it was a clean merge, not sure what the bot doesn't like.