Refactor workspace association tracking to use a dedicated revision table

Created on 7 August 2025, 3 months ago

Problem/Motivation

The current workspace association implementation relies on complex queries against entity storage tables to determine which revisions belong to which workspace. This approach:

  • Requires SqlContentEntityStorage, limiting support for custom entity storage handlers
  • Uses complex queries on entity base and revision tables
  • Has performance implications for large datasets
  • Makes the code harder to maintain and understand

Proposed resolution

Introduce a new workspace_association_revision table to explicitly track all entity revisions associated with a workspace.

Additionally, this clears the path towards deprecating/removing the workspace revision metadata field, which is needed in order to implement moving revisions between workspaces in a performant way as required for 📌 Allow for / implement simplified content workflow with workspaces Active .

Remaining tasks

Review.

User interface changes

Nope.

Introduced terminology

N/A.

API changes

Nope.

Data model changes

Added a new workspace_association_revision table to track all workspace-associated revisions.

Release notes snippet

The Workspaces module now uses a dedicated database table to track all workspace-associated entity revisions, improving performance and removing limitations on custom entity storage implementations.

📌 Task
Status

Active

Version

11.0 🔥

Component

workspaces.module

Created by

🇷🇴Romania amateescu

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Comments & Activities

  • Issue created by @amateescu
  • @amateescu opened merge request.
  • 🇷🇴Romania amateescu

    Added the upgrade path with test coverage, this should be fully ready for review now :)

  • 🇩🇪Germany Fabianx

    I am providing an AI review (made by Gemini) for discussion later:

      Decision: Request Changes
    
      I would not accept this PR in its current state, although it is a very
       strong and well-thought-out patch. The architectural direction is
      excellent, but there are some significant risks and areas for
      improvement that should be addressed before it's merged into a stable
      branch like 11.x.
    
      ---
    
      High-Level Review
    
      This patch introduces a fundamental change to how Drupal's Workspaces
      module tracks entity revisions, by adding a new
      workspace_association_revision table.
    
      The Good:
    
       * Improved Architecture: The new table decouples workspace logic from
         the underlying storage of versionable entities. The previous
         implementation with its complex queries and joins directly on entity
         revision tables was brittle and inefficient. This change is a major
         improvement for maintainability and performance.
       * Excellent Test Coverage: The addition of a new UpdatePathTestBase
         test (WorkspaceAssociationRevisionTableUpdateTest) specifically for
         the data migration is fantastic. The approach of re-implementing the
         old logic within the test to compare the results before and after the
          update is a robust way to ensure the refactoring is correct.
    
      The Concerns:
    
       * Data Migration Risk: The workspaces_update_11302() hook performs a
         data migration that could be extremely slow or even time out on
         production sites with a large number of entities and revisions
         managed by workspaces. Running heavy queries inside a loop is risky
         for an update hook.
    
      ---
    
      Specific Feedback and Suggestions
    
      Here are the detailed points I would raise in a PR review.
    
      1. Critical: Update Hook Performance
    
      The workspaces_update_11302() function needs to be more robust for
      large datasets.
    
       * Issue: The code iterates through every workspace and then every
         entity type within that workspace, executing a potentially complex
         database query for each combination. This will not scale well.
       * Suggestion: This update is a prime candidate for the Batch API. The
         process should be converted into a batch job that processes a limited
          number of workspaces or entities in each run. This is the standard
         Drupal way to handle long-running updates and prevent timeouts during
          drush updb or updates via the UI.
    
      2. Code Style and Refactoring
    
       * Issue: In WorkspaceAssociation.php, the methods
         getAssociatedRevisions() and getAssociatedInitialRevisions() contain
         very similar caching logic. They both check for a cached property and
          then run a database query if it's not present.
       * Suggestion: Consolidate this logic. A single protected method could
         be responsible for querying the workspace_association_revision table
         and populating both $this->associatedRevisions and
         $this->associatedInitialRevisions caches at the same time. The public
          methods would then become simple getters that access the cached
         data. This would reduce code duplication.
    
       * Issue: In workspaces_update_11302(), the logic to determine if a
         revision is the "initial" one relies on an implicitly named column
         from a joined table: $revision->{\"base_$revision_id_field\"}.
       * Suggestion: This is not guaranteed to be portable across all database
          drivers that Drupal supports. The query should explicitly alias the
         field for reliability.
           * Change: $revision_query->fields('base', [$revision_id_field])
           * To: $revision_query->addField('base', $revision_id_field, 
             'base_revision_id');
           * And then use $revision->base_revision_id in the logic.
    
      3. Test Fixture Management
    
       * Issue: The new test fixture workspaces-10.3.0.php is over 2800 lines
         long and consists mostly of enormous, unreadable serialized PHP
         object strings.
       * Suggestion: While this is a common pattern in Drupal core for
         historical reasons, it's worth questioning if there's a more
         maintainable way. Could a smaller, more targeted fixture be used? If
         not, at least ensure the comment at the top of the file, which is
         very helpful, is kept up-to-date with any changes. For this PR, I
         wouldn't block on this, but it's a technical debt concern.
    
      4. Minor: Unused use statement
    
       * Issue: In WorkspaceAssociation.php, the use 
         Drupal\Core\Entity\Sql\SqlContentEntityStorage; statement is removed
         from the top, which is correct. However, a new one is added in
         workspaces.install that is not used.
       * Suggestion: Remove the unused use statement from workspaces.install.
    
      In summary, this is a high-quality and valuable patch that is very
      close to being acceptable. The primary blocker is the risk associated
      with the update hook's performance. Once that is addressed by moving
      to a batch process, and the minor code suggestions are implemented, I
      would be happy to approve it.
    

    My own verdict (so far):

    1. Agree on batch API (at least one transaction per workspace), but I can probably get this conversion done with AI (so no need for Andrei to do)
    2. Need to look into this in detail first
    3. Disagree - it's standard practice
    4. Disagree - it's standard practice
    5. No idea if it's true, but would have thought an IDE would have seen that. Low priority and might be false-flag

    Claude brought up some other points, which I however need to review first as it also hallucinated a missing upgrade path ...

    Overall from my own review so far: Looks good, but need to look more detailed still.

  • 🇷🇴Romania amateescu

    That's a surprisingly insightful review :)

    1. Not sure I agree with the need to use batch API, I'd rather use a single transaction for the whole thing. I tested this upgrade path on a site with 13.065 workspace associations and a node_revision table with 150k revisions, and it runs in 8 seconds :)
    2. Both good points, fixed.
    3. Also disagree.
    4. Yep, false flag, that class is used in workspaces_update_11302()
  • 🇬🇧United Kingdom catch

    I don't think that it is helpful to post known incorrect points from an LLM onto a complex core issue even with the rebuttals inckuded. I have just had to read hundreds of words and think about them (and haven't re-reviewed the MR yet) whereas the actual review points boil down to:

    - The update should use an alias for a table name. (and this could have been found in linting with a phpstan rule).

    - We should decide whether the update should batch by workspace or not based on real world testing (which Andrei had apparently already done).

  • 🇬🇧United Kingdom catch

    I have an issue open about the fixtures but that issue is irrelevant to any new upgrade path testing we have now. I'm only linking it so that no one opens a 'new' followup for the LLM point. 📌 Rework database update tests so we don't have to ship database dumps in git Active .

  • The Needs Review Queue Bot tested this issue. It no longer applies to Drupal core. Therefore, this issue status is now "Needs work".

    This does not mean that the patch necessarily needs to be re-rolled or the MR rebased. Read the Issue Summary, the issue tags and the latest discussion here to determine what needs to be done.

    Consult the Drupal Contributor Guide to find step-by-step guides for working with issues.

  • 🇷🇴Romania amateescu

    Merged latest 11.x, and it was a clean merge, not sure what the bot doesn't like.

Production build 0.71.5 2024