[PP-1] Enforce conflict enforcement outside of tests and e2e tests

Created on 28 May 2025, 27 days ago

Overview

Follow-up to ๐Ÿ“Œ Add rudimentary conflict prevention to the preview end-point Active
In that issue we put in logic to throw an exception if client didn't send the latest auto-save hash for the entity and regions. This prevents 2 users for editing at the same time and wiping out each others changes

Trying to get the e2e tests to pass I discovered that rapid fire requests could cause this exception to be trigger without only 1 user editing the page.

So we decided to scope #3490565 to just the backend parts

Proposed resolution

In \Drupal\experience_builder\Controller\ApiLayoutController::validateAutoSaves() remove the special casing for phpunit tests.
Bring back the logic that was taken out of ๐Ÿ“Œ Add rudimentary conflict prevention to the preview end-point Active to send the autoSaves key from the client(commit link soon)
Implement system to handle rapid fire requests

User interface changes

๐Ÿ“Œ Task
Status

Postponed

Version

0.0

Component

Page builder

Created by

๐Ÿ‡บ๐Ÿ‡ธUnited States tedbow Ithaca, NY, USA

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Merge Requests

Comments & Activities

  • Issue created by @tedbow
  • ๐Ÿ‡บ๐Ÿ‡ธUnited States tedbow Ithaca, NY, USA
  • ๐Ÿ‡บ๐Ÿ‡ธUnited States tedbow Ithaca, NY, USA
  • Merge request !1080Resolve #3526907 Client auto-save conflicts โ†’ (Closed) created by tedbow
  • ๐Ÿ‡บ๐Ÿ‡ธUnited States tedbow Ithaca, NY, USA

    I brought in the changes from ๐Ÿ“Œ Add rudimentary conflict prevention to the preview end-point Active because I made progress getting the e2e test to pass, so I thought I would the current state

  • ๐Ÿ‡บ๐Ÿ‡ธUnited States tedbow Ithaca, NY, USA

    The only 2 e2e tests that are failing now are `entity-form-field-types-test.cy.js` which does pass for me locally and `media-library.cy.js` which did fail locally but I didn't see an 409 request errors in the log. So maybe these are unrelated random test errors? Not sure, retesting

  • ๐Ÿ‡ฆ๐Ÿ‡บAustralia larowlan ๐Ÿ‡ฆ๐Ÿ‡บ๐Ÿ.au GMT+10
  • ๐Ÿ‡ง๐Ÿ‡ชBelgium wim leers Ghent ๐Ÿ‡ง๐Ÿ‡ช๐Ÿ‡ช๐Ÿ‡บ

    AFAICT this is also a beta blocker? Otherwise we haven't solved the current auto-save-data-loss scenarios described in ๐Ÿ“Œ Add rudimentary conflict prevention to the preview end-point Active ?

  • ๐Ÿ‡ฎ๐Ÿ‡ณIndia nagwani
  • ๐Ÿ‡บ๐Ÿ‡ธUnited States tedbow Ithaca, NY, USA

    I am actually surprised that no existing e2e tests failed ๐ŸŽ‰

    2 new tests failed

    1. rapid-component-addition.cy.js: I am not sure this can really be test reliability give the random test fails we have in e2e and this test would solely rely on timing. I am removing it
    2. auto-save-conflict.cy.js: This was attempt to try to simulate 2 users with 2 different browser sessions having a conflict. I am not sure how to do this in cypress and searched around a little and couldn't get it to work. Someone with more experience might have better luck
  • ๐Ÿ‡ฌ๐Ÿ‡งUnited Kingdom jessebaker

    I am having a hard time coming up with a robust solution on the client side.

    There are two different issues.

    The first issue is concurrent editing from two different client instances. That can be either two different users or the same user in two different tabs.

    This is solved by the autoSaves hash being passed back and forth and checked and a warning being issued in the instance that a conflict/mismatch arises. โœ…

    The second issue is proving more difficult to tackle. This issue occurs when a single user in a single client is able to trigger the mismatch scenario by making multiple requests in quick succession.

    The issue happens in the following scenario:

    1. The client sends an update to the server with the current autoSaves hash (weโ€™ll call it 1).
    2. While that request is still in flight, the client sends another update to the server. This request also has autoSave hash 1 because the first request has not returned yet with the updated hash.
    3. The server receives the first request and generates a new autoSave hash (2)
    4. The response to the first request comes back to the client and the local autoSave hash is updated to 2.
    5. The server receives the second request (which has autoSave hash 1) and rejects it (because it should be 2!) as 409
    6. The response to the second request is received by the client as 409 and the user is shown an error.

    Potential solutions (to problem 2)

    1. โ€œLockโ€ the UI when a request is in flight so that subsequent requests cannot be sent
    2. Silently discard new requests made when a request is in flight (this is what Claudeโ€™s attempt is doing in !MR1080)
    3. Mark in flight requests as "aborted" when a subsequent request is made. When an aborted response returns, discard the returned data knowing that the second response is on the way.
    4. Create a client-side โ€œpendingโ€ system. While a request is in flight, subsequent data updates to the layout/model are stored as a local variable. When a response is received the local variable is checked and if the variable is set, generate a new request with the data from the variable combined with the autoSaves hash from the most recent response. Then Immediately unset the variable. If multiple data updates are made before the response comes back, they can each just overwrite the local variable as each change is a cumulative data that includes all previous data updates.
    5. On loading the React app generate a uuid to use as a client instance ID. Alongside the autoSave hash also send the client instance ID to the server. When the server receives consecutive requests with the same client instance ID, bypass the autoSave hash check as we know that they have come from a single instance and thus canโ€™t be conflicting concurrent changes.

    I've investigated the above solutions and have the following conclusions:

    1. This has poor UX implications. If we visually indicate the UI is locked it appears flickery on a good connection. If we just silently ignore user actions it makes the app feel unresponsive
    2. This allows the server and client to get out of date. The user makes 3 actions but the 2nd and 3rd are silently discarded. It looks like the actions have been made, but on the server only the 1st actually happened.
    3. I spent a long time implementing this only to find that it doesn't solve the issue. The subsequent request still has the out of data autoSave hash at the point it is dispatched so it still leads to a 409 error.
    4. I have spent even longer attempting to create this system. After 2 days I've not been able to successfully get this working with RTK query. I DO think it's possible, it's just VERY difficult and is resulting in a large amount of additional complexity
    5. I have just had this idea and I'm writing this comment on the issue because I'd like to get a second opinion on if its feasible. I think it would work and is less complex than the pending system above.
  • ๐Ÿ‡บ๐Ÿ‡ธUnited States tedbow Ithaca, NY, USA

    I think there is pontential problem with #11 5)

    they have come from a single instance and thus canโ€™t be conflicting concurrent changes.

    Is this true? What if the client makes 3 request in sequence and for some reason request 1 takes a long time to connect? Basically request 2 and 3 are processed and then request 1 gets processed. Wouldn't that wipe out changes in 2 and 3? Is what I suggesting even possible, maybe on randomly sluggish networks? Maybe ok to handle this edge case?

  • ๐Ÿ‡ฆ๐Ÿ‡บAustralia larowlan ๐Ÿ‡ฆ๐Ÿ‡บ๐Ÿ.au GMT+10

    It may be that ๐Ÿ“Œ Replace the postPreview action with atomic equivalents Active is a better use of effort if the approach here (which was meant to be a short-term quick fix) is not possible.

  • ๐Ÿ‡ง๐Ÿ‡ชBelgium wim leers Ghent ๐Ÿ‡ง๐Ÿ‡ช๐Ÿ‡ช๐Ÿ‡บ

    +1 for re-assessing ๐Ÿ“Œ Replace the postPreview action with atomic equivalents Active . It's now >6 months old, but still relevant.

    Marking and leaving assigned to @tedbow to convey we should first check that avenue.

  • Pipeline finished with Failed
    13 days ago
    Total: 1091s
    #519668
  • ๐Ÿ‡บ๐Ÿ‡ธUnited States tedbow Ithaca, NY, USA

    re #13 and #14 I am currently a -1 on doing ๐Ÿ“Œ Replace the postPreview action with atomic equivalents Active because

    1. There is nothing in #3492065 for "User interface changes"(I just copied that part of ๐Ÿ“Œ Add rudimentary conflict prevention to the preview end-point Active to here now)
      if we "Enable concurrent editing" there probably will be some very tricky UX problems and we should figure the desired outcomes before we start making the solution

      Will concurrent editing just work perfectly and we all agree exactly what that means so there is no describe the UX? ๐Ÿ˜œ

      for instance what happens if I am editing props for component, which should show the updates I as type, at the same time another user does another operation that moved that operation at out of view? This would either be by the other user reorder components at the same level as mine or adding another component above it that puts my out of view. Or it could be changing the props of component above mine that changes the height.

    2. Is there no possibility of conflict still in that system? I understand it would be less possible because you are allowing 2 users to edit different parts of the page. But 2 users could still try to edit the same component where 1 user didn't start with the latest changes that were just auto-saved correct?

      If so think a lot of the problems documented in #11 still exist just at a different level down

      I think #12 out of sync requests from the same user could still be problem. Especially because we send auto-save requests as you type for single prop of single component

      So I think you we wouldn't be saving time but not having to solve #11 it would just be a different level, per component

    3. ๐Ÿ“Œ Replace the postPreview action with atomic equivalents Active just seems more complicated so there would be more unknown unknowns. I think we are further along on the current implementation so there would less, less risk
  • Pipeline finished with Failed
    12 days ago
    Total: 818s
    #520552
  • ๐Ÿ‡บ๐Ÿ‡ธUnited States tedbow Ithaca, NY, USA

    Just wanted to make clear my comments in #16 were all with the idea that we have limited time before Beta1 - re ๐ŸŒฑ Milestone 1.0.0-beta1: Start creating non-throwaway sites Active and probably have a lot of other issues. The product decision has been made that we don't need concurrent editing in Beta1. Not sure about GA. but I assume we have a lot of other features that were prioritized in for Beta1

    If we had more development resources and there weren't other features that were deemed beta blockers I would be all for getting concurrent editing in before beta1. But also if also if we made that decision won't the first order of business be defining the UX for that?

  • Pipeline finished with Failed
    11 days ago
    Total: 266s
    #520790
  • ๐Ÿ‡ฌ๐Ÿ‡งUnited Kingdom jessebaker

    @tedbow I've found a bug that I think needs to be addressed server side.

    Steps

    1. Ensure you have no autosaves/pending changes
    2. Reload the page to ensure you start fresh
    3. Make a change (e.g. edit a component prop)
    4. Make another change. Note that these changes both make requests with the correct/latest autoSave hash in the payload
    5. Publish all changes
    6. Make a third change. Note that this request correctly sends the last autoSave hash that was received in .4
    7. The response to this third change is a 409.

    I suspect the latest autoSave hash on the server side is either being deleted or updated when changes are published but not communicated back to the FE so the client ends up with an out of date hash.

    I think that either a) publishing should not update the hash, or b) the response from a publish action needs to include the updated hash to the FE can store it for the next request.

  • Pipeline finished with Failed
    11 days ago
    #521462
  • Pipeline finished with Failed
    11 days ago
    Total: 973s
    #521474
  • Pipeline finished with Failed
    11 days ago
    Total: 960s
    #521521
  • Pipeline finished with Failed
    10 days ago
    #521738
  • First commit to issue fork.
  • Pipeline finished with Success
    10 days ago
    Total: 3570s
    #521874
  • ๐Ÿ‡บ๐Ÿ‡ธUnited States tedbow Ithaca, NY, USA

    @jessebaker I found the problem that is causing #18

    It is because during Publish we delete the auto-save entry, including the hashes and the client instance id, for the items that were published. This is because there are no auto-saved information at this point. Everything matches what should have been just saved. We don't want this to show up "Review X changes"

    So when the client sends the auto-saved hashes when find a mismatch, client has auto-saved hashes and the server doesn't. Now we could change the server side mechanism to stored this information separate from the actual auto-save information we will need to publish the changes but that would make the server side more complex. Also it is also possible that this particular entity will never be edited within XB again, so this keeps us from having to keep auto-saved hash information for any entity that was ever edited in XB forever.

    So I was going to suggest that when submit a publish request and get response from the server that you delete the auto-save hashes. That would solve the problem in #18 but I did make me think of another problem, that actually exists right now

    1. User 1 open xb/xb_page/1/editor, does make any changes, there is no auto-saved hash
    2. User 1 leave their browser open
    3. User 2 opens xb/xb_page/1/editor makes some changes
    4. User 2 publishes xb_page 1
    5. User 1, makes a change in their browser session that they never closed

    I think the effect would be that User 1 would make an auto-saved version that would not be aware of the published changes and so would wipe out those changes when it was published again.

    We could solve this by not deleting the auto-saved entry the server-side, and therefore throwing a conflict error for User 1. But again this would require use to keep auto-save hash information forever for all entities, which could start to add up on larger sites.

    So I don't think this is a solution.

    I think what we need to do is have the client be aware of the starting point version of the entity it is editing.

    So for example for a page or node this would be version number, vid or changed property.
    Basically the server would return to the client, along with the "autosaves" key, a "verison_id" key of the last saved version on the server. The client would always return this key.

    So in the example above User 1's last request would properly get a 409 error because it's version number would be 1 version number behind. But this would also require the client to request /xb/api/v0/layout/xb_page/1 after publishing. Though after ๐Ÿ› [Meta] Selective Publishing and Reverting Active the client could try to be more selective and not request if the current page was not publish it would probably be safer to always just request it.

    There are probably unrelated reasons why the client should always call /xb/api/v0/layout/xb_page/1 after publishing. For example Drupal has hook_entity_presave which allows altering an entity before it is saved. Of course XB will not be aware of any changes made in custom code here, therefore if XB doesn't call /xb/api/v0/layout/xb_page/1 after publishing the auto-save request could wipe out any changes that were made in any hook_entity_presave code

  • ๐Ÿ‡บ๐Ÿ‡ธUnited States tedbow Ithaca, NY, USA

    I pushed up 86ce2085 which will return the revision id under "auto" . Probably at least autoSaves: Record<string, string>; will need to updated on the front-end because this now an object.

    but it seems to work manually testing

  • ๐Ÿ‡บ๐Ÿ‡ธUnited States tedbow Ithaca, NY, USA

    Note I have not had time to fix the phpunit tests but the e2e test passed so that is good sign. Just to need to update the phpunit tests to expect to a nested structure with revision info from the client.

  • ๐Ÿ‡ฌ๐Ÿ‡งUnited Kingdom jessebaker

    @tedbow re: #18 #20

    1) I think perhaps we should create/address this in a follow up
    2) +1 to using 'changed' or even perhaps 'lastModified' and using a timestamp - I could see that information actually being useful elsewhere in the UI at some point.
    3) After publishing I think we already refetch some data so I don't see it being a problem to update the page.

  • Pipeline finished with Failed
    6 days ago
    Total: 818s
    #524823
  • Pipeline finished with Failed
    6 days ago
    #524849
  • Pipeline finished with Failed
    6 days ago
    Total: 740s
    #524863
  • Pipeline finished with Canceled
    6 days ago
    Total: 127s
    #524870
  • Pipeline finished with Failed
    6 days ago
    Total: 1044s
    #524871
  • Pipeline finished with Failed
    6 days ago
    Total: 493s
    #524903
  • Pipeline finished with Failed
    6 days ago
    Total: 928s
    #524909
  • Pipeline finished with Failed
    6 days ago
    Total: 1110s
    #525016
  • ๐Ÿ‡ฌ๐Ÿ‡งUnited Kingdom jessebaker

    There are probably unrelated reasons why the client should always call /xb/api/v0/layout/xb_page/1 after publishing.

    #20

    As of my latest commit the FE now re-requests the page after publishing changes IF the current page is amongst the list of pending changes.

    In my testing there is one remaining bug that I can see that is quite tricky to reproduce.

    1. In a browser tab start from a blank page, no autosaves (everything published)
    2. Copy the URL and open a new tab so you have two tabs with the same blank page state.
    3. In Tab 1, add a Heading component and immediately publish all changes
    4. Switch to Tab 2.
    5. Preview is now out of date (showing blank still, but should have a Heading on it)
    6. Add a Hero component
    7. There should be a conflict warning but there isn't.

    From what I can see, the issue is that the autosave hash has autoSaveRevision but it's always "1". Nothing I do ever seems to increment that number (that is actually a string).

     {
      "autoSaveRevision": "1",
      "hash": "efa8dd3ce4f5f9e1"
    }
    
  • ๐Ÿ‡บ๐Ÿ‡ธUnited States tedbow Ithaca, NY, USA

    re #24 ok might know what is happening.....

  • Pipeline finished with Failed
    6 days ago
    Total: 848s
    #525244
  • ๐Ÿ‡บ๐Ÿ‡ธUnited States tedbow Ithaca, NY, USA

    @jessebaker thanks for testing this. I was testing this with nodes and it was working.

    Basically entity have a seperate revision id but if we create a new revision on every save this won't increment. I guess Page is not incrementing which I guess makes sense because we don't have a way to revert to old revisions or view old revisions AFIACT

    So I used the "changed" time also in the revision for the client

  • Pipeline finished with Failed
    6 days ago
    Total: 997s
    #525251
  • Pipeline finished with Failed
    6 days ago
    Total: 1060s
    #525267
  • Pipeline finished with Success
    6 days ago
    Total: 2645s
    #525282
  • Pipeline finished with Failed
    5 days ago
    Total: 928s
    #525377
  • ๐Ÿ‡บ๐Ÿ‡ธUnited States tedbow Ithaca, NY, USA

    changing title to make clear we are only handling content entities, node and pages with layouts

    We will need this too ๐Ÿ“Œ For config entities add client support conflict detection via the `autoSaves' key Active

  • Pipeline finished with Failed
    5 days ago
    Total: 1063s
    #525403
  • Pipeline finished with Failed
    5 days ago
    Total: 1857s
    #525425
  • Pipeline finished with Failed
    5 days ago
    Total: 821s
    #525450
  • Pipeline finished with Success
    5 days ago
    Total: 810s
    #525461
  • Pipeline finished with Failed
    5 days ago
    Total: 670s
    #525533
  • Pipeline finished with Canceled
    5 days ago
    Total: 547s
    #525561
  • ๐Ÿ‡บ๐Ÿ‡ธUnited States tedbow Ithaca, NY, USA
  • Pipeline finished with Success
    5 days ago
    Total: 1761s
    #525567
  • ๐Ÿ‡บ๐Ÿ‡ธUnited States effulgentsia
  • ๐Ÿ‡ง๐Ÿ‡ชBelgium wim leers Ghent ๐Ÿ‡ง๐Ÿ‡ช๐Ÿ‡ช๐Ÿ‡บ

    How do I begin reviewing this MR? There's lots of comments here, and the issue summary doesn't talk about the client_id and clientInstanceId this MR is adding.

    Can we get an up-to-date issue summary to facilitate a review? And ideally also sprinkle โ„น๏ธ โ€ฆ comments over the MR to make it clearer still ๐Ÿ™

    P.S.: if ๐Ÿ“Œ Document the reasons for not using Workspaces for saving in 0.2.0 Active had landed, then I'd have asked to update that ADR. In fact โ€ฆ any reason not to do that now, @tedbow? ๐Ÿ˜‡๐Ÿ™ That'd be even better than an issue summary update!

  • ๐Ÿ‡ง๐Ÿ‡ชBelgium wim leers Ghent ๐Ÿ‡ง๐Ÿ‡ช๐Ÿ‡ช๐Ÿ‡บ
  • Pipeline finished with Failed
    4 days ago
    Total: 618s
    #527126
  • Pipeline finished with Failed
    4 days ago
    Total: 629s
    #527235
  • Pipeline finished with Failed
    4 days ago
    Total: 997s
    #527250
  • Pipeline finished with Failed
    3 days ago
    Total: 849s
    #527307
  • ๐Ÿ‡บ๐Ÿ‡ธUnited States tedbow Ithaca, NY, USA

    re-rolled, added some more code comments and updating the summary but not finished. Leaving assigned to myself as I am working on the summary

  • ๐Ÿ‡บ๐Ÿ‡ธUnited States tedbow Ithaca, NY, USA
  • ๐Ÿ‡บ๐Ÿ‡ธUnited States tedbow Ithaca, NY, USA

    updated more.... but still more to update

    leaving assigned to myself

  • ๐Ÿ‡บ๐Ÿ‡ธUnited States tedbow Ithaca, NY, USA
  • Pipeline finished with Failed
    about 12 hours ago
    Total: 479s
    #529377
  • Pipeline finished with Failed
    about 12 hours ago
    Total: 289s
    #529385
  • Pipeline finished with Failed
    about 12 hours ago
    Total: 718s
    #529387
  • Pipeline finished with Failed
    about 12 hours ago
    Total: 928s
    #529403
  • ๐Ÿ‡บ๐Ÿ‡ธUnited States tedbow Ithaca, NY, USA

    Change the MR so that \Drupal\experience_builder\Controller\ApiLayoutController::patch only validates the region that contains the component being updated.

    I need to add test to demonstrate how this works

  • Pipeline finished with Failed
    about 11 hours ago
    Total: 777s
    #529435
  • ๐Ÿ‡บ๐Ÿ‡ธUnited States tedbow Ithaca, NY, USA

    Added test coverage for regions to \Drupal\Tests\experience_builder\Kernel\ApiLayoutControllerPostTest::testOutdatedAutoSave. \Drupal\Tests\experience_builder\Kernel\ApiLayoutControllerPatchTest::test already had a test for a conflict with a region

  • ๐Ÿ‡บ๐Ÿ‡ธUnited States tedbow Ithaca, NY, USA

    @wim leers Re: #31 I have update the summary. I have had a chance to update ๐Ÿ“Œ Document the reasons for not using Workspaces for saving in 0.2.0 Active but I think this issue can be reviewed. Most of the back-end changes are in \Drupal\experience_builder\Controller\ApiLayoutController::validateAutoSaves. \Drupal\Tests\experience_builder\Kernel\ApiLayoutControllerPostTest::testOutdatedAutoSave is the most explicit test of this. \Drupal\Tests\experience_builder\Kernel\ApiLayoutControllerPatchTest::test() also catches ConflictHttpException which is thrown from validateAutoSaves()

Production build 0.71.5 2024