Fix race condition for user registration and profile creation

Created on 9 August 2023, 11 months ago
Updated 29 March 2024, 3 months ago

Problem/Motivation

In Open Social we want to ensure every user always has a default profile for our profile type. This simplifies a lot of code downstream that deals with users and user names (for example displaying the right name in a registration e-mail). To solve this we implemented hook_user_insert and created a profile when a user is created.

The problem occurs when adding the ProfileFormWidget to the user registration form. The widget will prepare profiles, but won't actually save them until after the user creation is completed (which includes all invocations of hook_user_insert). It then calls saveProfiles where the profiles will actually be persisted and are loadable as normal entities.

This causes for example ๐Ÿ› When sending out the registration mails, profiles aren't linked to the user yet Needs review .

There are no good workarounds for this that I was able to find in a few full days of experimenting. Modules could hook into the form, but then users that are not created through the registration form (e.g. in tests or through an API) will not have a profile because that code path isn't triggered. We also can't fall-back to hook_user_insert for non-registration form users because we have no way to know whether a user is being created through the form and will thus get a duplicate profile saved in ::saveProfiles right after the user creation is completed.

Steps to reproduce

Proposed resolution

We can move ::saveProfiles to happen after the user entity is built, but before the user is saved. The user will have a UUID generated at that point even though the entity has no ID yet. In case the entity is new, we can then save the profile in a profile_user_insert to make sure that any downstream modules that also want to use hook_user_insert will already have the profile available for loading.

In case the user already exists we can just immediately save the profile (or alternatively also implement hook_user_update but that function is called more often and thus might have a performance penalty).

Transfer of data between ::saveProfiles and profile_user_insert needs to happen through site state. The reason for this is that if we try to attach it to the user entity itself the Drupal entity API will treat it as an entity reference. Entity references cause the referenced entities to be saved first because the ID of the referenced entity is stored on the referencing entity (i.e. User --> Profile, rather than Profile --> User). This also suggests that a future version should possibly consider inverting the relationship between the entity that "has a profile" and the profile (to also allow other types of entities to have profiles, like a profile for an organization).

This also fixes ๐Ÿ› When sending out the registration mails, profiles aren't linked to the user yet Needs review without making the tokens smarter.

Remaining tasks

User interface changes

API changes

ProfileFormWidget::saveProfiles is now called before the ::save method for the account and register form.
Profiles created for users going through the user forms are now available as soon as the user entity is inserted into the database.

Data model changes

โœจ Feature request
Status

Needs review

Version

1.0

Component

Code

Created by

๐Ÿ‡ณ๐Ÿ‡ฑNetherlands Kingdutch

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Comments & Activities

  • Issue created by @Kingdutch
  • Issue was unassigned.
  • Status changed to Needs review 11 months ago
  • Open in Jenkins โ†’ Open on Drupal.org โ†’
    Core: 9.5.x + Environment: PHP 7.3 & MySQL 5.7
    last update 11 months ago
    23 pass
  • ๐Ÿ‡ณ๐Ÿ‡ฑNetherlands Kingdutch
  • ๐Ÿ‡บ๐Ÿ‡ธUnited States mglaman WI, USA

    I read this. I understand it works. I have some concerns over the approach โ€“ using state and blending between a user insert hook and the form. Itโ€™s not wrong but itโ€™s sending off some code smell triggers in my mind. I canโ€™t really articulate them at the moment. Iโ€™ll try to come back.

    Just to note:

    - We definitely cannot use \Drupal\user\UserData as it requires a user ID and cannot use UUID as a replacement.
    - The user object does not have a blob/map field to stuff data into, so we cannot do that
    - There is no way to clean up state information for accounts which bail on registration, possibly filling up the key value table with bad data

    Ping me if I fail to in a day or so.

  • ๐Ÿ‡ฌ๐Ÿ‡งUnited Kingdom joachim

    IIRC we had the same problem back in the D7 days with Profile2 module and Party module (drupal.org/project/party).

    Rather than dumping data into state, what about allowing Profile entities to be userless for a temporary period, and then delete userless profiles older than a few hours on cron?

  • ๐Ÿ‡ณ๐Ÿ‡ฑNetherlands Kingdutch

    but itโ€™s sending off some code smell triggers in my mind.

    I agree, I think I spent a good 30 hours going back 'n forth on this before landing on this.

    I think the actual code smell is not here, but it's the fact that the relationship is currently profile -> user rather than user -> profile. So the better (but more complex and backwards incompatible way) to fix this would be:

    * Profiles are not owned objects
    * Profiles are attached to users by an entity reference field on the user entity

    That properly allows you to save the profile before the user without issues.

    * There is no way to clean up state information for accounts which bail on registration, possibly filling up the key value table with bad data

    I think the only time this happens if if there's a SQL error for the user insert. The state isn't actually created until the form has validated and is being submitted (i.e. we're going to create the user) and the state is already removed by the time the user creation finished. So there's no way a user could cause state to blow up, it would mean the MySQL server had gone away in between the state being stored and the user entity being created with the state cleaning up.

    Ideally we'd have some sort of "If the state is a day old we remove it in cron" because then there'd have been a server error, but I think that's sufficiently rare we might not even need to worry about it

    (It's also specifically why I clear the state before I create the profile)

    Rather than dumping data into state, what about allowing Profile entities to be userless for a temporary period, and then delete userless profiles older than a few hours on cron?

    We need some way to relate the profile to the user later. Making sure that code that responds to user creation can figure out which profile to use is the biggest issue.

    I tried transferring the data through the user entity but that caused different issues depending on the different ways of doing it.... this gives me an idea.

  • Open in Jenkins โ†’ Open on Drupal.org โ†’
    Core: 9.5.x + Environment: PHP 7.3 & MySQL 5.7
    last update 11 months ago
    23 pass
  • ๐Ÿ‡ณ๐Ÿ‡ฑNetherlands Kingdutch

    The problem is that we need some kind of reliable global. We don't necessarily need state, but can use drupal static too which has the benefit of disappearing at the end of the request when something goes wrong.

    I specifically didn't choose a service here because this should be (in my opinion) something internal to the profile module. I abstracted this to two internal functions to make sure we can change this if we ever need to attach it to a request for example (or move it over to a service anyway) and make it work in a long running process (I'm looking at the ReactPHP/Fibers work in Drupal core issues).

  • ๐Ÿ‡บ๐Ÿ‡ธUnited States bradjones1 Digital Nomad Life

    There are no good workarounds for this that I was able to find in a few full days of experimenting.

    I have similar considerations around profiles for users, however I don't support a web form-based signup flow (all my users are created by SSO.)

    I wonder though if you aren't falling victim to the synchronous nature of all this, and couldn't asynchronously perform whatever after-registration tasks until a time in the future (even if that's only a few milliseconds) after all the work is done.

    I see two potential paths for you, both of which I'm implementing on Meet Kinksters, depending on the use case.

    Do the work in a post-response task, e.g. a destructable service. You can store the ID(s) of the created user(s) in the service, which will subscribe to an entity insert event, and then during destruction you can do whatever you need to do (e.g., ensure the user has a profile.)
    Similarly to #1, delegate the task to a queue, either in Drupal or (preferably) a message queue like RabbitMQ or Google Cloud Tasks. Queue the job after user creation, and then have a callback handle any additional scaffolding that's needed. Do your email sending on profile insert.

    In both the above cases, you're able to asynchronously perform the follow-up tasks - that is, whatever you want to do after the user is fully registered (that is, has a profile) can only care about firing on the profile insert, and it doesn't need to be coupled to the mechanism by which the profile gets inserted (that is, on registration or as part of a follow-up task.)

  • ๐Ÿ‡ฎ๐Ÿ‡ฑIsrael jsacksick

    Do the work in a post-response task, e.g. a destructable service. You can store the ID(s) of the created user(s) in the service, which will subscribe to an entity insert event, and then during destruction you can do whatever you need to do (e.g., ensure the user has a profile.)

    Was wondering about that too instead...

  • ๐Ÿ‡ณ๐Ÿ‡ฑNetherlands Kingdutch

    I wonder though if you aren't falling victim to the synchronous nature of all this, and couldn't asynchronously perform whatever after-registration tasks until a time in the future (even if that's only a few milliseconds) after all the work is done.

    The short answer, "Trust me, no"

    The long answer. Yes probably, but it means that we need to make everything in Drupal that might possibly respond to user creation and for some reason needs a user's profile to provide a better UX needs to be made async and made aware of the fact that a profile might not exist.

    If we look at this without the profile form then Drupal has an easy flow for this. We know when a user is inserted, so when a user is saved to the database we can make sure that a profile exists in the hook_user_insert code and everything lives happily ever after.

    However as soon as the profile form is introduced, because of the way it's implemented (create the profile before the user is saved, then save that exact profile without checking if anything else happened), this flow breaks down and we're left with no good way to make sure that a profile exists regardless of how the user is registered (be that through an API or through the form).

    The solution in my opinion is not to make the rest of a code base deal with the fact that a profile might not exist. In the two years I've spent overhauling Open Social's profile functionality I've found that life is a lot simpler if we can enforce that the profile always exists for a user and if that's not the case it's a bug.

    ๐Ÿ› When sending out the registration mails, profiles aren't linked to the user yet Needs review is one such example of arbitrary code expecting the profile to exist, while trying to solve this issue I've found more and unfortunately I don't have time to make Drupal's user registration process itself async.

  • ๐Ÿ‡ณ๐Ÿ‡ฑNetherlands Kingdutch

    Just to add: the "even if that's only a few milliseconds" seems to suggest I'm trying to solve a time-based race condition here which is not the case. I'm specifically trying to solve a race condition caused by the order in which Drupal calls hooks and performs tasks, which is not something that can be solved with async without opening up a lot of Drupal core issues.

  • ๐Ÿ‡บ๐Ÿ‡ธUnited States bradjones1 Digital Nomad Life

    You know your use case better than any of us, and I think this is one of those situations where you make an architectural decision that's important for your business based on those constraints. I'm not sure, though, that there must need be a "fix" for this in Profile so much as you set up something that works for Open Social's decision to require, in effect, guaranteed concurrent execution of tasks, or execution in a very specific order.

    The maintainer brain in me says this is almost a "won't fix" type scenario for the module, since it's so specific to your specific use case, and we're uncovering a number of alternatives that might work in other similar situations, but not yours because business rules. That's not to dismiss it, but to perhaps zoom out and say that this might be complexity that 99.99999% of users won't run into, and the remaining (the edge cases like you and I) will solve in different ways.

    In my case, for instance, I create the user's dating profile on user insert, and try to reduce the number of other synchronous side-effects that occur on user insertion. The async ones, even if they only execute a few milliseconds later, can reliably expect to find a dating profile for the user. If they don't (because a sunspot errored out the save or something crazy) then I get an error in my logging and can handle on an edge case... but that's literally never happened.

    Just trying to zoom out a bit and wonder if this is something that needs "fixing" in profile to begin with.

  • ๐Ÿ‡ณ๐Ÿ‡ฑNetherlands Kingdutch

    Okay final try before I'll move this to userland and give up here.

    The business rule I want to enforce: My user_format_name should be able to assume that a profile exists and if this is not the case it's a bug in my usage of the profile module.

    Is that a business rule the profile module is willing to support? Or are we saying "that's too specific and not a guarantee the module can give"?

    If that's a business rule the profile module is willing to support then there's currently a bug in the ProfileFormWidget because it makes it impossible to implement that business rule (as explained below). If that's not a business rule we're willing to support then that's where our difference of opinion is and we can close this as won't fix.

    The business rule can not be implemented because the order of function calls (enforced by Drupal core) is as follows:

    1. ProfileFormWidget::formElement - Creates the profile
    2. ProfileFormWidget::extractFormElements - Transfers form settings to the profile
    3. ProfileFormWidget::validateProfileForm - Validates the profile
    4. RegisterForm::save
      1. User::save
        1. hook_user_insert
      2. _user_mail_notify
        1. user_mail -> [user:display-name] token -> user_format_name -> hook_user_format_name_alter - Profile can not be retrieved here from AccountInterface
    5. ProfileFormWidget::saveProfiles

    The issue is then in 4.2.1 where the profile is not yet available and it's unfortunately not possible to delay those function calls without changing Drupal core.

    I also can't work around the shortcoming in this ordering by implementing my own hook_user_insert because ProfileFormWidget::saveProfiles saves the profile from the other ProfileFormWidget functions regardless. This causes a duplicate profile to be saved instead.

  • ๐Ÿ‡ฌ๐Ÿ‡งUnited Kingdom joachim

    > RegisterForm::save

    Would it be enough for that method to be split up between the saving bit and the notify bit -- say, RegisterForm::save() and RegisterForm::notify()

    You'd then need the (I assume?) form alteration that calls ProfileFormWidget::saveProfiles() to put that BETWEEN the two.

    That would be easier to do if the arrays of form validate and submit handlers were keyed rather than indexed -- IIRC there is an issue to change that?

  • ๐Ÿ‡ณ๐Ÿ‡ฑNetherlands Kingdutch

    Would it be enough for that method to be split up between the saving bit and the notify bit -- say, RegisterForm::save() and RegisterForm::notify()

    You'd then need the (I assume?) form alteration that calls ProfileFormWidget::saveProfiles() to put that BETWEEN the two.

    If you can split up the call $account->save and _user_mail_notify you'd fix the specific example that's part of RegisterForm. That would require changing Drupal core though. It would also not yet change any other contrib code trying to do anything on hook_user_insert which is still an important hook to make sure works because it's the only way to reliably do something on user creation regardless of how it happened (through an API or UI).

    The issue you're thinking of is ๐Ÿ› Loss of stored data when multiple profiles are displayed as a form on account form Needs review I believe.

  • ๐Ÿ‡ง๐Ÿ‡ทBrazil viniciusrp

    I recreated the patch for profile 1.10 version.

  • ๐Ÿ‡ง๐Ÿ‡ทBrazil viniciusrp

    I recreated the patch for profile 1.10 version, the from #15 is missing some file injection.

  • ๐Ÿ‡ง๐Ÿ‡ทBrazil viniciusrp

    I recreated the patch for profile 1.10 version, the from #15 is missing some file injection.

Production build 0.69.0 2024