Created on 29 April 2021, over 3 years ago
Updated 9 January 2024, 11 months ago

Update

The below discussion has helped me sharpen my thoughts.

The module will in itself support any nameID format... we can't really prevent people configuring strange formats, anyway.
But it will have

  • a README section about the importance of immutability. (It already mentions this, but the NameID must be clearly woven into this.)
  • warnings in the UI if a 'mutable' (transient) NameID seems to be used / clear hints not to use one.

----

Original text

There's one glaring thing that hasn't been addressed yet in this module: handling of NameIDs. (We've had a configuration option for 'NameID format' since the first version of this module, but it took me a long time to figure out what it really means and... that we don't actually use it.)

The main purpose of this issue is to invite others to comment if they think we should accommodate the use of NameIDs as 'unique ids'. And to lay out where I should be convinced/informed, in order to do so.

But first, zooming out: what's important here:

A SAML login must be able to unambiguously identify which Drupal account to log in as. So we must have a property which is present in the SAML login response, that is both unique and immutable - and that is stored with the account in Drupal.

If that property is not immutable, that can spell trouble:

  • When it changes on the IdP side, the next SAML login won't log into the same account anymore. Instead, depending on configuration, what can happen is
    • The login gets linked to a different Drupal account.
    • A new Drupal account gets created.
    • Login is denied.
  • When it changes in Drupal: same thing (except for 'getting linked to a different Drupal account', that Drupal account would also need to have its unique property changed accordingly.)

All this implies: if you choose the wrong property, this can lead to potential security issues.

What gets used as the 'unique user id' property:

This module has always required administrators to pass such a property in a SAML attribute in the message - alongside the user name and email (and possibly other attributes).

SAML however has an explicit property to identify the subject of a SAML assertion (e.g. in our case, the user who is logging in). Wouldn't that be intrinsically more suitable to use for this purpose?

...Maybe.

It is a somewhat popular concept, because the saml_sp module does it. Also, we've had a proposed samlauth 2 branch in 2016 that supported it (https://git.drupalcode.org/project/samlauth/-/tree/droath).

But the NameID isn't always intrinsically suitable for this; it is not always immutable.

When/how can we use the NameID:

Besides the NameID, a SAML assertion specifies its format. See e.g. StackOverflow for a bit of explanation.

The format urn:oasis:names:tc:SAML:2.0:nameid-format:persistent would be suitable, and would be easy to implement using the attached patch. I'd implement it if someone gives a good reason for it - the patch IMHO needs to be extended with a re-work of placement/explanation of the NameID Format in the configuration screen.

Some searching around suggests (though I'm not sure) that it isn't used very much.

Note that getting back a persistent NameID is not as easy as just specifying this NameID format in our configuration. That will send the format in the authentication/logout request as <samlp:NameIDPolicy Format=...>, but it does not necessarily result in actually having a NameID format of the same format being returned in the response. (I tried specifying 'permanent' to the SimpleSAMLphp test IdP I use, and it just returns a 'transient' NameID without indicating any error.)

In the case of the 'persistent' format, this figures: the authentication source must be equipped for it. It must have a way to store that persistent identifier somewhere along with the data of the authenticated user.

Then there's the email format: urn:oasis:names:tc:SAML:1.1:nameid-format:emailAddress. Searches suggest that this format is used often. Also, the saml_sp module uses it.

And if it's used often, that's a reason to support it. My fundamental question is this, though: What happens if an IdP user's email address changes?

I see two possibilities (and I'm going to assume that there are IdPs that implement either, or maybe even both):

A) The NameID doesn't actually change along with the user's email.

This means it's just a special form of 'persistent NameID' and we can implement it as such. No problem.

I assume this also means the changing email is (also) sent along in a SAML attribute.

B) The NameID changes along with the email address.

This means two things:

1. The NameID can also be used to populate the user's email field. (This means a bigger code change on our end.)

Also, if this is the case, I imagine such an IdP does not send the email as a separate attribute.

2. If we allow this NameID to be used as our unique but not immutable ID for the user, the "potential security issues" mentioned above apply.

This is why I'm hesitant to implement support for the 'email' NameID. I'm not against supporting it if there are IdPs that only work this way, especially because there are many IdPs where non-administrators can never change any email address. However, this support will have to be accompanied by at least a warning and clarifying changes in the README, so site administrators don't inadvertently set up an insecure situation.

This is analogous to our support of linking logins to existing Drupal users by email address, which has the same potential security implications - and has had a warning in the UI added in 8.x-3.1.

Other formats we can support if someone details them. The consideration is the same - summarizing:

  • Are its values unique per user and guaranteed not to change? Can do. See "persistent".
  • Is that not the case? That needs arguments / documentation then.

Conclusion: this module can implement support for NameIds if there is demand for it. But the demand will need to be clarified, and the implementation will need to be accompanied by documentation. Especially the inclusion of support for 'email NameID' will need to clarify which things an administrator must be aware of, if they want to prevent setting up a potentially insecure situation.

🌱 Plan
Status

Fixed

Version

3.0

Component

Miscellaneous

Created by

🇳🇱Netherlands roderik Amsterdam,NL / Budapest,HU

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Comments & Activities

Not all content is available!

It's likely this issue predates Contrib.social: some issue and comment data are missing.

  • 🇦🇺Australia mingsong 🇦🇺

    FYI, SAML 2.0 name identifier formats (From IBM Documentation)
    https://www.ibm.com/docs/en/sva/9.0.1?topic=federations-saml-20-name-ide...

  • 🇦🇺Australia mingsong 🇦🇺

    As a contribute module, the more flexibility provided, the more popular this module can be. So an universal solution should can handle all formats, given that I think we should treat the NameID from IdP as an ID of a SSO session rather than an entity ID of a user.
    Even If the format is NAMEID_PERSISTENT, in this case, the user entity ID + service ID is used as the session ID of SSO. Regardless what format is used, the NameID can always be treated as the ID of a SSO session.

    In the Drupal side, once we have the connection between the NameID from an IdP and the user ID in Drupal, we can figure out which user a NameID represent as.

    Therefore, I think we can consider a mapping data table for the NameID and uid in Drupal.

    For example

  • 🇳🇱Netherlands roderik Amsterdam,NL / Budapest,HU

    @chx / #10:

    Thank you for the patch. I had some draft work but was overcomplicating things; indeed just adding the NameID into the return value of getAttributes() is the right thing to do, and I'll add this.

    Caveats:

    • I'll replace 'NameID' with '!nameid' because I consider it more likely that someone ever creates a regular attribute named NameID than !nameid. So anyone using this patch will need to change their *_attribute config values.
    • The reason I didn't jump on this immediately is that it's adding another footgun to the module and I wanted to take the time to add good language to the README / config screen. I think I have time and ideas to do this now.
    • Something in me finds extending getAttributes() icky, but that doesn't matter because it's already icky anyway. (The same part of me doesn't like to have both regular and 'friendly' attribute names duplicated in the same return value.) This should eventually be fixed by 📌 Refactor SamlService::getAttributes() to a value object. Active .

    @Mingsong:

    I don't think I agree with that solution and I think that maybe there's a misunderstanding here... but I'm unable to pinpoint the exact misunderstanding. So I don't know exactly how to approach my reply...

    I think we should treat the NameID from IdP as an ID of a SSO session rather than an entity ID of a user.

    What is the conceptual difference between treating something as an SSO session vs. a user? if we are not using the SSO session / NameID / ... to look up a Drupal user and log them in, then... what use cases are there for having that session?

    In the Drupal side, once we have the connection between the NameID from an IdP and the user ID in Drupal, we can figure out which user a NameID represents as.

    I must be missing something here. I read "once we have the user, we can figure out the user". Well uhm...

    Therefore, I think we can consider a mapping data table for the NameID and uid in Drupal.

    We already have a mapping data table for a 'unique id from the IdP' and the uid in Drupal. (The authmap table.)

    What we see as the 'unique id from the IdP', is configurable. At the moment it must always be an attribute. This issue is about extending the possible configurations to also include the NameID.

    So conceptually this issue is pretty simple. It is, however, complicated by the fact that we can't use transient NameIDs (AFAICT), just like I was arguing against using 'mutable' attributes. This is a question of proper clear documentation.

    You seem to be saying either or both of the following two things:

    • We have a use case for an "SSO session" that is not necessarily tied to a user ID? (See above.) In that case, we need better specifications of that use case - and that likely needs to evolve into a new separate issue.
    • We can use any format of NameID to derive the user. I do not think so. I do not see how we could derive a Drupal user from a transient NameID. (Unless you mean through the feature called 'user linking', but 1) I am leaning toward discouraging that, lately; 2) if we can derive the user already, then I don't really see why/when we need to store the transient NameID. However... I don't think you actually mean this.)
  • 🇦🇺Australia mingsong 🇦🇺

    @Roderik, let me explain my thought by examples.

    What is the conceptual difference between treating something as an SSO session vs. a user? if we are not using the SSO session / NameID / ... to look up a Drupal user and log them in, then... what use cases are there for having that session?

    In a user case, the IdP won't expose any meaningful information of a user identity, such as employee ID. So that the NameID returned from an IdP is a temporary random string to distinguish between SSO sessions. The NameID format is "NAMEID_TRANSIENT". In the SP (Drupal) side, for log/audit purpose, need some information that can tell the connection between NameID (temporary) and Drupal user ID. The currently workaround is that the externalauth module will log the event of a user login via this module. Something like this
    Location: http://yourdomain/saml/acs
    Message: External login of user user1@example.com
    User: user1@example.com
    If the 'Log incoming SAML messages' setting is enabled, then there should be a samlauth log that logged the NameID received from the IdP. The way to connect those information is the timestamp of those logs. In a high-traffic, the timestamp could be confusing and not reliable.

    We already have a mapping data table for a 'unique id from the IdP' and the uid in Drupal. (The authmap table.)

    Again, this data table comes from externalauth module and there is no NameID stored in this table. Why we need NameID? This is back to the original question of this discussion.

  • 🇳🇱Netherlands roderik Amsterdam,NL / Budapest,HU

    In the SP (Drupal) side, for log/audit purpose, need some information that can tell the connection between NameID (temporary) and Drupal user ID.

    Yes, exactly. (Not for log/audit purpose, but we simply need some information to determine which user to log in.) And in case of a transient NameID... that needed information is not there. So we can't tell the connection.

    The currently workaround is that the externalauth module will log the event of a user login via this module.

    Uhm, what?

    I have to guess at your assumptions... but you seem to be implying that the externalauth module is (magically?) determining which user to log in, based on the information in the SAML response?

    That's not true. samlauth determines which user to log in. It then calls externalauth code; externalauth is basically used as a generic utility to store the connection between "unique ID" and user.

    Again, this data table comes from externalauth module and there is no NameID stored in this table

    The table comes from externalauth module. The data is stored in the table by samlauth module.

    After this issue is fixed, NameIDs will be stored in this table (if the samlauth configuration specifies this; see e.g. chx' patch).

  • 🇦🇺Australia mingsong 🇦🇺

    I have to guess at your assumptions... but you seem to be implying that the externalauth module is (magically?) determining which user to log in, based on the information in the SAML response?

    That is not an assumption. If you ever checked the Drupal logs, you will see the log I mentioned. I didn't mean which module is in charge of the login process at what stage, I just said there will be Drupal log to record that event.

    By the way, here is the line that the External Auth module signs in a user.

    https://git.drupalcode.org/project/externalauth/-/blob/2.0.3/src/Externa...

    Yes, External Authentication module is an API module, which means another module that calls the API is in charge of the business logic.

    In this case, samlauth module calls that API function at following lines
    https://git.drupalcode.org/project/samlauth/-/blob/8.x-3.9/src/SamlServi...
    https://git.drupalcode.org/project/samlauth/-/blob/8.x-3.9/src/SamlServi...

    As you can see from the codes above, there is no parameter to tell External Authentication module what NameID is that the Drupal user mapping to.

    The table comes from externalauth module. The data stored in the table is provided by samlauth module.

    So I don't know how can you provide the NameID to externalauth module.

  • 🇳🇱Netherlands roderik Amsterdam,NL / Budapest,HU

    This conversation has gotten surreal.

    So I don't know how can you provide the NameID to externalauth module.

    You provide the NameID in the $unique_id parameter.

  • 🇦🇺Australia mingsong 🇦🇺

    You provide the NameID in the $unique_id parameter.

    The $unique_id is assigned by following line
    https://git.drupalcode.org/project/samlauth/-/blob/8.x-3.9/src/SamlServi...

    which is specified by the configuration of 'unique_id_attribute'.

    More specific, by the configuration form.

    https://git.drupalcode.org/project/samlauth/-/blob/8.x-3.9/src/Form/Saml...

    A SAML attribute whose value is unique per user and does not change over time. Its value is stored by Drupal and linked to the Drupal user that is logged in.

    It could be the NameID, If the it won't be changed ever.

  • 🇩🇰Denmark nicklasmf

    Sorry for not having read the thread.

    I am in a situation where I need to implement SAML for the Danish Government (NemLog-in).
    The user can login with different accounts (private, employee etc.), and they return with different attributes according to their account, thus no attribute is common except the NameID (that I know of).

    I used your patch (2022-01-06/samlauth_nameid.patch) but I end up in a situation where this is longer than the allowed length of the name field.
    SQLSTATE[22001]: String data, right truncated: 1406 Data too long for column 'name' at row 1: INSERT INTO "users_field_data" with NameID https://data.gov.dk/model/core/eid/professional/uuid/d9a3db11-a100-10d6-9c3e-ac3da2000dba.

    I am thinking about shortening it but just want to let you know and whether we can do something further. Maybe add an event that allows to alter the field further before using it?

  • 🇳🇱Netherlands roderik Amsterdam,NL / Budapest,HU

    1)

    The SQL error you are seeing comes from the 'name' field. I see that's 60 characters.

    So I am assuming you are setting "NameID" in the "User name attribute" field.

    There is an event to shorten the user name if you want - or do anything else, like construct intricate logic to build the user name out of various attributes; the USER_SYNC event. (I'm linking to HEAD where the comments were changed a bit recently - but the event code hasn't changed.)

    (You maybe want to only change the user name when the account is new. Note whatever you do here is not restricted by the module's Synchronize user name on every login" option. If you shorten the new user's name and it happens to clash with another already existing user, then you should get a horrible exception without anything being saved yet -- which I guess is as good as anything.)

    2)

    I am also assuming you are setting "NameID" in the "Unique ID attribute" field. This is stored in the authmap.authname field, which is 128 characters.

    That can be altered on the ExternalAuthEvents::AUTHMAP_ALTER event -- only when newly registering a user. Please have a look at the ExternalAuthAuthmapAlterEvent class and/or its caller ExternalAuth::register(). This is outside the samlauth module (though it's called by samlauth).

    ---

    And since you haven't read the thread, I'll repeat something:

    I'll replace 'NameID' with '!nameid' because I consider it more likely that someone ever creates a regular attribute named NameID than !nameid. So anyone using this patch will need to change their *_attribute config values.

    (Unfortunately several issues came in between me finishing this. It's still on the top of my volunteer-coding list.)

    • roderik committed 23d45bbf on 8.x-3.x
      Issue #3211380 by chx, roderik, gpotter: Support for NameID as Unique ID...
  • Status changed to Fixed 11 months ago
  • 🇳🇱Netherlands roderik Amsterdam,NL / Budapest,HU

    Well that took longer than I hoped. Now committed on top of a revised/split edit screen.

    (A 3.10 release will hopefully follow soonish but there are now some changes since 3.9 that I need to take a good look at before really doing a release.)

    Wanted to credit jproctor in the commit message, used wrong name, will try to set credit now, while setting the issue fixed.

  • Automatically closed - issue fixed for 2 weeks with no activity.

Production build 0.71.5 2024