Datahandler plugins, (optionally) mappable from login providers

Created on 19 May 2023, about 1 year ago
Updated 9 August 2023, 11 months ago

TL/DR new functionality:

  • Login providers which want to pass additional data during login, can have a configurable (by the site admin) mapping screen for mapping those data to user fields. Little to no coding required on the provider side. (Plus accompanying security hints/best practices.)
  • A plugin can provide functionality with per-provider configurable settings - e.g. the default plugin can provide configurable flood control (see below)
  • Completely optional: if there's no configuration, the system works as before (no data mapping, no user fields, no need to look into the "plugin type" that this issue adds).

Preamble: my view of this module's intended function

This is a toolkit to aid developers in logging in Drupal users through external mechanisms, in a way that is

  • safe and simple by default;
  • more comprehensive where the caller wants to use extra functionality - still with emphasis on safety.

This needs:

1) A README that tells the user how (and why) to use this module for the default 'safe and simple' method, and provides guidance about extending this where wanted -- including any known security challenges.

2) Code. The following are common ways of interacting with this module by a custom/contrib module: (This can be a draft text for the README)

A) Do their own login in whatever way, and update the authmap themselves. Basically: just use the Authmap service for storage. Responsibility for security is fully left to the developer.

B) Log the user in with a simple call to

$account = EA->loginRegister($authname, PROVIDER, ['mail' => $email]);
// [ ... own processing ]

This work regardless whether the Drupal account already exists; it is automatically created/saved if not. However, it only is this simple if there is no extra data to save with the user / in a session / .... That extra processing is left to the caller; it needs to be done after the above call, or in an event handler / hook.

C) A structure that looks like this:

if ($account = EA->load($authname, PROVIDER)) {
  // [ ... own processing ]
  EA->userLoginFinalize($account, $authname, PROVIDER);
}
else {
  $account = EA->register($authname, PROVIDER, ['mail' => $email]);
  // [ ... own processing ]
}

D) currently not feasible; this issue aims to make it the recommended method:


$account = EA->loginRegister($authname, PROVIDER, array DATA)  | USER CODE
 \                                                             ---
  check for existence of user                                  | MODULE CODE
  optionally create and save user                              |
  log user in                                                  |
   \                                                           ---
    plugin-or-event-handlers($authname, PROVIDER, array DATA)  | USER CODE

This is the recommended method if the Drupal account/user entity needs to be changed by 'user code' (i.e. custom code or a contrib module).
So there's just a single parent call, but the 'own processing' (per A-C) is moved elsewhere, into a location that is called from within the ExternalAuth service. At first sight this seems more complicated than method B or C, however:

  1. this is the only way to make the ExternalAuth service responsible for creating/saving new users and also guarantee that no partial/wrong user entity is ever saved. (Methods B and C would save the user twice upon registration, which can leave partially saved users in the system.)
  2. if custom DATA needs to be synchronized into the user object on each login, this 'inner user code' is a single place where this can be done regardless whether the Drupal account is new or already exists. The outer caller does not need to worry about this.

Problem/Motivation

Motivation: I would like to see this 'toolkit' be more complete/comprehensive -- using experience gained from, and code developed in, the samlauth module. I would like to add default implementations for 'non-basic tasks' into this module, so it becomes a real toolkit:

  • synchronisation of custom properties into user fields
  • linking existing users - only under safe conditions
  • role assignment
  • flood control (i.e. too many failed logins optionally result in "Access is blocked because of IP based flood prevention.". Core has this for its own login form, but does not provide an UI for the configuration.)

...and I'd like to fix things that I perceive as 'bugs' in the process. (Tedious description of details in ✨ Redo authmap_alter / register events. Needs review ; reading is completely optional.)

Ground rule: the 'basic implementation' of login using this module should always remain simple, and safe by default.

Challenges (IMHO) in the current module:

  1. Above D.1 (guaranteeing that no partial/wrong user entities get saved) is challenging at the moment / requires writing ugly 'spaghetti'. (This needs the current events to be redone in a new major version.)
  2. loginRegister() and register() calls have two separate array parameters for DATA; this is confusing, and handling those is hard/partly impossible for the current event handlers.
  3. login() (and the login part of loginRegister() gets no DATA passed in at all, requiring user code to create a separate implementation for any user-data synchronization during login (though this is often the same code as the required implementation of user-data population during registration)
  4. The Authmap service also stores the opaque DATA blob - but it's clunky. (load() and save() are not exactly each other's complement - and likely noone has reported this as a bug because nearly noone is using this storage.)
  5. The README could use work, to explain reasons for the code structure + guide people to do the right/safe thing. Include a section that expands on challenges with the above extra code (drawing on the experience of fixing several security reports against the samlauth module).

AFAICT, points 2 and 3 are this way because the DATA array(s) passed into ExternalAuth::register() service is almost completely unused/opaque to the externalauth module. These can be completely different depending on the provider. Which makes it hard to decide how to treat this DATA.

In other words: overall challenge: it's impossible to know what DATA looks like, but we do want to handle it 'inside' the loginRegister() call -- because D.1.

Proposed resolution

Summary

  • To overcome this "overall challenge", move the handling of DATA into plugins, configurable per provider (solving 2 and 3).
  • Since that's a big overhaul, solve the other points at the same time.
  • This makes it possible to move the field synchronization code from samlauth module into a second plugin (that isn't used by default but can be selected/configured by a site admin).

Details

Change loginRegister(), register() and login() calls to all have a single DATA array argument.

Introduce a system of plugins for handling the DATA array; let's say their interface has preUserSave(User $user, array DATA) and postUserSave(User $user, array DATA); specifics TBD. The plugins may implement custom requirements for the DATA structure.

Introduce a configuration screen in the module:

  • to map login providers to specific plugins
  • each plugin (per provider) can be configured. (Configuration screen/options differ per plugin.)

The ExternalAuth service will load the correct plugin and call its methods, before/after saving the user / logging in. Configuration will be:

  • optional: if no configuration exists, the default/simple plugin is called with default config.
  • overridable: the caller of loginRegister() must be able to ultimately decide which plugin/config gets used. (In this case, the configuration screen will simply have no effect. The way to override this may be a bit clunky / is subject to me asking for approval of the whole system, after I write POC code+README for this.)

The default/simple plugin:

  • will do what the current register() does (i.e. make sure the email address + possibly other standard properties get filled from DATA into the user object) -- exact data/stringency/possible split of some data into another plugin TBD.
  • will check the email address validity? (Is this currently done? Should this be configurable? TBD.)
  • will on subsequent logins, if the email address is present but has changed, throw an exception. (Note this is a new situation because login() currently does not get the email address.) There can be a configuration option for enabling to change the email (which then gets saved in the user).
  • can have a configuration option to disallow creating new users by loginRegister() (so this gets configurable by the site administrator).
  • will save any extra DATA (except the email address), if present, into the authmap.data column. (Possibly: optional config to re-save this data on subsequent logins.)

There will be a second plugin that

  • selectively moves values from DATA into user fields (and does not use authmap.data)
  • has a configuration screen for mapping data values to user fields
  • has configuration for linking existing users (and references to the README explaining when this is unsafe to enable)

This plugin is provided as 'reference, use at own risk' - but I envision that a lot of modules that want to pass extra $DATA into the login process, can actually use it as-is, and will get configurable (by a site admin) data synchronization for free, without needing to write any code.

Reference implementation for 'synchronizing roles' can also be provided (moved from existing samlauth code), but I don't know the exact mechanism yet. (Same plugin, optional submodule...)

Challenge: Since each plugin is master over how they validate/use their DATA parameter, not all plugins will be suitable for all providers. This is why providers must be able to force using specific plugins, and the configuration screen must be peppered with warnings when changing the provider-plugin mapping. This is icky - but I think the ickyness is outweighed by the benefit of having a fully site-admin-configurable synchronization from many providers' data into user fields. The configuration screen will be for administrators with elevated privileges only, and can be placed into an optional UI module.

Remaining tasks

  • Write default/simple plugin + interface, and change code in Externalauth service to match.
  • Make required changes to Authmap (I foresee a separate saveData method to save data separately from the authname. The fact that there would need to be two save calls is very low on my worry-about-optimization list.)
  • Solicit feedback from stakeholders (at least 3 provider authors using the issue queue) to see if they have feedback, based on above PoC for the new version
  • Create config screens for admins (list of provider-plugin mappings; individual config screens)
  • Strip SAML-structure specific code from user-field synchronization/linking service and its config forms; port to second plugin.
    • Rewrite samlauth sync/forms to depend on this plugin. Test.
  • Decide how to include the role-mapping code; do it.
  • Final decision on the future of some methods. (I have a feeling I want to drop ExternalAuth::linkExistingAccount() and force using Authmap for this instead)
  • Write unit tests for both plugins, functional tests for the full login + user-save process using them.
  • In README, include:
    • info about authmap (why it's important to have; why the admin screens don't have an "edit" facility)
    • code structure (see above), why you'd likely not directly interact with Authmap, example of using a non-default plugin and forcing a specific plugin
    • "Considerations regarding your Drupal users" from samlauth README; info about why/when linking existing users is dangerous; suggest prepopulating users instead
    • Changes in major versions (code to port for existing non-'simple' providers)
  • UX/UI feedback on config screen (by someone else than me)

User interface changes

Adds admin interface

API changes

Lots... but nothing changes by default / no code changes are required for callers, unless they currently pass a data array to (login)Register() that contains more than just a mail address / standard user properties. (And almost noone does this, because it's confusing.)

Data model changes

New mapping from provider name to plugin name. This is likely going to be a (mapping-array) configuration value.

🌱 Plan
Status

Active

Version

2.0

Component

Code

Created by

🇳🇱Netherlands roderik Amsterdam,NL / Budapest,HU

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Comments & Activities

Production build 0.69.0 2024