Authored-by should use email address from commits

Created on 9 August 2025, 6 days ago

Problem/Motivation

Review https://new.drupal.org/contribution-record/11413644 and the associated MR https://git.drupalcode.org/project/drupal/-/merge_requests/12403

Observe that the contribution record suggests:


            
              
              
              ๐Ÿ“Œ
              Remove FileSystemInterface::basename() and use PHP native basename()
                Active
              
             feat: Remove FileSystemInterface::basename() and use PHP native basename()

Authored-by: 54534-cmlara@users.noreply.drupalcode.org
Authored-by: 22609-kimpepper@users.noreply.drupalcode.org

Observe that the author address is:
From: Conrad Lara <cmlara@cmlara.com>

The commit author address is a conscious choice by developers that should be respected by default.

Steps to reproduce

Create an issue with an MR. Submit commits to the MR with an Author email address that does not utilize the no-reply address. Credit the user who wrote the commits, observe that the message suggests the no-reply address.

Proposed resolution

Use email address as included in commit

Remaining tasks

User interface changes

Commit messages will now suggest the address as submitted by the commit author.

API changes

None expected.

Data model changes

None expected.

๐Ÿ› Bug report
Status

Active

Version

1.0

Component

User interface

Created by

๐Ÿ‡บ๐Ÿ‡ธUnited States cmlara

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Comments & Activities

  • Issue created by @cmlara
  • ๐Ÿ‡บ๐Ÿ‡ธUnited States dww

    +1. For folks who push commits to the MR, yes.

    I also opened an issue so that even if Iโ€™m credited with reviewing something that I still get identified with my preferred email.

    Thanks for opening this,
    -Derek

  • ๐Ÿ‡ง๐Ÿ‡ชBelgium BramDriesen Belgium ๐Ÿ‡ง๐Ÿ‡ช

    +1 But this is key!

    The commit author address is a conscious choice by developers that should be respected by default.

    As described on your git instructions page: https://www.drupal.org/user/UID/git โ†’ . One has the choice to use an anonymized address.

  • First commit to issue fork.
  • ๐Ÿ‡ช๐Ÿ‡ธSpain fjgarlin

    A few issues/questions:
    - Note that MRs could potentially have commits from the same person coming from different emails. eg: this MR. Whatever we do in this case (take the first, take the last, put both) will be correct in some cases and wrong on some others.
    - If a user decides to remove their account and their personal email was used on a commit (by that user's choice or unintentional local git setup), we won't have a way to remove that PII from the commit history.
    - If an issue had multiple MRs (eg: this one โœจ New contribution records system Active ), where some were merged, some where closed, some might be ignored. What's the source of truth? This would only add to the previously raised points, but it's another source of potential conflicts/issues.

    If we were to do it, I don't think we need a setting/field on d.o or even in gitlab, as we have the information on the patch files for the MRs.
    But the above questions are relevant, especially as we haven't heavily used conventional commits yet.

  • ๐Ÿ‡ช๐Ÿ‡ธSpain fjgarlin

    Related issue.

  • ๐Ÿ‡บ๐Ÿ‡ธUnited States cmlara

    Note that MRs could potentially have commits from the same person coming from different emails

    I had originally thought take last (allow users to do an update to correct bad data) however likley the take all is a better choice.

    Committers can clean up the data at commit if need be, and this goes to the fact that authors may be working for different organizations on the same issues (copyright of work for hire owned by the hiring company) which aligns to D.O. credit policy of credit everyone involved.

    we won't have a way to remove that PII from the commit history.

    This is covered by https://www.drupal.org/docs/develop/git/setting-up-git-for-drupal/drupal... โ†’

    The no-reply addresses themselves (since they identify a specific user by username and user ID) are also likely PII. There appears to be no real new concern here.

  • ๐Ÿ‡บ๐Ÿ‡ธUnited States drumm NY, US

    I think I generally like this idea. This has a good chance of being doable without additional API calls, and increased page load time, since weโ€™re already loading commits to help maintainers make crediting decisions. If this does take more API calls, then we should look if we need to consider other options.

    And this does let the contributor choose what they want per-issue, as long as theyโ€™re making a code contribution.

    Clarifying the proposed resolution to

    Use email address as included in the most recent commit, from any MR if there are multiple

    The most recent commit has the best chance of being what the contributor wants, and can be amended with an easier force push. Most issues wonโ€™t have multiple MRs. For ones that do, I donโ€™t think it practically matters too much which one wins; readable, efficient JS can be the priority over extra logic around prioritizing/choosing multiple MRs.

  • ๐Ÿ‡ช๐Ÿ‡ธSpain fjgarlin

    Great. I can work with that.

    1. We will load by default the anonymous GitLab address (as some of the contributors listed might not have participated in the code)
    2. We will make a map of user emails from the patch of the MRs: https://git.drupalcode.org/project/drupalorg/-/merge_requests/378/diffs....
    3. We will replace existing emails with the newly found.

  • ๐Ÿ‡ช๐Ÿ‡ธSpain fjgarlin

    Investigation/progress so far:
    - https://git.drupalcode.org/project/drupalorg/-/merge_requests/147.patch returns name and email, not username nor user ID.
    - https://docs.gitlab.com/api/merge_requests/#get-single-merge-request-com... returns name and email, not username nor user ID
    - https://docs.gitlab.com/api/merge_requests/#get-single-merge-request-par... returns user id, username, and name, but no email

    The only way to make sure to get the correct user from an email is to do a user search in gitlab (https://docs.gitlab.com/api/users/#list-users), and this will only work for the public_email, not secondary emails, and it will require a call per user (if using REST, which is what we are using so far).

    The glue is "username", which is not present in the list of commits from the above endpoints/URLs.

    Also, you can technically set any email via git config --global user.email "EMAIL". That's the one that will be linked to the commit, regardless of whether you have that email in your emails setting: https://git.drupalcode.org/-/profile/emails

  • ๐Ÿ‡บ๐Ÿ‡ธUnited States dww

    If getting everything via API is a pain, can we iterate through all the commits, and for every one, whatever is in Author: we add it to a list and add all unique values as Authored-by: footers in the default commit message? Not as ideal, but seems like itโ€™d cover the 80% case really well with no data beyond the commits in the issue forks.

  • ๐Ÿ‡บ๐Ÿ‡ธUnited States cmlara

    Also, you can technically set any email via git config --global user.email "EMAIL". That's the one that will be linked to the commit, regardless of whether you have that email in your emails setting: https://git.drupalcode.org/-/profile/emails

    That starts to touch on the often avoided issue of properly acknowledging copyright holders. this may be better a separate issue, however it is a point of compliance we eventually need to address.

    We can not assume that every author is a D.O. user when formatting the author-by lines, these generally should include everyone who has a legal claim in the code being committed, which can be overly simplified as, anyone who authored a change requiring original thought.

    This was a bit less of an issue when the commit template did not assert included names were authors, it is a burden that was already required, and became more deeply accepted when ๐ŸŒฑ [policy] Decide on format of commit message Active was adopted, especially with the original discussions believing GitLab would populate that data for us as part of its template system before the need to adhere to Drupalisms pulled this back into the contribution UI.

    This starts to go back to my point in #7 regarding authors who work for multiple organizations on the same issue and that commit email could be indicative of corporate ownership (work for hire copyright law).

Production build 0.71.5 2024