Simplify agreed commit format

Created on 26 August 2025, 21 days ago

Problem/Motivation

On 🌱 [policy] Decide on format of commit message Active , a format was agreed. When trying to put it into practice, many issues arose, and some of them might not be solvable.

See šŸ› Authored-by should use email address from commits Active , ✨ Add a UI to change footer "token" values for each user Active , and ✨ Allow users to set GitLab Commit email Active with some of their discussions as examples.

Trying to summarize:
- Commit emails may not be linked to any d.o account nor gitlab account
- Same user might have different commits with different emails
- d.o has personal emails (PII), gitlab has public_email and other emails, but we can only search by the public one.
- The roles of each individual within an issue (reviewer, author, reporter…) might be really complicated to guess without doing deep complex logic (commit history, users that commit or not, users that commented or not...)
- Gitlab api only stores email for a given commit, not the git username, so all we can do is ā€œguessā€ in some cases.

There are more relevant things on those issues, this is just an excerpt.

The roles calculation can be a bunch of micro-decisions for the maintainer to make (any automation won’t be perfect) that I can’t imagine actually matter.

Also, we seem to be ā€œlinkingā€ (again?) the commit message with the credits, when these two things are (and should be) independent, but this might be beyond the scope of this issue and probably was discussed in the parent one. So, we'll focus on how to fix the above issues.

Proposed resolution

- Drop the roles from the produced message (as deciding this will be prone to errors).
- Drop the emails from the produced message (as we cannot match the actual emails used to users).

Doing this would fix all the mentioned issues.

Go from:

[#999999] task: Convert MediaSource plugin discovery to attributes

Authored-by: sorlov <xxx@no-reply...>
Authored-by: quietone <xxx@no-reply...>
Reviewed-by: smustgrave <xxx@no-reply...>
...

To:

[#999999] task: Convert MediaSource plugin discovery to attributes

By: sorlov
By: quietone
By: smustgrave
...

By could also be Authored-by (fixed).

Any further modifications can be done after the copy/paste of the suggested message.

✨ Feature request
Status

Active

Version

11.0 šŸ”„

Component

other

Created by

šŸ‡ŖšŸ‡øSpain fjgarlin

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Comments & Activities

  • Issue created by @fjgarlin
  • šŸ‡³šŸ‡æNew Zealand quietone

    @fjgarlin, thanks for pulling this together.

    I am all for a simple solution, which the proposed resolution is. And I prefer 'By" instead of "Authored-by".

    Also, we seem to be ā€œlinkingā€ (again?) the commit message with the credits, when these two things are (and should be) independent.

    Where is this happening, on an issue?

  • šŸ‡ŖšŸ‡øSpain fjgarlin

    No, it's not happening anywhere, it was mostly about all the conversation to try to get all the contributors of an issue into a commit message and all the problems that it was creating (aka this issue and related issues). I should probably rephrase or even strike that out.

    I also prefer the "By". I'll keep it simple in the proposed resolution.

  • šŸ‡«šŸ‡·France nod_ Lille

    Discussed on slack and I'm +1 for the simplification

  • šŸ‡¬šŸ‡§United Kingdom catch

    It already takes quite a lot of time to sort out issue credit relative to everything else on an issue, at least on any issue with more than five people involved. Reviewer vs author is also often blurred - if someone does deep reviews, possibly with gitlab suggestions, vs someone making a couple of quick changes to resolve feedback.

    So agreed with keeping this as simple as possible.

  • šŸ‡¬šŸ‡§United Kingdom catch

    This is quite a small change, even though the original format change took years to sort out, so I'm going to move it to RTBC. We're not in a position to actually use the format yet, so agreeing to change it just means changing what we'll eventually be able to use once we move to gitlab issues, although it will impact some contrib modules quicker.

  • šŸ‡¬šŸ‡§United Kingdom longwave UK

    These lines are officially called "trailers" and I think there are some vague standards for them.

    GitLab supports certain forms of these, though the docs are sparse: https://docs.gitlab.com/user/project/repository/signed_commits/web_commits/

    GitHub supports at least Co-authored-by: https://docs.github.com/en/pull-requests/committing-changes-to-your-proj...

    Would it be helpful for future git archaelogy if we used a somewhat supported standard here?

  • šŸ‡ŖšŸ‡øSpain fjgarlin

    I think that could open this can of worms ✨ Add a UI to change footer "token" values for each user Active , which we are trying to close.
    If we agree on a "fixed" format, that's 100% fine, but if there are expectations for a dynamic role, then that's a bigger issue. That's what we are trying to avoid with this issue.

  • šŸ‡¬šŸ‡§United Kingdom longwave UK

    FWIW WordPress's Gutenberg repo also uses Co-authored-by to credit all contributors to a PR: https://make.wordpress.org/core/2024/02/01/new-commit-message-requiremen...

  • šŸ‡ŖšŸ‡øSpain fjgarlin

    Happy to go with that if that’s the agreed format. One line change from our end.

    Changed the proposed resolution in the issue summary to reflect this.

  • šŸ‡³šŸ‡æNew Zealand quietone

    Co-authored-by is better than Authored-by but I still prefer the simplicity and less wordy By. For me, it means the same as Co-authored-by.

    However, I will support either code>By or Co-authored-by.

  • šŸ‡¬šŸ‡§United Kingdom catch

    Yeah agreed with #11.

  • šŸ‡¬šŸ‡§United Kingdom longwave UK

    @fjgarlin is it possible to add the noreply address by default for each co-author as well? I think that is required for GitLab to link the commits back to users - assuming those addresses are known on the GitLab side, I can't see my GitLab profile directly so not sure if this is set up or not.

  • šŸ‡ŖšŸ‡øSpain fjgarlin

    That's exactly what we had (so, yes, it's possible), and then another can of worms opened šŸ› Authored-by should use email address from commits Active .

  • šŸ‡ŖšŸ‡ØEcuador jwilson3

    I've come here via a link to this issue, currently buried inside a collapsed FAQ item on the new contrib record page (https://new.drupal.org/contribution-record/:id ), which is currently being encouraged to module maintainers.

    If you are committing, you are encouraged to use the formatting here.

    Is it really safe to recommend people start using a new commit message format if the part about multiple authors is not yet settled? Will issue credit be properly picked up, if the format ends up changing?

  • šŸ‡ŖšŸ‡ØEcuador jwilson3

    I still prefer the simplicity and less wordy By.

    If an issue has 20+ authors, then a newline and By: ends up getting really long itself. If we're back to only showing username instead of username <email>, then maybe we can also go back to a single line format like: By: author1, author2, author3, author4.

  • šŸ‡ŖšŸ‡øSpain fjgarlin

    Will issue credit be properly picked up

    Issue credit has no connection to the commit message. Issue credit is gratend if maintainers check the checkbox for that user in the issue / contribution record.

  • šŸ‡¬šŸ‡§United Kingdom catch

    Will issue credit be properly picked up, if the format ends up changing?

    Issue credit isn't based on the commit log, the canonical source is the Drupal.org issue credit records, these are then used in the UI to generate a commit message. So there's no actual effect on commit credit except how this influences the d.o UI.

    The thing that could be broken by more than one commit message format change is other things based on commit log parsing like codeswarm.

  • šŸ‡ŖšŸ‡ØEcuador jwilson3

    Okay, that all makes sense, but now I'm trying to understand whether the (fallible, potentially inaccurate) author credits in the commit message is intended to serve machines or humans?

  • šŸ‡¬šŸ‡§United Kingdom catch

    When I look in git blame, I look for the issue number and the names of who worked on an issue - it gives me quick context about who worked on it which is often useful. I will often still need to copy and paste the issue nid and then take a look to see if there's the discussion of the line I'm interested in, but sometimes seeing whether I worked on the line in question or not is enough too.

  • šŸ‡µšŸ‡ŖPeru marvil07

    These lines are officially called "trailers" and I think there are some vague standards for them.

    For reference, git trailer official documentation at git documentation, e.g. man git-interpret-trailers.
    ā„¹ļø I already made a few points in the original issue 🌱 [policy] Decide on format of commit message Active , that may be relevant here.

  • šŸ‡ŖšŸ‡ØEcuador jwilson3

    As @longwave pointed out in previous comments, GitLab, GitHub and others seem to have come to some level of semi-standardization around the Co-authored-by: username <email> trailer, though this is not official in any way. So if the goal is for getting off the island, I'd go with that. On the other hand, if the goal is to just make the old "byline" info in the git commit message available (and easily human-scannable) for casual users, then a custom By: [username1]\nBy: [username2] is okay and By: [comma-separated list of users] even better, since IDE tools that do inline Git blame popups, like GitLense or Better Git Line Blame (which I use in Cursor) would get vertically truncated for really long multi-line commit messages, making it less scannable.

    I think it comes down to what problem we're trying to solve. Since we use GitLab, it seems smart to make the effort and kill 2 birds with 1 stone by going with something that GitLab supports. https://github.blog/news-insights/product-news/commit-together-with-co-a...

    From https://gitlab.com/gitlab-org/gitlab-foss/-/issues/31640

    Gitlab should recognize Co-Authored-By: [name] <[email]> in the message body, and should display user icons on UI related to the commit

    they don't actually link to the user from the commit message, as is suggested here, but they list the co-authors along with the main author in the interface

  • šŸ‡ŖšŸ‡øSpain fjgarlin

    Note that the old format is still available in the Contribution Record page ("Not ready to switch...", at the bottom of the page).

    I wouldn't like this issue to become as big as their predecessors, because here, it's just about simplifying the format due to the limitations explained in the issue description.

    From the link given on #22, there is a neutral "Helped-by" which could also work well here. Tho there seems to be some buy-in for "Co-authored-by".

    Given that the suggested format made it to RTBC in #6, and that most the conversation after was about changing "By" to "Co-authored-by" (there were also other questions that were replied), do we have consensus on the simplified format 2ļøāƒ£ suggested in the issue description?

  • šŸ‡øšŸ‡°Slovakia poker10

    If we do not consider "co-authored" as a potential issue with copyright, as mentioned in #3540547-8: Add a UI to change "role" values for each user in the footer → , then for me either By, or Co-authored-by looks good. I would not add emails, as discussed above.

  • šŸ‡ŖšŸ‡øSpain fjgarlin

    Re 2ļøāƒ£, got a +1 via slack from @nod_, @catch, @longwave, and I'll take @poker10 above comment as another +1.

    Setting back to RTBC based on that and will wait for a final approval on this (aka mark it as "Fixed").

  • šŸ‡¬šŸ‡§United Kingdom catch

    Updating the issue summary to reflect that we converged on Co-authored-by

  • šŸ‡«šŸ‡·France mably

    Is there a release notes generator handling the new commit format?

  • šŸ‡ŗšŸ‡øUnited States dww

    TL;DR: I’d rather have no information than wrong information.
    -1 to Co-authored-by, +1 to By.

    If my heuristic for automatically differentiating co-authors from reviewers in the other issues (if you push commits to the MR, default to co-author, else, default to reviewer) isn’t feasible for technical reasons and/or we’re worried about people debating it case-by-case, MR suggestions blurring the line, etc, I think I’d prefer just By: . Both for copyright and for honesty, I’d rather not call every participant in an issue / MR a co-author. By is bland and vague, just like we use now, but doesn’t imply everyone wrote the change.

    When I was originally spearheading this commit format change for core, I was hoping to use it as an opportunity not just to adopt useful 1-line summaries without an incomprehensible wall of usernames, but also to more accurately reflect the reality of who contributed to the change and how. If we call everyone a co-author, it’s less accurate, not more.

    Since doing the most accurate things automatically are revealing some technical hurdles and other resistance, the other approach I keep pondering / suggesting is to put the draft commit message to use in the MR body and/or as another section of the issue template and let folks craft / refine the draft commit message collectively. We already have a culture of encouraging people to keep the issue summary accurate. We already have a ā€œfieldā€ in the template to collectively write a release note. However, I know folks don’t always do this. Core committers could start refusing to commit RTBC things if they don’t have this message already written. We do that for CRs and release notes. We’ll delay issues for weeks/months nit picking code comments, but then we leave the urgent task of the Git history entirely up to committers to have to do on their own.

    The other option, if relying on MR/issue text isn’t going to fly, would be if the credit UI saved state and we could draft a ā€œbetterā€ message via that UI. But that’s probably a whole other can of worms.

    If there’s no draft, we need a quick / easy way for committers to get something useful, which is what this issue aims to provide. So if every attempt at reflecting reality is going to be ruled out, I vote for something vague and true, not specific and (partly) false, even if GitLab would provide some limited ā€œmagicā€ automatically if we used Co-authored-by.

  • šŸ‡ŗšŸ‡øUnited States cmlara

    Setting needs work as I don't see any discussion about the potential liability for core committers claiming an individual who provided review comments disagreeing with the commit is an 'author' of the commit. This was raised in #3540547-8: Add a UI to change "role" values for each user in the footer → . As I am not a lawyer, I would suggest core run this past counsel before using "Co-authored-by:".

    Yes using By: breaks 3rd party tooling support (which if I understand correctly was a major reason to migrate to the new standard message format) however misusing Co-authored-by would do the same. Addtionaly we already deviate from the standard in other ways (prefixing the issue ID) which breaks tooling.

    I also want to note that #29 echos a similar view I previously posted on Slack, that if maintainers (originally discussing Contrib however equally applies to Core) have a concern about the commit message it can indeed be made a gate for RTBC (although I believe only MR opener and Maintainers can edit a MR message).

    even if GitLab would provide some limited ā€œmagicā€ automatically if we used Co-authored-by.

    Doesn't GitHub and GitLab need full email addresses for this (that match one of the users emails)? Or are we referring to the fact that GiLab can automatically populate Co-authored-by based on commit authors? If the former we won't really see any magic from any of these proposed methods.

    If we need to test any of these ideas we still have https://www.drupal.org/project/test_commit_message → we can put some test commits in.

    Is there a release notes generator handling the new commit format?

    I believe Matt Glaman's Drupal MRN site should support all 3 versions of this (By, Authored-by, and Co-authored-by). Related SLACK thread: https://drupal.slack.com/archives/C1BMUQ9U6/p1756151091432389

    Addtionaly some more flexible generators (I'm looking at git-cliff at moment) may be able to do the basic release notes portions as well with customization (not sure how they would handled the attribution being malformed, I suspect ignored).

  • šŸ‡ŖšŸ‡øSpain penyaskito Seville šŸ’ƒ, Spain šŸ‡ŖšŸ‡ø, UTC+2 šŸ‡ŖšŸ‡ŗ

    Setting needs work as I don't see any discussion about the potential liability for core committers claiming an individual who provided review comments disagreeing with the commit is an 'author' of the commit.

    People have often asked to not be credited in a given issue, and was solved by not crediting them.
    This is not a new problem, and the commit message format or the recent changes in the credit system storage have no effect on this. This isn't a new problem, and shouldn't (and probably cannot) be fixed here.

  • šŸ‡ŗšŸ‡øUnited States cmlara

    This is not a new problem, and the commit message format or the recent changes in the credit system storage have no effect on this..

    Previous versions did not attempt to claim the individual wrote (part of) the code, as is being done with the proposed change in text to "Co-authored-by".

  • šŸ‡ŗšŸ‡øUnited States dww

    (From the summary):

    Any further modifications can be done after the copy/paste of the suggested message.

    This is part of the approach to this whole problem that I find a bit problematic. I’d rather this wasn’t entirely dependent on core conmitters to do manually. That’s already a bottleneck. I’m advocating for a solution where at least subsystem maintainers, if not anyone contributing to an issue, can collectively improve this git history in advance. Committers can always override and further edit before they actually push the commit, but we could lighten the load significantly with a collective solution.

  • šŸ‡ŗšŸ‡øUnited States dww

    Re emails, usernames, etc: sounds like there are some edge cases where we don’t know everything. But a little work could go a long way to providing as much info as we have in the vast majority of cases, no? Eg if I could set my GitLab email address to match the email address I have my local git configured to use, and I always pushed commits with the same address, everything would Just Work(tm), no? Worst case that someone borked this and the same user had 2 different emails? Big deal, we either see both of them or the last one or whatever. It’s still better than having no way to link anyone to a commit other than the Author.

  • šŸ‡¬šŸ‡§United Kingdom catch

    Drupal core doesn't have access to legal counsel and never has done. I'm surprised I even had to type that sentence as it's so far removed from the reality of core development.

    The GPL is explicitly provided with no warranty or liability.

    As @penyaskito mentions if someone requests not to be credited on an issue we remove them.

    For me 'by' and 'co-authored-by' both imply authorship, just the latter sounds artificially a bit grander.

    I personally do not care if we use 'by' or 'co-authored-by' but I do care if improvements to the commit message format and gitlab issues get delayed by bike shedding over entirely hypothetical situations.

  • šŸ‡³šŸ‡±Netherlands bbrala Netherlands

    I'm all for sinplifying the workflow, co-authored-by should be fine. I do think we are overly complicating things by giving co-authored-by so much extra weight. I think for an oblivious person it really means the same as the old format. "Issue xxxx by bbrala, catch" reads the same. "This issue has come together by these people" as co authored suggests also.

    Let's not pick a standard, and then move away from that standard too much just because we can explain the words differently.

    Also, wouldn't an possible endgoal be that we generate the commitmessage from gitlab as much as possible? There are variables there to do a lot of heavy lifting.

    https://docs.gitlab.com/user/project/merge_requests/commit_templates/

    That was also part of my frame of reference when discussing the commit message change. Then we don't need to manually do much, but that does mean, dont invent our own. But I guess that part was not iterated much on (or perhaps even mentioned explicitly enoug).

  • šŸ‡³šŸ‡æNew Zealand quietone

    I personally do not care if we use 'by' or 'co-authored-by' but I do care if improvements to the commit message format and gitlab issues get delayed by bike shedding over entirely hypothetical situations.

    +1. This is all GPL and in a community that wants to recognize all the people who contributed to get an issue finished.

    I also want to remind us all that that this issue is to tweak the agreement of the original issue because of technical reasons. Let's keep it to that.

  • šŸ‡ŖšŸ‡ØEcuador jwilson3

    Since we're removing email due to limitations above, it makes sense to also not use 'Co-authored-by' — in addition to the other arguments made above — because other platforms would typically expects a username + email in that format.

    In #23 my proposal was that if consensus is that we're not trying to satisfy machines and us a standard format, then I'd propose taking it one single step further than just the technical email limitations written in the issue summary, and make the 'By: ' trailer readable as a single comma-separated list of usernames, which aligns more closely with how we used to do it.

    Before:

    Issue #999999 by user1, user2, user3, user4, user5, user6, user7, user8, user9: Convert MediaSource plugin discovery to attributes
    

    After:

    [#999999] task: Convert MediaSource plugin discovery to attributes
    
    By: user1, user2, user3, user4, user5, user6, user7, user8, user9
    

    Instead of:

    [#999999] task: Convert MediaSource plugin discovery to attributes
    
    By: user1
    By: user2
    By: user3
    By: user4
    By: user5
    By: user6
    By: user7
    By: user8
    By: user9
    

    I didn't see anyone address this point yet and it was the primary point I was trying to make. Sorry for what sounds like it devolved into a bikeshed, but this goes directly to the point of considering how the data is used so we can get the format right once and not change it again for 10+ years.

  • šŸ‡ŗšŸ‡øUnited States dww

    #34 is not bikeshedding over hypotheticals. Without email addresses (even hard-coding the no reply if we have to), it’s going to really be hard for any other tooling to be helpful, or for GitLab to do its own magic. Can we prevent letting perfect be the enemy of good and do a best-effort for emails, even if there are edge cases where it doesn’t work flawlessly?

  • šŸ‡ŗšŸ‡øUnited States dww

    I wasn’t sold on deviating from conventional commits to put the issue number at the front of the line. My original proposals in the initial issue were all to put the issue number after the colon, as the start of the description, since that’s fully compliant with the standard. So long as we don’t get too creative with the type keywords, and mostly use fix, task and feat, it should still be pretty scannable by humans, and totally readable by tools written for the spec. Even with the issue number the front, they’re not all the same number of digits, so they won’t perfectly line up, anyway.

    If you’re looking at a rebase or some other action where you only see the 1-line summaries, it’d be something like:

    feat: [#12345] Whatever something nice to say
    fix: [#12346] Some other important thing
    task: [#1234789] Clean up the whatever
    

    Vs.

    [#12345] feat: Whatever something nice to say
    [#12346] fix: Some other important thing
    [#1234789] task:  Clean up the whatever
    

    IMHO, the 2nd isn’t fundamentally more scannable by humans, but it’s much less scannable by tools written to support the spec.

  • šŸ‡ŗšŸ‡øUnited States dww

    Finally, I’m still interested in hearing about a collective solution instead of something committers have to do entirely on their own. I’m sad we’re proposing to simplify this message so much, only to make the bespoke tooling a little easier, instead of seeing about ways to make this whole task part of the work of getting an issue ready to be committed and then being able to take fully advantage of the improved format for the benefit of out Git history.

    Thanks,
    -Derek

  • šŸ‡¬šŸ‡§United Kingdom catch

    I'm not sure how a (more) collective solution could work technically. Currently both committers and subsystem maintainers can assign issue credit (hopefully this is the case with the new system) - when this is already done on a complex issue it definitely helps, and is a collective process.

    However the commit message itself is automatically generated from the contribution credit and people's own accounts/metadata, so any manual tweaks like changing 'fix' to 'feat' or whatever have to be one by the actual person making the commit at commit time (not even people with commit access in general). With the new system I have to exclude myself from commit messages if the only thing I'm doing on an issue is committing it, then manually add the contribution credit for the commit - either doing it in the right order or manually editing the commit message to remove myself.

    If we get into things like making sure people's email addresses are correct (I managed to end up with three in core commit logs - the no-reply d.o one I was trying to use, my personal email when I forgot to set no-reply on a new machine for a while, and a second no-reply one in a different format that I'm still not sure where it came from), then that is a lot of additional admin work that only the person being credited themselves actually knows about - and they may not have been on that issue for three years.

    Issue credit is already the most time consuming part of the technical process of committing an issue (excluding review time, although one line commit typo fixes with 15 people on an issue, not even excluding that). So I'm very resistant to anything that increases that time, even by a little bit.

    With #38 do we know whether gitlab's integration works without an email address or not? If not, I'd be fine with just a comma separated list, but it would be shame to change things yet again if it turns out gitlab can handle the format without email addresses.

  • šŸ‡ŗšŸ‡øUnited States dww

    Possible ways to implement a collective solution:

    1. Add another heading to the default issue body template called ā€œDraft commit messageā€.
    2. Adopt the convention that the body of the MR (or at least part of it) should be a draft commit message.
    3. Use GitLab commit templates and hopefully get a bunch of the trailers and other goodies automagically per the end of comment #36.

    For 1 and 2, core committer copy/pastes from the issue or MR, not the credit UI. For 3, maybe we start merging via GitLab?

  • šŸ‡ŖšŸ‡øSpain fjgarlin

    Note on emails šŸ› Authored-by should use email address from commits Active . Having the no-reply address is easy, possible and straightforward, and would not require any additional api calls. It was initially implemented that way until that issue was created.

    Using the actual emails used in the commits (that's what that issue is about) would require an extra api call for every email listed in every MR and keeping a map of users to emails, and then replace those that were matched (which probably most of the time will be the initial no-reply suggestions).

  • šŸ‡¬šŸ‡§United Kingdom catch

    For 3, maybe we start merging via GitLab?

    We historically haven't been able to do this because it wasn't possible to customise the author on squashed commit messages, and probably other things too in the past, but iirc that's the last one, I think that still might be the case but haven't checked in the past month.

    It would be _great_ to be able to use merge trains, but I feel like this issue is one of still several steps before we're able to do that.

    Add another heading to the default issue body template called ā€œDraft commit messageā€.

    This feels like it would mean having to maintain both the issue credits themselves and the commit message separately, in different interfaces, and that they would get out of sync all the time. The commit message is also currently generated from the title, but that would also now be maintained in a separate place (or just disconnected entirely).

Production build 0.71.5 2024